Skip to content

Conversation

@bettio
Copy link
Collaborator

@bettio bettio commented Jan 28, 2024

Using i2c_driver_acquire and i2c_driver_release it is possible to implement native I2C drivers that coexist with erlang ones, without corruptions or race conditions.

This API has been designed on top of deprecated I2C API, so a new one might be needed.

Note: I2C API (the legacy one) from v4.4 until v5.1 is thread safe, new API is not, but we are not using it.

These changes are made under both the "Apache 2.0" and the "GNU Lesser General
Public License 2.1 or later" license terms (dual license).

SPDX-License-Identifier: Apache-2.0 OR LGPL-2.1-or-later

@bettio bettio force-pushed the acquire-i2c-api branch 2 times, most recently from bc775b5 to cbe29f2 Compare January 28, 2024 23:31
@pguyot
Copy link
Collaborator

pguyot commented Jan 29, 2024

Is this change introducing i2c_driver_acquire_peripheral and i2c_driver_release_peripheral that should be called from a "native" I2C driver? Why do we need any kind of mutex emulation if SMP is not defined?

I understand that lock is based on the idea that a "native" driver opens an Erlang port, but if it's native, why would it bother?

If we allow such native I2C driver, what kind of guarantee do we have against concurrent access to the I2C bus which seems to be what is not thread safe in esp-idf?

Eventually, the code calls smp_mutex_create but doesn't seem to call smp_mutex_destroy.

@bettio
Copy link
Collaborator Author

bettio commented Jan 29, 2024

Is this change introducing i2c_driver_acquire_peripheral and i2c_driver_release_peripheral that should be called from a "native" I2C driver? Why do we need any kind of mutex emulation if SMP is not defined?

I think some kind of improvement is needed for smp area (not in this PR). There some esp32 models, such as the esp32s2, for which we are disabling SMP support, however they still have threads support since they run FreeRTOS (like in any other esp-idf based firmware).
Hence since a native driver might still employ threads (other than the AtomVM main thread), a mutex is required even if SMP is not defining any of the primitives.

I understand that lock is based on the idea that a "native" driver opens an Erlang port, but if it's native, why would it bother?

Let's define the use case (similar to @arpunk one): foo needs to use BH1750 I2C sensors for which there is an erlang driver, but also wants to use a native display I2C driver. They need to share both the same bus, so some coexistence function is needed (that are those introduced with this PR).

So after this change the following can be done:

        i2c_opts = [
          sda: 4,
          scl: 15,
          clock_speed_hz: 1000000,
          peripheral: "i2c0"
        ]
        i2c_host = :i2c.open(i2c_opts)

        display_opts = [{:i2c_host, i2c_host} | @display_driver_opts]

If we allow such native I2C driver, what kind of guarantee do we have against concurrent access to the I2C bus which seems to be what is not thread safe in esp-idf?

For this specific reason I introduced i2c_driver_acquire_peripheral and i2c_driver_release_peripheral functions, a native driver muse use them: once acquired a mutex is locked.

There is still an open question (that can be fixed without changing the API exposed to native drivers). Is thread safety guaranteed across different I2C peripherals. E.g can I access I2C_NUM_0 while in parallel I'm using I2C_NUM_1? According to the answer to this question we might need a global I2C mutex or not.

Eventually, the code calls smp_mutex_create but doesn't seem to call smp_mutex_destroy.

Good catch.

@pguyot
Copy link
Collaborator

pguyot commented Jan 29, 2024

Is this change introducing i2c_driver_acquire_peripheral and i2c_driver_release_peripheral that should be called from a "native" I2C driver? Why do we need any kind of mutex emulation if SMP is not defined?

I think some kind of improvement is needed for smp area (not in this PR). There some esp32 models, such as the esp32s2, for which we are disabling SMP support, however they still have threads support since they run FreeRTOS (like in any other esp-idf based firmware).

Hence since a native driver might still employ threads (other than the AtomVM main thread), a mutex is required even if SMP is not defining any of the primitives.So the best option I think

I don't think a native driver should or can call an AtomVM function such as globalcontext_get_process_lock from an FreeRTOS task. Without smp, there is no process table lock so bad things could happen, there is no guarantee that the function can succeed from another task.

@bettio
Copy link
Collaborator Author

bettio commented Jan 29, 2024

Is this change introducing i2c_driver_acquire_peripheral and i2c_driver_release_peripheral that should be called from a "native" I2C driver? Why do we need any kind of mutex emulation if SMP is not defined?

I think some kind of improvement is needed for smp area (not in this PR). There some esp32 models, such as the esp32s2, for which we are disabling SMP support, however they still have threads support since they run FreeRTOS (like in any other esp-idf based firmware).
Hence since a native driver might still employ threads (other than the AtomVM main thread), a mutex is required even if SMP is not defining any of the primitives.So the best option I think

I don't think a native driver should or can call an AtomVM function such as globalcontext_get_process_lock from an FreeRTOS task. Without smp, there is no process table lock so bad things could happen, there is no guarantee that the function can succeed from another task.

Good point indeed.

So IMHO the best option is enabling i2c_driver_acquire_peripheral and i2c_driver_release_peripheral only when AVM_NO_SMP is not defined.
Since when AVM_NO_SMP it is not safe having anything except AtomVM.

Then we might have a separated discussion if we want to enable by default SMP on all ESP32 models.

Last but not least, I think the SPI API has the same issue, I think I need to make a fix for it.

@bettio bettio force-pushed the acquire-i2c-api branch 2 times, most recently from 5e5c05d to d10ef53 Compare January 29, 2024 13:06
// The native driver API defined here is supported only for SMP builds
#ifndef AVM_NO_SMP

I2CAcquireResult i2c_driver_acquire_peripheral(term i2c_port, i2c_port_t *i2c_num, GlobalContext *global)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

esp-idf calls the thing a port (because they do both master and slave), but I would advocate for bus. Peripheral definitely is misleading.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will change the name to just i2c_driver_acquire and i2c_driver_release. Functions do not acquire the bus, right now they just avoid the i2c is uninstalled while issuing commands.

return cmd == I2CCloseCmd ? NativeTerminate : NativeContinue;
}

static Context *get_i2c_context(term i2c_port, GlobalContext *global)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This function is only used with smp, isn't it?

So you may need to move up the #ifndef above it.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed.

@bettio bettio force-pushed the acquire-i2c-api branch 3 times, most recently from 7ba34e7 to 1afcd38 Compare January 29, 2024 21:37
@bettio bettio requested a review from pguyot January 29, 2024 21:38
ctx->platform_data = NULL;

#ifndef AVM_NO_SMP
smp_mutex_destroy(i2c_data->mutex);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this destroying the mutex before unlocking it on purpose? Not sure we can do that, I believe it is undefined behavior with thread APIs such as POSIX.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed.

@bettio bettio force-pushed the acquire-i2c-api branch 2 times, most recently from aa54485 to ead1b14 Compare January 31, 2024 00:31
{
// ugly hack for avoiding race conditions
// good news: close is not called frequently so the penalty is quite neglegible
Context *locked = globalcontext_get_process_lock(ctx->global, ctx->process_id);
Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@pguyot this change will avoid any change to platform_data, since before using the context and platform data we acquire a locked process. See line 672.
Maybe in the future we might change how platform_data is handled, but it is not our focus right now.

@bettio bettio requested a review from pguyot January 31, 2024 13:08
Copy link
Collaborator

@pguyot pguyot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems there still is a dead lock situation with the latest version of the code.

Close follows this sequence:

lock process table
acquire mutex
destroy bus
release mutex
release process table lock

Native driver follows this sequence:

lock process table
acquire mutex
release process table

do something with bus

lock process table
release mutex
release process table

This deadlocks as follows:

Close sequence Native driver sequence
lock process table
acquire mutex
release process table lock
do something with bus
lock process table
try to acquire mutex
try to acquire lock process table

Giving it more thoughts, I think we should rather adopt an Erlang-like approach and leverage the serialization at the process level rather than adding more locks.

This would mean external driver would send a message to the port to acquire the bus, the port would update an internal state (e.g. increment a reference counter), and the external driver would send a message to release the bus. This way we can ensure the bus is not closed while the external driver is using it.

@bettio bettio requested a review from pguyot February 1, 2024 12:04
@bettio
Copy link
Collaborator Author

bettio commented Feb 1, 2024

It seems there still is a dead lock situation with the latest version of the code.

Close follows this sequence:

lock process table acquire mutex destroy bus release mutex release process table lock

Native driver follows this sequence:

lock process table acquire mutex release process table

do something with bus

lock process table release mutex release process table

This deadlocks as follows:
Close sequence Native driver sequence
lock process table
acquire mutex
release process table lock
do something with bus
lock process table
try to acquire mutex
try to acquire lock process table

Giving it more thoughts, I think we should rather adopt an Erlang-like approach and leverage the serialization at the process level rather than adding more locks.

This would mean external driver would send a message to the port to acquire the bus, the port would update an internal state (e.g. increment a reference counter), and the external driver would send a message to release the bus. This way we can ensure the bus is not closed while the external driver is using it.

Good point, but not easy to fix it in little time, let's go for a naive approach and work on this known limitation later.
Let's just focus on correctness now.

return I2CAcquireOk;
}

void i2c_driver_release(term i2c_port, GlobalContext *global)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As mentioned elsewhere, this function is useless with this current implementation.

Consequently, acquire can be renamed and merged with get_i2c_context_lock.

return cmd == I2CCloseCmd ? NativeTerminate : NativeContinue;
}

// The native driver API defined here is supported only for SMP builds
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And the non-locking API could also be used with non-SMP builds.

Copy link
Collaborator

@pguyot pguyot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this current iteration, if i2c_driver_release is called after i2c_driver_acquire is called, it will deadlock.

Indeed, i2c_driver_acquire calls get_i2c_context_lock which locks process table and doesn't unlock it. And i2c_driver_release also calls get_i2c_context_lock and therefore deadlocks.

There are actually two options here:

  1. No lock at all

i2c native driver's author is responsible for making sure the i2c_port exists.
We don't do acquire/release but instead a i2c_driver_get_port function:

bool i2c_driver_get_port(term i2c_port, i2c_port_t *i2c_num, GlobalContext *global);

that peeks into i2c port context and returns the i2c_port_t (and result says if it succeeded or not).

  1. Acquire/release through messages

An acquire message is sent to the i2c_port that returns the native i2c_port_t and increments a counter of acquire calls.
A release message is also meant to be sent to the i2c_port.
The i2c_port process refuses to close while the i2c_port_t is acquired.

@bettio bettio force-pushed the acquire-i2c-api branch 2 times, most recently from f7e0514 to 4d4fcc6 Compare February 5, 2024 13:56
@bettio bettio requested a review from pguyot February 5, 2024 13:56
@bettio
Copy link
Collaborator Author

bettio commented Feb 5, 2024

As longer term solution 2 seems the right thing to do, however I'm pushing a simple ref count based approach so we can have a simple solution for the near term.
It should avoid any concurrent i2c deletion.

}

// The native driver API defined here is supported only for SMP builds
#ifndef AVM_NO_SMP
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This can be removed now.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

globalcontext_send_message(ctx->global, local_process_id, ret_msg);
mailbox_remove_message(&ctx->mailbox, &ctx->heap);

return cmd == I2CCloseCmd ? NativeTerminate : NativeContinue;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe this now needs to be changed to prevent the native process to be deallocated while a "native i2c driver" locked the port.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@bettio bettio requested a review from pguyot February 5, 2024 22:11
int local_process_id = term_to_local_process_id(i2c_port);
Context *ctx = globalcontext_get_process_lock(global, local_process_id);

if (ctx->native_handler != i2cdriver_consume_mailbox || (ctx->platform_data == NULL)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx can be null if process doesn't exist (or more likely no longer exists).

int local_process_id = term_to_local_process_id(i2c_port);
Context *ctx = globalcontext_get_process_lock(global, local_process_id);

if (ctx->native_handler != i2cdriver_consume_mailbox || (ctx->platform_data == NULL)) {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ctx should exist here if caller called acquire but testing for NULL wouldn't hurt.

i2c_data->ref_count--;
NativeHandlerResult close_result = i2c_driver_maybe_close(ctx);
if (close_result == NativeTerminate) {
mailbox_send_term_signal(ctx, KillSignal, CLOSE_ATOM);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could have called scheduler_terminate or send a regular message (e.g. CmdClose).

The problem here is you kill with reason CLOSE_ATOM which is not NORMAL_ATOM and may have unexpected consequences.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

scheduler_terminate was my first idea, but it is not clear to me (and we didn't document that information), if it is thread safe or even if it can be used when globalcontext_get_process_lock has been called.

@bettio bettio requested a review from pguyot February 6, 2024 13:01
@bettio bettio requested a review from pguyot February 6, 2024 20:53
Using `i2c_driver_acquire` and `i2c_driver_release` it
is possible to implement native I2C drivers that coexist with erlang
ones, without corruptions or race conditions.

This API has been designed on top of deprecated I2C API, so a new one
might be needed.

Signed-off-by: Davide Bettio <davide@uninstall.it>
@bettio bettio merged commit 8a3eca7 into atomvm:master Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants