Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Device state retention #72688

Open
wants to merge 6 commits into
base: main
Choose a base branch
from

Conversation

benediktibk
Copy link
Collaborator

This PR is a follow-up to #71336.

It implements a device state retention module, whose instance is selected via a chosen node. The retention module itself currently stores only the information which device needs to be reinitialized or can be initialized normally. But it should be possible to implement the storage of additional driver specific data as well.

The drivers themselves have to implement the device state retention via a reinit function. The kernel then decides during the initialization if the init or reinit function should be called, dependening on the results from the device state retention module.

This is a deviation from the approach, which was agreed upon in the architecture work group. But I think it will address most of the concerns, as it is a more general approach which can be used for multiple drivers. If it is still undesirable to move forward with this approach another draft is available as #72382 which implements the device state retention only for one specific driver.

Improve logging of the driver for the port expander PCAL64XXA.

Signed-off-by: Benedikt Schmidt <benedikt.schmidt@embedded-solutions.at>
Extract methods from init function of PCAL64XXA.

Signed-off-by: Benedikt Schmidt <benedikt.schmidt@embedded-solutions.at>
@benediktibk benediktibk force-pushed the device_retention branch 2 times, most recently from ad2b17e to ad5f4ed Compare May 13, 2024 16:03
Implement a retention module for retaining the device state during reboots.

Signed-off-by: Benedikt Schmidt <benedikt.schmidt@embedded-solutions.at>
Implement optional device state retention during reboots in the kernel.

Signed-off-by: Benedikt Schmidt <benedikt.schmidt@embedded-solutions.at>
Implement the device state retention for the port expander PCAL64XXA.

Signed-off-by: Benedikt Schmidt <benedikt.schmidt@embedded-solutions.at>
Add a build_all test for the device state retention module with a PCAL64XXA
as a user of the module.

Signed-off-by: Benedikt Schmidt <benedikt.schmidt@embedded-solutions.at>
Copy link
Contributor

@andyross andyross left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some notes. The device model experts will have more say for sure, but... I guess this seems a little too general? The problem you're trying to solve is very specific: "don't clobber the bootloader device settings on this GPIO controller", where this is an extremely general framework with a surprising amount (900 lines!) of code added.

Are there any real situations where this is going to help vs. just adding a "use_bootloader_state" flag to the handful of drivers that are sensitive to the transition?

* kernel during system initialization. This macro should only be used when the
* device is not being allocated from a devicetree node. If you are allocating a
* device from a devicetree node, use DEVICE_DT_DEFINE() or
* DEVICE_DT_INST_DEFINE() instead.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this documentation block should have something to say about what REINIT means and how this differs from DEVICE_DEFINE().

Like, the doc lines below for "init_fn" and "reinit_fn" say almost exactly the same things. You need to tell the user what the difference is.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The documentation definitely has room for improvement, so far I basically only copied the existing stuff.

* @param prio The device's priority within its initialization level. See
* SYS_INIT() for details.
* @param api Pointer to the device's API structure. Can be `NULL`.
* @param retention_index Retention index of the device.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Who allocates a "retention index"? The app? The board DTS? Is there a mechanism to detect or prevent collisions? This is the kind of thing that Zephyr has traditionally tried very hard to do automatically.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is currently defined in the devicetree via zephyr,retention-index. And yes, with this approach the user will have to ensure that there are no collisions. The original approach came from the other side, where the layout was defined with automatic collision checks via devicetree, but the comments #71336 basically were about this being to much a software thing, and therefore not belonging into the devicetree. The issue with this layout is, that it will have to be stable between the bootloader and the firmware. Hence, an automatic generation during the build process will be problematic, as this might produce images which are then incompatible.

@benediktibk
Copy link
Collaborator Author

Some notes. The device model experts will have more say for sure, but... I guess this seems a little too general? The problem you're trying to solve is very specific: "don't clobber the bootloader device settings on this GPIO controller", where this is an extremely general framework with a surprising amount (900 lines!) of code added.

Are there any real situations where this is going to help vs. just adding a "use_bootloader_state" flag to the handful of drivers that are sensitive to the transition?

This approach will solve the transition between bootloader and firmware, as well as the other way round. A "use_bootloader_state" flag won't work in the case of a switch from the firmware back into the bootloader.

The discussion during the architecture work group meeting went a little bit into two directions. Either a more general approach, or a rather specific solution for just one driver. This would be my proposal for the general approach, but I have also prepared the specific one here: #72382

@bjarki-trackunit
Copy link
Collaborator

bjarki-trackunit commented May 13, 2024

We currently have this function in tree

bool device_is_ready(const struct device *dev);

Which returns true if a device has been initialized successfully. It uses this structure stored in RAM to check:

struct device_state {
	/**
	 * Device initialization return code (positive errno value).
	 *
	 * Device initialization functions return a negative errno code if they
	 * fail. In Zephyr, errno values do not exceed 255, so we can store the
	 * positive result value in a uint8_t type.
	 */
	uint8_t init_res;

	/** Indicates the device initialization function has been
	 * invoked.
	 */
	bool initialized : 1;
};

Any device can check this at any time, in the existing device init function for example:

int pcal64xxa_init(const struct device *dev)
{
        if (device_is_ready(dev)) {
                return pcal64xxa_reinit(dev);
}

It seems way simpler to me to use the retention subsystem to preserve the struct device_state, instead of adding a new device variation. The retention problem is the same, yet this is compatible with the existing codebase and quite intuitive IMO.

Existing drivers which always reinit don't check if they are already initialized in their init functions, so the addition will not break anything in tree, while providing the ability for new drivers to use it.

Additionally, if retention is not used, and the reinit is not desired, device_is_ready() will always return false in the init function, prompting the init of the driver.

WDYT?

@benediktibk
Copy link
Collaborator Author

We currently have this function in tree

bool device_is_ready(const struct device *dev);

Which returns true if a device has been initialized successfully. It uses this structure stored in RAM to check:

struct device_state {
	/**
	 * Device initialization return code (positive errno value).
	 *
	 * Device initialization functions return a negative errno code if they
	 * fail. In Zephyr, errno values do not exceed 255, so we can store the
	 * positive result value in a uint8_t type.
	 */
	uint8_t init_res;

	/** Indicates the device initialization function has been
	 * invoked.
	 */
	bool initialized : 1;
};

Any device can check this at any time, in the existing device init function for example:

int pcal64xxa_init(const struct device *dev)
{
        if (device_is_ready(dev)) {
                ...
}

It seems way simpler to me to use the retention subsystem to preserve the struct device_state, instead of adding a new device variation. The retention problem is the same, yet this is compatible with the existing codebase and quite intuitive IMO.

Existing drivers which always reinit don't check if they are already initialized in their init functions, so the addition will not break anything in tree, while providing the ability for new drivers to use it.

Additionally, if retention is not used, and the reinit is not desired, device_is_ready() will always return false in the init function, prompting the init of the driver.

WDYT?

I like the idea, but I see one problem: There are different types of "initialized". There is the hardware (registers set, ...) and the software (semaphores initialized, ...) point of view. Therefore, reusing the existing flag will create two different meanings for the flag, depending on if the driver for the device calls it or a different part of the code.

But I could store the additional flag (e.g. hardware_initialized) itself in the device_state instead of the separate data structure from the retention module, if this would clear things up a little bit? Of course it then still would be necessary to create another accessor, like device_hardware_is_ready, but this could definitely be a way got get rid of the whole reinit-shenanigans.

@bjarki-trackunit
Copy link
Collaborator

bjarki-trackunit commented May 13, 2024

I like the idea, but I see one problem: There are different types of "initialized". There is the hardware (registers set, ...) and the software (semaphores initialized, ...) point of view. Therefore, reusing the existing flag will create two different meanings for the flag, depending on if the driver for the device calls it or a different part of the code.

But I could store the additional flag (e.g. hardware_initialized) itself in the device_state instead of the separate data structure from the retention module, if this would clear things up a little bit? Of course it then still would be necessary to create another accessor, like device_hardware_is_ready, but this could definitely be a way got get rid of the whole reinit-shenanigans.

I see, and agreed, adding a bit for "hardware initialized" to the existing struct is quite intuitive. device_is_ready() must retain its meaning -> both "software" and "hardware" initialized, so adding a new function to check the flag is also required.

maybe device_is_post_init(), or just device_is_initialized()?

Copy link
Member

@henrikbrixandersen henrikbrixandersen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still believe this to be a too generic solution to a very isolated, specific downstream problem.

I am sorry, but I do not see a general need for a device state retention subsystem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants