Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

subsys/mgmt/hawkbit & sample: Bugs/improvements #37225

Closed
ycsin opened this issue Jul 27, 2021 · 8 comments · Fixed by #38081
Closed

subsys/mgmt/hawkbit & sample: Bugs/improvements #37225

ycsin opened this issue Jul 27, 2021 · 8 comments · Fixed by #38081
Assignees
Labels
area: hawkBit bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug

Comments

@ycsin
Copy link
Member

ycsin commented Jul 27, 2021

Describe the bug
Found something in the hawkbit that might be improved while I'm debugging why the hawkbit sample isn't working for me:

  1. The hawkbit_init function is assuming the "storage" label to be with DT_CHOSEN_ZEPHYR_FLASH_CONTROLLER_LABEL (usually flash0?) whereas it can can be with the external serial flash so this should follows where the "storage" is defined instead of a constant. Fixed
  2. Same goes for the nvs_init(DT_CHOSEN_ZEPHYR_FLASH_CONTROLLER_LABEL) Fixed
  3. There's an extra space before the "not" here that should be removed. Fixed
  4. For some reason it marked the image as OK here, but the image still isn't OK. Which led to a reboot here when hawkbit_autohandler(); is invoked. Expected behavior from mcuboot
  5. It rebooted before the log messages get flushed, so maybe have to delay before reboot, if not there's no point logging since the users won't be able to read them anyway
    • or do something in the reboot function to flush the logs. (Probably out of the scope of hawkbit)
  6. The hawkbit subsys is expecting confirmed image, but the MCUboot is designed to not confirm the primary image that's flashed.
  7. LOG_MODULE_REGISTER(hawkbit); is wrong, its verbosity can't be configured. Fixed
  8. LOG_ERR("only part 'bApp' is supported; got %s", chunk->part); missing log_strdup, throwing error in log.

To Reproduce
Build hawkbit sample with "storage" in external flash.

Expected behavior
hawkbit sample working.

Impact
Can't get hawkbit sample to work.

Environment (please complete the following information):

  • OS: Windows 10
  • gnuarmemb
  • v2.6.0-rc2-107-gebe282b02d9f

Logs

*** Booting Zephyr OS build v2.6.0-rc2-107-gebe282b02d9f  ***
I: Starting bootloader
I: Primary image: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Scratch: magic=unset, swap_type=0x1, copy_done=0x3, image_ok=0x3
I: Boot source: primary slot
I: Swap type: none
I: Bootloader chainload address offset: 0x18000
I: Jumping to the first image slot


uart:~$ *** Booting Zephyr OS build v2.6.0-rc2-107-gebe282b02d9f  ***
[00:00:02.010,000] <inf> main: Hawkbit sample app started
[00:00:04.602,000] <inf> modem_gsm: Waiting for modem to boot up
[00:00:15.308,000] <inf> modem_gsm: Modem RDY
[00:00:16.385,000] <inf> modem_gsm: PPP channel 2 connected to GSM_1
[00:00:16.691,000] <inf> modem_gsm: Manufacturer: Quectel
[00:00:16.749,000] <inf> modem_gsm: Model: EC21
[00:00:16.808,000] <inf> modem_gsm: Revision: EC21EFAR06A01M4G
[00:00:16.868,000] <inf> modem_gsm: IMEI: 867962047311785
[00:00:17.052,000] <inf> modem_gsm: Network registered, home network.
[00:00:17.061,000] <inf> modem_gsm: Attached to packet service!
[00:00:17.080,000] <inf> net_ppp: Initializing PPP to use GSM_1
[00:00:17.084,000] <inf> modem_gsm: AT channel 1 connected to GSM_2
[00:00:17.157,000] <inf> main: hawkbit_init
[00:00:18.275,000] <inf> fs_nvs: 3 Sectors of 4096 bytes
[00:00:18.275,000] <inf> fs_nvs: alloc wra: 0, fe8
[00:00:18.275,000] <inf> fs_nvs: data wra: 0, 0
[00:00:18.276,000] <inf> hawkbit: Image is not confirmed OK
[00:00:18.276,000] <inf> hawkbit: Marked image as OK
[00:00:21.953,000] <inf> hawkbit: Erased second slot
[00:00:21.953,000] <inf> main: Starting hawkbit polling mode
[00:00:22.953,000] <err> hawkbit: The current image is not confirmed
[00:00:22.953,000] <err> hawkbit: Image is unconfirmed
[00:00:22.953,000] <err> hawkbit: Rebooting to previous confirmed image.

It manage to print the logs here because I k_sleep() before reboot function call. Or else the last message would be:

<inf> main: Starting hawkbit polling mode

before it repeats the bootloader process, and there's no way for anyone to immediately know what caused the reset.

Suggested fix

1 & 2: Refer to the nvs sample, determine the location of "storage" partition directly from devicetree: #37275

...
#define STORAGE_NODE DT_NODE_BY_FIXED_PARTITION_LABEL(storage)
#define FLASH_NODE DT_MTD_FROM_FIXED_PARTITION(STORAGE_NODE)
...
int hawkbit_init(void)
{
	bool image_ok;
	int ret = 0, rc = 0;
	struct flash_pages_info info;
	int32_t action_id;
	const struct device *flash_dev;

	flash_dev = DEVICE_DT_GET(FLASH_NODE);

	fs.offset = FLASH_AREA_OFFSET(storage);
	rc = flash_get_page_info_by_offs(flash_dev, fs.offset, &info);
	if (rc) {
		LOG_ERR("Unable to get storage page info");
		return -EIO;
	}

	fs.sector_size = info.size;
	fs.sector_count = 3U;

	rc = nvs_init(&fs, flash_dev->name);
...

3: Remove the space: #37278

LOG_INF("Image is %s confirmed OK", image_ok ? "" : "not");

4: Perform additional check if the image is actually "OK"? Expected behavior from mcuboot
5: Do k_sleep() in hawkbit before the reboot command is called so that the log subsys has time to print the logs

  • (out of hawkbit's scope) Implement function in the log subsys which can be called to flush all the logs
  • (out of hawkbit's scope) Do k_sleep() in the shutdown/reboot function so that the log subsys has time to print the logs

6: I recommend this needs to be mentioned in the hawkbit subsys's README & update the hawkbit sample's README such that the user should flash the confirmed image built with CONFIG_MCUBOOT_GENERATE_CONFIRMED_IMAGE=y instead of the typical one. Or else hawkbit will always reboot the system when autohandler() is invoked. (Thanks to @nvlsianpu for the enlightenment). This is especially useful for users that are new to this.

7: Should be LOG_MODULE_REGISTER(hawkbit, CONFIG_HAWKBIT_LOG_LEVEL); instead. #37277

8: Should be LOG_ERR("only part 'bApp' is supported; got %s", log_strdup(chunk->part)); instead #37276

@ycsin ycsin added the bug The issue is a bug, or the PR is fixing a bug label Jul 27, 2021
@ycsin
Copy link
Member Author

ycsin commented Jul 27, 2021

@ceolin @gmarull @Navin-Sankar Any clue/comments?

@Navin-Sankar Navin-Sankar self-assigned this Jul 27, 2021
@ycsin ycsin changed the title subsys: mgmt: hawkbit subsys: mgmt: hawkbit: Bugs/improvements Jul 27, 2021
@ycsin ycsin changed the title subsys: mgmt: hawkbit: Bugs/improvements subsys: mgmt: hawkbit & sample: Bugs/improvements Jul 27, 2021
@ycsin
Copy link
Member Author

ycsin commented Jul 27, 2021

Updated, please read again.

@Navin-Sankar
Copy link
Member

@ycsin I agree point 1 & 2. It's good to get the correct flash device from the devicetree storage partition node.
3. remove unwanted space between " not".

@cfriedt
Copy link
Member

cfriedt commented Jul 27, 2021

@sylvioalves - is this related to the PR you had just made? Just wondering if there is some commonality.

@cfriedt cfriedt added the priority: low Low impact/importance bug label Jul 27, 2021
@sylvioalves
Copy link
Collaborator

@sylvioalves - is this related to the PR you had just made? Just wondering if there is some commonality.

No, it is not related.

@ycsin
Copy link
Member Author

ycsin commented Jul 28, 2021

Updated

@cfriedt
Copy link
Member

cfriedt commented Jul 28, 2021

@ycsin - would you be open to making a PR for your fix?

@ycsin
Copy link
Member Author

ycsin commented Jul 29, 2021

@cfriedt already opened 4 PRs for different issue but @Navin-Sankar requested them to be made into 1 PR instead here. I will do it if its really necessary after I finish my other work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area: hawkBit bug The issue is a bug, or the PR is fixing a bug priority: low Low impact/importance bug
Projects
None yet
4 participants