[RSDK-8187] config cache changes #262

allisonschiang · 2024-07-15T14:51:34Z

Added storage into the config_monitor in order to write newly detected configs to NVS before rebooting, preventing a reboot cycle where the config is out of date.

However, this introduced a new issue where some invalid configs would cause a reboot loop, as it would crash the robot before entry.rs could detect a robot error and reset_robot_configuration().

Some fixes could be:

Keep this approach and try to catch configs that would cause a reboot loop (don't know if config is valid until bot crashes)
Keep this approach and individually return errors to catch crashing invalid configs as we discover them
Add another check that pulls config from app on boot-up and replaces the cache with any new one so if the user changes to a valid config it will overwrite the invalid config in cache

acmorrow · 2024-07-15T16:16:12Z

micro-rdk/src/common/config_monitor.rs

+    ServerError: From<<S as RobotConfigurationStorage>::Error>,
+    <S as WifiCredentialStorage>::Error: Sync + Send + 'static,
+{
+    pub fn new(restart_hook: impl FnOnce() + 'a, curr_config: ConfigResponse, storage: S) -> Self {


Let's make the lambda come last.

acmorrow · 2024-07-15T16:16:56Z

micro-rdk/src/common/config_monitor.rs

+                false => {
+                    if let Err(e) = self
+                        .storage
+                        .store_robot_configuration(new_config.config.unwrap())


I'm suspicious of unwrap. What should happen here if new_config.config is None?

not sure, what happens in entry.rs when get_config is called and new_config is None or returns an AppClientError?

micro-rdk/micro-rdk/src/esp32/entry.rs

Lines 83 to 84 in f95d10b

let (app_config, cfg_response, cfg_received_datetime) =

app_client.get_config(Some(network.get_ip())).await.unwrap();

I'm guessing it panics?

oh I see, I already new_config is already unwrapped

acmorrow · 2024-07-16T19:40:30Z

micro-rdk/src/common/config_monitor.rs

+    S: RobotConfigurationStorage + WifiCredentialStorage + Clone + 'static,
+    <S as RobotConfigurationStorage>::Error: Debug,
+    ServerError: From<<S as RobotConfigurationStorage>::Error>,
+    <S as WifiCredentialStorage>::Error: Sync + Send + 'static,


This won't work unless feature = "provisioning" is enabled, but see #267 which offers a way out. Perhaps this change should wait on that one.

micro-rdk/src/common/config_monitor.rs

acmorrow · 2024-07-16T19:45:01Z

Added storage into the config_monitor in order to write newly detected configs to NVS before rebooting, preventing a reboot cycle where the config is out of date.

However, this introduced a new issue where some invalid configs would cause a reboot loop, as it would crash the robot before entry.rs could detect a robot error and reset_robot_configuration().

Some fixes could be:

Keep this approach and try to catch configs that would cause a reboot loop (don't know if config is valid until bot crashes)

Keep this approach and individually return errors to catch crashing invalid configs as we discover them

Add another check that pulls config from app on boot-up and replaces the cache with any new one so if the user changes to a valid config it will overwrite the invalid config in cache

What about the following idea: when building from a cached config, we first fetch the config, and then we immediately erase it. Then we build the robot. Only if the robot builds OK do we write it back to the cache again. I don't love it because of the extra flash wear it implies, but that's already a problem we have.

acmorrow · 2024-07-16T20:20:28Z

Added storage into the config_monitor in order to write newly detected configs to NVS before rebooting, preventing a reboot cycle where the config is out of date.
However, this introduced a new issue where some invalid configs would cause a reboot loop, as it would crash the robot before entry.rs could detect a robot error and reset_robot_configuration().
Some fixes could be:

Keep this approach and try to catch configs that would cause a reboot loop (don't know if config is valid until bot crashes)

Keep this approach and individually return errors to catch crashing invalid configs as we discover them

Add another check that pulls config from app on boot-up and replaces the cache with any new one so if the user changes to a valid config it will overwrite the invalid config in cache

What about the following idea: when building from a cached config, we first fetch the config, and then we immediately erase it. Then we build the robot. Only if the robot builds OK do we write it back to the cache again. I don't love it because of the extra flash wear it implies, but that's already a problem we have.

I discussed this idea with @npmenard and he feels like the flash wear is too much of a problem. Please move forward with committing this (once you have addressed the failure to build without provisioning), and file a new bug ticket about the situation that arises when a panic inducing config is cached. Then we can tackle that in a new PR.

mattjperez

This looks good for me (nice work on the generics in signatures), but definitely

wait for new [RSDK-8219] allow micro-RDK to build w/out provisioning feature #267 to merge and retest compilation with and without provisioning
make and reference a follow up ticket in TODO comment

mattjperez · 2024-07-17T18:18:30Z

micro-rdk/src/common/config_monitor.rs

+                    if let Some(config) = new_config.config {
+                        if let Err(e) = self.storage.store_robot_configuration(config) {
+                            log::warn!("Failed to store new robot configuration from app: {}", e);
+                        }
+                    }
+                    self.restart();


please add the crash loop bug as a TODO comment here once it's made.

npmenard · 2024-07-17T19:30:54Z

micro-rdk/src/common/config_monitor.rs

@@ -43,7 +69,12 @@ impl<'a> PeriodicAppClientTask for ConfigMonitor<'a> {
                app_client.clone().get_config(None).await
            {
                if self.curr_config != *new_config {
-                    self.restart()
+                    if let Some(config) = new_config.config {
+                        if let Err(e) = self.storage.store_robot_configuration(config) {


did we consider erasing the config then restarting?

We could. That would leave open the possibility that if you can't connect when you come back online, you won't be able to build a robot until you talk to app again. That might be a risk worth taking.

I think this could work, its either a empty config or an invalid config if you can't come back online which both wouldn't work. Should I implement this solution?

config cache changes

ce65168

allisonschiang requested a review from a team as a code owner July 15, 2024 14:51

acmorrow requested changes Jul 15, 2024

View reviewed changes

allisonschiang self-assigned this Jul 15, 2024

allisonschiang added 3 commits July 15, 2024 15:59

move lambda last

9727c73

ensure config is some

f8ce783

Merge branch 'main' into cache-config-on-change

d1d155c

allisonschiang requested a review from acmorrow July 15, 2024 20:48

acmorrow requested changes Jul 16, 2024

View reviewed changes

mattjperez reviewed Jul 17, 2024

View reviewed changes

npmenard reviewed Jul 17, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[RSDK-8187] config cache changes #262

[RSDK-8187] config cache changes #262

allisonschiang commented Jul 15, 2024

acmorrow Jul 15, 2024

acmorrow Jul 15, 2024

allisonschiang Jul 15, 2024 •

edited

Loading

allisonschiang Jul 15, 2024

allisonschiang Jul 15, 2024

acmorrow Jul 16, 2024

acmorrow commented Jul 16, 2024

acmorrow commented Jul 16, 2024

mattjperez left a comment

mattjperez Jul 17, 2024

npmenard Jul 17, 2024

acmorrow Jul 17, 2024

allisonschiang Jul 17, 2024

	let (app_config, cfg_response, cfg_received_datetime) =
	app_client.get_config(Some(network.get_ip())).await.unwrap();

[RSDK-8187] config cache changes #262

Are you sure you want to change the base?

[RSDK-8187] config cache changes #262

Conversation

allisonschiang commented Jul 15, 2024

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allisonschiang Jul 15, 2024 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

acmorrow commented Jul 16, 2024

acmorrow commented Jul 16, 2024

mattjperez left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

allisonschiang Jul 15, 2024 •

edited

Loading