Fix the flaky alarms tests #1765

PradeepKiruvale · 2023-02-27T14:25:10Z

Proposed changes

This PR fixes the flaky alarm tests

Run tests in serial
Fix the issues in the way the results validated

Types of changes

Bugfix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds functionality)
Improvement (general improvements like code refactoring that doesn't explicitly fix a bug or add any new functionality)
Documentation Update (if none of the other choices apply)
Breaking change (fix or feature that would cause existing functionality to not work as expected)

Paste Link to the issue

Checklist

I have read the CONTRIBUTING doc
I have signed the CLA (in all commits with git commit -s)
I ran cargo fmt as mentioned in CODING_GUIDELINES
I used cargo clippy as mentioned in CODING_GUIDELINES
I have added tests that prove my fix is effective or that my feature works
I have added necessary documentation (if appropriate)

Further comments

didier-wenzek

Seems good. I have some questions though.

didier-wenzek · 2023-02-27T17:17:14Z

crates/core/tedge_mapper/src/c8y/tests.rs

@@ -1274,6 +1244,7 @@ fn extract_child_id(in_topic: &str, expected_child_id: Option<String>) {
 }

 #[tokio::test]
+#[serial]


Why do we need to have all these tests serial?

They behave correctly even without serial when both c8y_mapper_alarm_empty_payload and c8y_mapper_child_alarm_empty_payload are ignored.

Yes, that's right. But, I observed that the test c8y-converter needs the broker, to be on the safer side I made them run serially.

didier-wenzek · 2023-02-27T17:24:26Z

crates/core/tedge_mapper/src/c8y/tests.rs

-            "tedge/alarms/major/temperature_alarm",
+            "tedge/alarms/major/temperature_alarm/external_sensor",


Why this instance of tedge/alarms/major/temperature_alarm is replaced by tedge/alarms/major/temperature_alarm/external_sensor while the others (above) are replaced by tedge/alarms/major/custom_temperature_alarm"?

Because there was a typo in the topic, the alarm that was sent on "tedge/alarms/major/temperature_alarm/external_sensor had to be cleared sending an empty message on the same topic.

Signed-off-by: Pradeep Kumar K J <pradeepkumar.kj@softwareag.com>

albinsuresh · 2023-02-28T06:50:17Z

crates/core/tedge_mapper/src/c8y/tests.rs

-    while let Ok(Some(msg)) = messages.next().with_timeout(TEST_TIMEOUT_MS).await {
-        assert_json_include!(actual:serde_json::from_str::<serde_json::Value>(&msg).unwrap(), expected:expected_msg);
-    }
+    let expected_msg = r#"{"severity":"MAJOR","type":"custom_temperature_alarm","time":"2023-01-25T18:41:14.776170774Z","text":"Temperature high","customFragment":{"nested":{"value":"extra info"}}}"#;


I'm assuming that you converted this to raw string to use the assert_received_all_expected function that expects string. You could have used the to_string() function on the earlier json::Value as well, right? So that you get a string without affecting the code readability. It doesn't matter in this specific case as the JSON was not formatted properly in the first place.

Yes, thats right. I felt converting from json to string is not necessary if a raw string is used directly.

Using json! is just for better readability, especially with multi-level JSON structs. More of a good to have feature than a must.

albinsuresh · 2023-02-28T06:54:45Z

crates/core/tedge_mapper/src/c8y/tests.rs


+    // Expect converted temperature alarm message
+    mqtt_tests::assert_received_all_expected(&mut messages, TEST_TIMEOUT_MS, &[expected_msg]).await;


This is still problematic as an assertion failure will prevent the cleanup logic below(clearing alarms) from executing. Not just in this test, but all alarm tests relying on this cleanup behaviour suffers from the same as you discovered yesterday.

What I observed is that these tests run in serial and when the next test starts it will create a new broker instance, which will be a clean start and no persisted messages.

So, we don't even need that "alarm cleanup" step in all these tests, huh? Fine then. You can remove those cleanup steps if it's useless anyway. But, up-to you.

didier-wenzek · 2023-02-28T13:16:13Z

This PR has to be rebased to get the fix for test report generation.

rina23q

Got a question only.

rina23q · 2023-02-28T14:39:45Z

crates/core/tedge_mapper/src/c8y/tests.rs

            r#"{ "text": "Temperature high","time":"2023-01-25T18:41:14.776170774Z","customFragment": {"nested":{"value": "extra info"}} }"#,
            mqtt_channel::QoS::AtLeastOnce,
            true,
        )
        .await
        .unwrap();

-    let expected_msg = json!({"severity":"MAJOR","type":"temperature_alarm","time":"2023-01-25T18:41:14.776170774Z","text":"Temperature high","customFragment":{"nested":{"value":"extra info"}}});
-
-    while let Ok(Some(msg)) = messages.next().with_timeout(TEST_TIMEOUT_MS).await {


So the sporadically test failure is fixed by replacing this block by mqtt_tests::assert_received_all_expected? It looks all other tests have this replacement.

I am just curious, because other locations than alarm also use this while let block for testing.

Yup, using mqtt_tests::assert_received_all_expected fixed the issue. Before I was trying to assert the json objects and was relying on the timeout error when it does not receive. But, the call with_timeout does not return an error, and because of this, the tests were always passing.
I did check at least two other places where while let is being used, felt it's okay.

github-actions · 2023-02-28T16:57:44Z

Robot Results

✅ Passed	❌ Failed	⏭️ Skipped	Total	Pass %
137	0	5	137	100

Passed Tests

Name	⏱️ Duration	Suite
Define Child device 1 ID	0.005 s	`C8Y Child Alarms Rpi`
Normal case when the child device does not exist on c8y cloud	1.655 s	`C8Y Child Alarms Rpi`
Normal case when the child device already exists	0.73 s	`C8Y Child Alarms Rpi`
Reconciliation when the new alarm message arrives, restart the mapper	1.8820000000000001 s	`C8Y Child Alarms Rpi`
Reconciliation when the alarm that is cleared	5.484 s	`C8Y Child Alarms Rpi`
Prerequisite Parent	18.434 s	`Child Conf Mgmt Plugin`
Prerequisite Child	0.369 s	`Child Conf Mgmt Plugin`
Child device bootstrapping	13.758 s	`Child Conf Mgmt Plugin`
Snapshot from device	23.288 s	`Child Conf Mgmt Plugin`
Child device config update	16.303 s	`Child Conf Mgmt Plugin`
Configuration types should be detected on file change (without restarting service)	45.992 s	`Inotify Crate`
Child devices support sending simple measurements	45.258 s	`Child Device Telemetry`
Child devices support sending custom measurements	47.257 s	`Child Device Telemetry`
Child devices support sending custom events	41.641 s	`Child Device Telemetry`
Child devices support sending custom events overriding the type	34.64 s	`Child Device Telemetry`
Child devices support sending custom alarms #1699	32.949 s	`Child Device Telemetry`
Child devices support sending inventory data via c8y topic	23.435 s	`Child Device Telemetry`
Main device support sending inventory data via c8y topic	21.586 s	`Child Device Telemetry`
Successful firmware operation	59.382 s	`Firmware Operation`
Install with empty firmware name	53.464 s	`Firmware Operation`
Supports restarting the device	73.997 s	`Restart Device`
Update tedge version from previous using Cumulocity	105.273 s	`Tedge Self Update`
Successful shell command with output	3.755 s	`Shell Operation`
Check Successful shell command with literal double quotes output	3.171 s	`Shell Operation`
Execute multiline shell command	3.006 s	`Shell Operation`
Failed shell command	3.018 s	`Shell Operation`
Software list should be populated during startup	51.834 s	`Software`
Install software via Cumulocity	66.688 s	`Software`
Software list should only show currently installed software and not candidates	42.299 s	`Software`
Stop tedge-agent service	0.254 s	`Log Path Config`
Customize the log path	0.109 s	`Log Path Config`
Initialize tedge-agent	0.15 s	`Log Path Config`
Check created folders	0.107 s	`Log Path Config`
Remove created custom folders	0.098 s	`Log Path Config`
Install latest via script (from current branch)	28.404 s	`Install Tedge`
Install specific version via script (from current branch)	15.576 s	`Install Tedge`
Install latest tedge via script (from main branch)	24.422 s	`Install Tedge`
Support starting and stopping services	37.792 s	`Service-Control`
Supports a reconnect	47.017 s	`Test-Commands`
Supports disconnect then connect	47.801 s	`Test-Commands`
Update unknown setting	27.375 s	`Test-Commands`
Update known setting	23.384 s	`Test-Commands`
Stop c8y-configuration-plugin	0.115 s	`Health C8Y-Configuration-Plugin`
Update the service file	0.128 s	`Health C8Y-Configuration-Plugin`
Reload systemd files	0.715 s	`Health C8Y-Configuration-Plugin`
Start c8y-configuration-plugin	0.132 s	`Health C8Y-Configuration-Plugin`
Start watchdog service	10.389 s	`Health C8Y-Configuration-Plugin`
Check PID of c8y-configuration-plugin	0.105 s	`Health C8Y-Configuration-Plugin`
Kill the PID	0.151 s	`Health C8Y-Configuration-Plugin`
Recheck PID of c8y-configuration-plugin	2.349 s	`Health C8Y-Configuration-Plugin`
Compare PID change	0.001 s	`Health C8Y-Configuration-Plugin`
Stop watchdog service	0.274 s	`Health C8Y-Configuration-Plugin`
Remove entry from service file	0.178 s	`Health C8Y-Configuration-Plugin`
Stop c8y-log-plugin	0.195 s	`Health C8Y-Log-Plugin`
Update the service file	0.243 s	`Health C8Y-Log-Plugin`
Reload systemd files	0.899 s	`Health C8Y-Log-Plugin`
Start c8y-log-plugin	0.311 s	`Health C8Y-Log-Plugin`
Start watchdog service	10.364 s	`Health C8Y-Log-Plugin`
Check PID of c8y-log-plugin	0.117 s	`Health C8Y-Log-Plugin`
Kill the PID	0.084 s	`Health C8Y-Log-Plugin`
Recheck PID of c8y-log-plugin	2.155 s	`Health C8Y-Log-Plugin`
Compare PID change	0.001 s	`Health C8Y-Log-Plugin`
Stop watchdog service	0.122 s	`Health C8Y-Log-Plugin`
Remove entry from service file	0.118 s	`Health C8Y-Log-Plugin`
Stop tedge-mapper	0.224 s	`Health Tedge Mapper C8Y`
Update the service file	0.245 s	`Health Tedge Mapper C8Y`
Reload systemd files	0.894 s	`Health Tedge Mapper C8Y`
Start tedge-mapper	0.249 s	`Health Tedge Mapper C8Y`
Start watchdog service	10.35 s	`Health Tedge Mapper C8Y`
Check PID of tedge-mapper	0.098 s	`Health Tedge Mapper C8Y`
Kill the PID	0.093 s	`Health Tedge Mapper C8Y`
Recheck PID of tedge-mapper	2.167 s	`Health Tedge Mapper C8Y`
Compare PID change	0.002 s	`Health Tedge Mapper C8Y`
Stop watchdog service	0.072 s	`Health Tedge Mapper C8Y`
Remove entry from service file	0.058 s	`Health Tedge Mapper C8Y`
Stop tedge-agent	0.22 s	`Health Tedge-Agent`
Update the service file	0.115 s	`Health Tedge-Agent`
Reload systemd files	0.688 s	`Health Tedge-Agent`
Start tedge-agent	0.142 s	`Health Tedge-Agent`
Start watchdog service	10.363 s	`Health Tedge-Agent`
Check PID of tedge-mapper	0.046 s	`Health Tedge-Agent`
Kill the PID	0.066 s	`Health Tedge-Agent`
Recheck PID of tedge-agent	2.259 s	`Health Tedge-Agent`
Compare PID change	0.002 s	`Health Tedge-Agent`
Stop watchdog service	0.276 s	`Health Tedge-Agent`
Remove entry from service file	0.127 s	`Health Tedge-Agent`
Stop tedge-mapper-az	0.19 s	`Health Tedge-Mapper-Az`
Update the service file	0.223 s	`Health Tedge-Mapper-Az`
Reload systemd files	0.716 s	`Health Tedge-Mapper-Az`
Start tedge-mapper-az	0.158 s	`Health Tedge-Mapper-Az`
Start watchdog service	10.233 s	`Health Tedge-Mapper-Az`
Check PID of tedge-mapper-az	0.064 s	`Health Tedge-Mapper-Az`
Kill the PID	0.158 s	`Health Tedge-Mapper-Az`
Recheck PID of tedge-agent	2.209 s	`Health Tedge-Mapper-Az`
Compare PID change	0.001 s	`Health Tedge-Mapper-Az`
Stop watchdog service	0.212 s	`Health Tedge-Mapper-Az`
Remove entry from service file	0.189 s	`Health Tedge-Mapper-Az`
Stop tedge-mapper-collectd	0.179 s	`Health Tedge-Mapper-Collectd`
Update the service file	0.177 s	`Health Tedge-Mapper-Collectd`
Reload systemd files	0.766 s	`Health Tedge-Mapper-Collectd`
Start tedge-mapper-collectd	0.175 s	`Health Tedge-Mapper-Collectd`
Start watchdog service	10.236 s	`Health Tedge-Mapper-Collectd`
Check PID of tedge-mapper-collectd	0.105 s	`Health Tedge-Mapper-Collectd`
Kill the PID	0.164 s	`Health Tedge-Mapper-Collectd`
Recheck PID of tedge-mapper-collectd	2.431 s	`Health Tedge-Mapper-Collectd`
Compare PID change	0.001 s	`Health Tedge-Mapper-Collectd`
Stop watchdog service	0.173 s	`Health Tedge-Mapper-Collectd`
Remove entry from service file	0.164 s	`Health Tedge-Mapper-Collectd`
c8y-log-plugin health status	5.633 s	`MQTT health endpoints`
c8y-configuration-plugin health status	5.606 s	`MQTT health endpoints`
Wrong package name	0.173 s	`Improve Tedge Apt Plugin Error Messages`
Wrong version	0.128 s	`Improve Tedge Apt Plugin Error Messages`
Wrong type	0.317 s	`Improve Tedge Apt Plugin Error Messages`
tedge_connect_test_positive	0.669 s	`Tedge Connect Test`
tedge_connect_test_negative	1.268 s	`Tedge Connect Test`
tedge_connect_test_sm_services	6.633 s	`Tedge Connect Test`
tedge_disconnect_test_sm_services	0.395 s	`Tedge Connect Test`
Install thin-edge.io	11.539 s	`Call Tedge`
call tedge -V	0.149 s	`Call Tedge`
call tedge -h	0.145 s	`Call Tedge`
call tedge -h -V	0.154 s	`Call Tedge`
call tedge help	0.091 s	`Call Tedge`
tedge config list	0.149 s	`Call Tedge Config List`
tedge config list --all	0.12 s	`Call Tedge Config List`
set/unset device.type	0.817 s	`Call Tedge Config List`
set/unset device.key.path	0.587 s	`Call Tedge Config List`
set/unset device.cert.path	0.502 s	`Call Tedge Config List`
set/unset c8y.root.cert.path	0.419 s	`Call Tedge Config List`
set/unset c8y.smartrest.templates	0.367 s	`Call Tedge Config List`
set/unset az.root.cert.path	0.421 s	`Call Tedge Config List`
set/unset az.mapper.timestamp	0.742 s	`Call Tedge Config List`
set/unset mqtt.bind_address	0.522 s	`Call Tedge Config List`
set/unset mqtt.port	0.588 s	`Call Tedge Config List`
set/unset tmp.path	0.335 s	`Call Tedge Config List`
set/unset logs.path	0.245 s	`Call Tedge Config List`
set/unset run.path	0.245 s	`Call Tedge Config List`
Get Put Delete	4.006 s	`Http File Transfer Api`

albinsuresh · 2023-03-01T07:36:59Z

crates/core/tedge_mapper/src/c8y/tests.rs

-    while let Ok(Some(msg)) = messages.next().with_timeout(TEST_TIMEOUT_MS).await {
-        assert_json_include!(actual:serde_json::from_str::<serde_json::Value>(&msg).unwrap(), expected:expected_msg);
-    }
+    let expected_msg = r#"{"severity":"MAJOR","type":"custom_temperature_alarm","time":"2023-01-25T18:41:14.776170774Z","text":"Temperature high","customFragment":{"nested":{"value":"extra info"}}}"#;


Using json! is just for better readability, especially with multi-level JSON structs. More of a good to have feature than a must.

albinsuresh · 2023-03-01T07:39:12Z

crates/core/tedge_mapper/src/c8y/tests.rs


+    // Expect converted temperature alarm message
+    mqtt_tests::assert_received_all_expected(&mut messages, TEST_TIMEOUT_MS, &[expected_msg]).await;


So, we don't even need that "alarm cleanup" step in all these tests, huh? Fine then. You can remove those cleanup steps if it's useless anyway. But, up-to you.

PradeepKiruvale requested review from albinsuresh and rina23q February 27, 2023 14:25

PradeepKiruvale temporarily deployed to Test Pull Request February 27, 2023 14:32 — with GitHub Actions Inactive

PradeepKiruvale temporarily deployed to Test Pull Request February 27, 2023 18:06 — with GitHub Actions Inactive

PradeepKiruvale temporarily deployed to Test Pull Request February 27, 2023 18:08 — with GitHub Actions Inactive

didier-wenzek reviewed Feb 27, 2023

View reviewed changes

PradeepKiruvale temporarily deployed to Test Pull Request February 27, 2023 21:21 — with GitHub Actions Inactive

Fix the flaky alarms tests

0234c42

Signed-off-by: Pradeep Kumar K J <pradeepkumar.kj@softwareag.com>

albinsuresh reviewed Feb 28, 2023

View reviewed changes

PradeepKiruvale force-pushed the flacky-alarm-tet branch from 8ffdae7 to 0234c42 Compare February 28, 2023 13:20

PradeepKiruvale temporarily deployed to Test Pull Request February 28, 2023 13:27 — with GitHub Actions Inactive

rina23q reviewed Feb 28, 2023

View reviewed changes

albinsuresh approved these changes Mar 1, 2023

View reviewed changes

PradeepKiruvale merged commit f95be3a into thin-edge:main Mar 2, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix the flaky alarms tests #1765

Fix the flaky alarms tests #1765

PradeepKiruvale commented Feb 27, 2023

didier-wenzek left a comment

didier-wenzek Feb 27, 2023

PradeepKiruvale Feb 28, 2023

didier-wenzek Feb 27, 2023

PradeepKiruvale Feb 28, 2023

albinsuresh Feb 28, 2023

PradeepKiruvale Feb 28, 2023 •

edited

albinsuresh Mar 1, 2023

albinsuresh Feb 28, 2023

PradeepKiruvale Feb 28, 2023 •

edited

albinsuresh Mar 1, 2023

didier-wenzek commented Feb 28, 2023

rina23q left a comment

rina23q Feb 28, 2023

PradeepKiruvale Feb 28, 2023

github-actions bot commented Feb 28, 2023

albinsuresh Mar 1, 2023

albinsuresh Mar 1, 2023

		"tedge/alarms/major/temperature_alarm",
		"tedge/alarms/major/temperature_alarm/external_sensor",


		// Expect converted temperature alarm message
		mqtt_tests::assert_received_all_expected(&mut messages, TEST_TIMEOUT_MS, &[expected_msg]).await;

Fix the flaky alarms tests #1765

Fix the flaky alarms tests #1765

Conversation

PradeepKiruvale commented Feb 27, 2023

Proposed changes

Types of changes

Paste Link to the issue

Checklist

Further comments

didier-wenzek left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PradeepKiruvale Feb 28, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PradeepKiruvale Feb 28, 2023 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

didier-wenzek commented Feb 28, 2023

rina23q left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

github-actions bot commented Feb 28, 2023

Robot Results

Passed Tests

Choose a reason for hiding this comment

Choose a reason for hiding this comment

PradeepKiruvale Feb 28, 2023 •

edited

PradeepKiruvale Feb 28, 2023 •

edited