-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Adapt monitoring of daemons to new Service Monitoring Feature in 10.14 #1728
Adapt monitoring of daemons to new Service Monitoring Feature in 10.14 #1728
Conversation
6879e40
to
b50cdfd
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The key points of the feature are here, but the code must be re-organized.
let topic = message.topic.name.to_owned(); | ||
let payload: HashMap<String, Value> = serde_json::from_str(message.payload_str()?)?; | ||
|
||
if payload.len() > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if the len is 0?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The service monitor
message will not be created. This can happen in the case of, a bridge health status message. [tedge/health/mosquitto-c8y-bridge] 1/0.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. An option could be to translate 1/0 as up/down. However not sure that this helpful as the point is to transfer this state over the bridge. @reubenmiller what do you think about this case?
0846397
to
22b3cb5
Compare
Robot Results
Passed Tests
|
// If not Bridge health status | ||
if !service_name.contains("mosquitto-c8y-bridge") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This check is too specific: there are also the az
and aws
bridge.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, it's for all the bridges. I changed to check for all the bridges.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. An option could be to translate 1/0 as up/down. However not sure that this helpful as the point is to transfer this state over the bridge. @reubenmiller what do you think about this case?
Yes I would love to have a consistent health payload if possible, e.g. using up/down
instead of 1/0
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. An option could be to translate 1/0 as up/down. However not sure that this helpful as the point is to transfer this state over the bridge. @reubenmiller what do you think about this case?
Yes I would love to have a consistent health payload if possible, e.g. using
up/down
instead of1/0
.
I think he meant that when we send it to c8y cloud. But do we really need the bridge status to be sent to the c8y cloud?. Because when the bridge goes down, the status can't be sent to the c8y cloud.
let topic = message.topic.name.to_owned(); | ||
let payload: HashMap<String, Value> = serde_json::from_str(message.payload_str()?)?; | ||
|
||
if payload.len() > 1 { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. An option could be to translate 1/0 as up/down. However not sure that this helpful as the point is to transfer this state over the bridge. @reubenmiller what do you think about this case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The failing test has been fixed by #1730.
So I'm okay to merge this PR. Can you remove the useless check on the message size, though? Thanks.
let payload_str = message | ||
.payload_str() | ||
.unwrap_or(r#""type":"thin-edge.io","status":"down""#); | ||
|
||
let health_status = serde_json::from_str(payload_str).unwrap_or_else(|_| HealthStatus { | ||
service_type: "unknown".to_string(), | ||
status: "down".to_string(), | ||
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid having to provide twice the default values (and involuntary introducing a difference), one can chain function calls.
let payload_str = message | |
.payload_str() | |
.unwrap_or(r#""type":"thin-edge.io","status":"down""#); | |
let health_status = serde_json::from_str(payload_str).unwrap_or_else(|_| HealthStatus { | |
service_type: "unknown".to_string(), | |
status: "down".to_string(), | |
}); | |
let health_status = message | |
.payload_str() | |
.and_then(serde_json::from_str) | |
.unwrap_or_else(|_| HealthStatus { | |
service_type: "unknown".to_string(), | |
status: "down".to_string(), | |
}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay to not have chaining. However: fix the default values. This must be the same line 61 and 64.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
a743ade
to
4af8648
Compare
@@ -1776,6 +1776,62 @@ async fn mapper_updating_the_inventory_fragments_from_file() { | |||
sm_mapper.abort(); | |||
} | |||
|
|||
#[tokio::test(flavor = "multi_thread", worker_threads = 2)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that test really require 2 threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
); | ||
} | ||
|
||
#[tokio::test(flavor = "multi_thread", worker_threads = 2)] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does that test really require 2 threads?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Below table gives more information about the `device type`, `thin-edge` topic, `health status` message format | ||
and mapping to the `cumulocity` cloud message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not obvious for a user to guess what he has to do. I would focus on the tegde/health
topics telling the user when his software/service/child-device has to send a health status and how. In a second step I would tell what is done by thin-edge or more precisely the c8y mapper and how this can be observed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Updated the doc.
|
||
## Send the health status of a service to `tedge/health` topic. | ||
|
||
Below table lists the MQTT topic to which the health status message should be sent, and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Below table lists the MQTT topic to which the health status message should be sent, and the | |
The table below lists the MQTT topics to which the health status message should be sent, and the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
The above message says that the `tedge-mapper-c8y` is `up` and the `type` of the service is `thin-edge.io`. | ||
|
||
|
||
For example, to monitor the health of a `docker` service that is running on an `external-sensor` child device from cumulocity, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to repeat "For example" as you are still in the example from the previous statement.
Also we don't have to mention that the child device is connected to c8y.
For example, to monitor the health of a `docker` service that is running on an `external-sensor` child device from cumulocity, | |
To monitor the health of a `docker` service that is running on an `external-sensor` child device, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
> Note: The `status` here can be `up or down` or any other string. For example, `unknown`. | ||
|
||
For example, to monitor the health status of a `tedge-mapper-c8y service` that is running on a `thin-edge.io` device | ||
from cumulocity, one has to send the below message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We don't need to mention the Cumulocity bit as it is assumed already.
from cumulocity, one has to send the below message. | |
one has to send the below message. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
The `tedge-mapper-c8y` will translate the `health status` message that is received on `tedge/health/#` | ||
topic to `Cumulocity` specific `service monitor` message and sends it to `Cumulocity` cloud. | ||
|
||
The table below gives more information about the `cumulocity topic`, and the `translated service monitor` message for both `thin-edge` as well as for `child` device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You don't need to to backticks for the Cumulocity IoT product name.
The table below gives more information about the `cumulocity topic`, and the `translated service monitor` message for both `thin-edge` as well as for `child` device. | |
The table below gives more information about the **Cumulocity IoT** topic and the translated service monitor message for both `thin-edge` as well as for `child` device. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
# How to monitor health of service from Cumulocity IoT | ||
|
||
The health of a `thin-edge.io` service or any other `service` that is running on the `thin-edge.io` device | ||
or on the `child` device can be monitored from the Cumulocity by sending the `health-status` message to `Cumulocity IoT`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
General change (in multiple places), we should used the markdown **Cumulocity IoT**
when referring to Cumulocity (e.g. in bold, and always using a capital letter for Cumulocity)
or on the `child` device can be monitored from the Cumulocity by sending the `health-status` message to `Cumulocity IoT`. | |
or on the `child` device can be monitored from the **Cumulocity IoT** by sending the `health-status` message to **Cumulocity IoT**. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
from cumulocity, one has to send the below message. | ||
|
||
``` | ||
tedge mqtt pub tedge/health/tedge-mapper-c8y `{"status":"up","type":"thin-edge.io"}` -q 2 -r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might make more sense to use the generic service
value for the type
field (also in the second example).
tedge mqtt pub tedge/health/tedge-mapper-c8y `{"status":"up","type":"thin-edge.io"}` -q 2 -r | |
tedge mqtt pub tedge/health/tedge-mapper-c8y `{"status":"up","type":"service"}` -q 2 -r |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved
Signed-off-by: Pradeep Kumar K J <pradeepkumar.kj@softwareag.com>
ec82828
to
97cca1c
Compare
Proposed changes
These code changes support monitoring of the
thin-edge
device service as well as thechild
devices service on c8y.The
tedge-mapper-c8y
receives the health status from the thin-edge c8y services (tedge-agent, c8y-log-plugin, etc) or any child device service and translates into a smart rest (102) message, and sends it to c8y.tedge/health/<service-name>
topictedge/health/<child-id>/<service-name>
topicThe health status will be published to c8y only on state change.i.e on start (up) or stop(down).
Types of changes
Paste Link to the issue
#1353
Checklist
cargo fmt
as mentioned in CODING_GUIDELINEScargo clippy
as mentioned in CODING_GUIDELINESFurther comments