-
Notifications
You must be signed in to change notification settings - Fork 53
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Move c8y-mapper to new health topics #2276
Conversation
Robot Results
|
eb9bec8
to
d042c6f
Compare
Codecov Report
Additional details and impacted files
|
a194320
to
1c5a9ec
Compare
1c5a9ec
to
d8360ae
Compare
d8360ae
to
f6cf40f
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I still need some time to finalize this review.
/// | ||
/// It's most often in a format `device/DEVICE_NAME/service/SERVICE_NAME`. | ||
#[derive(Debug, Clone, Hash, PartialEq, Eq)] | ||
pub struct ServiceTopicId(EntityTopicId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not convinced by this type.
Why would we have a specific type to identify services but not devices?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also addressed it here.
We can have a specific type to identify devices as well, just by the scope of this PR I thought it was a good place to add one for a service. I think as we plan to support custom MQTT schemes, where any identifier can be either a device or a service, it makes sense to save this information by encoding it in a type. Granted, producers of these wrappers are currently not well defined, so maybe this is a premature abstraction, but ultimately, I think these ServiceTopicId
/DeviceTopicId
wrapper types could be produced by the entity store (that is, they can be derived from EntityType
which is a part of EntityMetadata
), or derived from a default topic scheme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not convinced with these abstractions because they assume things about the topic. An EntityTopicId
in itself doesn't describe whether it's a device or not. It's the registration message that dictates whether that topic maps to a device or service. So, we shouldn't overload the topic abstraction itself with that information.
And yes, we have default topic schemes for devices and services, but they're just specializations of EntityTopicId
and can easily be created with helpers like default_child_device
, default_main_service
etc. So, I'm really not sure if these additional types are required. There's already enough confusion around all the IDs like EntityTopicId
, EntityExternalId
etc. Why add more for very little value addition?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Responded in this comment, if you think this abstraction adds little value I can remove ServiceTopicId
and use EntityTopicId
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As of now, I see indeed little value in ServiceTopicId
and DeviceTopicId
. However, this is not blocking.
What really add values is the struct Service
which combines the EntityTopicId
of a service with the EntityTopicId
of its parent device removing the need for guessing one from the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally understood what is worrying me. This struct ServiceTopicId
and even the struct Service
are halfway between EntityTopicId
and EntityMetadata
.
EntityTopicId
s are just identifiers of sources & targets of MQTT messages - with no assumption of the nature of these sources and targets.EntityMetadata
s describe these entities - notably providing their type and their parent if any.
So, if a feature needs to know the type of entity or its parent, then the implementation must be given an EntityMetadata
not a decorated EntityTopicId
.
Hence, now I do think that it would be better to not introduce neither ServiceTopicId
nor DeviceTopicId
and not even Service
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if a feature needs to know the type of entity or its parent, then the implementation must be given an
EntityMetadata
not a decoratedEntityTopicId
.
Yeah, I agree that EntityMetadata
is probably the closest to what we eventually want to have. One problem with it though is that this type is designed for flat storage in a vector, i.e. parent
field being just a string key that one can look in a hashmap - this is far simpler than using references. So currently, if we want to see who is the parent in EntityMetadata
, we have to call .get()
again on the entity store.
The solution for this I think would be to have an Entity
struct, which is models the entity so as to make invalid states unrepresentable and contain any other relevant entities, e.g. a parent, so one wouldn't have to call entity store again. This is similar to .ancestors()
method, but instead of returning IDs, it could return objects with all necessary information.
For now though, as agreed, I won't be making any sweeping API changes and just do it with EntityMetadata
directly, in order to have full picture of the pain points before resolving them.
/// If in a default MQTT scheme, returns a device topic id of this service. | ||
pub fn to_device_topic_id(&self) -> Option<DeviceTopicId> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We really should avoid these methods that assume a default topic scheme.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, this assumption of the default topic scheme, spread everywhere with no good way to indicate it (i.e. using a comment or a custom .expect()
message, instead of better utilising the type system) leads to some unreliable code, and I think implementing a solution to this should be a priority. I'll look at the current usages and make a ticket.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is why I see more values in the struct Service
which removes the need for this kind of guessing work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See this comment #2276 (comment).
Now I do think that any use of a Service
should be replaced by an EntityMetadata
.
825c291
to
4c88266
Compare
/// | ||
/// It's most often in a format `device/DEVICE_NAME/service/SERVICE_NAME`. | ||
#[derive(Debug, Clone, Hash, PartialEq, Eq)] | ||
pub struct ServiceTopicId(EntityTopicId); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm still not convinced with these abstractions because they assume things about the topic. An EntityTopicId
in itself doesn't describe whether it's a device or not. It's the registration message that dictates whether that topic maps to a device or service. So, we shouldn't overload the topic abstraction itself with that information.
And yes, we have default topic schemes for devices and services, but they're just specializations of EntityTopicId
and can easily be created with helpers like default_child_device
, default_main_service
etc. So, I'm really not sure if these additional types are required. There's already enough confusion around all the IDs like EntityTopicId
, EntityExternalId
etc. Why add more for very little value addition?
93b1e24
to
f509718
Compare
3862236
to
3fbb470
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The tedge_to_te_converter
must stop converting tedge/health/#
messages.
See https://github.com/thin-edge/thin-edge.io/pull/2276/files#r1362176975
/// to set the status of). | ||
/// | ||
/// https://cumulocity.com/guides/reference/smartrest-two/#104 | ||
pub fn service_status_update_message(external_ids: &[String], service_status: &str) -> Message { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To avoid any confusion, as described in the comments above, and to keep this API consistent with the other APIs in this module like service_creation_message
, I'd just take the service_id
and ancestors
as separate parameters, the latter without the target service_id
in it. But while passing it to publish_topic_from_ancestors
, you'll have to append them.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like do do it properly later as part of a follow-up task.
Should be fixed in 87ac7f8. |
@@ -30,9 +30,6 @@ impl TedgetoTeConverter { | |||
} | |||
topic if topic.name.starts_with("tedge/events") => self.convert_event(message), | |||
topic if topic.name.starts_with("tedge/alarms") => self.convert_alarm(message), | |||
topic if topic.name.starts_with("tedge/health") => { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I understand that this was removed to avoid a certain bug discussed in an earlier comment. But, I'm wondering about the impact of this change on backward compatibility. I was of the impression that these conversions exist to make the transition of third-party component written against the old APIs smoother on our next release where they'll continue publishing to the older topics and we'll internally map them with this tedge_to_te_converter
. So, removing this conversion support for tedge/health
will break all third-party components publishing to the old health topics.
But, if tedge/health
was only an internal API in the earlier releases, meant only to be used by our own components and not by third-party components, then this is okay. I'm just not sure if it was just an internal API. @didier-wenzek could you confirm?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's true that we could only exclude a white-list of tedge daemons. However I don't think this API has been used by many as it was not documented beyond tedge daemons.
@@ -99,26 +99,6 @@ Check health status of tedge-mapper-c8y service on broker restart | |||
|
|||
Custom Test Teardown | |||
|
|||
Check health status of child device service |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why was this test removed and not just adapted to use the new health endpoints? I understand that this test probably was relying on the tedge_to_te_converter
earlier and that's why you removed. But, we could have just updated it to use the new te/.../health/status
endpoints directly, unless this is duplicate and a test for child device service health status already exists elsewhere.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I may have made a mistake with that one. We agreed with @reubenmiller that health status messages should not change entity properties, so it follows that they shouldn't also create entities by themselves either.
However I forgot about auto-registration mechanism, in which it is expected that an entity sending any message will be automatically registered in C8y, so the same goes for health status messages.
I'll try to restore and update the test.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approved. Thank you for this sustained effort.
Follow-up tasks:
- Introduce a new constructor for
HealthMonitorBuilder
from tedge-config to avoid repeating the same logic in all tedge daemons. - Add a user-configurable service topic id to all tedge daemons, instead of building it after the device id.
- Consider to deprecate
ServiceTopicId
,EntityTopicId
andService
, replacing their usage withEntityMetadata
or intriducing anEntity
type (see Move c8y-mapper to new health topics #2276 (comment)). - Consider to deprecate this idea of a service type at the level of thin-edge
Entity
. This should be related to the inventory i.e. data attached by the user to devices and services.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving, but still review my last comment on an existing test that was removed, which might reduce our test coverage. I'm just hoping that it was removed for genuine reasons.
Also, create a list of planned follow-up tasks in a comment, or even better as issues. Group trivial and related ones together and feel free to raise multiple issues, if some are going to be more time consuming than others, and unrelated to other changes.
87ac7f8
to
c2b618e
Compare
Created a list of follow-up tasks in this issue: #2274 |
This commit updates tedge-mapper-c8y to subscribe to new health status topics: ``` tedge/health/SERVICE_NAME -> te/device/main/service/SERVICE_NAME/status/health tedge/health/CHILD_ID/SERVICE_NAME -> te/device/CHILD_ID/service/SERVICE_NAME/status/health ``` Producers of health status topics (i.e. different thin-edge services) are not in the scope of this PR, however the tests for tedge-agent were failing because after it terminated, the topic converter could not convert a last will message to the new health status topic. As such, only tedge-agent was updated to publish its health status messages to a new health topic, introducing a new health topic API that will be used by other services, however for now they still publish to old health status topics, and these messages are converted by the tedge-agent's topic converter. Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
A new `Service` type was created, containing topic ids of the service itself and the associated device. Functions which need to take an entity topic id of a service can use this type to ensure that the given argument is indeed a service, as well as to obtain the topic id of the associated device. Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Thank you |
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
Signed-off-by: Marcel Guzik <marcel.guzik@inetum.com>
c2b618e
to
2a8cccc
Compare
Proposed changes
This PR updates most of the services to use new health status topics:
Health check commands are not part of this PR and will be done in a follow-up PR.
tedge_to_te_converter
mapping oftedge/health
tote/+/+/+/+/health/status
is still active so e.g. cloud bridges can still publish messages to old topics but be converted to new ones.Types of changes
Paste Link to the issue
te/
#2274: This PR completes half of the scope of this ticket by moving c8y-mapper and other services to new health status messagete
MQTT topics. #2308: tedge-watchdog was updated to read health status messages from new topics, but still uses oldtedge/health-check
topic for health check commandsChecklist
cargo fmt
as mentioned in CODING_GUIDELINEScargo clippy
as mentioned in CODING_GUIDELINESFurther comments
Follow-up tasks