Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

encoding/: Adopt serde style encoding #105

Merged
merged 1 commit into from
Dec 10, 2022

Conversation

mxinden
Copy link
Member

@mxinden mxinden commented Oct 16, 2022

Adopt encoding style similar to serde. EncodeXXX for a type that can be
encoded (see serde's Serialize) and XXXEncoder for a supported encoding i.e.
text and protobuf (see serde's Serializer).

  • Compatible with Rust's additive features. Enabling the protobuf feature does
    not change type signatures.
  • EncodeMetric is trait object safe, and thus Registry can use dynamic dispatch.
  • Implement a single encoding trait EncodeMetric per metric type instead of
    one implementation per metric AND per encoding format.
  • Leverage std::fmt::Write instead of std::io::Write. The OpenMetrics text
    format has to be unicode compliant. The OpenMetrics Protobuf format requires
    labels to be valid strings.

Builds on top of #100.

Status: It compiles. Still a couple of rough edges. Tests don't yet compile.

Copy link
Contributor

@thomaseizinger thomaseizinger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Did an initial, high-level pass!

This looks like a good step forward, I still need to dig into the details :)
Thanks for giving this a shot!

src/encoding.rs Outdated
#[cfg(feature = "protobuf")]
pub mod proto;
pub mod text;

// TODO: Rename to MetricEncode to be consistent with MetricEncoder.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Everything we do are metrics in here, yes? Can we drop the Metric prefix?

Protobuf(proto::LabelKeyEncoder<'a>),
}

impl<'a> From<text::LabelKeyEncoder<'a>> for LabelKeyEncoder<'a> {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd expect that these are only used within this crate (and not exported publicly) so I am not sure we need these impls.

let (key, value) = self;

let mut label_key_encoder = encoder.encode_label_key();
key.encode(&mut label_key_encoder)?;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In serde, I believe this is flipped: The serializer has functions for writing certain types, like Serializer::serialize_str: https://docs.rs/serde/latest/serde/trait.Serializer.html#tymethod.serialize_str

Would it make sense to do the same thing here? Our "data format" are metrics so I think it would make sense if an encoder has a function "write_label" and you can pass a string to it.

I still need to dig into this in detail so excuse if it doesn't make any sense!

) -> Result<(), std::fmt::Error>
where
S: EncodeLabelSet,
CounterValue: EncodeValue,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of dispatching to the value here, we could have one function on MetricEncoder per type and we'd probably just need to macro_rules the counter definition for each supported value type.

Not sure if worth it but I think it would reduce some indirection in dispatching between all these things.

@thomaseizinger
Copy link
Contributor

I had a closer look at this and I found the the decomposition of the functions on MetricEncoder a bit confusing to understand. What do you think about an approach where we provide more concrete functions on MetricEncoder like 120cfb0?

I think this could drastically simplify, how often we need to dispatch and e.g. also solves the issue that the protobuf encoding does not know about suffixes. On the flip side, this will decrease the flexibility a bit because users cannot make arbitrary suffixes anymore. This feature is a bit of an illusion anyway because it only works with the text encoding.

@mxinden
Copy link
Member Author

mxinden commented Nov 2, 2022

I had a closer look at this and I found the the decomposition of the functions on MetricEncoder a bit confusing to understand. What do you think about an approach where we provide more concrete functions on MetricEncoder like 120cfb0?

Thanks for looking into this!

I think that makes sense. Would you agree with me suggesting that we can do this as a follow up @thomaseizinger? I would like this pull request to make the architectural change only. Breaking individual methods can happen in a one-by-one case later on, ideally in a deprecate-then-remove style in case users already depend on it.

@thomaseizinger
Copy link
Contributor

I had a closer look at this and I found the the decomposition of the functions on MetricEncoder a bit confusing to understand. What do you think about an approach where we provide more concrete functions on MetricEncoder like 120cfb0?

Thanks for looking into this!

I think that makes sense. Would you agree with me suggesting that we can do this as a follow up @thomaseizinger? I would like this pull request to make the architectural change only. Breaking individual methods can happen in a one-by-one case later on, ideally in a deprecate-then-remove style in case users already depend on it.

I suggested it because I think it would really simplify the implementation and thus ease the review of this PR.

I get the desire to keep the PR small. Unfortunately, we have to add a lot of code to retain some of the interfaces. I think this PR could be smaller if we just break those.

@mxinden
Copy link
Member Author

mxinden commented Nov 16, 2022

I had a closer look at this and I found the the decomposition of the functions on MetricEncoder a bit confusing to understand. What do you think about an approach where we provide more concrete functions on MetricEncoder like 120cfb0?

Gave this more thought and tested it out. See e030f76. As you predicted, this simplifies things a lot.

As always, thanks @thomaseizinger for the great suggestion.

There are still a couple of TODOs, but I see light at the end of the tunnel.

@thomaseizinger
Copy link
Contributor

I had a closer look at this and I found the the decomposition of the functions on MetricEncoder a bit confusing to understand. What do you think about an approach where we provide more concrete functions on MetricEncoder like 120cfb0?

Gave this more thought and tested it out. See e030f76. As you predicted, this simplifies things a lot.

That is great to hear! I think we can do even better though. Let me know what you think of thomaseizinger@d2ad9df.

@mxinden
Copy link
Member Author

mxinden commented Nov 18, 2022

I had a closer look at this and I found the the decomposition of the functions on MetricEncoder a bit confusing to understand. What do you think about an approach where we provide more concrete functions on MetricEncoder like 120cfb0?

Gave this more thought and tested it out. See e030f76. As you predicted, this simplifies things a lot.

That is great to hear! I think we can do even better though. Let me know what you think of thomaseizinger@d2ad9df.

Wonderful simplification. Thanks for proposing a patch. Only issue I see is that it makes certain invalid states possible at compile time. See thomaseizinger@d2ad9df#r90329584.

@thomaseizinger
Copy link
Contributor

I sent a PR here: mxinden#1

Adopt encoding style similar to serde. `EncodeXXX` for a type that can be
encoded (see serde's `Serialize`) and `XXXEncoder` for a supported encoding i.e.
text and protobuf (see serde's `Serializer`).

- Compatible with Rust's additive features. Enabling the `protobuf` feature does
not change type signatures.
- `EncodeMetric` is trait object safe, and thus `Registry` can use dynamic dispatch.
- Implement a single encoding trait `EncodeMetric` per metric type instead of
one implementation per metric AND per encoding format.
- Leverage `std::fmt::Write` instead of `std::io::Write`. The OpenMetrics text
format has to be unicode compliant. The OpenMetrics Protobuf format requires
labels to be valid strings.

Signed-off-by: Max Inden <mail@max-inden.de>
@howardjohn
Copy link
Contributor

Is there a reason this is adopting a serde style rather than directly using Serde (#97)? The motivation behind that would be that serde already has a large ecosystem behind it, with plenty of helpers to construct Serialize implementations like #[flatten], etc

@thomaseizinger
Copy link
Contributor

Not every type that implements Serialize is a valid metric.

Maybe we could use serde internally but even that is tricky because protobuf doesn't map cleanly to serde so I am struggling to see the benefit!

How do you expect the ecosystem support of serde to benefit this prometheus client implementation?

@howardjohn
Copy link
Contributor

That makes sense, thanks for the explanation. My main interest was from more advanced derive support; in particular I want #[serde(flatten)] for now, but possibly other customizations in the future. While these could be solved by alternative mechanisms, it seemed nice to just piggy-back on Serde

@08d2
Copy link

08d2 commented Dec 6, 2022

Each Prometheus metric encoding would map to a unique Serde data format, all distinct from e.g. raw Protobufs or whatever. Only a well-defined set of encodings (data formats) would be acceptable representations of metrics. One couldn't derive an arbitrary data format for a type and expect it to work. (If this was already understood, apologies!)

@thomaseizinger
Copy link
Contributor

That makes sense, thanks for the explanation. My main interest was from more advanced derive support; in particular I want #[serde(flatten)] for now, but possibly other customizations in the future. While these could be solved by alternative mechanisms, it seemed nice to just piggy-back on Serde

Can you please explain further what you are trying to achieve? Like, post the type you are trying to write (with serde annotations) and the corresponding prometheus text encoding that you are expecting.

@howardjohn
Copy link
Contributor

howardjohn commented Dec 6, 2022 via email

@thomaseizinger
Copy link
Contributor

Okay, thanks for sharing this. It makes sense that we support something like this but we can't do it via serde, simply because there is no way of how we can possible map all types that are Serialize to a valid prometheus text encoding.

With the features introduced in this PR, it will be possible to implement a #[labels(flatten)] attribute in the custom derive. Until then, you can workaround it by making a constructor that accepts Common and flattens out the fields manually. That is only a few more LoC and provides roughly the same convenience upon constructing the label set.

@howardjohn
Copy link
Contributor

In real world we have about 10 metrics with 15 common labels so it ends up being a bit of duplication. Anyhow, I don't want to derail this PR. Thanks for the help and explaination, I'll open an issue to track adding a flatten attribute.

@mxinden
Copy link
Member Author

mxinden commented Dec 10, 2022

With the features introduced in this PR, it will be possible to implement a #[labels(flatten)] attribute in the custom derive.

That would be my preference as well.

@howardjohn let's discuss how to proceed once you created the issue.

@mxinden
Copy link
Member Author

mxinden commented Dec 10, 2022

I will merge here. I tested this in conjunction with #82 on https://github.com/mxinden/kademlia-exporter/. While #82 still needs some work, this is ready to go. I suggest we cut an alpha in between to allow folks to test this patch and the new protobuf encoding from @ackintosh.

@howardjohn
Copy link
Contributor

I opened #117 to discuss further. Thanks everyone. I may be able to submit a PR myself if I can find time + figure it out

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants