Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes #1142

Merged
merged 4 commits into from
Jul 11, 2023

Conversation

lalitb
Copy link
Member

@lalitb lalitb commented Jul 5, 2023

Changes

This is to further validate the benchmark introduced in #1121, changing the LogRecord to store attributes in Vec instead of OrderMap/IndexMap. The benchmark was done to emit the tokio-tracing log to user-event exporter.
As an example, to emit the error log with below attributes:

error!(
        event_name = "my-event-name",
        event_id = 20,
        user_name = "otel user",
        user_email = "otel@opentelemetry.io",
        login_success =  false
)

With OrderMap:
It takes around ~1000 ns to emit a single event.
With Vector:
It takes around ~700 ns to emit a single event.

The user_events tracepoint was disabled, so no actual export was happening. Which means most of the above overhead was coming from logs SDK (and not the exporter).

The user_events export iterate over the attributes to serialize, and doesn't require to do any lookup on particular key.

These changes come at below cost:

  • If the number of attributes are large, and exporter need to do direct lookup for particular attribute - the operation would be fast in OrderMap as compared to Vector. I feel this use-case is somewhat rare.
  • The duplicate key detection need to be done by application/instrumentation library before writing attributes.

The PR is raised to further discuss if these changes would be good for log signal, or any better suggestion. Another option I thought was to have SDK support both data structures, and let user select which one to use. Something like this (not tested):

enum AttributeData<K, V> {
    HashStore(HashMap<K, V>),
    VectorStore(Vec<(K, V)>),
}

 
impl<K, V> AttributeData<K, V> {
    fn iterator(&self) -> Box<dyn Iterator<Item = &(K, V)>> {
        match self {
            AttributeData::HashStore(index_map) => {
                Box::new(index_map.iter().map(|(k, v)| &(k, v)))
            }
            AttributeData::VectorStore(vector) => Box::new(vector.iter()),
        }
    }

  fn insert(&mut self, key: K, value: V) {
        match self {
            AttributeData::HashStore(index_map) => {
                index_map.insert(key, value);
            }
            AttributeData::VectorStore(vector) => {
                vector.push((key, value));
            }
        }
    }
}

pub struct LogRecord {
    // existing fields ..

    /// Additional attributes associated with this record
    pub attributes: Option<AttributeData<Key, AnyValue>>,
}

Merge requirement checklist

  • CONTRIBUTING guidelines followed
  • Unit tests added/updated (if applicable)
  • Appropriate CHANGELOG.md files updated for non-trivial, user-facing changes
  • Changes in public API reviewed (if applicable)

@lalitb lalitb requested a review from a team as a code owner July 5, 2023 23:04
Copy link
Contributor

@djc djc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why was an OrderMap chosen in the first place? Do we do any lookups/deduplication on this data?

@shaun-cox
Copy link
Contributor

Why was an OrderMap chosen in the first place? Do we do any lookups/deduplication on this data?

I think it was a result of #794

@lalitb
Copy link
Member Author

lalitb commented Jul 6, 2023

Do we do any lookups/deduplication on this data?

There are no lookups in the log SDK. The lookups could be part of any external exporters, but there shouldn't be significant difference in complexity - o(n) vs o(1) - for smaller number of attributes.
With this change, the deduplication would now be the responsibility of the instrumented library/application. The application needs to ensure that all the attributes are unique.

@codecov
Copy link

codecov bot commented Jul 10, 2023

Codecov Report

Patch coverage: 28.5% and project coverage change: -0.6 ⚠️

Comparison is base (3209577) 49.8% compared to head (6093b5b) 49.3%.

❗ Current head 6093b5b differs from pull request most recent head 5554e19. Consider uploading reports for the commit 5554e19 to get more accurate results

Additional details and impacted files
@@           Coverage Diff           @@
##            main   #1142     +/-   ##
=======================================
- Coverage   49.8%   49.3%   -0.6%     
=======================================
  Files        171     175      +4     
  Lines      20171   20464    +293     
=======================================
+ Hits       10061   10101     +40     
- Misses     10110   10363    +253     
Impacted Files Coverage Δ
opentelemetry-api/src/global/trace.rs 30.6% <0.0%> (+2.6%) ⬆️
opentelemetry-api/src/logs/record.rs 0.0% <0.0%> (ø)
opentelemetry-api/src/trace/noop.rs 54.8% <0.0%> (+2.5%) ⬆️
opentelemetry-api/src/trace/tracer_provider.rs 42.1% <0.0%> (-57.9%) ⬇️
opentelemetry-appender-tracing/src/layer.rs 0.0% <0.0%> (ø)
...elemetry-contrib/src/trace/exporter/jaeger_json.rs 0.0% <ø> (ø)
opentelemetry-jaeger/src/exporter/config/agent.rs 32.6% <0.0%> (ø)
opentelemetry-jaeger/src/exporter/mod.rs 57.5% <ø> (ø)
opentelemetry-jaeger/src/exporter/uploader.rs 18.1% <ø> (ø)
opentelemetry-otlp/src/metric.rs 0.0% <0.0%> (ø)
... and 9 more

... and 12 files with indirect coverage changes

☔ View full report in Codecov by Sentry.
📢 Do you have feedback about the report comment? Let us know in this issue.

@lalitb lalitb changed the title To discuss - [Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes [Logs SDK] Modify LogRecord to use Vector instead of OrderMap for attributes Jul 10, 2023
@TommyCpp TommyCpp merged commit 1f1a4fe into open-telemetry:main Jul 11, 2023
10 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants