Skip to content

Conversation

marciw
Copy link
Contributor

@marciw marciw commented Sep 30, 2025

Part of https://github.com/elastic/docs-team/issues/31?issue=elastic%7Cdocs-team%7C41

Status

🟒 Ready for PM/engineer review
🚧 Not ready for tech writer review

❗ Note for reviewers: We're going for "MVP" docs for now and tracking additional improvements in #3179

Changes

  • Revised overview: simplified, clarified
  • Revised setup: removed component templates, simplified
  • New advanced section (reindex, advanced concepts)

TODO:

  • Reconcile with recent changes to general data stream docs
  • Check docs patterns/style/etc.

@kkrik-es kkrik-es requested a review from felixbarny October 1, 2025 06:54
@marciw marciw changed the title [WIP] Remaining TSDS edits Edit time series docs for clarity Oct 5, 2025
@marciw marciw marked this pull request as ready for review October 5, 2025 22:12
@marciw marciw requested review from a team as code owners October 5, 2025 22:12
@marciw
Copy link
Contributor Author

marciw commented Oct 5, 2025

@kkrik-es @gmarouli Thanks for all your comments! I think I've resolved everything but please take another quick look.

manage-data/data-store/data-streams/time-series-data-stream-tsds.md
manage-data/data-store/data-streams/set-up-tsds.md
manage-data/data-store/data-streams/time-bound-tsds.md

(or any other topics in the section, but those 3 are the main ones)

@marciw marciw requested review from kkrik-es and gmarouli October 5, 2025 22:44
Metrics differ from dimensions in that while dimensions generally remain constant, metrics are expected to change over time, even if rarely or slowly.
:::{tip}
Metrics are expected to change (even if rarely or slowly), while dimensions generally remain constant.
:::
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can probably remove this note. Dimension values may also change, e.g. as nodes join and leave a cloud deployment.

#### `_tsid` metadata field [tsid]

[Pass-through](elasticsearch://reference/elasticsearch/mapping-reference/passthrough.md#passthrough-dimensions) fields may be configured as dimension containers. In this case, their sub-fields get included to the routing path automatically.
The `_tsid` is an automatically generated object containing the document’s dimensions. It's intended for internal {{es}} use, so in most cases you won't need to work with it. The format of the `_tsid` field is subject to change.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
The `_tsid` is an automatically generated object containing the document’s dimensions. It's intended for internal {{es}} use, so in most cases you won't need to work with it. The format of the `_tsid` field is subject to change.
The `_tsid` is an automatically generated object derived from the document’s dimensions. It's intended for internal {{es}} use, so in most cases you won't need to work with it. The format of the `_tsid` field is subject to change.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"containing" is not accurate (it used to contain parts of the dimension values but not any more).. it's calculated using all dimension values per doc.

navigation_title: "Querying"
products:
- id: elasticsearch
---
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can probably include a reference to the TS command here as tech preview.

Copy link
Contributor

@kkrik-es kkrik-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Amazing, thanks!


A TSDS document is uniquely identified by its time series and timestamp, both of which are used to generate the document `_id`. So, two documents with the same dimensions and the same timestamp are considered to be duplicates. When you use the `_bulk` endpoint to add documents to a TSDS, a second document with the same timestamp and dimensions overwrites the first. When you use the `PUT /<target>/_create/<_id>` format to add an individual document and a document with the same `_id` already exists, an error is generated.
:::{tip}
{{es}} uses dimensions and timestamps to generate time series document `_id` values. Two documents with the same dimensions and timestamp are considered duplicates.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe this is a bit too much detail. But 409s (either due to actual duplicates or due to misconfiguration) are a common issue and are difficult to debug. So I think briefly mentioning the symptom could help. Alternatively, we could also add it to a section of common issues. This can definitely be a follow-up.

Suggested change
{{es}} uses dimensions and timestamps to generate time series document `_id` values. Two documents with the same dimensions and timestamp are considered duplicates.
{{es}} uses dimensions and timestamps to generate time series document `_id` values. Two documents with the same dimensions and timestamp are considered duplicates. Duplicates are rejected during ingestion with a `409 Conflict` status.

- To define a metric, use the `time_series_metric` mapping parameter. For more details, refer to [Metrics](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md#time-series-metric).
- (Optional) Define a `date` or `date_nanos` mapping for the `@timestamp` field. If you don't specify a mapping, {{es}} maps `@timestamp` as a `date` field with default options.
* (Optional) Other index settings, such as [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas), for the data stream's backing indices.
- A priority higher than `200`, to avoid [collisions](/manage-data/data-store/templates.md#avoid-index-pattern-collisions) with built-in templates.
Copy link
Contributor

@gmarouli gmarouli Oct 6, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we need to add here a lifecycle management section, at least for stateful, in serverless, this is enforced. I would recommend adding the following in the template:

"lifecycle": {
  "enabled": true
}

The main reason we need this is rollover, if a user doesn't add this they are going to end up with a gigantic index. Everything else is optional, we can leave it under the advanced set-up.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second this - And I think that on the advanced setup we should at least explain why lifecycle management is useful for setting up rollover (quick one sentence to give them a reason on why the provided links are useful if they do not know about them)

---

# Reindex a TSDS [tsds-reindex]
# Reindex a time series data stream [tsds-reindex]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@kkrik-es if I understand correctly, this reindexing manual is suggesting to reindex the data of one data stream into a single backing index of another data stream. Right?

If this is true, then I think we need to add a disclaimer before a user gets unpleasantly surprised.

We could also mention the reindex data stream API that was added for upgrades, I will check if it works.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah not ideal.. It's orthogonal to this PR tho, let's file an issue to provide a better path (I thought we had one..)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should at least mention that the result will be a single index, like a note box or something, @marciw what do you think.

The rest is indeed orthogonal to this PR.

* One or more [metric fields](#time-series-metric)
* An auto-generated document `_id` (custom `_id` values are not supported)
* **Backing indices:** A TSDS uses [time-bound indices](/manage-data/data-store/data-streams/time-bound-tsds.md) to store data from the same time period in the same backing index.
* **Dimension-based routing:** The routing logic uses dimension fields to map data to shards, improving storage efficiency and query performance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also, consider linking to the corresponding section in time-bound-tsds.md.

Suggested change
* **Dimension-based routing:** The routing logic uses dimension fields to map data to shards, improving storage efficiency and query performance.
* **Dimension-based routing:** The routing logic uses dimension fields to map all data points of a time series to the same shard, improving storage efficiency and query performance, and ensuring that duplicate data points are rejected.

```

Most time series data contains repeated values. Dimensions are repeated across documents in the same time series. The metric values of a time series may also change slowly over time.
You can use the {{esql}} [`TS` command](elasticsearch://reference/query-languages/esql/commands/ts.md) (in technical preview) to query time series data streams. The `TS` command is optimized for time series data. It also enables the use of aggregation functions that efficiently process metrics per time series, before aggregating results.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe remove the (in technical preview). The TS docs also contain that and it's another thing we need to remember to keep in sync when TS goes beta or GA.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kinda like the note here, so that people don't miss that it shouldn't be used in production yet. We'll hopefully update it when time comes as it's very prominent.

"routing_path": [ "metricset" ]
}
"index.mode": "time_series",
"index.routing_path": ["dimension_field"]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This setting gets auto-populated, let's remove it.

Copy link
Contributor

@yannis-roussos yannis-roussos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you Marci, looks great! I have added a few minor comments for your consideration.

Small additional thing I noticed: The link in the "Create a data stream and add sample data" section of the quickstart is now broken as we have moved the accepted time range section to the Time-bound indices page

* The matching index template for a TSDS must contain the `index.routing_path` index setting. A TSDS uses this setting to perform [dimension-based routing](#dimension-based-routing).
* A TSDS uses internal [index sorting](elasticsearch://reference/elasticsearch/index-settings/sorting.md) to order shard segments by `_tsid` and `@timestamp`.
* TSDS documents only support auto-generated document `_id` values. For TSDS documents, the document `_id` is a hash of the document’s dimensions and `@timestamp`. A TSDS doesn’t support custom document `_id` values.
* A TSDS uses [synthetic `_source`](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source), and as a result is subject to some [restrictions](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-restrictions) and [modifications](elasticsearch://reference/elasticsearch/mapping-reference/mapping-source-field.md#synthetic-source-modifications) applied to the `_source` field.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that we should still mention this - it could potentially cause problems to people not aware that synthetic source is enabled

- To define a metric, use the `time_series_metric` mapping parameter. For more details, refer to [Metrics](/manage-data/data-store/data-streams/time-series-data-stream-tsds.md#time-series-metric).
- (Optional) Define a `date` or `date_nanos` mapping for the `@timestamp` field. If you don't specify a mapping, {{es}} maps `@timestamp` as a `date` field with default options.
* (Optional) Other index settings, such as [`index.number_of_replicas`](elasticsearch://reference/elasticsearch/index-settings/index-modules.md#dynamic-index-number-of-replicas), for the data stream's backing indices.
- A priority higher than `200`, to avoid [collisions](/manage-data/data-store/templates.md#avoid-index-pattern-collisions) with built-in templates.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I second this - And I think that on the advanced setup we should at least explain why lifecycle management is useful for setting up rollover (quick one sentence to give them a reason on why the provided links are useful if they do not know about them)

marciw and others added 2 commits October 7, 2025 16:52
Co-authored-by: Yannis Roussos <yannis.roussos@elastic.co>
Co-authored-by: Yannis Roussos <yannis.roussos@elastic.co>
Adds docs for the new OTLP endpoint added via
elastic/elasticsearch#133057

Closes #3363

---------

Co-authored-by: Fabrizio Ferri-Benedetti <fabri.ferribenedetti@elastic.co>
Co-authored-by: Kostas Krikellas <131142368+kkrik-es@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants