Skip to content
Permalink
Browse files
remove arbitrary granularity spec from docs (#12460)
* remove arbitrary granularity spec from docs

* Update docs/ingestion/ingestion-spec.md

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>

Co-authored-by: Victoria Lim <vtlim@users.noreply.github.com>
  • Loading branch information
techdocsmith and vtlim committed Apr 28, 2022
1 parent df074f2 commit 42fa5c26e1f4e09692c9752ccba6126a9ae96905
Showing 2 changed files with 4 additions and 4 deletions.
@@ -298,11 +298,11 @@ A `granularitySpec` can have the following components:

| Field | Description | Default |
|-------|-------------|---------|
| type | Either `uniform` or `arbitrary`. In most cases you want to use `uniform`.| `uniform` |
| segmentGranularity | [Time chunking](../design/architecture.md#datasources-and-segments) granularity for this datasource. Multiple segments can be created per time chunk. For example, when set to `day`, the events of the same day fall into the same time chunk which can be optionally further partitioned into multiple segments based on other configurations and input size. Any [granularity](../querying/granularities.md) can be provided here. Note that all segments in the same time chunk should have the same segment granularity.<br><br>Ignored if `type` is set to `arbitrary`.| `day` |
| type |`uniform`| `uniform` |
| segmentGranularity | [Time chunking](../design/architecture.md#datasources-and-segments) granularity for this datasource. Multiple segments can be created per time chunk. For example, when set to `day`, the events of the same day fall into the same time chunk which can be optionally further partitioned into multiple segments based on other configurations and input size. Any [granularity](../querying/granularities.md) can be provided here. Note that all segments in the same time chunk should have the same segment granularity.| `day` |
| queryGranularity | The resolution of timestamp storage within each segment. This must be equal to, or finer, than `segmentGranularity`. This will be the finest granularity that you can query at and still receive sensible results, but note that you can still query at anything coarser than this granularity. E.g., a value of `minute` will mean that records will be stored at minutely granularity, and can be sensibly queried at any multiple of minutes (including minutely, 5-minutely, hourly, etc).<br><br>Any [granularity](../querying/granularities.md) can be provided here. Use `none` to store timestamps as-is, without any truncation. Note that `rollup` will be applied if it is set even when the `queryGranularity` is set to `none`. | `none` |
| rollup | Whether to use ingestion-time [rollup](./rollup.md) or not. Note that rollup is still effective even when `queryGranularity` is set to `none`. Your data will be rolled up if they have the exactly same timestamp. | `true` |
| intervals | A list of intervals defining time chunks for segments. Specify interval values using ISO8601 format. For example, `["2021-12-06T21:27:10+00:00/2021-12-07T00:00:00+00:00"]`. If you omit the time, the time defaults to "00:00:00".<br><br>If `type` is set to `uniform`, Druid breaks the list up and rounds-off the list values based on the `segmentGranularity`. If `type` is set to `arbitrary`, Druid uses the list as-is.<br><br>If `null` or not provided, batch ingestion tasks generally determine which time chunks to output based on the timestamps found in the input data.<br><br>If specified, batch ingestion tasks may be able to skip a determining-partitions phase, which can result in faster ingestion. Batch ingestion tasks may also be able to request all their locks up-front instead of one by one. Batch ingestion tasks throw away any records with timestamps outside of the specified intervals.<br><br>Ignored for any form of streaming ingestion. | `null` |
| intervals | A list of intervals defining time chunks for segments. Specify interval values using ISO8601 format. For example, `["2021-12-06T21:27:10+00:00/2021-12-07T00:00:00+00:00"]`. If you omit the time, the time defaults to "00:00:00".<br><br>Druid breaks the list up and rounds off the list values based on the `segmentGranularity`.<br><br>If `null` or not provided, batch ingestion tasks generally determine which time chunks to output based on the timestamps found in the input data.<br><br>If specified, batch ingestion tasks may be able to skip a determining-partitions phase, which can result in faster ingestion. Batch ingestion tasks may also be able to request all their locks up-front instead of one by one. Batch ingestion tasks throw away any records with timestamps outside of the specified intervals.<br><br>Ignored for any form of streaming ingestion. | `null` |

### `transformSpec`

@@ -251,7 +251,7 @@ If we were not using rollup, all columns would be specified in the `dimensionsSp
At this point, we are done defining the `parser` and `metricsSpec` within the `dataSchema` and we are almost done writing the ingestion spec.

There are some additional properties we need to set in the `granularitySpec`:
* Type of granularitySpec: `uniform` and `arbitrary` are the two supported types. For this tutorial, we will use a `uniform` granularity spec, where all segments have uniform interval sizes (for example, all segments cover an hour's worth of data).
* Type of granularitySpec: the `uniform` granularity spec defines segments with uniform interval sizes. For example, all segments cover an hour's worth of data.
* The segment granularity: what size of time interval should a single segment contain data for? e.g., `DAY`, `WEEK`
* The bucketing granularity of the timestamps in the time column (referred to as `queryGranularity`)

0 comments on commit 42fa5c2

Please sign in to comment.