Skip to content

Add documentation re alphabetical sorted of MV dimensions#10695

Merged
clintropolis merged 1 commit intoapache:masterfrom
lkm:hotfix/mv-doc
May 7, 2021
Merged

Add documentation re alphabetical sorted of MV dimensions#10695
clintropolis merged 1 commit intoapache:masterfrom
lkm:hotfix/mv-doc

Conversation

@lkm
Copy link
Contributor

@lkm lkm commented Dec 20, 2020

Simple updating of documentation. It's very relevant to know that that multi-value dimensions are sorted by default and that it's possible to change this behaviour.

To someone more intune with Druid's inner workings: Do we really want alphabetical sorted to be the default for MV dimensions? Would it not be better to default to ARRAY in DimensionSchema L89?

Description

Fixed the bug ...

Renamed the class ...

Added a forbidden-apis entry ...


This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths, ensuring the threshold for code coverage is met.
  • added integration tests.
  • been tested in a test Druid cluster.

Key changed/added classes in this PR
  • MyFoo
  • OurBar
  • TheirBaz

Copy link
Contributor

@techdocsmith techdocsmith left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. Please fix link style.

Copy link
Contributor

@techdocsmith techdocsmith Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
| multiValueHandling | Specify the type of handling multi-value fields will get, possible values are `SORTED_ARRAY`, `SORTED_SET` and `ARRAY`. The first two will order the array upon ingestion and `SORTED_SET` will remove duplicates. `ARRAY` will ingest data as-is | `SORTED_ARRAY` |
| multiValueHandling | Specify the type of handling for multi-value fields. Possible values are `SORTED_ARRAY`, `SORTED_SET`, and `ARRAY`. `SORTED_ARRAY` and `SORTED_SET` order the array upon ingestion. `SORTED_SET` removes duplicates. `ARRAY` ingests data as-is | `SORTED_ARRAY` |

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
characters), by default they are sorted alphabetically, see [Dimension Objects](../ingestion/index.html#dimension-objects) for configuration.
characters), by default Druid lists the values in alphabetical order, see [Dimension Objects](../ingestion/index.md#dimension-objects) for configuration.

Please use .md for links. Also, it's not clear to me what the related information is at the destination. But I'm a still fairly new to Druid.

Copy link
Contributor Author

@lkm lkm Dec 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @techdocsmith, I've updated the commit with your recommendations. "Lists" is the wrong verb here, "ingests" would be more appropriate. The referred section is the above committed detail about multiValueHandling.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also .md should be .html, check the rest of the reference in the file. The markdown is compiled for the web version of the doc.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lkm , please see: #10528 Using .md is more flexible for viewing docs locally and for publishing to the web

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the context @techdocsmith, I've changed it to .md

@lkm lkm force-pushed the hotfix/mv-doc branch 2 times, most recently from bb0ebb0 to 77595db Compare December 23, 2020 08:12
Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I double-checked to make sure there are tests for this feature, and it looks like there are, so I think it's good to add the docs. (We wouldn't want to add docs for an untested feature!)

A couple of comments and then I think this is good to go.

@gianm
Copy link
Contributor

gianm commented Jan 8, 2021

To someone more intune with Druid's inner workings: Do we really want alphabetical sorted to be the default for MV dimensions? Would it not be better to default to ARRAY in DimensionSchema L89?

Probably ARRAY or maybe SORTED_SET would be an ideal default. SORTED_ARRAY is a little weird, admittedly, but it's the default for legacy reasons. Maybe we'll change it in a future version.

@lkm
Copy link
Contributor Author

lkm commented Feb 9, 2021

Thanks @gianm. Requested changes added

Copy link

@sthetland sthetland left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@clintropolis clintropolis merged commit 9be2a5c into apache:master May 7, 2021
@clintropolis clintropolis added this to the 0.22.0 milestone Aug 12, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants