Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Web console: Make array ingest mode ux better #15927

Merged
merged 4 commits into from
Mar 13, 2024

Conversation

vogievetsky
Copy link
Contributor

This PR tries to address an issue of "web console tab context poisoning" that is when a web console user re-uses a tab by deleting the SQL query in the tab and writing/pasting a new one. The user might forget that deleting all the SQL does not clear the context parameters which will be attached to subsequent queries issued.

Specifically here is a scenario that is of concern:

  • User plays with the "Connect external data" flow which creates a new tab for them with arrayIngestMode: array set (as was added in Web console: use arrayIngestMode: array #15588)
  • After sometime, the user comes back and re-uses that tab by selecting the entire SQL query and pasting over some REPLACE/INSERT query into their production datasource that contains MVDs to do some fix up. They do not realize that while they cleared the SQL query, arrayIngestMode: array is still set in the query context in that tab.
  • The MVD columns become mixed with ARRAYs and some queries against these columns start working in unexpected ways.

This PR makes two changes (in two respective commits):

  1. Change it so that the arrayIngestMode: array context parameter is only set when the user has opted in to arrays (via the toggle that is off by default). Currently arrayIngestMode: array is always being set by all UI flows and the toggle only affects the SQL that is generated.
    This change affects:
  • The "Connect external data" wizard flow
  • The "SQL data loader"
  • The spec to SQL converter (in this case arrayIngestMode: array is only set if the ingestion spec is detected as using arrays by having dimensions specs of type "auto" + "castToType: ARRAY<...>")
  1. Make the arrayIngestMode selection more prominent in the UI so that it is shown on the "Run panel" and is not hidden inside of a menu.

image

@cryptoe cryptoe added this to the 29.0.1 milestone Feb 21, 2024
@gianm
Copy link
Contributor

gianm commented Mar 8, 2024

What happens when you click on that? does it explain what the settings do? (IMO it should, since the behavior is potentially confusing. "array ingest mode: array" or "array ingest mode: MVD" might make people think they are definitely getting arrays or MVDs, but that's not how it works; it just controls what happens for things with SQL type ARRAY<STRING>. there's some asymmetry: it's possible to get MVDs in "array" mode, although it's not possible to get arrays in "MVD" mode.)

@vogievetsky
Copy link
Contributor Author

I am not sure what that refers to in your comment in the first question. What happens when you click on that?

There is this toggle in the UX flow that is on by default:

image

It has this info section:

image

(This was added before this PR)

In the Run bar dropdown there is a link to the docs:

image

Which links to https://druid.apache.org/docs/latest/querying/arrays/#differences-between-arrays-and-multi-value-dimensions

@gianm
Copy link
Contributor

gianm commented Mar 13, 2024

@vogievetsky -- sorry, by "that" I meant the "array ingest mode" dropdown. I see it now in the screenshot. I just updated with new docs for arrayIngestMode that will appear on https://druid.apache.org/docs/latest/querying/arrays#arrayingestmode -- would you mind linking to that?

Some flavor text for array and mvd would also be nice, like for array:

Load SQL VARCHAR ARRAY as Druid ARRAY

And for mvd:

Load SQL VARCHAR ARRAY as Druid multi-value STRING

With the "? Documentation" link pointing to https://druid.apache.org/docs/latest/querying/arrays#arrayingestmode for the full details.

@vogievetsky
Copy link
Contributor Author

Thank you for the feedback, updated:

image

Copy link
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nits:

  • ARRAY<STRING> is more typical formatting than ARRAY<string>
  • multi-value STRING is better than multi-value STRING. (the "multi-value" isn't an identifier that needs to be code-formatted, it's just English)

I am good with this, though, so, approving. But feel free to make those changes 😄

@vogievetsky
Copy link
Contributor Author

Thanks for the feedback I was going back and forth on that second point. I guess I manifested my inner wishes that Druid did have a mv-string type :-)

image

@vogievetsky vogievetsky merged commit ccae19a into apache:master Mar 13, 2024
11 checks passed
@vogievetsky vogievetsky deleted the make_arrayIngestMode_ux_better branch March 13, 2024 20:04
cryptoe pushed a commit to cryptoe/druid that referenced this pull request Mar 18, 2024
* only set arrayIngestMode: array when needed (still do the queries the correct way)

* arrayIngestMode control

* update wording

* feedback fixes
vogievetsky added a commit that referenced this pull request Mar 18, 2024
* only set arrayIngestMode: array when needed (still do the queries the correct way)

* arrayIngestMode control

* update wording

* feedback fixes

Co-authored-by: Vadim Ogievetsky <vadim@ogievetsky.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants