Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor: Changes the DatabaseSelector and TableSelector to use the new Select component #16483

Conversation

michael-s-molina
Copy link
Member

@michael-s-molina michael-s-molina commented Aug 27, 2021

SUMMARY

Follow-up of #16334 which introduced #16475 and was reverted in #16478.

You can find the descriptions and videos of the changes in the original PR #16334.

Fixes #16475

Thanks, @etr2460 for providing the revert PR.

@ktmud @etr2460 @villebro
I'll add comments indicating the relevant changes from the original PR.

Is also worth mentioning this fix by @AAfghahi #16472

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Check the original PR.

TESTING INSTRUCTIONS

Check the original PR.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

Comment on lines 52 to 66
type DatabaseValue = {
label: string;
value: number;
id: number;
database_name: string;
backend: string;
};
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the database type to include id, database_name, and backend.

schema,
const [currentDb, setCurrentDb] = useState<DatabaseValue | undefined>(
db
? { label: `${db.backend}: ${db.database_name}`, value: db.id, ...db }
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loaded db props in the initial value.

}
if (onDbChange) {
onDbChange(db);
onDbChange(actualDb);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed to use actualDb instead of db

value: { label: string; value: number },
option: DatabaseValue,
) {
const actualDb = option || db;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the actualDB to use the option instead of value

Comment on lines +235 to +179
id: row.id,
database_name: row.database_name,
backend: row.backend,
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Loaded the additional db information.

@codecov
Copy link

codecov bot commented Aug 27, 2021

Codecov Report

Merging #16483 (8ac1edb) into master (adc3d24) will decrease coverage by 0.02%.
The diff coverage is 82.14%.

❗ Current head 8ac1edb differs from pull request most recent head 03a2788. Consider uploading reports for the commit 03a2788 to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master   #16483      +/-   ##
==========================================
- Coverage   76.95%   76.92%   -0.03%     
==========================================
  Files        1007     1007              
  Lines       54149    54156       +7     
  Branches     7369     7369              
==========================================
- Hits        41669    41661       -8     
- Misses      12240    12255      +15     
  Partials      240      240              
Flag Coverage Δ
hive 81.24% <27.27%> (-0.06%) ⬇️
mysql ?
postgres 81.74% <90.90%> (+<0.01%) ⬆️
presto 81.52% <27.27%> (-0.03%) ⬇️
python 82.20% <90.90%> (-0.05%) ⬇️
sqlite 81.35% <90.90%> (+<0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...ontend/src/views/CRUD/data/dataset/DatasetList.tsx 69.41% <ø> (ø)
...nd/src/views/CRUD/data/dataset/AddDatasetModal.tsx 56.25% <40.00%> (ø)
...erset-frontend/src/datasource/DatasourceEditor.jsx 74.81% <59.09%> (ø)
...rontend/src/SqlLab/components/SqlEditorLeftBar.jsx 56.06% <80.00%> (ø)
...et-frontend/src/components/TableSelector/index.tsx 84.25% <82.60%> (ø)
superset/views/core.py 75.86% <90.00%> (-0.42%) ⬇️
...frontend/src/components/DatabaseSelector/index.tsx 92.70% <91.54%> (ø)
...et-frontend/src/components/CertifiedIcon/index.tsx 100.00% <100.00%> (ø)
superset-frontend/src/components/Icons/Icon.tsx 100.00% <100.00%> (ø)
...nd/src/components/WarningIconWithTooltip/index.tsx 100.00% <100.00%> (ø)
... and 15 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update adc3d24...03a2788. Read the comment docs.

@michael-s-molina
Copy link
Member Author

Here's an example of switching databases and querying different tables.

Screen.Recording.2021-08-27.at.10.26.31.AM.mov

@michael-s-molina
Copy link
Member Author

/testenv up

@github-actions
Copy link
Contributor

@michael-s-molina Ephemeral environment spinning up at http://54.212.66.149:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

@AAfghahi
Copy link
Member

This PR is also relevant and helps with the undefined engine bug:

#16472

@michael-s-molina
Copy link
Member Author

This PR is also relevant and helps with the undefined engine bug:

#16472

Thanks! Added a reference in the description.

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

2 questions about testing/verification

expect(select).toBeInTheDocument();
userEvent.click(select);
expect(
await screen.findByRole('option', { name: 'postgresql: test' }),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we also add expect(props.onDbChange).toBeCalledTimes(1); here to catch the previous regression?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion. I created two new tests to catch the previous regression:

  • Sends the correct db when changing the database
  • Sends the correct schema when changing the schema

@@ -227,10 +227,7 @@ class DatasourceControl extends React.PureComponent {
</Tooltip>
)}
{extra?.warning_markdown && (
<WarningIconWithTooltip
warningMarkdown={extra.warning_markdown}
size={30}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

can you attach a screenshot of this change? i seem to recall a reason for the different size

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course. I'm attaching a screenshot with both healthCheckMessage and warning_markdown enabled for comparison.

Screen Shot 2021-09-01 at 12 01 19 PM

@ktmud
Copy link
Member

ktmud commented Aug 27, 2021

Some UX issues found during local testing:

  1. For db types, can we use the same tag style instead of just text?

    Now

    Xnip2021-08-27_09-27-41

    Before

    Xnip2021-08-27_09-30-45
  2. Loading state should show "Loading..." instead of "No data"

    Now

    Xnip2021-08-27_09-28-10

    Before

    Xnip2021-08-27_09-33-32

    I think we should keep the same placeholder text as well.

  3. Previously the Select will automatically trigger the initial load of the tables list if there is a selected schema and the schema list if there is a selected database (just select a table and refresh the page to test this), now you have to focus on the tables select to trigger the loading. I think we should keep this preload behavior because it can be quite slow to load these APIs.

  4. Scrolling tables list to the end will trigger an unnecessary new query----probably trying to fetch the next page?
    Xnip2021-08-27_09-29-39

    It also freezes the whole page on our one of our schemas with 30k tables.

  5. Switching schema does not clear the selected table--even though the table does not exist in the new schema.

Will take a look at the code once these issues are fixed.

@michael-s-molina michael-s-molina force-pushed the refactor-database-selector-selects branch 2 times, most recently from 14aa926 to 273231d Compare September 1, 2021 16:17
@michael-s-molina
Copy link
Member Author

  1. For db types, can we use the same tag style instead of just text?

Done

  1. Loading state should show "Loading..." instead of "No data"

Done in #16531

  1. Previously the Select will automatically trigger the initial load of the tables list if there is a selected schema and the schema list if there is a selected database (just select a table and refresh the page to test this), now you have to focus on the tables select to trigger the loading. I think we should keep this preload behavior because it can be quite slow to load these APIs.

The idea here is to avoid unnecessary queries to the server. This component is used in Explore, SQL Editor, and the datasets list. Many times the user does not modify any of the selects but they are rendered. Some examples:

  • When the user enters the datasets modal from the dataset list to edit anything from the other tabs
  • When changing or just viewing a query in SQL Editor

I do understand your concern though. Here we have a limitation because the tables endpoint does not currently support pagination. In talks with @villebro to add pagination support we ended up with the following:

That tables API will be horrible to paginate, as SqlAlchemy doesn’t natively support pagination of retrieval of table names.
I can open up a discussion on the sqla repo to see if they're open to adding the optional feature, but it will probably take a long time to get it in.

So one of the reasons for the slow behavior is the lack of pagination support. For cases where the queries are heavy even with pagination, we created a prop called fetchOnlyOnSearch that only triggers a filtered query. I think Airbnb should enable this prop for this case given that you have a schema with 30k tables.

  1. Scrolling tables list to the end will trigger an unnecessary new query----probably trying to fetch the next page?

This was a result of the lack of pagination support. The component thinks that we have more results available given the page size and total results. To overcome this, while we don't add pagination support to this endpoint, I added the following:

// TODO: Would be nice to add pagination in a follow-up. Needs endpoint changes.
pageSize={Number.MAX_SAFE_INTEGER}

This is just disabling the pagination in practice. We don't want to provide a prop to disable the pagination because we want to make sure all endpoints support this feature. This is especially important if we think about devices with slow connections that are using Superset endpoints.

It also freezes the whole page on our one of our schemas with 30k tables.

I tested here with 50k values and the Select was able to render smoothly. Virtualization is native in Ant Design Select. In their page, we have an example with 100k items. So I don't know if it froze because of query performance or another reason. Maybe we can test again with fetchOnlyOnSearch enabled.

  1. Switching schema does not clear the selected table--even though the table does not exist in the new schema.

Fixed.

@ktmud @etr2460 @villebro Thank you so much for the tests, reviews and help improving this feature.

@ktmud
Copy link
Member

ktmud commented Sep 1, 2021

Table names and schema names are cached, so I don't think unnecessary db metadata queries are a huge concern.

Re: pagination, shouldn't the Select itself has a built-in "async but no pagination" mode? Why is the pageSize hack necessary?

The freeze might be related to the refetch. It indeed worked well with the initial rendering.

@michael-s-molina
Copy link
Member Author

@ktmud Let me provide you a little more context to help with the decision. When we started the select migration project, @geido compiled a great spreadsheet with all the different types of selects that we had and all the props that were available for each select type. We had a lot of props, more than 50! What we noticed was that this lead to multiple behaviors throughout the application and increased the complexity of components because of the different combinations. Then we paired with @villebro to design the component API and decided that we would try to diminish the number of props by establishing default behaviors and making the user experience more similar. Another objective was to decrease the complexity of the component.

This context is important because we can resolve these two questions by adding new props to the component like lazyLoading (default true) and paginated (default true). In fact, these props appeared during our discussions but we assumed that we would always want to avoid unnecessary queries and make sure that data is always fetched in blocks. That's why they are default behaviors of the component and not props.

Table names and schema names are cached, so I don't think unnecessary db metadata queries are a huge concern.

This can be resolved with the inclusion of the lazyLoading prop, but I don't think that even if the lazy loading is disabled we will resolve the problem. The query can still be running when the user opens the select, even if it started during the rendering phase. So I think using fetchOnlyOnSearch would be more appropriate for this case.

Re: pagination, shouldn't the Select itself has a built-in "async but no pagination" mode? Why is the pageSize hack necessary?

The hack here is needed because the default behavior assumes that we should always try to fetch data using pagination. Adding a paginated prop can encourage the developer to not deal with adding pagination to the endpoints, which is pretty common right now. What the hack does is establishing a really big page size so that the content fits on one page while we don't fix the endpoint.

As these are subjective decisions, I still can add these props if you think they pay off. I just wanted to add the whole context to help with the decision.

@ktmud
Copy link
Member

ktmud commented Sep 1, 2021

The query can still be running when the user opens the select, even if it started during the rendering phase.

This is about improving the availability of UI elements for most cases. A lot of users land on SQL Lab page without immediately going to change the selected table, but when they do, it's better to have the list preloaded already.

TBH I don't think pagination and async loading itself should be part of the default Select component. A lot of select controls do not need async loading at all (but still need the isCreatable and isMultible options).

The Async behavior should either be an HOC or another hook. For example:

const { options, onSearch } = useAsyncSelectOptions({
  fetchOptions: ...,
  preload: true
});

<Select options={options} onSearch={onSearch} .../>

With the complexity abstracted away from the Select component itself, adding more options to the fetch behavior should not be that big of a concern.

@michael-s-molina
Copy link
Member Author

The Async behavior should either be an HOC or another hook.

We consider this approach when designing the component but we chose to make it an integral part of the Select because we have UI behaviors that are more easily implemented as a native feature than as a HOC. It's not only about how to retrieve the data but the interaction with the component also changes.

A lot of select controls do not need async loading at all.

We just replaced all the selects of the application and the majority of them require async loading. In many cases where the previous select received an options array, it was the parent component that was doing the fetch and managing the pagination. In fact, absorbing this behavior and avoiding this duplicated logic was one of the reasons to create the component in the first place.

I think that in the interest of moving this PR forward, can we settle in creating the paginated and lazyLoading props for now and revisit this decision when all endpoints support pagination?

@ktmud
Copy link
Member

ktmud commented Sep 1, 2021

Most Select controls on the Explore page do not need async.

Not sure if this would help, but recently I've created yet another Select component in another project using the amazing Downshift.js library and chakra-ui. It does not handle pagination yet, but the "letting hooks take over event handlers" approach that Downshift.js has taken really opened my eyes. Thought that would be relevant to the discussion here as well.

This Select component also uses a custom render hook which has made it easy to break down a complex component into overridable smaller subcomponents.

@michael-s-molina michael-s-molina force-pushed the refactor-database-selector-selects branch 3 times, most recently from 2bb58c3 to aa2ea31 Compare September 2, 2021 14:44
@michael-s-molina
Copy link
Member Author

Not sure if this would help, but recently I've created yet another Select component in another project using the amazing Downshift.js library and chakra-ui. It does not handle pagination yet, but the "letting hooks take over event handlers" approach that Downshift.js has taken really opened my eyes. Thought that would be relevant to the discussion here as well.

This Select component also uses a custom render hook which has made it easy to break down a complex component into overridable smaller subcomponents.

Thanks for sharing this @ktmud. I'll definitely take a look for future improvements to the component. In the meantime, I removed the pagination and lazy loading, reverting the behavior to what it was before. We can continue with the review process.

@michael-s-molina michael-s-molina force-pushed the refactor-database-selector-selects branch 2 times, most recently from bc70a61 to f787ac8 Compare September 2, 2021 19:01
@michael-s-molina
Copy link
Member Author

/testenv up

@github-actions
Copy link
Contributor

github-actions bot commented Sep 3, 2021

@michael-s-molina Ephemeral environment spinning up at http://54.185.15.154:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

@geido
Copy link
Member

geido commented Sep 9, 2021

/testenv up

Copy link
Member

@geido geido left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM! Thank you!

@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2021

@geido Ephemeral environment spinning up at http://34.216.189.221:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

@michael-s-molina
Copy link
Member Author

/testenv up

@github-actions
Copy link
Contributor

@michael-s-molina Ephemeral environment spinning up at http://54.189.216.32:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

Copy link
Member

@etr2460 etr2460 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm!

@michael-s-molina michael-s-molina merged commit 596e1cd into apache:master Sep 22, 2021
@github-actions
Copy link
Contributor

Ephemeral environment shutdown and build artifacts deleted.

opus-42 pushed a commit to opus-42/incubator-superset that referenced this pull request Nov 14, 2021
QAlexBall pushed a commit to QAlexBall/superset that referenced this pull request Dec 28, 2021
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.4.0 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XXL test_priority:high 🚢 1.4.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Unable to switch database in SQL Lab
6 participants