Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Flow for tables that already have a dataset #22136

Merged
merged 16 commits into from
Dec 5, 2022

Conversation

lyndsiWilliams
Copy link
Member

SUMMARY

This implements a flow in the new dataset creation page for when a table already has a dataset. It uses a simplified dataset fetch in the AddDataset/index.tsx file to cross-reference with the table list. If the table has a dataset, it gets a warning icon in the left panel. If the table is selected, the table's columns are still displayed, but the "Create Dataset" button is disabled and it has an info banner at the top informing the user of the pre-existing dataset with a link to that dataset in the upper-left corner of the alert. Clicking the "View Dataset" button in this alert will bring the user to the explore view of the existing dataset in a new tab.

ANIMATED GIF / SCREENSHOT

Screenshot of selected table with existing dataset

Screenshot 2022-11-15 at 7 01 47 PM

Left panel warning icons

existingDSleftPanel

"View dataset"

existingDSViewDS

TESTING INSTRUCTIONS

  • Go to http://localhost:9000/dataset/add/?testing
  • Select a database and a schema
  • Observe that any tables in the left panel with a pre-existing dataset will have a warning icon
  • Click a table with a dataset
  • Observe that the "Create dataset" button is disabled and there is an info banner at the top informing of the pre-existing dataset
  • Click "View dataset" in the alert
  • Observe that you are taken to the explore view of the existing dataset in a new tab

ADDITIONAL INFORMATION

  • Has associated issue:
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Nov 16, 2022

Codecov Report

Merging #22136 (11e2409) into master (6f6cb18) will decrease coverage by 1.59%.
The diff coverage is 72.72%.

@@            Coverage Diff             @@
##           master   #22136      +/-   ##
==========================================
- Coverage   66.98%   65.39%   -1.60%     
==========================================
  Files        1832     1851      +19     
  Lines       69918    72903    +2985     
  Branches     7570     8661    +1091     
==========================================
+ Hits        46838    47672     +834     
- Misses      21122    23069    +1947     
- Partials     1958     2162     +204     
Flag Coverage Δ
hive 52.60% <ø> (ø)
mysql 78.15% <ø> (ø)
postgres 78.21% <ø> (ø)
presto 52.49% <ø> (ø)
python 78.69% <ø> (-2.69%) ⬇️
sqlite 76.67% <ø> (ø)
unit ?

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...RUD/data/dataset/AddDataset/DatasetPanel/index.tsx 29.03% <ø> (ø)
...set-frontend/src/views/CRUD/data/dataset/styles.ts 100.00% <ø> (ø)
superset-frontend/src/views/CRUD/data/hooks.ts 57.69% <25.00%> (-5.95%) ⬇️
...d/src/views/CRUD/data/dataset/AddDataset/index.tsx 51.61% <41.66%> (-8.39%) ⬇️
...s/CRUD/data/dataset/AddDataset/LeftPanel/index.tsx 87.32% <78.57%> (+0.65%) ⬆️
...a/dataset/AddDataset/DatasetPanel/DatasetPanel.tsx 86.95% <90.90%> (-4.72%) ⬇️
...D/data/dataset/AddDataset/DatasetPanel/fixtures.ts 100.00% <100.00%> (ø)
...iews/CRUD/data/dataset/AddDataset/Footer/index.tsx 39.39% <100.00%> (+1.89%) ⬆️
...set/advanced_data_type/plugins/internet_address.py 16.32% <0.00%> (-79.60%) ⬇️
superset/utils/pandas_postprocessing/boxplot.py 20.51% <0.00%> (-79.49%) ⬇️
... and 191 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

@lyndsiWilliams
Copy link
Member Author

/testenv up

@github-actions
Copy link
Contributor

@lyndsiWilliams Ephemeral environment spinning up at http://35.85.146.196:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

}

const renderExistingDatasetAlert = (linkedDataset: any) => (
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we put a type on this linkeddataset?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oops, good catch! Typed in this commit.

Copy link
Contributor

@eric-briscoe eric-briscoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks good and works as described when testing on ephemeral. I left a few comments I think are worth addressing

message={t('This table already has a dataset')}
description={
<>
{t(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I recommend defining all text using t() as const outside of functions so that the text only gets translated once, not per render / function call. It also makes the text exportable for use in test files so you don't have to keep string literals defined in multiple files in sync manually.

I am commenting once but recommend for anywhere t() is being called inside a function / functional component.

Example can be seen on line 149, 150, 151

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation on this, I didn't realize it was running on each render/function call but it definitely makes sense! I fixed these spots and some others that I found in this commit.

@@ -110,7 +113,7 @@ const DatasetPanelWrapper = ({
if (tableName && schema && dbId) {
getTableMetadata({ tableName, dbId, schema });
}
// getTableMetadata is a const and should not be independency array
// getTableMetadata is a const and should not be in dependency array
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 😁

.catch(e => {
console.log('error', e);
});
.catch(error => console.log('There was an error fetching tables', error));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should this be an error instead of log? Also there is actually a logging utility in Superset-frontend that I was introduced to lately that should be best practice instead of calling console.log | .error | etc directly

You can: import { logging } from '@superset-ui/core';

Then call the logging methods defined at: https://github.com/apache/superset/blob/3c41ff68a43b5ab6b871226a73de9f2129d64766/superset-frontend/packages/superset-ui-core/src/utils/logging.ts

This lets us avoid ts-ignore comments anywhere we legitimately need to log something, and will allow for enriching what we do with logging calls in future, such as send some of the errors for backend logging to capture in usability metrics

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I actually didn't know about this logging utility, thanks for the info! Changed the catch in this commit.

);
})
.catch(error =>
console.log('There was an error fetching dataset', error),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please see other comment about using logging utility

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed the catch in this commit.

@@ -113,7 +115,11 @@ function Footer({
<Button onClick={cancelButtonOnClick}>Cancel</Button>
<Button
buttonStyle="primary"
disabled={!datasetObject?.table_name || !hasColumns}
disabled={
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a nit, but since this disabled check is getting kind of long it may be worth defining a variable doing this check before the return JSX block which will make the code easier to debug in future and keep the actual JSX portion cleaner.

Example:

const disabled = !datasetObject?.table_name ||
          !hasColumns ||
          linkedDatasets?.includes(datasetObject?.table_name);

  return (
    <>
      <Button onClick={cancelButtonOnClick}>Cancel</Button>
      <Button
        buttonStyle="primary"
        disabled
        tooltip={!datasetObject?.table_name ? tooltipText : undefined}
        onClick={onSave}
      >

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh yeah good call, much cleaner! Changed in this commit

};

useEffect(() => {
if (dataset?.schema) getDatasetsList();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although JavaScript allows for omitting curly braces around one line conditionals and functions, over the years it has become a widely recommended practice to always include curly braces for following reasons:

  1. It is a common area bug bugs to appear when refactoring is done from single line to multiple lines
  2. It can make code ambiguous based on line spacing
    if (myVar > 3) myVar = 0; doSomething(myVar);
    The above code can be confusing to identify if the author intended to always run doSomething(myVar); after the if statement, or they made a code mistake and only want that to run in context of the if statement and forgot to add braces. This can get a lot more confusing than this simple example when this syntax gets used frequently.
  3. Later if additional logic is added the curly braces will need to be added back. If dev forgets this the logic will run, but second line with run outside of the conditional
  4. Because it has been recommended to be avoided, this syntax is not common to see in JavasScript code and some developers may be confused by and /or misinterpret the code and change / refactor incorrectly.

Many tools have rules defined to track this syntax as a bug, for example: This is a common SonarQube rule: https://3layer.com.br/sonar/rules/show/javascript:CurlyBraces?layout=false

Consistency, code clarity, and avoiding bugs typically trumps saving extra characters. This may be worth discussing adding as a lint rule with wider group

Copy link
Member

@eschutho eschutho Nov 18, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree on the lint rule as a follow up +1

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh wow I didn't realize this was so risky! I added brackets in this commit and will avoid this in the future.

Copy link
Contributor

@eric-briscoe eric-briscoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a theme anti-pattern we should adjust

title={tableName || ''}
>
{tableName && (
<Icons.Table iconColor={supersetTheme.colors.grayscale.base} />
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where I commented earlier use of supersetTheme directly is an anit pattern we did not catch in initial creation of this file.

Above in the functional component we should have
const theme = useTheme();

Then here use
<Icons.Table iconColor={theme.colors.grayscale.base} />`

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIxed in this commit.

@@ -19,10 +19,12 @@
import React from 'react';
import { supersetTheme, t, styled } from '@superset-ui/core';
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyndsiWilliams I missed this when we originally worked on DatasetPanel but using supersetTheme directly is an anti-pattern. We should remove this import, and instead bring in useTheme
import { useTheme, t, styled } from '@superset-ui/core';
At line 131-148, and line 288, anywhere else we directly use supersetTheme should be changed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIxed in this commit.

@@ -123,6 +128,24 @@ const StyledTable = styled(Table)`
right: 0;
`;

const StyledAlert = styled(Alert)`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Related to coment on line 20, we shuld repelase use of supertThem with:
const StyledAlert = styled(Alert)( ({ theme }) =>

and replace all the supersetTheme with theme
for example:
`border: 1px solid ${theme.colors.info.base};

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FIxed in this commit.

})
: undefined;

const getDatasetsList = () => {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd recommend moving any api calls outside of the view component into either a hook or redux action for separation of concerns.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved the API call into a hook in this commit.

@EugeneTorap
Copy link
Contributor

EugeneTorap commented Nov 18, 2022

@eschutho @eric-briscoe @lyndsiWilliams I have some nice suggestions!

  • This is the new dataset creation page and we have [SIP-61] - Improve the organization of front-end folders created by @michael-s-molina
    We can move this page into src/pages/addDataset and it'll be first component in src/pages after that we can smoothly move others page to the folder or if we create a new page for superset we can immediately do it in src/pages.
  • Creating Virtual Dataset on this new page. It will be good if we run our SQL query here we can see all metadata: columns types, used tables in the query, labels/aliases of columns and so on. Because in SQL Lab if you run sql query you can only see result data without columns metadata and others additional info.

What do you think about it?

@lyndsiWilliams
Copy link
Member Author

@eschutho @eric-briscoe @lyndsiWilliams I have some nice suggestions!

  • This is the new dataset creation page and we have [SIP-61] - Improve the organization of front-end folders created by @michael-s-molina
    We can move this page into src/pages/addDataset and it'll be first component in src/pages after that we can smoothly move others page to the folder or if we create a new page for superset we can immediately do it in src/pages.
  • Creating Virtual Dataset on this new page. It will be good if we run our SQL query here we can see all metadata: columns types, used tables in the query, labels/aliases of columns and so on. Because in SQL Lab if you run sql query you can only see result data without columns metadata and others additional info.

What do you think about it?

Hey @EugeneTorap!

  • This is a good suggestion, but it is a pretty substantial change for this PR. It will be better to do this in a future cleanup PR.
  • We currently don't have plans to add a virtual dataset from this flow. Now that we can create charts from a query in SQL Lab, the left panel in Explore should show you the column metadata info.

@@ -168,15 +202,52 @@ export interface IDatasetPanelProps {
* Boolean indicating if the component is in a loading state
*/
loading: boolean;
datasets?: DatasetObject[] | undefined;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I kind of want this to have a better name, maybe existingDatasets? or something along those lines.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was going off of what @eschutho suggested in this review comment. I'm not really settled on the best name here but I'd be fine changing it, would like to hear what Elizabeth thinks of existingDatasets as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense!

Does this need to be a list or can it be boolean? From my, limited, understanding of this feature you're seeing if there already is a dataset linked, so we don't need to be passing in a list of datasets, we could just check to see if there is an associated one.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need the whole object so we can cross-reference the list of table names with the table names in the list of datasets here:

{datasetNames?.includes(tableName) &&
renderExistingDatasetAlert(
datasets?.find(dataset => dataset.table_name === tableName),
)}

And when it's passed into renderExistingDatasetAlert() we need to be able to pull out the explore_url from the dataset object here:

onClick={() => {
window.open(
dataset?.explore_url,
'_blank',
'noreferrer noopener popup=false',
);
}}

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

got it! Makes sense

}

const EXISTING_DATASET_DESCRIPTION = t(
'You can only associate one dataset with one table. This table already has a dataset associated with it in Preset.\n',
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, does this need to be preset specific language? If so, then I think we need to add that in on the Preset side and have a more generic description in superset.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That makes sense, but I'm not sure about Preset-specific language here. I think we'd need @yousoph 's input on this one.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated text: This table already has a dataset associated with it. You can only associate one dataset with a table.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I spoke with Sophie and Karan in Slack and we discussed changing the text to be more general and not include Superset or Preset. Changed in this commit.

@AAfghahi
Copy link
Member

/testenv up

@github-actions
Copy link
Contributor

@AAfghahi Ephemeral environment spinning up at http://52.42.126.171:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

@lyndsiWilliams
Copy link
Member Author

/testenv up

@github-actions
Copy link
Contributor

@lyndsiWilliams Ephemeral environment spinning up at http://54.202.36.48:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

@@ -94,7 +94,7 @@ const LoaderContainer = styled.div`
display: flex;
align-items: center;
justify-content: center;
height: 100%;
height: 98%;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyndsiWilliams typically we we see scrolling with something where width / height is set to 100% due to box-sizing logic. By default CSS does not subtract padding and border thickness from the value of the parent elements size. This makes the child set its size to 100% of the parent element plus the child's padding and border dimensions (100% of parent + child's padding and borders making the child larger than the parents visible area). Getting rid of the scrollbar should be achievable with:

height: 100%;
box-sizing: border-box;

The CSS box-sizing property allows us to include the padding and border in an element's total width and height calculation

The reason we might want to try this instead of reducing the height to 98% is that 2% of the total height in actual pixels varies on the browser window's height. The higher the screen resolution and taller the browser window is, the more that 2% will cause an offset of actual vertical center alignment. For example on a HD screen the offset would be ~ 17px (850px of usable screen after browsers UI. 850 * 0.02 = ~17px). On a 4k screen the offset could be significantly larger. In other words as the browser grows in height, the more non-centered aligned the item becomes. Alternatively, as the screen size reduces it becomes possible that 2% is less than the height - padding+border and scroll bar still ends up appearing. Using 100% with the box-sizing ensures we remain center aligned and void scroll regardless of browser height and screen resolution.

good explanation with interactive examples of box-sizing here: https://www.w3schools.com/css/css3_box-sizing.asp

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that makes sense, I didn't know about box-sizing, thank you! But setting height to 100% and box-sizing: border-box did not get rid of the scrollbar on my end. Did it work on your end?

I see the reasoning for wanting to include 100% of the element, but I was thinking it might be acceptable to do 98% height for this since it's just the loading gif that shows in the center, so that 2% will never cut any visual elements off. What do you think?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Implemented the border-box fix we discussed on Slack in this commit.

Copy link
Contributor

@eric-briscoe eric-briscoe left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@lyndsiWilliams Latest looks good! There is one more CSS height issue we discussed that is already fixed in follow on PR to this so approving as is. thanks!

Copy link
Member

@eschutho eschutho left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is great @lyndsiWilliams!!

@lyndsiWilliams lyndsiWilliams merged commit 04b7a26 into master Dec 5, 2022
@lyndsiWilliams lyndsiWilliams deleted the lyndsi/tables-with-preexisting-datasets branch December 5, 2022 21:43
@github-actions
Copy link
Contributor

github-actions bot commented Dec 5, 2022

Ephemeral environment shutdown and build artifacts deleted.

@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants