Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(databases): GSheets and Clickhouse DBs are not allowed to upload files #21065

Merged
merged 6 commits into from
Sep 26, 2022

Conversation

Antonio-RiveroMartnez
Copy link
Member

@Antonio-RiveroMartnez Antonio-RiveroMartnez commented Aug 12, 2022

SUMMARY

Our users are getting an Internal Server Error 500 when trying to upload files to GSheets databases, that's because such databases don't support file upload nor the Clickhouse ones thus we need to stop showing such options to the users and prevent that existing ones get displayed in the upload form.

Additionally, schemas allowed field is dependent on having file upload enabled but doesn't appear that way, so we are rendering it only when user checks the Allow file upload checkbox and placing it underneath the file upload checkbox.

Wording changes:

  1. Update field name to : "Allow file uploads to database" instead of "Allow data upload" and remove the tooltip
  2. On the Schemas Allowed field, we should update "CSV Upload" text to be "File Upload" and "CSVs" to "files", including in the hint text

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

Before:
error

error

After:
test

test

TESTING INSTRUCTIONS

  1. When selecting a DB type, if GSheet or Clickhouse is selected the Allow data upload checkbox is not displayed in the form
  2. Enabled the file upload menu only if there is any exiting DB with allow_file_upload flag set to true and that is not GSheets nor Clickhouse
  3. When the form to file upload files is opened, the Database dropdown must not contain any GSheets nor Clickhouse DB.

ADDITIONAL INFORMATION

  • Has associated issue: Upload csv to existing clickhouse is not supported #19247
  • Required feature flags:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

@codecov
Copy link

codecov bot commented Aug 12, 2022

Codecov Report

Merging #21065 (cc338fe) into master (7938e81) will decrease coverage by 0.00%.
The diff coverage is 79.41%.

@@            Coverage Diff             @@
##           master   #21065      +/-   ##
==========================================
- Coverage   66.61%   66.61%   -0.01%     
==========================================
  Files        1791     1791              
  Lines       68559    68586      +27     
  Branches     7319     7325       +6     
==========================================
+ Hits        45671    45686      +15     
- Misses      21002    21017      +15     
+ Partials     1886     1883       -3     
Flag Coverage Δ
hive 52.98% <72.22%> (+0.01%) ⬆️
javascript 52.80% <87.50%> (-0.01%) ⬇️
mysql 80.83% <72.22%> (-0.01%) ⬇️
postgres 80.88% <72.22%> (-0.01%) ⬇️
presto 52.88% <72.22%> (+0.01%) ⬆️
python 81.35% <72.22%> (-0.01%) ⬇️
sqlite 79.43% <72.22%> (-0.01%) ⬇️
unit 50.84% <72.22%> (+0.01%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
superset-frontend/src/views/components/SubMenu.tsx 86.88% <ø> (ø)
superset/databases/api.py 95.49% <ø> (ø)
superset/views/database/forms.py 85.71% <40.00%> (-3.18%) ⬇️
...tend/src/views/CRUD/data/database/DatabaseList.tsx 65.68% <66.66%> (-0.32%) ⬇️
superset/models/core.py 88.76% <71.42%> (-0.29%) ⬇️
...perset-frontend/src/views/components/RightMenu.tsx 71.31% <83.33%> (+7.97%) ⬆️
.../CRUD/data/database/DatabaseModal/ExtraOptions.tsx 86.95% <100.00%> (+3.62%) ⬆️
...c/views/CRUD/data/database/DatabaseModal/index.tsx 33.00% <100.00%> (+0.72%) ⬆️
superset/db_engine_specs/base.py 88.40% <100.00%> (+0.07%) ⬆️
superset/db_engine_specs/clickhouse.py 66.66% <100.00%> (+0.62%) ⬆️
... and 5 more

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more


class DatabasesNotAllowedToFileUpload(str, Enum):
GSHEETS = "gsheets"
CLICKHOUSE = "clickhousedb"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Rather than maintaining a separate enum for this, why not define this as a property on the db_engine_spec?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nytai! I just updated the PR with such property in our base and each engine now. Thanks for the suggestion.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@nytai let us know if you want us to recruit additional folks for a review, or if you have time to take a look. Thanks!

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think having someone with more experience with these per-db configs, like @villebro or @john-bodley, have a look would be good.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Antonio-RiveroMartnez Thanks for making that change, multiple places in the frontend where you're checking
db?.engine !== Engines.GSheet && db?.engine !== Engines.ClickHouse. This makes adding a new engine that does not support file upload a pain, since you have to update a bunch of files across both backend and frontend code. Ideally one could just specify in the db_engine_spec allows_file_upload=False and that's it. So maybe adding a new api that you could call from the db modal to check is a supplied connection string or config supports file uploads would be a better approach?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree with @nytai here - we should place this flag in the db engine specs, and the UI should accommodate accordingly. I may be wrong, but I believe we're already sending some db engine spec metadata in bootstrap data. Maybe we could just add allows_file_upload to BaseEngineSpec, override it in the GSheets and ClickHouse specs, and then make sure the flag is made available in bootstrap data? If we're not passing any info about db engine specs, now would probably be a good time to do so.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @nytai @villebro! Yeah, currently the flag is being set in our BaseEngineSpec and being overridden in our GSheets and Clickhouse specs. The missing part is sending it (or using it) in the frontend instead of db?.engine !== Engines.GSheet && db?.engine !== Engines.ClickHouse. So, let me check the bootstrap data and if not doing it already let me start sending it. Thank you both for the input.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok @nytai @villebro , I updated the PR, we were sending some extra information in the response but that might is editable by the user so I decided to include a new engine_information that currently only has allows_file_upload flag. Now, There is no need for updating multiple places but instead just setting such flags as true to prevent file uploading.

@yousoph
Copy link
Member

yousoph commented Aug 17, 2022

/testenv up

@github-actions
Copy link
Contributor

@yousoph Ephemeral environment spinning up at http://54.201.86.244:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

@yousoph
Copy link
Member

yousoph commented Aug 19, 2022

/testenv up

@github-actions
Copy link
Contributor

@yousoph Ephemeral environment spinning up at http://54.213.205.154:8080. Credentials are admin/admin. Please allow several minutes for bootstrapping and startup.

:returns: Dict with properties of our class like allows_file_upload
"""
return {
"allows_file_upload": cls.allows_file_upload,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how is this different from the current colum allow_file_upload? If they are very different, then should we think about changing the name of this?

allows_file_upload and allow_file_upload being different columns that serve different purposes is going to be way too confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The allow_file_upload is in the Database model and it's set per each user's DB. While the allows_file_upload is set in the DB Engine Spec model and cannot be set by the user. So, even when a Database has its flag set to true (think of existing user's DB), if its DB Engine Spec doesn't support it, we prevent file uploading to it.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that makes sense!

Could we name it supports_file_upload or something along the like? Having two columns only separated by a singular/plural shift is going to be confusing.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, totally! Let me make the change and ILYK. Thanks for the input @AAfghahi

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course! Otherwise this LGTM!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Updated with the new supports_file_upload @AAfghahi :)

@github-actions
Copy link
Contributor

Storybook has completed and can be viewed at

- GSheets  and Clickhouse databases are not allowed to upload files to it so we need to stop showing such options to the users. This needs to happens for new DBs and existing ones. Also include Clickhouse DB as not allowed for data upload.
- Add a new property in db_engine_spec so we can tell whether a db engine support file upload. It is True in by default.
- Stop using an Enum and instead make the method that populates the dropdown in our form use the new property for the engine of  a given database
- The text input Schemas allowed for File upload renders conditionally based on whther the user selects Allow file upload.
- Fix margins for checboxes and new location of the text input
- Update the wording for Allow file upload and text input for schemas
- Remove tooltip from Allow file upload checkbox
- Update tests
        - Include a new engine_information property in the response got our GET apis (single and list). Currently inside that property we send the allows_file_upload config value for the specific engine but it can added more information later.
        - Instead of manually checking if engine is GSheets or Clickhouse, use the new property so when new engines are added we dont need to change multiple file but just the property for it.
        - Update our tests and openapi spec to consider the new property
- Rename allows_file_upload to supports_file_upload so we avoid confusion from the DB property
- Update our new RightMenu tests to include the environmentTag required prop
@github-actions
Copy link
Contributor

github-actions bot commented Sep 9, 2022

Ephemeral environment shutdown and build artifacts deleted.

@@ -118,8 +118,8 @@ const RightMenu = ({
ALLOWED_EXTENSIONS,
HAS_GSHEETS_INSTALLED,
} = useSelector<any, ExtentionConfigs>(state => state.common.conf);
const [showModal, setShowModal] = useState<boolean>(false);
const [engine, setEngine] = useState<string>('');
const [showModal, setShowModal] = React.useState<boolean>(false);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the reason for using this versus importing in useState and invoking it normally?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So we can spyOn during our tests, e.g. const useStateMock = jest.spyOn(React, 'useState'); https://github.com/apache/superset/pull/21065/files#diff-6a8138bf7b3a4c95d31569dfe4497ceb65ae07faddc47fb944a611ec7fb19776R86

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you look at the line above that for useSelector, could you follow that method instead and import useState?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You cannot because if you do something like:

jest.mock('react', () => ({
  ...jest.requireActual('react'),
  useState: jest.fn(),
}));

the whole tests are going to fail because React itself wont have all what it needs to render and handle cases where we actually need regular useState, and after a while researching I found that spying on native hooks from react would require such usage as we have here. I guess it's the same reason why this other component uses it already in our code base:

https://github.com/apache/superset/tree/master/superset-frontend/src/explore/components/controls/ControlPopover

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But that test and file still imports useState and uses it instead of doing React.useState

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indeed but the usage of the spied hook is different which might be affecting why these tests cannot do it that way, either way I'll give a spin to see if it works that way, if not possible we need to decide whether we move forward with it as is or not 🤔 I'll keep you posted @AAfghahi , thanks for the input 🎉

@eschutho eschutho merged commit b36bd3f into apache:master Sep 26, 2022
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.1.0 and removed 🚢 2.1.3 labels Mar 13, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/XL 🚢 2.1.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants