Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: create backend routes and API for importing saved queries #13893

Merged
merged 9 commits into from
Apr 8, 2021

Conversation

AAfghahi
Copy link
Member

SUMMARY

Created backend routes and validation for Import Saved Queries feature that Superset is rolling out.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

none yet

TEST PLAN

none yet!

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Includes DB Migration (follow approval process in SIP-59)
    • Migration is atomic, supports rollback & is backwards-compatible
    • Confirm DB migration upgrade and downgrade tested
    • Runtime estimates and downtime expectations provided
  • Introduces new feature or API
  • Removes existing feature or API

log_to_statsd=False,
)
def import_(self) -> Response:
"""Import chart(s) with associated datasets and databases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

chart -> saved_query (same thing in other places)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

embarrassing, thank you for pointing it out.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've done that so many times! 😆

action=lambda self, *args, **kwargs: f"{self.__class__.__name__}.post",
log_to_statsd=False,
)
def post(self) -> Response:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We don't need this for this PR.

Creating a new saved query would be an API for when the user clicks "save query" in SQL Lab. Currently that flow uses the old API (savedqueryviewapi/api/create), but when we eventually migrate it to the new v1 API this is the endpoint that would be used.

I would suggest moving this out of the PR, and have a separate PR introducing this and the command, so we can move "save query" to this endpoint.

@@ -0,0 +1,56 @@
# Licensed to the Apache Software Foundation (ASF) under one
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is also not needed, let's move to a separate PR.

prefix ="saved_queries/"
schemas: Dict[str, Schema] = {
"datasets/": ImportV1DatasetSchema(),
"saved_query": ImportV1SavedQuerySchema(),
Copy link
Member

@betodealmeida betodealmeida Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

queries/, this maps to the directory inside the ZIP file

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

by this do you mean changing it to saved_queries or replacing it with queries/ altogether. Furthermore, would I then use ImportV1SavedQuerySchema still?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is used when validating the schema of all the files inside the ZIP: https://github.com/apache/superset/blob/master/superset/commands/importers/v1/__init__.py#L108-L111

If you open the ZIP file from a saved query export you'll see that saved queries are stored inside the queries/ subdirectory, so we need to use that:

schemas: Dict[str, Schema] = {
        "datasets/": ImportV1DatasetSchema(),
        "queries/": ImportV1SavedQuerySchema(),

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that makes a lot of sense. Thank you.

# discover datasets associated with saved queries
dataset_uuids: Set[str] = set()
for file_name, config in configs.items():
if file_name.startswith("saved_queries/"):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same here, currently the directory is queries/. (Check the ZIP file with an exported saved query.)

(We can't change it to saved_queries/ unless we bump the version of the importer.)

def _import(
session: Session, configs: Dict[str, Any], overwrite: bool = False
) -> None:
# discover datasets associated with saved queries
Copy link
Member

@betodealmeida betodealmeida Mar 31, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Saved queries are not associated with datasets, they're associated with databases only. This is how the exported saved query looks like:

schema: master
label: Untitled Query 2
description: test
sql: SELECT * FROM covid_dashboard WHERE WHERE FROM HA AVG SUM MAX
uuid: 5a8be71f-f59f-4a7c-a7b3-e59394168062
version: 1.0.0
database_uuid: 45812a92-6844-4578-8692-0cc603c3fed0

So you need to scan saved queries, find the associated databases, import them, then import the saved queries. There's no datasets involved here.

Copy link
Member Author

@AAfghahi AAfghahi Apr 1, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

def _import(
        session: Session, configs: Dict[str, Any], overwrite: bool = False
    ) -> None:
        # discover databases associated with saved queries
        dataset_uuids: Set[str] = set()
        for file_name, config in configs.items():
            if file_name.startswith("queries/"):
                dataset_uuids.add(config["database_uuid"])

        # import related databases
        database_ids: Dict[str, int] = {}
        for file_name, config in configs.items():
            if file_name.startswith("databases/") and config["uuid"] in database_uuids:
                database = import_database(session, config, overwrite=False)
                database_ids[str(database.uuid)] = database.id


        # import saved queries with the correct parent ref
        for file_name, config in configs.items():
            if file_name.startswith("queries/") and config["database_uuid"] in database:
                # update datasource id, type, and name
                database = database[config["dataset_uuid"]]
                config.update(
                    {
                        "datasource_id": database.id,
                        "datasource_name": database.table_name,
                    }
                )
                config["params"].update({"datasource": database.uuid})
                import_saved_query(session, config, overwrite=overwrite)

is what I set it to, if I am understanding this correctly. Though I'd like to talk through what this file is doing

@betodealmeida
Copy link
Member

Don't forget to add unit tests, you can find examples for all the other imports inside tests.

@hughhhh hughhhh self-requested a review March 31, 2021 20:46
@@ -25,3 +30,10 @@ class SavedQueryBulkDeleteFailedError(DeleteFailedError):

class SavedQueryNotFoundError(CommandException):
message = _("Saved query not found.")

class SavedQueryImportError(ImportFailedError):
message = _("Import chart failed for an unknown reason")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

End the message in a period for consistency.

message = _("Import chart failed for an unknown reason")

class SavedQueryInvalidError(CommandInvalidError):
message = _("Saved Query Parameters are invalid")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

message = _("Saved Query parameters are invalid.")


dao = SavedQueryDAO
model_name= "saved_queries"
prefix ="saved_queries/"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This also needs to be queries/, matching the name of the directory inside the ZIP file.

@AAfghahi AAfghahi changed the title feature: create backend routes and API for importing saved queries feat: create backend routes and API for importing saved queries Apr 1, 2021
@AAfghahi AAfghahi force-pushed the saved_queries branch 2 times, most recently from 80a37cf to fd5f4b4 Compare April 6, 2021 15:36
@AAfghahi AAfghahi marked this pull request as ready for review April 7, 2021 00:00
@codecov
Copy link

codecov bot commented Apr 7, 2021

Codecov Report

Merging #13893 (a2235c3) into master (9d0bb3a) will increase coverage by 2.64%.
The diff coverage is 87.87%.

❗ Current head a2235c3 differs from pull request most recent head a48bd8c. Consider uploading reports for the commit a48bd8c to get more accurate results
Impacted file tree graph

@@            Coverage Diff             @@
##           master   #13893      +/-   ##
==========================================
+ Coverage   76.78%   79.42%   +2.64%     
==========================================
  Files         935      939       +4     
  Lines       47292    47542     +250     
  Branches     5895     5938      +43     
==========================================
+ Hits        36311    37759    +1448     
+ Misses      10824     9657    -1167     
+ Partials      157      126      -31     
Flag Coverage Δ
cypress 56.04% <74.07%> (+3.73%) ⬆️
hive 80.52% <83.76%> (+0.23%) ⬆️
mysql 80.82% <83.11%> (+0.21%) ⬆️
postgres 80.85% <83.11%> (+0.21%) ⬆️
presto 80.54% <85.06%> (+0.25%) ⬆️
python 81.40% <86.03%> (+0.26%) ⬆️
sqlite 80.41% <83.11%> (+0.22%) ⬆️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files Coverage Δ
...tend/src/SqlLab/components/ScheduleQueryButton.tsx 28.12% <ø> (ø)
...et-frontend/src/SqlLab/components/TableElement.jsx 88.60% <ø> (+0.23%) ⬆️
...end/src/common/components/DropdownButton/index.tsx 24.00% <0.00%> (ø)
...frontend/src/components/DatabaseSelector/index.tsx 92.70% <ø> (+1.04%) ⬆️
...rset-frontend/src/components/ErrorMessage/types.ts 100.00% <ø> (ø)
superset-frontend/src/components/Icon/index.tsx 100.00% <ø> (ø)
...rset-frontend/src/components/ProgressBar/index.tsx 100.00% <ø> (ø)
...ontend/src/components/URLShortLinkButton/index.jsx 100.00% <ø> (ø)
.../components/CrossFilterScopingModal/utils/index.ts 100.00% <ø> (ø)
...ntend/src/dashboard/components/CssEditor/index.jsx 95.83% <ø> (ø)
... and 247 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 3c4591e...a48bd8c. Read the comment docs.


# import saved queries with the correct parent ref
for file_name, config in configs.items():
if file_name.startswith("queries/") and config["database_uuid"] in database:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

database here is a Database object, you want:

Suggested change
if file_name.startswith("queries/") and config["database_uuid"] in database:
if file_name.startswith("queries/") and config["database_uuid"] in database_ids:

log_to_statsd=False,
)
def import_(self) -> Response:
"""Import Saved Queries with associated datasets and databases
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Import Saved Queries with associated datasets and databases
"""Import Saved Queries with associated databases

description: JSON map of passwords for each file
type: string
overwrite:
description: overwrite existing databases?
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: overwrite existing databases?
description: overwrite existing saved queries?

"""Import Saved Queries"""

dao = SavedQueryDAO
model_name= "saved_queries"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The lint is going to fail here and in a few other places. You can auto format it by running pre-commit run --all-files.

Suggested change
model_name= "saved_queries"
model_name = "saved query"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok that saves me a lot of heart ache. ty

buf.seek(0)
return buf

@pytest.mark.usefixtures("create_saved_queries")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A fixture runs setup code before a test function, and cleanup code after. There's no fixture called create_saved_queries, so you should remove this.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that makes sense, sort of like a beforeEach afterEach in RTL. Would it be more thorough to add this fixture?

Comment on lines 148 to 149
assert saved_query.uuid == "05b679b5-8eaf-452c-b874-a7a774cfa4e9"
assert saved_query.database_uuid == "b8a1ccd3-779d-4ab7-8ad8-9ab119d7fe89"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suspect these will change every time you run the tests, so you might need to remove these tests.

Comment on lines 139 to 147
assert saved_query.sql == (
"""
-- Note: Unless you save your query,
these tabs will NOT persist if you clear
your cookies or change browsers.

SELECT * from birth_names
"""
)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test will probably fail because the SQL is formatted differently (there's no line break after "your query,", eg).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, they do. Was wondering about this actually, should I just not test the sql or is there a comment to disable the line is too long linter?

Copy link
Member

@betodealmeida betodealmeida Apr 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can disable to line it too linter, or break the line:

assert saved_query == (
    "-- Note: Unless you save your query, "
    "these tabs will NOT persist if you clear "
    "our cookies or change browsers.\n\n"
    ...
)

db.session.commit()

def test_import_v1_saved_queries_validation(self):
"""Test different validations applied when importing a chart"""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"""Test different validations applied when importing a chart"""
"""Test different validations applied when importing a saved query"""

"""Test different validations applied when importing a chart"""
# metadata.yaml must be present
contents = {
"metadata.yaml": yaml.safe_dump(saved_queries_metadata_config),
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You need to remove metadata.yaml, since here you're testing that a validation message "Missing metadata.yaml" is shown when the file is missing.

Copy link
Member Author

@AAfghahi AAfghahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you

Copy link
Member Author

@AAfghahi AAfghahi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ty

Copy link
Member

@betodealmeida betodealmeida left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@AAfghahi
Copy link
Member Author

AAfghahi commented Apr 7, 2021

@betodealmeida Thank you! Though, its still failing two tests :/

@hughhhh hughhhh merged commit b5e5b3a into apache:master Apr 8, 2021
@AAfghahi AAfghahi deleted the saved_queries branch April 8, 2021 18:23
allanco91 pushed a commit to allanco91/superset that referenced this pull request May 21, 2021
…he#13893)

* initial commit

* revisions

* started tests

* added unit tests

* revisions

* tests passing

* fixed api test

* Update superset/queries/saved_queries/commands/importers/v1/utils.py

Co-authored-by: Hugh A. Miles II <hughmil3s@gmail.com>

* Revert "Update superset/queries/saved_queries/commands/importers/v1/utils.py"

This reverts commit 18580aa.

Co-authored-by: Hugh A. Miles II <hughmil3s@gmail.com>
QAlexBall pushed a commit to QAlexBall/superset that referenced this pull request Dec 29, 2021
…he#13893)

* initial commit

* revisions

* started tests

* added unit tests

* revisions

* tests passing

* fixed api test

* Update superset/queries/saved_queries/commands/importers/v1/utils.py

Co-authored-by: Hugh A. Miles II <hughmil3s@gmail.com>

* Revert "Update superset/queries/saved_queries/commands/importers/v1/utils.py"

This reverts commit 18580aa.

Co-authored-by: Hugh A. Miles II <hughmil3s@gmail.com>
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 1.2.0 labels Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/L 🚢 1.2.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants