fix: datatype tracking issue on virtual dataset #20088

codemaster08240328 · 2022-05-16T20:20:28Z

SUMMARY

Column types not detected when creating Virtual Dataset.

If we create a virtual dataset, and try to create a chart based on it, some column types may not be detected as it was defined in original dataset. It was because some columns had no any non-null values when we run the SQL to create a dataset.
The previous version was tracking column type of columns from the query result values, and that's why some columns had null type even though it had exact type in original dataset.

To fix this, we need to track the column type from original dataset, not a SQL query running result value.

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

BEFORE

AFTER:

TESTING INSTRUCTIONS

Create a virtual dataset using this query:

SELECT *
FROM public."SQLLab_Test_For_VirtualDS"
WHERE site_slug in ('connectmiles', 'rocketmiles')
AND reward_program_slug in ('connectmiles')
LIMIT 10

Try to create a chart based on virtual dataset.
See if all columns have exact data type, not NULL.

Dataset for virtual dataset.
sqllab_query_publicconnectmiles_all_reservation_stats_20220412T210849.csv

ADDITIONAL INFORMATION

Has associated issue:
Required feature flags:
Changes UI
Includes DB Migration (follow approval process in SIP-59)
- Migration is atomic, supports rollback & is backwards-compatible
- Confirm DB migration upgrade and downgrade tested
- Runtime estimates and downtime expectations provided
Introduces new feature or API
Removes existing feature or API

codecov · 2022-05-16T20:27:12Z

Codecov Report

Merging #20088 (fa4f374) into master (c8fe518) will decrease coverage by 11.91%.
The diff coverage is 37.50%.

@@             Coverage Diff             @@
##           master   #20088       +/-   ##
===========================================
- Coverage   66.47%   54.55%   -11.92%     
===========================================
  Files        1727     1727               
  Lines       64724    64732        +8     
  Branches     6822     6822               
===========================================
- Hits        43024    35314     -7710     
- Misses      19969    27687     +7718     
  Partials     1731     1731

Flag	Coverage Δ
hive	`53.69% <37.50%> (-0.01%)`	⬇️
mysql	`?`
postgres	`?`
presto	`53.55% <37.50%> (-0.01%)`	⬇️
python	`57.98% <37.50%> (-24.66%)`	⬇️
sqlite	`?`
unit	`49.45% <37.50%> (-0.01%)`	⬇️

Flags with carried forward coverage won't be shown. Click here to find out more.

Impacted Files	Coverage Δ
superset/db_engine_specs/postgres.py	`59.32% <37.50%> (-37.96%)`	⬇️
superset/utils/dashboard_import_export.py	`0.00% <0.00%> (-100.00%)`	⬇️
superset/key_value/commands/upsert.py	`0.00% <0.00%> (-89.59%)`	⬇️
superset/key_value/commands/update.py	`0.00% <0.00%> (-89.37%)`	⬇️
superset/key_value/commands/delete.py	`0.00% <0.00%> (-85.30%)`	⬇️
superset/key_value/commands/delete_expired.py	`0.00% <0.00%> (-80.77%)`	⬇️
superset/dashboards/commands/importers/v0.py	`15.62% <0.00%> (-76.25%)`	⬇️
superset/datasets/commands/update.py	`25.88% <0.00%> (-68.24%)`	⬇️
superset/datasets/commands/create.py	`30.18% <0.00%> (-67.93%)`	⬇️
superset/datasets/commands/importers/v0.py	`24.03% <0.00%> (-67.45%)`	⬇️
... and 272 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update c8fe518...fa4f374. Read the comment docs.

betodealmeida

Looks good! :)

betodealmeida · 2022-05-20T20:05:24Z

Looks like there's a test failing.

villebro · 2022-06-03T11:13:59Z

superset/db_engine_specs/postgres.py

@@ -21,6 +21,7 @@
 from typing import Any, Dict, List, Optional, Pattern, Tuple, TYPE_CHECKING

 from flask_babel import gettext as __
+from psycopg2.extensions import binary_types, string_types


I think this import should be moved into get_datatype, as psycopg2 isn't a required dependency

Fixed by #20543

* Fix datatype tracking issue on virtual dataset * fix pytest issue on postgresql

codemaster08240328 added 2 commits May 16, 2022 16:05

Fix datatype tracking issue on virtual dataset

df70299

Merge branch 'master' into fix/virtual-dataset-tracking-datatype

e6159de

superset-github-bot bot added the Preset-Patch label May 16, 2022

pull-request-size bot added the size/XS label May 16, 2022

rusackas requested a review from betodealmeida May 16, 2022 20:36

diegomedina248 approved these changes May 16, 2022

View reviewed changes

betodealmeida approved these changes May 20, 2022

View reviewed changes

codemaster08240328 added 2 commits May 31, 2022 08:01

Merge branch 'master' into fix/virtual-dataset-tracking-datatype

fa4f374

fix pytest issue on postgresql

7b22e6b

pull-request-size bot added size/S and removed size/XS labels May 31, 2022

rusackas merged commit 74c5479 into apache:master Jun 1, 2022

villebro reviewed Jun 3, 2022

View reviewed changes

philipher29 pushed a commit to ValtechMobility/superset that referenced this pull request Jun 9, 2022

fix: datatype tracking issue on virtual dataset (apache#20088)

588b712

* Fix datatype tracking issue on virtual dataset * fix pytest issue on postgresql

mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 2.0.0 labels Mar 13, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: datatype tracking issue on virtual dataset #20088

fix: datatype tracking issue on virtual dataset #20088

codemaster08240328 commented May 16, 2022

codecov bot commented May 16, 2022 •

edited

betodealmeida left a comment

betodealmeida commented May 20, 2022

villebro Jun 3, 2022

michael-s-molina Jun 29, 2022

fix: datatype tracking issue on virtual dataset #20088

fix: datatype tracking issue on virtual dataset #20088

Conversation

codemaster08240328 commented May 16, 2022

SUMMARY

BEFORE/AFTER SCREENSHOTS OR ANIMATED GIF

TESTING INSTRUCTIONS

ADDITIONAL INFORMATION

codecov bot commented May 16, 2022 • edited

Codecov Report

betodealmeida left a comment

Choose a reason for hiding this comment

betodealmeida commented May 20, 2022

villebro Jun 3, 2022

Choose a reason for hiding this comment

michael-s-molina Jun 29, 2022

Choose a reason for hiding this comment

codecov bot commented May 16, 2022 •

edited