Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify and improve expand_data in Presto #8233

Merged
merged 6 commits into from Sep 17, 2019
Merged

Conversation

betodealmeida
Copy link
Member

CATEGORY

Choose one

  • Bug Fix
  • Enhancement (new features, refinement)
  • Refactor
  • Add tests
  • Build / Development Environment
  • Documentation

SUMMARY

This PR simplifies the logic of the expand_data method in Presto, while making it more generic. Queries that were failing before, like

SELECT ARRAY[ROW(100,0,'hello')]

are now working.

Some of the methods in db_engine_specs/presto.py are no longer used and can be removed; I'll do that in a subsequent PR. I wanted to keep this PR simple and small because it's blocking an important release.

TEST PLAN

I updated the unit tests, and tested with real data. I'll add more unit tests in the next PR, covering more complex cases.

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

REVIEWERS

@khtruong

@codecov-io
Copy link

Codecov Report

Merging #8233 into master will increase coverage by 0.04%.
The diff coverage is 93.97%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #8233      +/-   ##
==========================================
+ Coverage   65.98%   66.03%   +0.04%     
==========================================
  Files         479      479              
  Lines       23038    23105      +67     
  Branches     2552     2552              
==========================================
+ Hits        15202    15257      +55     
- Misses       7700     7712      +12     
  Partials      136      136
Impacted Files Coverage Δ
superset/utils/core.py 88.33% <100%> (+0.38%) ⬆️
superset/db_engine_specs/presto.py 77.52% <92.06%> (-0.28%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ad1793...e337b35. Read the comment docs.


For rows, we return a list of the columns:

>>> get_children(dict(name="a", type="ROW(BIGINT,FOO VARCHAR)"))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is more for my understanding than a PR comment. I thought each row has a variable name before the type. For example ROW(FOO BIGINT, BAR VARCHAR). When will there be a nameless column again?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good question! If you do a query like:

SELECT ARRAY[ROW(100,0,'hello')]

The resulting column has type ARRAY(ROW(INTEGER,INTEGER,VARCHAR(5))), without names.

@betodealmeida betodealmeida merged commit 4132d8f into apache:master Sep 17, 2019
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.35.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels lyft Related to Lyft size/L 🚢 0.35.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants