Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[DB Engine] Support old and new Presto syntax #7977

Merged

Conversation

etr2460
Copy link
Member

@etr2460 etr2460 commented Aug 2, 2019

CATEGORY

Choose one

  • Bug Fix
  • Enhancement (new features, refinement)
  • Refactor
  • Add tests
  • Build / Development Environment
  • Documentation

SUMMARY

In presto v0.199 the syntax to get partitions for a table changed. This logic was changed to only support the new syntax in #7250, but some people might still be on the old versions. This PR adds a config var to set your presto version (as i couldn't find any way to programmatically get it with sqlalchemy) and uses the correct syntax for your version of Presto.

TEST PLAN

CI

ADDITIONAL INFORMATION

  • Has associated issue:
  • Changes UI
  • Requires DB Migration.
  • Confirm DB Migration upgrade and downgrade tested.
  • Introduces new feature or API
  • Removes existing feature or API

REVIEWERS

@graceguo-supercat @betodealmeida @michellethomas @john-bodley

from superset.db_engine_specs.base import BaseEngineSpec
from superset.exceptions import SupersetTemplateException
from superset.models.sql_types.presto_sql_types import type_map as presto_type_map
from superset.utils import core as utils

QueryStatus = utils.QueryStatus

PRESTO_VERSION = app.config.get("PRESTO_VERSION")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@etr2460 I don't think this is correct as there may be multiple Presto databases. Either this should be defined in the Database CRUD model (preferred) or this variable needs to be a dictionary keyed by database ID.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Eek, that's true... This should be added to the database model

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There may be a way of determining this from the connection, though regardless it probably should be stored either in the model or cached.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah, i tried to figure out a way to from the connection, but couldn't find one. I've added it to the extra params in the database model now, as i didn't think it should use a new column (since we're only using it with presto dbs right now)

@etr2460 etr2460 force-pushed the erik-ritter--fix-presto-partition branch 6 times, most recently from 64d1daa to 1722a3f Compare August 3, 2019 00:21
@codecov-io
Copy link

codecov-io commented Aug 3, 2019

Codecov Report

Merging #7977 into master will increase coverage by <.01%.
The diff coverage is 83.33%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master    #7977      +/-   ##
==========================================
+ Coverage   65.59%   65.59%   +<.01%     
==========================================
  Files         469      469              
  Lines       22401    22403       +2     
  Branches     2432     2432              
==========================================
+ Hits        14694    14696       +2     
  Misses       7586     7586              
  Partials      121      121
Impacted Files Coverage Δ
superset/views/database/__init__.py 81.63% <ø> (ø) ⬆️
superset/db_engine_specs/presto.py 78.01% <83.33%> (+0.1%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b8ca078...1722a3f. Read the comment docs.

Copy link
Member

@villebro villebro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Most SQL engines support querying the version: For example, on sqlite the query select sqlite_version(); for me yields the result 3.28.0. I would expect Presto to have something similar, but I was unable to find anything in the docs. If other similar use cases turn up, I think it would be worthwhile adding some type of mechanism for version checking in db_engine_specs, but for now I think this solution is ok. However, it would be good to add a note in docs/installation.rst that the version needs to be defined in the extra params for Presto to support the old syntax.

# Default to the new syntax if version is unset.
partition_select_clause = (
f'SELECT * FROM "{table_name}$partitions"'
if not presto_version or presto_version >= "0.199"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you may want something like

version_tuple = lambda v: tuple(map(int, v.split('.')))
version_tuple(presto_version) > (0, 199)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using distutils which provides a version comparison method as described here.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would suggest using distutils which provides a version comparison method as described here.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although string compare is probably going to work in almost all cases, I agree that using distutils is a better solution. Updated!

@etr2460 etr2460 force-pushed the erik-ritter--fix-presto-partition branch from 1722a3f to aee8af4 Compare August 5, 2019 16:04
@etr2460
Copy link
Member Author

etr2460 commented Aug 5, 2019

@villebro I've added additional documentation in installation.rst

@john-bodley john-bodley merged commit d58dbad into apache:master Aug 5, 2019
@etr2460 etr2460 deleted the erik-ritter--fix-presto-partition branch August 5, 2019 19:57
graceguo-supercat pushed a commit to graceguo-supercat/superset that referenced this pull request Aug 5, 2019
etr2460 pushed a commit to etr2460/incubator-superset that referenced this pull request Aug 8, 2019
@mistercrunch mistercrunch added 🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels 🚢 0.34.0 labels Feb 28, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏷️ bot A label used by `supersetbot` to keep track of which PR where auto-tagged with release labels size/S 🚢 0.34.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants