-
Notifications
You must be signed in to change notification settings - Fork 13.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
refactor(database): use SupersetResultSet on SqlaTable.get_df() #10707
Conversation
Codecov Report
@@ Coverage Diff @@
## master #10707 +/- ##
==========================================
- Coverage 64.76% 60.23% -4.54%
==========================================
Files 789 789
Lines 37091 37077 -14
Branches 3555 3555
==========================================
- Hits 24021 22332 -1689
- Misses 12963 14558 +1595
- Partials 107 187 +80
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
83d46d5
to
383664e
Compare
@@ -25,7 +25,7 @@ | |||
str, str, Optional[str], Optional[str], Optional[int], Optional[int], bool | |||
] | |||
DbapiDescription = Union[List[DbapiDescriptionRow], Tuple[DbapiDescriptionRow, ...]] | |||
DbapiResult = List[Union[List[Any], Tuple[Any, ...]]] | |||
DbapiResult = Sequence[Union[List[Any], Tuple[Any, ...]]] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for some reason MyPy
was giving me errors on this locally (I'm probably running a newer version than CI). As I know some Dbapis are actually returning tuples, I think this makes sense anyway.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
8074f28
to
889ab05
Compare
@@ -119,7 +129,7 @@ describe('Visualization > Table', () => { | |||
it('Test table with percent metrics and groupby', () => { | |||
const formData = { | |||
...VIZ_DEFAULTS, | |||
percent_metrics: NUM_METRIC, | |||
percent_metrics: PERCENT_METRIC, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For some weird reason this test was flaking out when run with NUM_METRIC
(I replicated this test locally, but was not able to make it fail). Switching to PERCENT_METRIC
below seems to have solved it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM!
…he#10707) * refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
…he#10707) * refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
* refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
…he#10707) * refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
…he#10707) * refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
* refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
* refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
…he#10707) * refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
…he#10707) * refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
…he#10707) * refactor(database): use SupersetResultSet on SqlaTable.get_df() * lint * change cypress test
SUMMARY
Currently
SqlaTable.get_df()
is fetching data usingcursor.fetchall()
instead ofBaseEngineSpec.fetch_data()
, which contains critical customizations for multiple databases. In addition, instead of constructing theDataFrame
usingpd.DataFrame.from_records()
, we can leverageSupersetResultSet
which is already implemented on SQL Lab, and which behind the scenes is usingPyArrow
for storing the datastructure. This is a yet another step in the direction of usingSupersetResultSet
, and thereforePyArrow
, instead of Pandas DataFrames whenever possible.TEST PLAN
CI
ADDITIONAL INFORMATION
Ping: @robdiciuccio