Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adding support for all dtypes, adds more tests and fixes small bugs #102

Merged
merged 19 commits into from
Oct 2, 2020

Conversation

westernguy2
Copy link
Contributor

This pull request solves the issue in #79 where the dtype of a column is not being assigned a data_type in data_type_lookup, resulting in an error. This is resolved by setting any column that has a dtype not being accounted for as nominal. In addition, it became clear that the ordinal data type wasn't being used, so that was removed from Lux.

Also fixed are bugs relating to __getattr__ due to a missing return value, which fixed the issue where accessing columns of a DataFrame via dot notation wasn't working.

Finally, some more tests for .loc and .iloc were added, as well as a new test for groupby.agg which hasn't been resolved yet (in issue #66 ).

@codecov-commenter
Copy link

codecov-commenter commented Sep 30, 2020

Codecov Report

Merging #102 into master will increase coverage by 0.05%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #102      +/-   ##
==========================================
+ Coverage   74.94%   75.00%   +0.05%     
==========================================
  Files          36       36              
  Lines        2415     2412       -3     
==========================================
- Hits         1810     1809       -1     
+ Misses        605      603       -2     
Impacted Files Coverage Δ
lux/action/filter.py 90.16% <ø> (-0.16%) ⬇️
lux/vis/Clause.py 55.35% <ø> (ø)
lux/vislib/altair/AltairRenderer.py 96.49% <ø> (+0.12%) ⬆️
lux/core/frame.py 68.28% <100.00%> (+0.72%) ⬆️
lux/executor/Executor.py 72.72% <100.00%> (ø)
lux/executor/PandasExecutor.py 95.94% <100.00%> (+0.01%) ⬆️
lux/action/generalize.py 80.95% <0.00%> (-0.87%) ⬇️
lux/action/univariate.py 87.87% <0.00%> (-0.70%) ⬇️
lux/action/correlation.py 86.84% <0.00%> (-0.34%) ⬇️
... and 32 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 1ebc02c...b8fa059. Read the comment docs.

@dorisjlee
Copy link
Member

This looks great! One remaining issue in #79 is the use of qcut, which creates Interval data type, it seems like there is some difficulty in parsing this into the widget representation (i.e. JSON serializable). The use of cut seems to be working fine. I've added a test for this in the PR.
image

@westernguy2
Copy link
Contributor Author

This commit checks if dtype of a column is of type Interval in create_vis. If it is, then the column is converted to type str instead. This only modifies the visualization's data, not the original LuxDataFrame. This should fix the issue where columns of type Interval are unable to be visualized, as now it's a String type.

In addition, metadata is also expired in _set_item as there was a bug where direct column assignment was not updating the metadata accurately.

@dorisjlee dorisjlee merged commit f1cf523 into lux-org:master Oct 2, 2020
@dorisjlee
Copy link
Member

This looks great, thanks @westernguy2 !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants