Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix `read_csv` when `parse_dates` and `index_col` are the same #548

Merged
merged 2 commits into from Apr 14, 2019

Conversation

Projects
None yet
2 participants
@devin-petersohn
Copy link
Member

commented Apr 14, 2019

  • Resolves #537
  • Now moves all names into the partition
    • This ensures that parse_dates and similar parameters have the
      correct columns names to find
    • Required that we also compute index_col inside the partitions
      (Previously, we were poping the parameter
  • Adds tests to ensure that this use-case is supported

What do these changes do?

Related issue number

  • passes flake8 modin
  • passes black --check modin
  • tests added and passing
Fix `read_csv` when `parse_dates` and `index_col` are the same
* Resolves #537
* Now moves all `names` into the partition
  * This ensures that `parse_dates` and similar parameters have the
    correct columns names to find
  * Required that we also compute `index_col` inside the partitions
  (Previously, we were `pop`ing the parameter
* Adds tests to ensure that this use-case is supported
@codecov

This comment has been minimized.

Copy link

commented Apr 14, 2019

Codecov Report

Merging #548 into master will decrease coverage by 5.05%.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #548      +/-   ##
==========================================
- Coverage   90.24%   85.19%   -5.06%     
==========================================
  Files          34       34              
  Lines        4851     4856       +5     
==========================================
- Hits         4378     4137     -241     
- Misses        473      719     +246
Impacted Files Coverage Δ
modin/engines/ray/pandas_on_ray/io.py 100% <ø> (ø) ⬆️
modin/engines/ray/generic/io.py 92.98% <100%> (+0.12%) ⬆️
modin/engines/base/frame/axis_partition.py 45.45% <0%> (-52.28%) ⬇️
...engines/python/pandas_on_python/frame/partition.py 35.29% <0%> (-50.99%) ⬇️
modin/data_management/utils.py 44.44% <0%> (-40%) ⬇️
...python/pandas_on_python/frame/partition_manager.py 63.15% <0%> (-36.85%) ⬇️
modin/engines/base/io.py 75.96% <0%> (-20.2%) ⬇️
modin/backends/pandas/query_compiler.py 78.28% <0%> (-12.03%) ⬇️
...es/python/pandas_on_python/frame/axis_partition.py 93.33% <0%> (-6.67%) ⬇️
modin/data_management/factories.py 88.57% <0%> (-6.43%) ⬇️
... and 2 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 9551e7a...bd01f7f. Read the comment docs.

@@ -228,6 +228,19 @@ def _read_csv_from_file_pandas_on_ray(cls, filepath, kwargs={}):
Returns:

This comment has been minimized.

Copy link
@williamma12

williamma12 Apr 14, 2019

Collaborator

Should we use this opportunity to move the _read_csv_file_from_pandas_on_ray away from generic/io.py and to the pandas on ray specific io file?

This comment has been minimized.

Copy link
@devin-petersohn

devin-petersohn Apr 14, 2019

Author Member

It can be renamed, but this defines the general functionality for how the arguments are handled and how the index is built. I think it should stay because it is also used by #502 for the arrow backend.

This comment has been minimized.

Copy link
@williamma12

williamma12 Apr 14, 2019

Collaborator

Ah ok let's rename the function in #502 then?

@williamma12
Copy link
Collaborator

left a comment

great pr @devin-petersohn!

@williamma12 williamma12 merged commit dd1a62e into modin-project:master Apr 14, 2019

3 checks passed

Travis CI - Pull Request Build Passed
Details
codecov/patch 100% of diff hit (target 90.24%)
Details
codecov/project Absolute coverage decreased by -5.05% but relative coverage increased by +9.75% compared to 9551e7a
Details
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.