Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load retail demo data using compressed file #271

Merged
merged 9 commits into from
Oct 2, 2018
Merged

Conversation

WillKoehrsen
Copy link
Contributor

Changed load_retail to load a gz compressed version of the retail detail from S3. This reduces the size of the downloaded file from 43 MB to 7.3 MB.

@codecov-io
Copy link

codecov-io commented Oct 1, 2018

Codecov Report

Merging #271 into master will decrease coverage by 0.02%.
The diff coverage is 60%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #271      +/-   ##
==========================================
- Coverage   94.47%   94.45%   -0.03%     
==========================================
  Files          71       71              
  Lines        7696     7700       +4     
==========================================
+ Hits         7271     7273       +2     
- Misses        425      427       +2
Impacted Files Coverage Δ
featuretools/demo/retail.py 85% <60%> (-8.75%) ⬇️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 8498bb6...8903c05. Read the comment docs.

* Update to reference Spark

* Added reference to Spark scaling

Included references to spark notebook and article for using Featuretools on a cluster.
Tries to load `gz` compressed and defaults to uncompressed `csv`  if error.
Doc string now refers to both `gz` compressed and uncompressed versions of csv data file.
@WillKoehrsen
Copy link
Contributor Author

Added a fallback to uncompressed version. This might be necessary if pandas.read_csv() cannot read the gz compressed version.

@kmax12
Copy link
Contributor

kmax12 commented Oct 2, 2018

Looks good. Merging

@kmax12 kmax12 changed the title Load retail gz compressed Load retail demo data as compressed file Oct 2, 2018
@kmax12 kmax12 changed the title Load retail demo data as compressed file Load retail demo data using compressed file Oct 2, 2018
@kmax12 kmax12 merged commit da09b20 into master Oct 2, 2018
@WillKoehrsen WillKoehrsen deleted the load-retail-zip branch October 3, 2018 13:40
@rwedge rwedge mentioned this pull request Oct 31, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants