Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extend Excel write/read to handle large data beyond the row limits #309

Merged
merged 23 commits into from
Apr 21, 2020

Conversation

behnam-zakeri
Copy link
Contributor

@behnam-zakeri behnam-zakeri commented Apr 10, 2020

In response to issue #292, this PR extends the code in ixmp.backend.io.py to write and read large data, i.e., dataframes and series longer than the maximum row number of Excel.

How to review

You can test this branch as it is. The tests are improved to reflect the new changes. ALternatively, you can export one of the global scenarios and check if parameter land_output is completely transferred to Excel in two sheets.

PR checklist

  • Tests improved to reflect the enhancements.
  • Documentation is not needed, minor enhancement.
  • Release notes updated by adding a short note.

@codecov
Copy link

codecov bot commented Apr 10, 2020

Codecov Report

Merging #309 into master will increase coverage by 0.10%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #309      +/-   ##
==========================================
+ Coverage   97.31%   97.41%   +0.10%     
==========================================
  Files          39       41       +2     
  Lines        4090     4251     +161     
==========================================
+ Hits         3980     4141     +161     
  Misses        110      110              
Impacted Files Coverage Δ
ixmp/backend/base.py 98.65% <100.00%> (ø)
ixmp/backend/io.py 98.55% <100.00%> (+0.11%) ⬆️
ixmp/cli.py 98.67% <100.00%> (+<0.01%) ⬆️
ixmp/core.py 95.80% <100.00%> (-0.01%) ⬇️
ixmp/tests/test_cli.py 100.00% <100.00%> (ø)
ixmp/reporting/key.py 100.00% <0.00%> (ø)
ixmp/reporting/quantity.py 95.00% <0.00%> (ø)
ixmp/tests/reporting/__init__.py 100.00% <0.00%> (ø)
... and 6 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update ea17bab...96528d8. Read the comment docs.

Copy link
Contributor

@zikolach zikolach left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I left some comments

ixmp/backend/io.py Outdated Show resolved Hide resolved
ixmp/backend/io.py Outdated Show resolved Hide resolved
ixmp/backend/io.py Outdated Show resolved Hide resolved
if i > 1:
suffix = '({})'.format(i)
else:
suffix = ''
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't it better to always add numeric prefix if number of elements is more than 1 sheet?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now, the sheets will be saved as foo, foo(1), foo(2), ... . I think it's better to see the sheets starting with the name of the items (in case the sheet name is not fully expanded).

ixmp/backend/io.py Outdated Show resolved Hide resolved
ixmp/backend/io.py Outdated Show resolved Hide resolved
ixmp/cli.py Outdated Show resolved Hide resolved
@khaeru khaeru linked an issue Apr 17, 2020 that may be closed by this pull request
@khaeru
Copy link
Member

khaeru commented Apr 17, 2020

In addition to the points raised by @zikolach, the documentation (current version shown here: https://message.iiasa.ac.at/projects/ixmp/en/master/file-io.html#scenario-model-data) needs to be adjusted to describe the new format.

@behnam-zakeri
Copy link
Contributor Author

In addition to the points ... needs to be adjusted to describe the new format.

Thanks @khaeru, I extended the doc to reflect this. I also noticed a redundant sentence and deleted that.

Copy link
Member

@khaeru khaeru left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent @behnam-zakeri, thank you!

I used the Python built-in functions range(), zip(), and enumerate(), plus the technique of defining a function-within-a-function, to streamline the code a little, but your improvements were already very tidy, and the addition of tests is a solid example.

Will merge once tests pass.

@khaeru khaeru merged commit 56355b8 into iiasa:master Apr 21, 2020
@behnam-zakeri
Copy link
Contributor Author

Thanks @zikolach for useful comments and @khaeru for very nice improvements of the code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Handle items with >10⁶ elements in Excel data I/O
3 participants