Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Features/517-netcdf4-features #559

Merged

Conversation

ben-bou
Copy link
Collaborator

@ben-bou ben-bou commented Apr 24, 2020

Description

Adding the following features to the save_netcdf method:

  • Naming of dimensions
  • Creating unlimited dimensions
  • Usage of already existing dimensions and variables
  • Specifying the Location of data in the netcdf4-File using indices or slices

Issue/s resolved: #517 (partially)

Changes proposed:

Type of change

Due Diligence

  • All split configurations tested
  • Multiple dtypes tested in relevant functions
  • Documentation updated (if needed)
  • Updated changelog.md under the title "Pending Additions"

Does this change modify the behaviour of other functions? If so, which?

no

@coquelin77
Copy link
Member

coquelin77 commented Apr 27, 2020

some of the formatting is wrong here. this is fixed automatically by pre-commit (most of the time). all you need to do is install pre-commit (pip install pre-commit or conda install -c conda-forge pre-commit) then run pre-commit instal and pre-commit run --all-files

@ben-bou
Copy link
Collaborator Author

ben-bou commented Apr 29, 2020

Thanks, I ran pre-commit and updated the brach.

@coquelin77
Copy link
Member

bump. is this being worked on? to run the tests locally you can use mpirun -np # ...

@ben-bou
Copy link
Collaborator Author

ben-bou commented May 7, 2020

What do you mean, is anything missing?

@coquelin77
Copy link
Member

the tests are failing. it looks like something in the IO is hanging. you can see it in the Travis CI - Pull Request details

@ben-bou
Copy link
Collaborator Author

ben-bou commented May 7, 2020

In Travis CI, IO is hanging, but if I do
mpiexec -n 4 coverage run --source=heat --parallel-mode -m pytest heat/ && coverage combine && coverage report && coverage xml, it passes for 1...4 processes.
Any suggestions? Maybe there are incompatibilities in the underlying netcdf Versions?

@coquelin77
Copy link
Member

check the versions. it is listed in the log for the travis builds but i have also listed it here:

Package        Version  Location
-------------- -------- --------
atomicwrites   1.3.0    
attrs          19.1.0   
certifi        2019.3.9 
cftime         1.1.2    
chardet        3.0.4    
codecov        2.0.15   
coverage       4.5.3    
future         0.18.2   
h5py           2.10.0   
heat           0.3.0    /heat   
idna           2.8      
more-itertools 7.0.0    
mpi4py         3.0.3    
netCDF4        1.5.2    
numpy          1.18.4   
pip            19.0.3   
pluggy         0.9.0    
py             1.8.0    
pytest         4.4.0    
requests       2.21.0   
setuptools     40.8.0   
six            1.12.0   
torch          1.5.0    
urllib3        1.24.1 

@ben-bou
Copy link
Collaborator Author

ben-bou commented May 13, 2020

Hi, I've got a quick question:

Lets say I've got a dndarray data and a shape out_shape, which is basically the same as data.shape but it might have ones added in between. Is there a way to find out the hypothetical split-axis of data if it was broadcast to out_shape?

@ben-bou
Copy link
Collaborator Author

ben-bou commented May 13, 2020

ok, seems like out_split = data.split + out_shape[: data.split].count(1) - data.shape[: data.split].count(1) does this

Ben Bourgart and others added 3 commits May 13, 2020 19:37
…eft-hand shape of set-operation). Then only merge slices for split-axis. This is a testing/debugging version, cleanup needed
@ben-bou
Copy link
Collaborator Author

ben-bou commented May 14, 2020

These formatting errors have nothing to do with my changes. Where do they suddenly come from? Is there a way to suppress them? I don't get any errors when running pre-commit locally.

@codecov
Copy link

codecov bot commented May 15, 2020

Codecov Report

Merging #559 (9eedf74) into master (debeaf7) will decrease coverage by 0.05%.
The diff coverage is 88.99%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #559      +/-   ##
==========================================
- Coverage   97.59%   97.54%   -0.06%     
==========================================
  Files          87       87              
  Lines       18030    18219     +189     
==========================================
+ Hits        17596    17771     +175     
- Misses        434      448      +14     
Impacted Files Coverage Δ
heat/core/io.py 88.73% <80.16%> (-1.66%) ⬇️
heat/core/tests/test_io.py 94.79% <100.00%> (+1.76%) ⬆️
heat/core/types.py 95.36% <0.00%> (+0.51%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update debeaf7...9eedf74. Read the comment docs.

@coquelin77
Copy link
Member

please extend the unit tests to cover the new additions

@Markus-Goetz
Copy link
Member

Also, I will have time to look at it next week an make some addition/changes. Sorry for the long delay

@coquelin77
Copy link
Member

bump, there was an update which removed the ht_device parameter. see other tests for how to use it

@ClaudiaComito ClaudiaComito self-assigned this Jul 14, 2020
@ClaudiaComito ClaudiaComito added this to In progress in Current sprint via automation Jul 14, 2020
@mtar
Copy link
Collaborator

mtar commented Oct 20, 2020

GPU cluster tests are currently disabled on this Pull Request.

heat/core/io.py Outdated
merged_slices = __merge_slices(var, file_slices, data)
try:
var[merged_slices] = (
data._DNDarray__array.cpu()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Ben, please replace _DNDarray__array__ with the new larray for consistency with the rest - also in the follow-up code

heat/core/io.py Outdated
if is_split:
var[slices] = data.larray.cpu()
merged_slices = __merge_slices(var, file_slices, data)
var[merged_slices] = data._DNDarray__array.cpu()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

_DNDarray__array --> larray

@ClaudiaComito ClaudiaComito merged commit 0c9c763 into helmholtz-analytics:master Nov 18, 2020
Current sprint automation moved this from In progress to Done Nov 18, 2020
@ClaudiaComito
Copy link
Contributor

Thanks a lot @ben-bou for your contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

Enhance HDF5 and netCDF4 API
5 participants