Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shared diskless files #12

Closed
wants to merge 11 commits into from
Closed

Shared diskless files #12

wants to merge 11 commits into from

Conversation

SiggyF
Copy link

@SiggyF SiggyF commented Dec 27, 2013

I extended the diskless mode. Files that are created in memory can now be opened by opening a
file with NC_DISKLESS mode using the same name.
This is in improvement over sharing open file identifiers between processes and provides an easy path
from disk-based to memory-based model coupling.

This is a simplified example of the use case I had (netcdf as shared memory between python and c) :
http://nbviewer.ipython.org/github/SiggyF/notebooks/blob/master/diskless.ipynb

and this is more detailed example (create a dataset in netcdf in python and reuse it in a fortran based model) :
http://nbviewer.ipython.org/github/nens/python-subgrid/blob/master/notebooks/delflandrain.ipynb

I added an extra test tst_diskless5.c. Note that if you want to ignore the whitespace cleanup you can do so by adding ?w=1 in a github url or add --ignore-all-space in the git command.

Please let me know if you can merge this code into the netcdf library. Feel free to contact me if extra effort is required.

Fedor Baart

@DennisHeimbigner
Copy link
Collaborator

Fedor, I am not sure I understand the point of this change.
Is this interpretation correct?

We assume that some part of the code has opened/created
an in-memory file. This change allows another part of the code
in the same process to open that diskless file by name rather than
by ncid. Is this correct?

@SiggyF
Copy link
Author

SiggyF commented Dec 31, 2013

Hi Dennis,

Thanks for considering the patch.
Indeed this change allows another part of the code (in the same process space) to open the same in memory file by name.

The following scenario's are described in the docs:

status = nc_create("diskless.nc", NC_DISKLESS, &ncid);
// Creates an in memory file, using the memio.c.

status = nc_create("diskless.nc", NC_DISKLESS|NC_WRITE, &ncid);
// Creates an in memory file, and also an empty file with the same name.
// The file is made persistent on close.


 status = nc_open("existing.nc", NC_DISKLESS, &ncid);
 // An existing file is read into memory.

This is the scenario that I implemented.

 // After an nc_create("diskless.nc", &ncid1);
 status = nc_open("diskless.nc", NC_DISKLESS, &ncid2);

Before my change this results in an file not found error.
After my change this results in ncid2 being set to the ncid1 created nc_create.
I can also imagine the ncid2 being different from ncid1, but I would expect the
combination of create and open to also work for in memory files.

My implementation indeed only works for single processes, as memio.c does not create shared memory and
the nc_filelist is also not shared. I think for openening in memory files between processes
the NC_MMAP should already work, but I did not test that.

You could also use the ncid, but that is not always very elegant.
For example if you create the file in python and reuse it in fortran you could
pass the dataset._grpid to fortran and start reading without opening a file.

Cheers,

Fedor

@DennisHeimbigner
Copy link
Collaborator

I expect to accept this change, but after
discussion, there is no reason to limit it
to diskless files only, so we will modify
your fix to work for any file. And thanks
for the changes and idea.
=Dennis Heimbigner
Unidata

SiggyF wrote:

Hi Dennis,

Thanks for considering the patch.
Indeed this change allows another part of the code (in the same process space) to open the same in memory file by name.

The following scenario's are described in the docs:

status = nc_create("diskless.nc", NC_DISKLESS, &ncid);
// Creates an in memory file, using the memio.c.

status = nc_create("diskless.nc", NC_DISKLESS|NC_WRITE, &ncid);
// Creates an in memory file, and also an empty file with the same name.
// The file is made persistent on close.


 status = nc_open("existing.nc", NC_DISKLESS, &ncid);
 // An existing file is read into memory.

This is the scenario that I implemented.

 // After an nc_create("diskless.nc", &ncid1);
 status = nc_open("diskless.nc", NC_DISKLESS, &ncid2);

Before my change this results in an file not found error.
After my change this results in ncid2 being set to the ncid1 created nc_create.
I can also imagine the ncid2 being different from ncid1, but I would expect the
combination of create and open to also work for in memory files.

My implementation indeed only works for single processes, as memio.c does not create shared memory and
the nc_filelist is also not shared. I think for openening in memory files between processes
the NC_MMAP should already work, but I did not test that.

You could also use the ncid, but that is not always very elegant.
For example if you create the file in python and reuse it in fortran you could
pass the dataset._grpid to fortran and start reading without opening a file.

Cheers,

Fedor


Reply to this email directly or view it on GitHub:
#12 (comment)

@WardF
Copy link
Member

WardF commented Feb 25, 2014

Following up on open pull requests/issue reports. Dennis, can this be merged? If so I'm happy to handle the actual merge into the development branch.

@DennisHeimbigner
Copy link
Collaborator

Not yet; I recall that I realized there might be
a problem with this. Unfortunately, I forgot what it was.
=Dennis

Ward Fisher wrote:

Following up on open pull requests/issue reports. Dennis, can this be merged? If so I'm happy to handle the actual merge into the development branch.


Reply to this email directly or view it on GitHub:
#12 (comment)

@WardF
Copy link
Member

WardF commented Mar 12, 2014

Given that this pull request can't be merged at this time, but that we also are interested in incorporating it once we're able to, I am closing the pull request and adding a link to it to the 'Requested Features' page on the NetCDF-C wiki. This way we can re-open it once we have the resources to devote to it.

https://github.com/Unidata/netcdf-c/wiki/Requested-Features

@WardF WardF closed this Mar 12, 2014
edhartnett added a commit to NetCDF-World-Domination-Council/netcdf-c that referenced this pull request Feb 2, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants