Shared diskless files #12

SiggyF · 2013-12-27T15:10:23Z

I extended the diskless mode. Files that are created in memory can now be opened by opening a
file with NC_DISKLESS mode using the same name.
This is in improvement over sharing open file identifiers between processes and provides an easy path
from disk-based to memory-based model coupling.

This is a simplified example of the use case I had (netcdf as shared memory between python and c) :
http://nbviewer.ipython.org/github/SiggyF/notebooks/blob/master/diskless.ipynb

and this is more detailed example (create a dataset in netcdf in python and reuse it in a fortran based model) :
http://nbviewer.ipython.org/github/nens/python-subgrid/blob/master/notebooks/delflandrain.ipynb

I added an extra test tst_diskless5.c. Note that if you want to ignore the whitespace cleanup you can do so by adding ?w=1 in a github url or add --ignore-all-space in the git command.

Please let me know if you can merge this code into the netcdf library. Feel free to contact me if extra effort is required.

Fedor Baart

DennisHeimbigner · 2013-12-28T19:06:16Z

Fedor, I am not sure I understand the point of this change.
Is this interpretation correct?

We assume that some part of the code has opened/created
an in-memory file. This change allows another part of the code
in the same process to open that diskless file by name rather than
by ncid. Is this correct?

SiggyF · 2013-12-31T11:56:50Z

Hi Dennis,

Thanks for considering the patch.
Indeed this change allows another part of the code (in the same process space) to open the same in memory file by name.

The following scenario's are described in the docs:

status = nc_create("diskless.nc", NC_DISKLESS, &ncid);
// Creates an in memory file, using the memio.c.

status = nc_create("diskless.nc", NC_DISKLESS|NC_WRITE, &ncid);
// Creates an in memory file, and also an empty file with the same name.
// The file is made persistent on close.


 status = nc_open("existing.nc", NC_DISKLESS, &ncid);
 // An existing file is read into memory.

This is the scenario that I implemented.

 // After an nc_create("diskless.nc", &ncid1);
 status = nc_open("diskless.nc", NC_DISKLESS, &ncid2);

Before my change this results in an file not found error.
After my change this results in ncid2 being set to the ncid1 created nc_create.
I can also imagine the ncid2 being different from ncid1, but I would expect the
combination of create and open to also work for in memory files.

My implementation indeed only works for single processes, as memio.c does not create shared memory and
the nc_filelist is also not shared. I think for openening in memory files between processes
the NC_MMAP should already work, but I did not test that.

You could also use the ncid, but that is not always very elegant.
For example if you create the file in python and reuse it in fortran you could
pass the dataset._grpid to fortran and start reading without opening a file.

Cheers,

Fedor

DennisHeimbigner · 2013-12-31T18:01:44Z

I expect to accept this change, but after
discussion, there is no reason to limit it
to diskless files only, so we will modify
your fix to work for any file. And thanks
for the changes and idea.
=Dennis Heimbigner
Unidata

SiggyF wrote:

Hi Dennis,

Thanks for considering the patch.
Indeed this change allows another part of the code (in the same process space) to open the same in memory file by name.

The following scenario's are described in the docs:
status = nc_create("diskless.nc", NC_DISKLESS, &ncid);
// Creates an in memory file, using the memio.c.

status = nc_create("diskless.nc", NC_DISKLESS|NC_WRITE, &ncid);
// Creates an in memory file, and also an empty file with the same name.
// The file is made persistent on close.


 status = nc_open("existing.nc", NC_DISKLESS, &ncid);
 // An existing file is read into memory.
This is the scenario that I implemented.
 // After an nc_create("diskless.nc", &ncid1);
 status = nc_open("diskless.nc", NC_DISKLESS, &ncid2);
Before my change this results in an file not found error.
After my change this results in ncid2 being set to the ncid1 created nc_create.
I can also imagine the ncid2 being different from ncid1, but I would expect the
combination of create and open to also work for in memory files.

My implementation indeed only works for single processes, as memio.c does not create shared memory and
the nc_filelist is also not shared. I think for openening in memory files between processes
the NC_MMAP should already work, but I did not test that.

You could also use the ncid, but that is not always very elegant.
For example if you create the file in python and reuse it in fortran you could
pass the dataset._grpid to fortran and start reading without opening a file.

Cheers,

Fedor

Reply to this email directly or view it on GitHub:
#12 (comment)

WardF · 2014-02-25T19:21:04Z

Following up on open pull requests/issue reports. Dennis, can this be merged? If so I'm happy to handle the actual merge into the development branch.

DennisHeimbigner · 2014-02-25T19:23:45Z

Not yet; I recall that I realized there might be
a problem with this. Unfortunately, I forgot what it was.
=Dennis

Ward Fisher wrote:

Following up on open pull requests/issue reports. Dennis, can this be merged? If so I'm happy to handle the actual merge into the development branch.

Reply to this email directly or view it on GitHub:
#12 (comment)

WardF · 2014-03-12T20:13:40Z

Given that this pull request can't be merged at this time, but that we also are interested in incorporating it once we're able to, I am closing the pull request and adding a link to it to the 'Requested Features' page on the NetCDF-C wiki. This way we can re-open it once we have the resources to devote to it.

https://github.com/Unidata/netcdf-c/wiki/Requested-Features

…h_pio2 Adding PIO to netCDF

SiggyF added 11 commits December 27, 2013 12:53

testcase for diskless create/open

e83c365

start implementing create/open in diskless mode

abb0ebe

This is not a slow test

5ed4624

my test works, but diskless with scalar vars dumps core

b2cac65

all tests passed

8213de7

fix for empty persistent files

ebd754e

when a file exists with the same name, read it (to discuss)

7552a59

test for dimension and ncids

13138fb

possible uninitialized variables

ca711b3

add new function to header file

c825f65

commented code

202b592

DennisHeimbigner closed this Dec 28, 2013

DennisHeimbigner reopened this Dec 28, 2013

WardF closed this Mar 12, 2014

WardF mentioned this pull request Sep 22, 2014

Failures on 32-bit platforms #83

Closed

edhartnett mentioned this pull request Mar 21, 2016

Can argument start be NULL in get/put APIs? #231

Closed

edhartnett added a commit to NetCDF-World-Domination-Council/netcdf-c that referenced this pull request Feb 2, 2018

Merge pull request Unidata#12 from NetCDF-World-Domination-Council/ej…

637b0d4

…h_pio2 Adding PIO to netCDF

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Shared diskless files #12

Shared diskless files #12

SiggyF commented Dec 27, 2013

DennisHeimbigner commented Dec 28, 2013

SiggyF commented Dec 31, 2013

DennisHeimbigner commented Dec 31, 2013

WardF commented Feb 25, 2014

DennisHeimbigner commented Feb 25, 2014

WardF commented Mar 12, 2014

Shared diskless files #12

Shared diskless files #12

Conversation

SiggyF commented Dec 27, 2013

DennisHeimbigner commented Dec 28, 2013

SiggyF commented Dec 31, 2013

DennisHeimbigner commented Dec 31, 2013

WardF commented Feb 25, 2014

DennisHeimbigner commented Feb 25, 2014

WardF commented Mar 12, 2014