Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PFIO server needs to be smarter and remember who created a file #1620

Closed
bena-nasa opened this issue Aug 5, 2022 · 14 comments
Closed

PFIO server needs to be smarter and remember who created a file #1620

bena-nasa opened this issue Aug 5, 2022 · 14 comments
Assignees
Labels
🎁 New Feature This is a new feature

Comments

@bena-nasa
Copy link
Collaborator

@tclune here's about what PFIO should probably do with clobbering and History

Recently a change was added in v2.22.0 of History with the following behaviour.
History has logic each time it runs that decides if I am writing, will it be to a new file or an existing file (say you have multiple times per file).
However, it did not check if it decides it is time for a new file, does it exist already for some reason. Maybe you had a previous run in the same working space, maybe you have scripting that copied a file in of the same name like in this issue: Das Scripting Issue. This can be a problem if said file is not consistent with what you are trying to write.

So the simple solution is that if History decides, hey time for a new file, if somehow the file is already there, just fail which I did. This apparently broke NOAA. Now the "right" thing here is to make sure your run scripting cleans up before running. But I suppose we could make it so that hey if a file is there, just clobber it in a NetCDF sense.

However, this is not as simple as just making the default when you call the create method in the NETCDF4_FileFormatter (which is just a hook into the NetCDF create) to be clobber for the following reason.

The PFIO server is not that smart. Basically PFIO server has this logic. You make a request to write some data to some file. PFIO checks, does the file exist already which if you have multiple times per file is perfectly reasonable. If the file does exist, is says, I must have created it already so I"ll just open it and write the hyperslice you told me to. If it doesn't it says, ok, I'll create the file. Even if the create clobbers this doesn't solve the problem.

Because what the file is there and it shouldn't be and PFIO DIDN'T create it. Well, it will still just open it and who knows what's in the file.

So PFIO needs to keep track of, did I create this file, if not, I shouldn't be trying to write to so rather the OPEN the file, I need to call create with clobber.

@bena-nasa bena-nasa self-assigned this Aug 5, 2022
@bena-nasa bena-nasa added the 🎁 New Feature This is a new feature label Aug 5, 2022
@tclune
Copy link
Collaborator

tclune commented Aug 5, 2022

Looking at it now. PFIO (NetCDF4_FileFormatter) has the necessary interface, as mode is one of the arguments to open(). The client code would then just include NO_CLOBBER as one of the options.

So some steps:

  1. Create a new pFIO enum for NO_CLOBBER alongside pFIO_READ and pFIO_WRITE.
  2. Fix current logic that converts mode (pfio) to mode (netcdf). This needs to involve some IOR type stuff (or adding of integers).
  3. The actual nf90_open() can then just use omode and won`t need the IOR bit.

If there is a way to pass a "null" communicator for the serial nf90_open() that would also make things a bit clearer.

@tclune
Copy link
Collaborator

tclune commented Aug 5, 2022

I don't think PFIO should track whether or not it created a file. Client code needs to tell PFIO what to do including setting non-default NO CLOBBER if desired.

@bena-nasa
Copy link
Collaborator Author

bena-nasa commented Aug 5, 2022

@weiyuan-jiang @tclune Then solution is we need (this this is NOT just clobber/noclobber toggle issue) is to modify the client - server api so that the client can tell the server as needed that when it is time to write, DO NOT check if the file exists and open (by open I mean use NF90_Open)/use the existing file. Simply call the NETCDF4_FileFormatter's create method which of course it just a hook to NF90_Create and has the appropriate optional mode, presumable clobbering if the file exists. Looking at the code this will involve touching at least HistoryCollection.F90

@bena-nasa
Copy link
Collaborator Author

bena-nasa commented Aug 5, 2022

@tclune @weiyuan-jiang
On the other hand, what's wrong with the PFIO server tracking if it created the file as long as the client code can override it's behaviour? I mean you are saying write to this file, certainly we are relying on the server to create the file in the first place. It seems perfectly logical that if the server gets a request to write to a file it did not create, at the very least the default behaviour should be, danger Will Robinson. I didn't create this yet you are asking me to write to it..., so I should clobber before writing, that's the safest unless you tell me not too. But I can certainly keep track of that, knowing what filename I called a create on.

@tclune
Copy link
Collaborator

tclune commented Aug 5, 2022

I'm not opposed to the server remembering. I just don't want the NetCDF layer remembering.

@bena-nasa
Copy link
Collaborator Author

bena-nasa commented Aug 5, 2022

I'm not opposed to the server remembering. I just don't want the NetCDF layer remembering.

Neither do I, I was never saying it should... this is a issue with the server logic, always has been. The bottom line is that if the file exists, but the server never created it (which it can easily track), either die or clobber, let the client code tell it what to do, but writing to it without clobbering just seems dangerous.

@bena-nasa
Copy link
Collaborator Author

@tclune I think I see how to track and take action and do things correctly on the PFIO Server side (if the server didn't create it clobber), modifying the API so the client can control this behaviour is more work, I bet @weiyuan-jiang can do that.

@weiyuan-jiang
Copy link
Contributor

I agree it should let the client control the behavior.

bena-nasa added a commit that referenced this issue Aug 5, 2022
@bena-nasa
Copy link
Collaborator Author

@weiyuan-jiang @tclune
I started this branch, and it is just a start:
https://github.com/GEOS-ESM/MAPL/tree/feature/bmauer/fixes-%231620
I've modified HistoryCollection.F90 in PFIO so that it tracks if it creates the file and clobbers in an NF90_Create sense if the file already exists and it was not created by the PFIO Server.
The hard part is getting the option from history to modify the server behaviour as the client code wants to here where the file is ultimately created or opened.

@weiyuan-jiang
Copy link
Contributor

I think it really does not matter if the file is created by pFio or not. There are 3 choices if the file exists. 1) overwrite 2) exit 3) append ( need to match the meta data, at least the dimension)

@bena-nasa
Copy link
Collaborator Author

bena-nasa commented Aug 8, 2022

@weiyuan-jiang yes it absolutely does matter! What if pfio did not create the file and the user has put some other file in with the same name before execution of the application using the server that is just not right. Or worse put a file in there that has everything in it except one variable you are trying to write! So yes it does matter and we have to be very, very clear about the terminology/behaviour here, there are two ways to overwrite a file. If the PFIO server did not create it, it should either overwrite it by doing an nf90_create with the mode = clobber or throw an error. It should not use an nf90_open to just open it and write in that case. Obviously if the pfio server did create it and the user wants to write multiple times to the file then yes, using an nf90_open is the correct behaviour; but if the pfio server did not create it in that execution of the application, the user screwed up before execution of their application. At least in any use case we have via History now or for the foreseeable future.

@bena-nasa
Copy link
Collaborator Author

bena-nasa commented Aug 8, 2022

@weiyuan-jiang pushed the branch, realized, no need to store a list, just the last file the server created for that collection, at least I think...

@weiyuan-jiang
Copy link
Contributor

That is ok. I overloaded the message. Now users (or clients) can send the message if they wants NOCLOBBER. It will report errors if NOCLOBBER but the file exists

@bena-nasa
Copy link
Collaborator Author

Closing this, issue resovled by #1631

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🎁 New Feature This is a new feature
Projects
None yet
Development

No branches or pull requests

3 participants