Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Why is fsync calling fdsync #83

Closed
ustulation opened this issue Aug 16, 2023 · 2 comments
Closed

Why is fsync calling fdsync #83

ustulation opened this issue Aug 16, 2023 · 2 comments

Comments

@ustulation
Copy link

Long story short

The fsync code in AIOFile.fsync ends up calling fdsync :

return await self.__context.fdsync(self.fileno())

Is there a particular reason for this? I assume fsync and fdatasync have differences under the hood (at OS level) and fsync is slower but more detailed in what it flushes (eg. in BSD systems). For those who care about these details fsync -> fdsync could be unexpected and misleading.

Expected behavior

fsync should call fsync and a new API should be exposed for fdsync which maps to the underlying context's fdsync.

@mosquito
Copy link
Owner

This offers speed advantages, and if journaling is enabled on the file system, there is significantly small potential risk of data loss.

@ustulation
Copy link
Author

The point is:

  1. If you expose a well known API that says/means to do x but ends up doing y, that could be categorised as being misleading (of the API).
  2. As a library writer, and IMO, you should just expose the functionality and let the users decide on the "significantly small potential risk of data loss", otherwise you are just assuming that all past, current and future users and systems would be unaffected and/or see things the way you see as opposed to what is otherwise the norm (which I think is a flawed premise). For eg. I do millions upon millions of records and checkpoints on an NFS where multiple systems are reading the data (so multiple OSes involved). This runs for weeks and months. At that scale the "small potential" becomes magnified. There have been occasions where the checkpoint was written before the actual data could be persisted before the writer crashed. So if I want my fsyncs to really mean fsyncs and don't want to work out what systems treat those two differently and what's the difference etc I'm unable to do it here.
  3. I don't claim to have read or understood your source code thoroughly but it seems you have an underlying facility in place already: https://github.com/mosquito/caio/blob/c2a39f8e3fdb52068a11d1193ed530cd3588e039/caio/abstract.py#L42. This frontend is simply calling the wrong function. Could it not just have fsync call fsync and fdsync call fdsync?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants