Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failed write corrupts LIFO queue file #14

Open
ArturGaspar opened this issue Mar 8, 2016 · 5 comments
Open

Failed write corrupts LIFO queue file #14

ArturGaspar opened this issue Mar 8, 2016 · 5 comments

Comments

@ArturGaspar
Copy link

If a write fails when pushing an item to the LIFO queue, some data may be actually written before an exception is raised. When trying to pop, the end of the file is supposed to have the size of the last item, but it actually contains whatever was in the partially written data from previous failed push() calls.

>>> from queuelib import LifoDiskQueue
>>> q = LifoDiskQueue("./small_fs/queue")  # Filesystem has less than 100KB available.
>>> for i in range(100): 
...     q.push(b'a' * 1000)
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "queuelib/queue.py", line 152, in push
    self.f.write(string)
IOError: [Errno 28] No space left on device
>>> q.pop()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "queuelib/queue.py", line 162, in pop
    self.f.seek(-size-self.SIZE_SIZE, os.SEEK_END)
IOError: [Errno 22] Invalid argument

The error above comes from the value of size, which is decoded from the last SIZE_SIZE bytes of the file. When that value is larger than the file itself the seek will fail.

@artscrap
Copy link

attempt (not finished yet) to fix of error either for LIFO and FIFO
https://github.com/artscrap/queuelib/tree/fixdiskwrite

@Gallaecio Gallaecio added the bug label Aug 22, 2019
@Gallaecio
Copy link
Member

I’m thinking we could solve this by:

  • Replacing every \ by \\ in the input string while pushing.
  • Adding \n after the size in every write.
  • If upon reading where \n should be what’s read is not \n, find the earliest \n, and truncate the file contents to that point.
  • Replacing every \\ by \ on the string read while popping.

Because of the format change, as well as a potential performance impact of slash conversions, I wonder whether we should change the existing queue class or implement a new one that follows this approach (e.g. RecoverableLifoDiskQueue).

If we do modify the existing queue class, I wonder how to handle format versioning. For example, should we detect old-format files and upgrade them on disk initialization? Or should we empty such files and start from scratch?

@Gallaecio
Copy link
Member

@dangra Thoughts?

@MatthewScholefield
Copy link

MatthewScholefield commented Oct 5, 2020

Not sure if this makes sense here or as a separate issue, but I ran into a different corruption FIFO issue: When the drive ran out of space, somehow the info.json file got truncated. Right now I'm manually recovering it by iterating through the entries up to the entry I know I popped/added.

@Dobatymo
Copy link

I am having the same issue. It should at least be documented that the LifoDiskQueue is not safe against corruption at the moment.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants