Skip to content
This repository has been archived by the owner on Sep 18, 2021. It is now read-only.

Sync Issue in PersistentQueue on I/O Exception? #21

Closed
ebarlas opened this issue Aug 4, 2010 · 5 comments
Closed

Sync Issue in PersistentQueue on I/O Exception? #21

ebarlas opened this issue Aug 4, 2010 · 5 comments

Comments

@ebarlas
Copy link

ebarlas commented Aug 4, 2010

Upon close inspection of the PersistentQueue class, it occurred to me that if an I/O Exception is raised at certain points, the in-memory queue may become out of sync with the journal. For example, this can occur in add if an I/O Exception occurs on journal.add after the item has been added to the in-memory queue. Similar behavior exists in remove. Is this an accurate reading of the code? If so, what is the reasoning behind it? Thanks.

@robey
Copy link
Contributor

robey commented Aug 5, 2010

it looks like an i/o exception would bounce out to the handler and possibly disconnect the client. i guess we should catch exceptions when writing the journal, and kill the server if they happen, so that queues don't get into this weird state if the disk fills. does that sound okay?

@ebarlas
Copy link
Author

ebarlas commented Aug 5, 2010

Hmm, possibly. The best approach, I suppose, would be to rollback journal operations, but it seems to me that simply isn't possible with the current system. Another approach is to place I/O operations ahead of in-memory data structure operations to raise I/O exceptions before modifying the queue, transaction table, or other PersistentQueue data. That should keep the PersistentQueue in a consistent state. Yet another approach is to close and reopen the queue on I/O Exceptions, however this may seemingly result in a huge number of journal reads as the journals are replayed. Perhaps this is just something to be aware of and need not be addressed?

@ebarlas
Copy link
Author

ebarlas commented Aug 9, 2010

Thoughts?

@robey
Copy link
Contributor

robey commented Aug 10, 2010

i think you're right that it shouldn't try to continue as if nothing happened.

i'm leaning toward catching i/o exceptions inside the journal code, and writing a fatal log message and calling system.exit. it would be an unambiguous signal that something has gone wrong with the machine, and i think if the machine is hosed, kestrel shouldn't try to paste over it.

@ebarlas
Copy link
Author

ebarlas commented Aug 11, 2010

Okay, that does seem reasonable. One problem is that it might adversely affect folks using Kestrel as a library since the proposed fix would shutdown the JVM.

@robey robey closed this as completed Apr 10, 2012
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants