-
Notifications
You must be signed in to change notification settings - Fork 313
Sync Issue in PersistentQueue on I/O Exception? #21
Comments
it looks like an i/o exception would bounce out to the handler and possibly disconnect the client. i guess we should catch exceptions when writing the journal, and kill the server if they happen, so that queues don't get into this weird state if the disk fills. does that sound okay? |
Hmm, possibly. The best approach, I suppose, would be to rollback journal operations, but it seems to me that simply isn't possible with the current system. Another approach is to place I/O operations ahead of in-memory data structure operations to raise I/O exceptions before modifying the queue, transaction table, or other PersistentQueue data. That should keep the PersistentQueue in a consistent state. Yet another approach is to close and reopen the queue on I/O Exceptions, however this may seemingly result in a huge number of journal reads as the journals are replayed. Perhaps this is just something to be aware of and need not be addressed? |
Thoughts? |
i think you're right that it shouldn't try to continue as if nothing happened. i'm leaning toward catching i/o exceptions inside the journal code, and writing a fatal log message and calling system.exit. it would be an unambiguous signal that something has gone wrong with the machine, and i think if the machine is hosed, kestrel shouldn't try to paste over it. |
Okay, that does seem reasonable. One problem is that it might adversely affect folks using Kestrel as a library since the proposed fix would shutdown the JVM. |
Upon close inspection of the PersistentQueue class, it occurred to me that if an I/O Exception is raised at certain points, the in-memory queue may become out of sync with the journal. For example, this can occur in add if an I/O Exception occurs on journal.add after the item has been added to the in-memory queue. Similar behavior exists in remove. Is this an accurate reading of the code? If so, what is the reasoning behind it? Thanks.
The text was updated successfully, but these errors were encountered: