New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disk state validation to avoid crashing on corruption #121

Closed
mbsanchez opened this Issue Dec 2, 2015 · 3 comments

Comments

Projects
None yet
2 participants
@mbsanchez

mbsanchez commented Dec 2, 2015

I have fixed a crash on nzbget caused by a bad disk state: when an user has a bad disk state (caused by a system crash perhaps), the user can't start nzbget because it crashes on startup, I have debugged nzbget and I have found that the article size on diskstate file is different to the articles length on pFileInfo vector, to fix this and allow nzbget to run without crashing, I have made the next changes:

 daemon/queue/DiskState.cpp | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/daemon/queue/DiskState.cpp b/daemon/queue/DiskState.cpp
index e603e2c..e7229a4 100644
--- a/daemon/queue/DiskState.cpp
+++ b/daemon/queue/DiskState.cpp
@@ -1429,7 +1429,8 @@ bool DiskState::LoadFileState(FileInfo* pFileInfo, Servers* pServers, bool bComp
    iCompletedArticles = 0; //clang requires initialization in a separate line (due to goto statements)

    int size;
-   if (fscanf(infile, "%i\n", &size) != 1) goto error;
+   if (fscanf(infile, "%i\n", &size) != 1 || (size_t)size > pFileInfo->GetArticles()->size()) goto error;
    for (int i = 0; i < size; i++)
    {
        if (!bHasArticles)

I have a backup of queue folder that you can use to reproduce this issue, but i don't know how to attach the rar file.

@mbsanchez mbsanchez changed the title from nzbget crahs on bad diskstate. to nzbget crahess on bad diskstate. Dec 2, 2015

@mbsanchez mbsanchez changed the title from nzbget crahess on bad diskstate. to nzbget crahes on bad diskstate. Dec 2, 2015

@mbsanchez mbsanchez changed the title from nzbget crahes on bad diskstate. to nzbget crashes on bad diskstate. Dec 2, 2015

@hugbug

This comment has been minimized.

Show comment
Hide comment
@hugbug

hugbug Dec 5, 2015

Member

Yes, that's a problem. A corrupted disk state file(s) can cause all kinds of troubles since there is no any validation. What you have fixed is one place of probably hundreds.

May be a better more complete solution would be to store a checksum of file content and validate it before building objects from the file. Performance is always a concern so the solution must take this into account.

I'm changing the title a little and converting to "feature" for a next release.

Member

hugbug commented Dec 5, 2015

Yes, that's a problem. A corrupted disk state file(s) can cause all kinds of troubles since there is no any validation. What you have fixed is one place of probably hundreds.

May be a better more complete solution would be to store a checksum of file content and validate it before building objects from the file. Performance is always a concern so the solution must take this into account.

I'm changing the title a little and converting to "feature" for a next release.

@hugbug hugbug changed the title from nzbget crashes on bad diskstate. to Disk state validation to avoid crashing on corruption Dec 5, 2015

@hugbug hugbug added the feature label Dec 5, 2015

@hugbug hugbug added this to the v17.0 milestone Dec 5, 2015

hugbug added a commit that referenced this issue Mar 23, 2016

#121: unified disk state management
… to use StateDiskFile for all operations.
@hugbug

This comment has been minimized.

Show comment
Hide comment
@hugbug

hugbug Apr 28, 2016

Member

have found that the article size on diskstate file is different to the articles length on pFileInfo vector

I had another similar case. After an investigation it turned out the problem wasn't the corrupted disk state file. Sometimes (it's not clear when) it may happen that after a deleting of file from queue its disk state progress file (which holds progress data for partially downloaded files) remain on disk. After a program restart the same file ID may be reused. This can lead to a situation when a disk state info file and a disk state progress file (for the same file ID) both exist but do not belong together. They are processed together nonetheless, causing the trouble.

This has to be fixed.

Member

hugbug commented Apr 28, 2016

have found that the article size on diskstate file is different to the articles length on pFileInfo vector

I had another similar case. After an investigation it turned out the problem wasn't the corrupted disk state file. Sometimes (it's not clear when) it may happen that after a deleting of file from queue its disk state progress file (which holds progress data for partially downloaded files) remain on disk. After a program restart the same file ID may be reused. This can lead to a situation when a disk state info file and a disk state progress file (for the same file ID) both exist but do not belong together. They are processed together nonetheless, causing the trouble.

This has to be fixed.

@hugbug

This comment has been minimized.

Show comment
Hide comment
@hugbug

hugbug Apr 28, 2016

Member

May be a better more complete solution would be to store a checksum of file content and validate it before building objects from the file. Performance is always a concern so the solution must take this into account.

I've developed checksum calculating (when writing) and validation (when reading). Since it's currently doesn't look like there is an issue with corrupted files I'm not committing the checksum solution yet, to avoid unnecessary performance overhead.

Member

hugbug commented Apr 28, 2016

May be a better more complete solution would be to store a checksum of file content and validate it before building objects from the file. Performance is always a concern so the solution must take this into account.

I've developed checksum calculating (when writing) and validation (when reading). Since it's currently doesn't look like there is an issue with corrupted files I'm not committing the checksum solution yet, to avoid unnecessary performance overhead.

hugbug added a commit that referenced this issue Apr 28, 2016

#121: queue directory cleanup on program start
Automatically removing orphaned diskstate files from QueueDir:
- file info state;
- file progress state;
- file completion state;
- nzb-log.

@hugbug hugbug added improvement and removed feature labels Apr 29, 2016

@hugbug hugbug closed this May 14, 2016

hugbug added a commit that referenced this issue Oct 9, 2017

#121: unified disk state management
… to use StateDiskFile for all operations.

hugbug added a commit that referenced this issue Oct 9, 2017

#121: queue directory cleanup on program start
Automatically removing orphaned diskstate files from QueueDir:
- file info state;
- file progress state;
- file completion state;
- nzb-log.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment