Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

possible DoS attack against PAR2 clients #31

Closed
BlackIkeEagle opened this issue May 7, 2014 · 4 comments
Closed

possible DoS attack against PAR2 clients #31

BlackIkeEagle opened this issue May 7, 2014 · 4 comments

Comments

@BlackIkeEagle
Copy link
Member

This is a forwarded issue from multipar

check if the problem exists in par2cmdline and fix if needed

--- refer start ---

This is an important update to fix a rare (but fatal if happen)
problem.
When data of source file is almost same or sparce,
finding slices happened to be very slow ...
almost impossible for large data size.
This is categolized as hash collision problem.

PAR2 uses two hash algorithms (CRC-32 and MD5) to detect slices.
CRC-32 is used to locate a slice, because MD5 is slow.
When there are some slices of same contents (both CRC-32 and MD5 are
same),
detection by CRC-32 happens many times for same data.
It requires MD5 calculation for the slice everytime,
and verification becomes very slow.
For example, if all file data bytes are 0,
hash from all offset matches hash of a slice.
To avoid this "too many detection problem",
par2cmdline and QuickPar skip a range of a found slice.
Because MultiPar tries to find more complete slices than QuickPar,
MultiPar was freezed in this case.

When there are some slices of same CRC-32 (but different MD5),
miss-detection by CRC-32 happens many times for same data.
From the restriction of 32-bit checksum, miss-detection can happen,
but the frequency is low.
When number of blocks is 32768, miss-detection may happen only 1 time
per 128KB.
Because the chance of CRC-32 collision is very low normally,
most PAR2 clients don't avoid it.
As nobody reported this kind problem after PAR2 was public,
users don't need to care the risk in general usage.
Anyway I implemented a feature to count the frequency of miss-
detection,
and par2j.exe will avoid freeze in most case.

I made a sample data set to test behavior in this collision problem.
There are only 4 blocks in a 2MB file.
Because I changed first 2 bytes as a damage,
a PAR2 client should detect 3 complete blocks from the damaged file.

2mb_0s.bin : This is a sample of sparce data. Bytes are almost 0.
par2cmdline and QuickPar find only 2 blocks.
Old MultiPar was freezed, but will find 3 blocks after many minutes.
New MultiPar finds 3 blocks.

2mb_col.bin : This is a sample of checksum collision of every byte.
par2cmdline and QuickPar are freezed, but find only 1 block after
several minutes.
Old MultiPar was freezed, but will find 3 blocks after many minutes.
New MultiPar finds 3 blocks.

pattern16.bin : This is a sample of checksum collision per 16-bytes.
par2cmdline and QuickPar are freezed, but find only 2 blocks after a
few minutes.
Old MultiPar was freezed, but find 3 blocks after several minutes.
New MultiPar finds 3 blocks.

If you want to check your using PAR2 client, set of "pattern16.bin"
is faster to finish.
If your PC is slow, the freeze period may become longer.
I'm not sure this problem can be used for DoS attack against
automated PAR2 usage.
If someone has a skill to forge CRC-32,
it is possible to make a set of source file and PAR2 file,
which freeze a PAR2 client for several hours.
But, the malicious file data has continual short pattern,
and human being can recognize the oddness with ease.
Also, developer can solve the problem, like I did.

--- refer end ---

BlackIkeEagle added a commit that referenced this issue Jun 9, 2014
Signed-off-by: BlackEagle <ike.devolder@gmail.com>
@eMPee584
Copy link
Contributor

[side note.. ⛵
ah co0l.. so Mr Sawada set up a proper website and respects the GPL: http://multipar.eu/source
good you are watching that one. Is there any notable functionality in his so-called PAR3-implementation lacking in current par2cmdline?]

@BlackIkeEagle
Copy link
Member Author

par3 is similar to par2, but has some advantages

it has smaller headers and uses another technique to create recovery files
this results in smaller recovery files for the same % of recoverable data

@BlackIkeEagle
Copy link
Member Author

the sollution offered is not perfect but it will avoid a massive slowdown

it could in some cases give false positives of missing blocks

pcordes pushed a commit to pcordes/par2-cleanhistory that referenced this issue May 6, 2015
Signed-off-by: BlackEagle <ike.devolder@gmail.com>
PeterBClements pushed a commit that referenced this issue May 26, 2015
Issue #31 was reported as a problem whereby cache collision of crc32
values resulted in the calculation of md5 values at a large number of
file positions for certain types of data files.

The fix for issue #31 was to introduce code that would only only scan
forwards 10 bytes whilst searching for a good datablock, and then skip
the next blocksize-10 bytes before continuing searching.

This code will never find blocks that are positioned earlier in the file
than they should be, and will also never find blocks that are positioned
more than 10 bytes later in the file than they should be.

Issue #50 reported that a single byte erasure at the start of a data
file resulted in the complete failure to find any good data block in
the file.

The fix for issue #50 is an improvement of the code intended to deal
with issue #31 such that instead of only finding datablocks that are
either exactly at the correct position in the file or no more than
10 bytes after the correct position: it will now (by default) check
up to 64 bytes before and up to 64 bytes after the expected position
of the next block (where the 'expected' position is calculated relative
to the position where the last block was found).

Additionally, two new commandline parameters were added to enable this
code to be tuned:

  -N    : No data skipping (every byte of data will be scanned for blocks)
  -S<n> : Skip leaway n bytes (check for the next block within +/- n
          bytes of the expected position before skipping forward)

Finally, where the code would previously skip forward a whole block
if it found data that it thinks is a duplicate of a block that has
already been found, it will now start scanning from that position.
@Yutaka-Sawada
Copy link

I found that this issue was not solved in recent version. But, the problem had been solved at year 2014. So, something change might happen to remove the solution.

I put the sample set of PAR2 files and source files (testCollision_2014-05-06.zip) in "Tool" folder on OneDrive. They are same as old data set, which were refered in the first post. par2cmdline v0.8.1 fails (freeze) at verifying these three cases.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants