Skip to content

Commit d65fe7b

Browse files
authored
Merge pull request #403 from rubyzip/check-size
Validate entry sizes when extracting
2 parents 94b7fa2 + 97cb6ae commit d65fe7b

File tree

6 files changed

+149
-14
lines changed

6 files changed

+149
-14
lines changed

Diff for: Changelog.md

+11
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,16 @@
11
# X.X.X (Next)
22

3+
-
4+
5+
# 1.3.0 (Next)
6+
7+
Security
8+
9+
- Add `validate_entry_sizes` option so that callers can trust an entry's reported size when using `extract` [#403](https://github.com/rubyzip/rubyzip/pull/403)
10+
- This option defaults to `false` for backward compatibility in this release, but you are strongly encouraged to set it to `true`. It will default to `true` in rubyzip 2.0.
11+
12+
New Feature
13+
314
- Add `add_stored` method to simplify adding entries without compression [#366](https://github.com/rubyzip/rubyzip/pull/366)
415

516
Tooling / Documentation

Diff for: README.md

+60-13
Original file line numberDiff line numberDiff line change
@@ -152,19 +152,23 @@ When modifying a zip archive the file permissions of the archive are preserved.
152152
### Reading a Zip file
153153

154154
```ruby
155+
MAX_SIZE = 1024**2 # 1MiB (but of course you can increase this)
155156
Zip::File.open('foo.zip') do |zip_file|
156157
# Handle entries one by one
157158
zip_file.each do |entry|
158-
# Extract to file/directory/symlink
159159
puts "Extracting #{entry.name}"
160-
entry.extract(dest_file)
160+
raise 'File too large when extracted' if entry.size > MAX_SIZE
161+
162+
# Extract to file or directory based on name in the archive
163+
entry.extract
161164

162165
# Read into memory
163166
content = entry.get_input_stream.read
164167
end
165168

166169
# Find specific entry
167170
entry = zip_file.glob('*.csv').first
171+
raise 'File too large when extracted' if entry.size > MAX_SIZE
168172
puts entry.get_input_stream.read
169173
end
170174
```
@@ -219,6 +223,8 @@ File.open(new_path, "wb") {|f| f.write(buffer.string) }
219223

220224
## Configuration
221225

226+
### Existing Files
227+
222228
By default, rubyzip will not overwrite files if they already exist inside of the extracted path. To change this behavior, you may specify a configuration option like so:
223229

224230
```ruby
@@ -233,18 +239,63 @@ Additionally, if you want to configure rubyzip to overwrite existing files while
233239
Zip.continue_on_exists_proc = true
234240
```
235241

242+
### Non-ASCII Names
243+
236244
If you want to store non-english names and want to open them on Windows(pre 7) you need to set this option:
237245

238246
```ruby
239247
Zip.unicode_names = true
240248
```
241249

250+
Sometimes file names inside zip contain non-ASCII characters. If you can assume which encoding was used for such names and want to be able to find such entries using `find_entry` then you can force assumed encoding like so:
251+
252+
```ruby
253+
Zip.force_entry_names_encoding = 'UTF-8'
254+
```
255+
256+
Allowed encoding names are the same as accepted by `String#force_encoding`
257+
258+
### Date Validation
259+
242260
Some zip files might have an invalid date format, which will raise a warning. You can hide this warning with the following setting:
243261

244262
```ruby
245263
Zip.warn_invalid_date = false
246264
```
247265

266+
### Size Validation
267+
268+
**This setting defaults to `false` in rubyzip 1.3 for backward compatibility, but it will default to `true` in rubyzip 2.0.**
269+
270+
If you set
271+
```
272+
Zip.validate_entry_sizes = true
273+
```
274+
then `rubyzip`'s `extract` method checks that an entry's reported uncompressed size is not (significantly) smaller than its actual size. This is to help you protect your application against [zip bombs](https://en.wikipedia.org/wiki/Zip_bomb). Before `extract`ing an entry, you should check that its size is in the range you expect. For example, if your application supports processing up to 100 files at once, each up to 10MiB, your zip extraction code might look like:
275+
276+
```ruby
277+
MAX_FILE_SIZE = 10 * 1024**2 # 10MiB
278+
MAX_FILES = 100
279+
Zip::File.open('foo.zip') do |zip_file|
280+
num_files = 0
281+
zip_file.each do |entry|
282+
num_files += 1 if entry.file?
283+
raise 'Too many extracted files' if num_files > MAX_FILES
284+
raise 'File too large when extracted' if entry.size > MAX_FILE_SIZE
285+
entry.extract
286+
end
287+
end
288+
```
289+
290+
If you need to extract zip files that report incorrect uncompressed sizes and you really trust them not too be too large, you can disable this setting with
291+
```ruby
292+
Zip.validate_entry_sizes = false
293+
```
294+
295+
Note that if you use the lower level `Zip::InputStream` interface, `rubyzip` does *not* check the entry `size`s. In this case, the caller is responsible for making sure it does not read more data than expected from the input stream.
296+
297+
### Default Compression
298+
248299
You can set the default compression level like so:
249300

250301
```ruby
@@ -253,13 +304,17 @@ Zip.default_compression = Zlib::DEFAULT_COMPRESSION
253304

254305
It defaults to `Zlib::DEFAULT_COMPRESSION`. Possible values are `Zlib::BEST_COMPRESSION`, `Zlib::DEFAULT_COMPRESSION` and `Zlib::NO_COMPRESSION`
255306

256-
Sometimes file names inside zip contain non-ASCII characters. If you can assume which encoding was used for such names and want to be able to find such entries using `find_entry` then you can force assumed encoding like so:
307+
### Zip64 Support
308+
309+
By default, Zip64 support is disabled for writing. To enable it do this:
257310

258311
```ruby
259-
Zip.force_entry_names_encoding = 'UTF-8'
312+
Zip.write_zip64_support = true
260313
```
261314

262-
Allowed encoding names are the same as accepted by `String#force_encoding`
315+
_NOTE_: If you will enable Zip64 writing then you will need zip extractor with Zip64 support to extract archive.
316+
317+
### Block Form
263318

264319
You can set multiple settings at the same time by using a block:
265320

@@ -272,14 +327,6 @@ You can set multiple settings at the same time by using a block:
272327
end
273328
```
274329

275-
By default, Zip64 support is disabled for writing. To enable it do this:
276-
277-
```ruby
278-
Zip.write_zip64_support = true
279-
```
280-
281-
_NOTE_: If you will enable Zip64 writing then you will need zip extractor with Zip64 support to extract archive.
282-
283330
## Developing
284331

285332
To run the test you need to do this:

Diff for: lib/zip.rb

+3-1
Original file line numberDiff line numberDiff line change
@@ -42,7 +42,8 @@ module Zip
4242
:write_zip64_support,
4343
:warn_invalid_date,
4444
:case_insensitive_match,
45-
:force_entry_names_encoding
45+
:force_entry_names_encoding,
46+
:validate_entry_sizes
4647

4748
def reset!
4849
@_ran_once = false
@@ -54,6 +55,7 @@ def reset!
5455
@write_zip64_support = false
5556
@warn_invalid_date = true
5657
@case_insensitive_match = false
58+
@validate_entry_sizes = false
5759
end
5860

5961
def setup

Diff for: lib/zip/entry.rb

+12
Original file line numberDiff line numberDiff line change
@@ -603,9 +603,21 @@ def create_file(dest_path, _continue_on_exists_proc = proc { Zip.continue_on_exi
603603
get_input_stream do |is|
604604
set_extra_attributes_on_path(dest_path)
605605

606+
bytes_written = 0
607+
warned = false
606608
buf = ''.dup
607609
while (buf = is.sysread(::Zip::Decompressor::CHUNK_SIZE, buf))
608610
os << buf
611+
bytes_written += buf.bytesize
612+
if bytes_written > size && !warned
613+
message = "Entry #{name} should be #{size}B but is larger when inflated"
614+
if ::Zip.validate_entry_sizes
615+
raise ::Zip::EntrySizeError, message
616+
else
617+
puts "WARNING: #{message}"
618+
warned = true
619+
end
620+
end
609621
end
610622
end
611623
end

Diff for: lib/zip/errors.rb

+1
Original file line numberDiff line numberDiff line change
@@ -4,6 +4,7 @@ class EntryExistsError < Error; end
44
class DestinationFileExistsError < Error; end
55
class CompressionMethodError < Error; end
66
class EntryNameError < Error; end
7+
class EntrySizeError < Error; end
78
class InternalError < Error; end
89
class GPFBit3Error < Error; end
910

Diff for: test/file_extract_test.rb

+62
Original file line numberDiff line numberDiff line change
@@ -10,6 +10,10 @@ def setup
1010
::File.delete(EXTRACTED_FILENAME) if ::File.exist?(EXTRACTED_FILENAME)
1111
end
1212

13+
def teardown
14+
::Zip.reset!
15+
end
16+
1317
def test_extract
1418
::Zip::File.open(TEST_ZIP.zip_name) do |zf|
1519
zf.extract(ENTRY_TO_EXTRACT, EXTRACTED_FILENAME)
@@ -80,4 +84,62 @@ def test_extract_non_entry_2
8084
end
8185
assert(!File.exist?(outFile))
8286
end
87+
88+
def test_extract_incorrect_size
89+
# The uncompressed size fields in the zip file cannot be trusted. This makes
90+
# it harder for callers to validate the sizes of the files they are
91+
# extracting, which can lead to denial of service. See also
92+
# https://en.wikipedia.org/wiki/Zip_bomb
93+
Dir.mktmpdir do |tmp|
94+
real_zip = File.join(tmp, 'real.zip')
95+
fake_zip = File.join(tmp, 'fake.zip')
96+
file_name = 'a'
97+
true_size = 500_000
98+
fake_size = 1
99+
100+
::Zip::File.open(real_zip, ::Zip::File::CREATE) do |zf|
101+
zf.get_output_stream(file_name) do |os|
102+
os.write 'a' * true_size
103+
end
104+
end
105+
106+
compressed_size = nil
107+
::Zip::File.open(real_zip) do |zf|
108+
a_entry = zf.find_entry(file_name)
109+
compressed_size = a_entry.compressed_size
110+
assert_equal true_size, a_entry.size
111+
end
112+
113+
true_size_bytes = [compressed_size, true_size, file_name.size].pack('LLS')
114+
fake_size_bytes = [compressed_size, fake_size, file_name.size].pack('LLS')
115+
116+
data = File.binread(real_zip)
117+
assert data.include?(true_size_bytes)
118+
data.gsub! true_size_bytes, fake_size_bytes
119+
120+
File.open(fake_zip, 'wb') do |file|
121+
file.write data
122+
end
123+
124+
Dir.chdir tmp do
125+
::Zip::File.open(fake_zip) do |zf|
126+
a_entry = zf.find_entry(file_name)
127+
assert_equal fake_size, a_entry.size
128+
129+
::Zip.validate_entry_sizes = false
130+
a_entry.extract
131+
assert_equal true_size, File.size(file_name)
132+
FileUtils.rm file_name
133+
134+
::Zip.validate_entry_sizes = true
135+
error = assert_raises ::Zip::EntrySizeError do
136+
a_entry.extract
137+
end
138+
assert_equal \
139+
'Entry a should be 1B but is larger when inflated',
140+
error.message
141+
end
142+
end
143+
end
144+
end
83145
end

0 commit comments

Comments
 (0)