-
Notifications
You must be signed in to change notification settings - Fork 31
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Meaningful error message required rather than core dump #109
Comments
Hey Terry - I've reproduced on my local with a synthetic file (cool tool on windows for generating files of arbitrary size fill of random bytes: RDFC). My computer heroically survived unzipping a 2GB file, a 5GB file, but finally choked on a 15GB file. Giving a runtime error: runtime: out of memory: cannot allocate 17179869184-byte block (17242652672 in use) runtime stack: goroutine 11 [running]: goroutine 1 [chan receive, 3 minutes]: goroutine 9 [chan receive, 3 minutes]: goroutine 12 [chan receive, 3 minutes]: goroutine 14 [semacquire, 3 minutes]: goroutine 18 [select, 3 minutes]: It may be a little hard to guard against this error given that computers are all different in their RAM capacity, so can't just set an arbitrary limit of say 3GB of zipped content. But I'll see what can be done. |
p.s. interesting that I got a golang runtime panic where you got a Windows OS exception panic... maybe because accessing the file over the network? |
hmmm tried it on the same zip that broke for you TJ, but I didn't get your error, got one that looks a lot like my synthetic file error: runtime: out of memory: cannot allocate 17179869184-byte block (17244651520 in use) runtime stack: goroutine 277 [running]: goroutine 1 [chan receive, 9 minutes]: goroutine 33 [chan receive, 9 minutes]: goroutine 278 [chan receive, 9 minutes]: goroutine 280 [semacquire, 9 minutes]: goroutine 327 [select, 9 minutes]: |
TJ: did some research into this. Detecting and preventing out of memory errors is evidently a hard problem! But the next release of golang (1.10) has something promising: they are working on "* APIs for memory and CPU resource control". This will hopefully allow me to detect available memory before attempting to allocate a big slice. So likely any fix to this won't land before golang 1.10 which is due early 2018. In the meantime, if you are using the "-z" flag: be aware that if your compressed file contains really big files, you can hit these out of memory errors. Temporary solution is to unzip before scanning with siegfried. |
Thanks Richard. My default approach will be to unzip pre-SF scan from now on anyway. |
A possible alternate approach is to back-up stream contents to a temp file on disk. That way I won't need to reserve such a large chunk of memory. It is a little less tidy and may mean a significant slowdown in some scenarios but it will at least avoid things blowing up like this. |
I think an adjustable limit would be a good idea due to the wide variety in specs for user machines. Perhaps a short description in help page to assist users guesstimate their optimal ARBITRARY_LIMIT. Regarding the default limit size, it would be interesting to see how much faster it would be to process the same "consignment AV.zip'' test file if the ARBITRARY_LIMIT is set to 10 times the size (~650MB). |
Hi Richard,
I tried to use siegfried on a zip file which was too large to process (46gb in size).
Siegfried attempted to load the whole zip into memory and failed, displaying the message below.
An out-of-memory error message would be better than a core dump in this instance.
Y:\XXXX\Converted videos, film\Consignment AV>sf -z -csv "Consignment AV.zip"
github.com/richardlehane/siegfried/internal/siegreader.(*Reader).ReadAt(0xc04253
68c0, 0xc0426345e6, 0x7010, 0x7a1a, 0x3c0632e6c, 0xa36330, 0x3c655e8, 0xd0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/rea
der.go:123 +0xd6
io.(*SectionReader).Read(0xc045cfa2a0, 0xc0426345e6, 0x7010, 0x7a1a, 0xc045c9f26
0, 0xc04201b080, 0xc045c9f260)
C:/go/src/io/io.go:465 +0x83
bufio.(*Reader).Read(0xc045b8d4a0, 0xc0426345e6, 0x7010, 0x7a1a, 0x5e6, 0x0, 0x0
)
C:/go/src/bufio/bufio.go:199 +0x1aa
io.ReadAtLeast(0x9c49a0, 0xc045b8d4a0, 0xc042634000, 0x75f6, 0x8000, 0x75f6, 0x7
d3b80, 0xc045c9f300, 0x9c49a0)
C:/go/src/io/io.go:309 +0x8d
io.ReadFull(0x9c49a0, 0xc045b8d4a0, 0xc042634000, 0x75f6, 0x8000, 0xc045c9f3b0,
0x411a6d, 0xc042046018)
C:/go/src/io/io.go:327 +0x5f
compress/flate.(*decompressor).copyData(0xc04257d300)
C:/go/src/compress/flate/inflate.go:663 +0xf5
compress/flate.(*decompressor).Read(0xc04257d300, 0xc1c5cd6000, 0x1000, 0x800000
00, 0x0, 0x100000000, 0xc145cd6000)
C:/go/src/compress/flate/inflate.go:347 +0x79
archive/zip.(*pooledFlateReader).Read(0xc045cf66a0, 0xc1c5cd6000, 0x1000, 0x8000
0000, 0x0, 0x0, 0x0)
C:/go/src/archive/zip/register.go:90 +0x139
archive/zip.(*checksumReader).Read(0xc045bbee10, 0xc1c5cd6000, 0x1000, 0x8000000
0, 0x100000000, 0x100000000, 0x0)
C:/go/src/archive/zip/reader.go:194 +0x7f
github.com/richardlehane/siegfried/internal/siegreader.(*stream).fill(0xc0422316
80, 0x80000000, 0x0, 0x0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/str
eam.go:76 +0xd3
github.com/richardlehane/siegfried/internal/siegreader.(*stream).CanSeek(0xc0422
31680, 0x0, 0xc042221001, 0xc045d00400, 0x0, 0x0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/str
eam.go:155 +0x229
github.com/richardlehane/siegfried/internal/bytematcher.(*Matcher).identify(0xc0
4207d8c0, 0xc045cf66c0, 0xc045cf2c00, 0xc045cf2c60, 0xc044701b48, 0x0, 0x1)
c:/gopath/src/github.com/richardlehane/siegfried/internal/bytematcher/id
entify.go:96 +0x15d6
created by github.com/richardlehane/siegfried/internal/bytematcher.(*Matcher).Id
entify
c:/gopath/src/github.com/richardlehane/siegfried/internal/bytematcher/by
tematcher.go:173 +0xd3
goroutine 1 [chan receive, 3 minutes]:
github.com/richardlehane/siegfried.(*Siegfried).IdentifyBuffer(0xc04207d970, 0xc
045cf66c0, 0x0, 0x0, 0xc045bc0ee0, 0x70, 0x0, 0x0, 0xc04202d150, 0x47d227, ...)
c:/gopath/src/github.com/richardlehane/siegfried/siegfried.go:385 +0x103
4
main.identifyRdr(0x3c30030, 0xc045bbee10, 0xc045be0000, 0xc042034fc0, 0x809918)
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:205 +0x127
main.identifyRdr(0x9c5960, 0xc0424386c8, 0xc04239e580, 0xc042034fc0, 0x809918)
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:266 +0x59f
main.readFile(0xc04239e580, 0xc042034fc0, 0x809918)
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:183 +0xa9
main.identifyFile(0xc04239e580, 0xc042034fc0, 0x809918)
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:191 +0x97
main.identify.func1(0xc0420480a0, 0x12, 0x9cc400, 0xc042035020, 0x0, 0x0, 0xc042
5edbf0, 0xc0420e61a0)
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/longpath_windows
.go:113 +0x4fb
path/filepath.walk(0xc0420480a0, 0x12, 0x9cc400, 0xc042035020, 0xc042231590, 0x0
, 0x50)
C:/go/src/path/filepath/path.go:356 +0x88
path/filepath.Walk(0xc0420480a0, 0x12, 0xc042231590, 0x7, 0x0)
C:/go/src/path/filepath/path.go:403 +0x124
main.identify(0xc042034fc0, 0xc0420480a0, 0x12, 0x0, 0x0, 0x0, 0x809918, 0xed194
6cf2, 0xa16ca0)
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/longpath_windows
.go:116 +0xe3
main.main()
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:466 +0xad4
goroutine 4 [chan receive, 3 minutes]:
main.printer(0xc042034fc0, 0xc042231540)
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:153 +0xba
created by main.main
c:/gopath/src/github.com/richardlehane/siegfried/cmd/sf/sf.go:388 +0x8a1
goroutine 287 [chan receive, 3 minutes]:
github.com/richardlehane/siegfried/internal/bytematcher.(*Matcher).scorer.func6(
0xc045cf2cc0, 0xc044701b60, 0xc044701b58, 0xc044701b50, 0xc045cfa300, 0xc04207d8
c0, 0xc045cfa330, 0xc045cfa390, 0xc045cf6780, 0xc045cfa360, ...)
c:/gopath/src/github.com/richardlehane/siegfried/internal/bytematcher/sc
orer.go:390 +0x57
created by github.com/richardlehane/siegfried/internal/bytematcher.(*Matcher).sc
orer
c:/gopath/src/github.com/richardlehane/siegfried/internal/bytematcher/sc
orer.go:389 +0x3cf
goroutine 321 [semacquire, 3 minutes]:
sync.runtime_SemacquireMutex(0xc0422316bc, 0xc045b33d00)
C:/go/src/runtime/sema.go:71 +0x44
sync.(*Mutex).Lock(0xc0422316b8)
C:/go/src/sync/mutex.go:134 +0xf5
github.com/richardlehane/siegfried/internal/siegreader.(*stream).Slice(0xc042231
680, 0x41000, 0x1000, 0x0, 0x0, 0x0, 0x0, 0x0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/str
eam.go:100 +0x74
github.com/richardlehane/siegfried/internal/siegreader.(*Reader).setBuf(0xc045ce
5480, 0x41000, 0x0, 0x0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/rea
der.go:50 +0x56
github.com/richardlehane/siegfried/internal/siegreader.(Reader).ReadByte(0xc045
ce5480, 0xc045b33f05, 0x0, 0x0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/rea
der.go:70 +0x86
github.com/richardlehane/siegfried/vendor/github.com/richardlehane/match/fwac.(
fwac).match(0xc04256bb20, 0x9c4e20, 0xc045ce5480, 0xc045cf2d80)
c:/gopath/src/github.com/richardlehane/siegfried/vendor/github.com/richa
rdlehane/match/fwac/fwac.go:448 +0x2be
created by github.com/richardlehane/siegfried/vendor/github.com/richardlehane/ma
tch/fwac.(*fwac).Index
c:/gopath/src/github.com/richardlehane/siegfried/vendor/github.com/richa
rdlehane/match/fwac/fwac.go:439 +0x86
goroutine 299 [select, 3 minutes]:
github.com/richardlehane/siegfried/internal/siegreader.(*stream).EofSlice(0xc042
231680, 0x0, 0x1000, 0x0, 0x0, 0x0, 0x0, 0x0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/str
eam.go:132 +0x12a
github.com/richardlehane/siegfried/internal/siegreader.(*ReverseReader).setBuf(0
xc045d12140, 0x0, 0x0, 0x0)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/rea
der.go:169 +0x56
github.com/richardlehane/siegfried/internal/siegreader.(*ReverseReader).ReadByte
(0xc045d12140, 0xc04256b920, 0x7706c0, 0xc045c7ee60)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/rea
der.go:212 +0x86
github.com/richardlehane/siegfried/internal/siegreader.(LimitReverseReader).Rea
dByte(0xc042221040, 0xc045770a80, 0x65, 0x65)
c:/gopath/src/github.com/richardlehane/siegfried/internal/siegreader/rea
der.go:274 +0x68
github.com/richardlehane/siegfried/vendor/github.com/richardlehane/match/fwac.(
fwac).match(0xc04256b940, 0x9c4de0, 0xc042221040, 0xc045d00480)
c:/gopath/src/github.com/richardlehane/siegfried/vendor/github.com/richa
rdlehane/match/fwac/fwac.go:448 +0xd0
created by github.com/richardlehane/siegfried/vendor/github.com/richardlehane/ma
tch/fwac.(*fwac).Index
c:/gopath/src/github.com/richardlehane/siegfried/vendor/github.com/richa
rdlehane/match/fwac/fwac.go:439 +0x86
rax 0x440622e6c
rbx 0x7010
rcx 0x440629e7c
rdi 0xc0426345e6
rsi 0x440622e6c
rbp 0xc045c9f1a0
rsp 0xc045c9f138
r8 0x1
r9 0x0
r10 0xc0426345e6
r11 0x20
r12 0x0
r13 0x0
r14 0x456320
r15 0x0
rip 0x4581f5
rflags 0x10283
cs 0x33
fs 0x53
gs 0x2b
Y:\XXXX\Converted videos, film\Consignment AV>
The text was updated successfully, but these errors were encountered: