Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Extractor failing on majority of images #2

Open
ksbickmore opened this issue Apr 2, 2017 · 8 comments
Open

Extractor failing on majority of images #2

ksbickmore opened this issue Apr 2, 2017 · 8 comments

Comments

@ksbickmore
Copy link

I'm having trouble determining why the extractor fails on the vast majority of images downloaded via the scraper. It works fine on the Netgear WNAP320, 29 of the DLinks that were scraped and another 30 or so of the other Netgear images. The remaining all get errors similar to what I've posted below.

When I extract using binwalk -e it works fine. binwalk is version 2.1.2b

./sources/extractor/extractor.py -b netgear -sql 127.0.0.1 -np -nk ../b547a37d517c20b70b10657cc9f15a9a0268e62f.zip images/

Database Image ID: 768

/home/user/firmadyne/b547a37d517c20b70b10657cc9f15a9a0268e62f.zip

MD5: 29b3378ac4edad8ebd119e86c3511b82
Tag: 768
Temp: /tmp/tmpzDF4ji
Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True

Zip archive data, at least v2.0 to extract, compressed size: 2112, uncompressed size: 5845, name: ReleaseNotes_FVS338_fw_2.0.0.html
Recursing into archive ...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip.extracted/fvs338_2_0_0_139.img
>> MD5: 8f6d717d07234d1be491853872fcef38
>> Tag: 768
>> Temp: /tmp/tmpaEx6r5
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>>>> gzip compressed data, maximum compression, from Unix, last modified: 2006-10-07 16:28:56
>> Recursing into compressed ...

/tmp/tmpaEx6r5/_fvs338_2_0_0_139.img.extracted/333C
>> MD5: 97bff8c2cdc9d4fdb54509fe04906df6
>> Tag: 768
>> Temp: /tmp/tmpB60hPN
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>>>> PARity archive data - file number 27745
>> Recursing into archive ...
>>>> Extraction failed!
>>>> gzip compressed data, maximum compression, from Unix, last modified: 2006-10-07 16:28:37
>> Recursing into compressed ...

/tmp/tmpB60hPN/_333C.extracted/1A3E0
>> Skipping: recursion depth 3
>> Cleaning up /tmp/tmpB60hPN...

/tmp/tmpaEx6r5/_fvs338_2_0_0_139.img.extracted/210AC0
>> MD5: a14c6cefeb7a3f73a36ca73668b9a55a
>> Tag: 768
>> Temp: /tmp/tmpBbZKR7
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Cleaning up /tmp/tmpBbZKR7...

/tmp/tmpaEx6r5/_fvs338_2_0_0_139.img.extracted/210AE8
>> MD5: 19d48884221170c2085f9ab3c068774f
>> Tag: 768
>> Temp: /tmp/tmpGkoMwE
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Cleaning up /tmp/tmpGkoMwE...

/tmp/tmpaEx6r5/_fvs338_2_0_0_139.img.extracted/210B10
>> MD5: 9b1103e2d69ce3e6eb951537d7406de4
>> Tag: 768
>> Temp: /tmp/tmpMwFyjT
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Cleaning up /tmp/tmpMwFyjT...

/tmp/tmpaEx6r5/_fvs338_2_0_0_139.img.extracted/210B38
>> MD5: 12a4df70c4a7495c702cf18156f23197
>> Tag: 768
>> Temp: /tmp/tmp0Bdoxz
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Cleaning up /tmp/tmp0Bdoxz...

/tmp/tmpaEx6r5/_fvs338_2_0_0_139.img.extracted/210B60
>> MD5: 0d4a362fa6c9cfe47412120640ee4ec0
>> Tag: 768
>> Temp: /tmp/tmptochU8
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Skipping: recursion breadth 5
>> Cleaning up /tmp/tmptochU8...
>> Skipping: completed!
>> Cleaning up /tmp/tmpaEx6r5...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip.extracted/ReleaseNotes_FVS338_fw_2.0.0.html
>> MD5: af25e9cdeba6d6a5afc380dd56e54a21
>> Skipping: text/plain...

Extraction failed!
Zlib compressed data, best compression
Recursing into compressed ...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip.extracted/fvs338_2_0_0_139.img
>> MD5: 8f6d717d07234d1be491853872fcef38
>> Skipping: 8f6d717d07234d1be491853872fcef38...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip.extracted/ReleaseNotes_FVS338_fw_2.0.0.html
>> MD5: af25e9cdeba6d6a5afc380dd56e54a21
>> Skipping: af25e9cdeba6d6a5afc380dd56e54a21...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip-0.extracted/2170E3
>> MD5: 5ba158ccd71d703366609b02933972c9
>> Tag: 768
>> Temp: /tmp/tmpwPh5a9
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Cleaning up /tmp/tmpwPh5a9...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip-0.extracted/217900
>> MD5: 0225d6e816bcfd23c77cee5d26a73d2d
>> Tag: 768
>> Temp: /tmp/tmppJhxpB
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Cleaning up /tmp/tmppJhxpB...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip-0.extracted/2181ED
>> MD5: 22e744cae8cdec1aa9ea16073c05f4ac
>> Tag: 768
>> Temp: /tmp/tmphgOKKy
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...
>> Cleaning up /tmp/tmphgOKKy...

/tmp/tmpzDF4ji/_b547a37d517c20b70b10657cc9f15a9a0268e62f.zip-0.extracted/218A07
>> MD5: 2b1dda9d98da2e30fc8fa982b93d6ae8
>> Tag: 768
>> Temp: /tmp/tmpha8u9m
>> Status: Kernel: True, Rootfs: False, Do_Kernel: False, Do_Rootfs: True
>> Recursing into archive ...
>>>> Extraction failed!
>> Recursing into compressed ...

Skipping: recursion breadth 5
>> Cleaning up /tmp/tmpha8u9m...
Skipping: completed!
Cleaning up /tmp/tmpzDF4ji...

@ddcc
Copy link
Collaborator

ddcc commented Apr 2, 2017

Did you read the usage instructions? Also, the last few lines of the output indicate you're running into a recursion limit; see the variable RECURSION_BREADTH. Also, the extractor only looks for directory structures that resemble a *NIX filesystem, and ignores those that don't satisfy the heuristic; see the variable UNIX_THRESHOLD.

@ksbickmore
Copy link
Author

I've looked at the usage instructions and the instructions here as well. fakeroot doesn't seem to work and just throws an error:
fakeroot: FAKEROOTKEY set to 604242434 fakeroot: nested operation not yet supported

I'm also using python2.7 instead of python3 because python3 with the extractor throws errors such as
... File "./sources/extractor/extractor.py", line 442, in <genexpr> if any(s in filetype for s in [b"application/x-executable", TypeError: 'in <string>' requires string as left operand, not bytes

I've tried increasing the RECURSION_BREADTH and even at 100 it fails for every extraction in that same image. As I'm watching it attempt the directory of Netgears and DLink downloads it is failing on the majority of them as well. Are most of those not *nix?

@ddcc
Copy link
Collaborator

ddcc commented Apr 2, 2017

That error indicates that you're trying to run fakeroot inside a fakeroot, which isn't supported. Are you running an entire bash session in fakeroot or something?

Compatibility with python3 should be fixed with 481fe28 .

I took a look at fvs338_2_0_0_139.img, and the reason it fails is that there's no recognizable *NIX filesystem. Binwalk just sees a bunch of gzip'ed files, which it can extract, but that's not useful for emulation. I'd guess that most of the unsupported files probably fall into this category.

@ksbickmore
Copy link
Author

I'm just opening a terminal normally in Ubuntu, not sure why fakeroot would think that is happening.
Is it odd that such a huge number of the files that get scraped are unsupported in this way?

@ddcc
Copy link
Collaborator

ddcc commented Apr 2, 2017

You might want to take a look at Table VII (last page) of the paper for some concrete numbers. There's no standard format for firmware images, so automated extraction isn't straightforward. Even manual extraction with binwalk can be quite challenging, especially if the firmware uses a RTOS, a custom filesystem, or anything non-*NIX.

@ksbickmore
Copy link
Author

The new extractor was still failing on far more images than what was shown in Table VII for both Netgears and DLink; however, it increased the success rate. For instance it is now able to extract fvs338_2_0_0_139.img as there is a file system in 333C.

@ddcc
Copy link
Collaborator

ddcc commented Apr 3, 2017

By new extractor, you mean with the latest commit? The only difference running under python2 vs python3 should be runtime performance; there's no change in functionality.

@ksbickmore
Copy link
Author

Sorry, yes I meant the new commit. For whatever reason it was able to extract that image.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants