Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DWG large files errors #272

Open
arturred opened this issue Oct 14, 2020 · 9 comments
Open

DWG large files errors #272

arturred opened this issue Oct 14, 2020 · 9 comments
Assignees

Comments

@arturred
Copy link

Hi
Reading these files using the current binary assert reports many errors :

Warning: checksum: 0x31b7133b (calculated) mismatch

ERROR: read_R2004_section_info out of range
Warning: Failed to find section_info[7] with type 1
ERROR: Failed to read compressed Header section
ERROR: Invalid .props x 28191
Warning: Failed to find section_info[7] with type 3
ERROR: Failed to read compressed Classes section
Warning: Skip empty section 2329 AcDb:AcDbObjects
ERROR: Invalid opcode 0x0 in input stream at pos 294
ERROR: Failed to read compressed AcDbObjects section
Warning: Failed to find section_info[7] with type 2
ERROR: Failed to read uncompressed AuxHeader section
ERROR: Preview overflow > 29067
Warning: thumbnail.size mismatch: 29071 != 0
ERROR: Some section size or address out of bounds
ERROR: Template section not found
...

Please download samples from https://easyupload.io/jvzytl
Teigha or AutoCAD can read them. If I convert them to other dwg formats (lower or higher), they open fine.
This seems to be a file specific issue.

@rurban
Copy link
Contributor

rurban commented Oct 14, 2020

Yes, this is the known section map bug #144

@rurban rurban self-assigned this Oct 14, 2020
@arturred
Copy link
Author

Thanks for the info. This seems to be a hard bug for a year. I've also tested libdxfrw trying to fix it (using your suggestions for variables overflow) but no luck so far. I either get duplicated ids of a page map section or invalid addresses outside the buffer range. In other files, the page map seems to be correct but reading section info fails. The problem is not only overflow values but also the decompressed buffer that may contain gaps (negative page id?). No idea so I hope that you will figure this out.

@rurban
Copy link
Contributor

rurban commented Oct 15, 2020

Yes, a hard one. The more failing dwg examples, the better to figure out the scheme.
In principle it needs a big dwg and then delete many entities, which causes the gaps.

@markstock
Copy link

I think I found a ton of files that fail in this same manner: https://www.3drotterdam.nl/downloads/#/

From HEAD build, Fedora 29, GCC 8.3.1:

curl -O https://www.3drotterdam.nl/downloads/global/download//DWG/Rotterdam_Centrum.zip
unzip Rotterdam_Centrum.zip
dwgread -O GeoJSON -o Cool.json Rotterdam_Centrum/Bomen/Cool.dwg
Warning: checksum: 0x28e2125d (calculated) mismatch

ERROR: Skip section A with size 89 > 1 * 0
ERROR: read_R2004_section_info out of range
Warning: Failed to find section_info[7] with type 1
ERROR: Failed to read compressed Header section
Warning: Failed to find section_info[7] with type 3
ERROR: Failed to read compressed Classes section
Warning: Failed to find section_info[7] with type 4
ERROR: Failed to read compressed Handles section
Warning: Failed to find section_info[7] with type 2
ERROR: Failed to read uncompressed AuxHeader section
ERROR: Preview overflow
ERROR: Invalid product_checksum size 16. Need min. 16 bits, have 65280 for .
ERROR: Template section not found

ERROR: Failed to decode file: Cool.dwg 0x941

ERROR 0x941

@rurban
Copy link
Contributor

rurban commented May 4, 2021

This seems to be a good example, thanks. No deleted pages, just a corrupt section_info[6] out of thin air. Interesting

@no-such-user
Copy link

It seems that many files have the checksum and other issues, including some that ship with libredwg:

root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json example_2010.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2edd12f6 (calculated) mismatch
root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json example_2013.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2c7512b9 (calculated) mismatch
root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json example_2018.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2d1512c9 (calculated) mismatch
root@695c816fa3f9:/libredwg/test/test-data# ../../programs/dwgread --format json sample_2018.dwg 2>&1 | grep "Warning: checksum"
Warning: checksum: 0x2845124f (calculated) mismatch

And about a third of the sample files I am using to test.

How much does this impact the ability to extract text from the file? Are we going to miss any sections do to this issue?

@FishOrBear
Copy link

Large files will cause all the text to be garbled,
Is there a way to solve?

@rurban
Copy link
Contributor

rurban commented Apr 2, 2022 via email

@FishOrBear
Copy link

FishOrBear @.> schrieb am Fr., 1. Apr. 2022, 10:09:
Large files will cause all the text to be garbled, Is there a way to solve?
Only, if many objects have been deleted. no way, as of yet —

Reply to this email directly, view it on GitHub <#272 (comment)>, or unsubscribe <github.com/notifications/unsubscribe-auth/AAAKGULVSETJNYJWY5U76ZTVC2VMDANCNFSM4SQP347Q> . You are receiving this because you were assigned.Message ID: @.
>

Why use dwggrep.exe to read without garbled characters?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants