New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Further improve parsing of dictionaries / names #795
Conversation
Is there any way to get into a discussion of an issue. |
As written above, it's a follow-up to the previous change in #776, related to issue #775. Even with the change from #776 you can construct a PDF with dictionaries that take ages to parse (see the updated test in pdfcpu/pkg/pdfcpu/model/parse_dict_test.go Line 153 in 4978e9c
I'm happy to discuss the changes in this merge request - at least that's my understanding what merge requests are for. You can easily comment on individual lines, add a review or add global comments as we are doing right now. |
It's easy to come up with spec compliant PDFs that pdfcpu will choke on but that's true for many pdfcpu processors out there. Instead of focusing on theoretical corner cases I'd like to spend my time on real word PDFs that are spec compliant and yet cause trouble. Yet I am on board if this is about speeding up parsing of average but bigger PDF files. |
Do you think you can rebase this onto the latest commit? |
With this change, the names are decoded internally, so they be can compared directly when adding entries to dictionaries. On writing, the names are encoded if necessary. Also removed some duplicate code for name encoding / decoding.
50422c4
to
044a6c0
Compare
Sure, just rebased the branch on fc87a22. |
heads up... ValidationNone is gone in case you were using it.. |
Thanks! |
With this change, the names are decoded internally, so they can be compared directly when adding entries to dictionaries. On writing, the names are encoded if necessary.
Also removed some duplicate code for name encoding / decoding and simplified object type tests.
Follow-up to #776 to also speed up parsing dictionaries that contain key with a
#
.