Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fatal: strconv.Atoi: parsing "j": invalid syntax #386

Closed
wydhit opened this issue Oct 11, 2021 · 4 comments
Closed

Fatal: strconv.Atoi: parsing "j": invalid syntax #386

wydhit opened this issue Oct 11, 2021 · 4 comments
Assignees

Comments

@wydhit
Copy link

wydhit commented Oct 11, 2021

pdfcpu: 0.3.12
build : 2021-07-12T00:48:22Z
commit: 63d5ab6

.\pdfcpu.exe validate -vv .\testpdf.pdf

testpdf.pdf

` READ: 2021/10/11 11:15:01 logStream: no stream content
READ: 2021/10/11 11:15:01 dereferenceObject: begin, dereferencing object 842
READ: 2021/10/11 11:15:01 in use object 842
READ: 2021/10/11 11:15:01 dereferenceObject: dereferencing object 842
READ: 2021/10/11 11:15:01 ParseObject: begin, obj#842, offset:310328
READ: 2021/10/11 11:15:01 newPositionedReader: positioned to offset: 310328
READ: 2021/10/11 11:15:01 object: small object w/o stream, parse until endobj
Fatal: strconv.Atoi: parsing "j": invalid syntax
dereferenceObject: problem dereferencing object 842
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObject
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2304
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObjects
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2404
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceXRefTable
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2483
github.com/pdfcpu/pdfcpu/pkg/pdfcpu.Read
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:80
github.com/pdfcpu/pdfcpu/pkg/api.ReadContext
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/api.go:48

github.com/pdfcpu/pdfcpu/pkg/api.readAndValidate
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/api.go:107
github.com/pdfcpu/pdfcpu/pkg/api.Info
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/info.go:35
github.com/pdfcpu/pdfcpu/pkg/api.InfoFile
github.com/pdfcpu/pdfcpu/pkg/cli.Info
github.com/pdfcpu/pdfcpu/pkg/cli.Process
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/cli/process.go:35
main.process
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/process.go:91
main.processInfoCommand
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/process.go:1179
main.commandMap.process
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/cmd.go:143
main.main
/Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/main.go:55
runtime.main
/Users/horstrutter/gotip/src/runtime/proc.go:225
runtime.goexit
/Users/horstrutter/gotip/src/runtime/asm_amd64.s:1371`

@hhrutter
Copy link
Collaborator

Thank you for reporting this!

@hhrutter hhrutter self-assigned this Oct 11, 2021
@ghost
Copy link

ghost commented Oct 11, 2021

Oh wow I was also about to report having this issue. Here is another PDF in case it helps.

validating(mode=relaxed) DWS_KID_LU0952581584_GB_en_2021-03-29.pdf ...
 READ: 2021/10/11 08:48:38 Read: begin
 INFO: 2021/10/11 08:48:38 PDF Version 1.5 conforming reader
 READ: 2021/10/11 08:48:38 readXRefTable: begin
 READ: 2021/10/11 08:48:38 scanning for offsetLastXRefSection starting at 153335
 READ: 2021/10/11 08:48:38 Offset last xrefsection: 152830
 READ: 2021/10/11 08:48:38 buildXRefTableStartingAt: begin
 READ: 2021/10/11 08:48:38 headerVersion begin
 READ: 2021/10/11 08:48:38 headerVersion: end, found header version: 1.4
 READ: 2021/10/11 08:48:38 newPositionedReader: positioned to offset: 152830
 READ: 2021/10/11 08:48:38 xref line 1: <�:�H���,>
 READ: 2021/10/11 08:48:38 xref line 2: </U(�eL*G~�\)��?)>
 READ: 2021/10/11 08:48:38 buildXRefTableStartingAt: found xref stream
 READ: 2021/10/11 08:48:38 newPositionedReader: positioned to offset: 152830
 READ: 2021/10/11 08:48:38 parseXRefStream: begin at offset 152830
 READ: 2021/10/11 08:48:38 parseXRefStream: endInd=71(47) streamInd=-1(-1)
 READ: 2021/10/11 08:48:38 bypassXRefSection after pdfcpu: parseXRefStream: corrupt pdf file
 READ: 2021/10/11 08:48:38 newPositionedReader: positioned to offset: 0
 READ: 2021/10/11 08:48:38 processTrailer: trailer leftover: <
<<
/Size 33
/Info 1 0 R
/Root 2 0 R
/ID[<91A4CACA6C807AB4B91A0A4121774209><7D7DD03C942D9A46B3775ED6C6F95DA9>]
/Encrypt 32 0 R
>>
startxref>
 READ: 2021/10/11 08:48:38 line: <
<<
/Size 33
/Info 1 0 R
/Root 2 0 R
/ID[<91A4CACA6C807AB4B91A0A4121774209><7D7DD03C942D9A46B3775ED6C6F95DA9>]
/Encrypt 32 0 R
>>
startxref>
 READ: 2021/10/11 08:48:38 scanTrailer dictBuf after start tag: <<<
/Size 33
/Info 1 0 R
/Root 2 0 R
/ID[<91A4CACA6C807AB4B91A0A4121774209><7D7DD03C942D9A46B3775ED6C6F95DA9>]
/Encrypt 32 0 R
>>
startxref>
 READ: 2021/10/11 08:48:38 processTrailer: trailerString: (len:139) <<<
/Size 33
/Info 1 0 R
/Root 2 0 R
/ID[<91A4CACA6C807AB4B91A0A4121774209><7D7DD03C942D9A46B3775ED6C6F95DA9>]
/Encrypt 32 0 R
>>
startxref
>
 READ: 2021/10/11 08:48:38 processTrailer: trailerDict:
<<
        <Encrypt, (32 0 R)>
        <ID, [<91A4CACA6C807AB4B91A0A4121774209> <7D7DD03C942D9A46B3775ED6C6F95DA9>]>
        <Info, (1 0 R)>
        <Root, (2 0 R)>
        <Size, 33>
>>
 READ: 2021/10/11 08:48:38 parseTrailerDict begin
 READ: 2021/10/11 08:48:38 parseTrailerInfo begin
 READ: 2021/10/11 08:48:38 parseTrailerInfo: Encrypt object: (32 0 R)
 READ: 2021/10/11 08:48:38 parseTrailerInfo: Root object: (2 0 R)
 READ: 2021/10/11 08:48:38 parseTrailerInfo: Info object: (1 0 R)
 READ: 2021/10/11 08:48:38 parseTrailerInfo: ID object: [<91A4CACA6C807AB4B91A0A4121774209> <7D7DD03C942D9A46B3775ED6C6F95DA9>]
 READ: 2021/10/11 08:48:38 parseTrailerInfo end
 READ: 2021/10/11 08:48:38 parseTrailerDict end
TRACE: 2021/10/11 08:48:38 EnsureValidFreeList begin
TRACE: 2021/10/11 08:48:38 EnsureValidFreeList: empty free list.
 READ: 2021/10/11 08:48:38 readXRefTable: end
 READ: 2021/10/11 08:48:38 dereferenceXRefTable: begin
 READ: 2021/10/11 08:48:38 Encryption: (32 0 R)
 READ: 2021/10/11 08:48:38 dereferencedObject: dereferencing object 32
 READ: 2021/10/11 08:48:38 ParseObject: begin, obj#32, offset:152762
 READ: 2021/10/11 08:48:38 newPositionedReader: positioned to offset: 152762
 READ: 2021/10/11 08:48:38 object: small object w/o stream, parse until endobj
Fatal: strconv.Atoi: parsing "bj": invalid syntax

DWS_KID_LU0952581584_GB_en_2021-03-29.pdf

@wydhit
Copy link
Author

wydhit commented Oct 11, 2021

Many 1.3 PDF versions have problems
%PDF-1.3

cookbook-LeetCode-go1.3.pdf

READ: 2021/10/11 18:31:41 parseTrailerDict begin READ: 2021/10/11 18:31:41 parseTrailerInfo begin READ: 2021/10/11 18:31:41 parseTrailerInfo: Info object: (548 0 R) READ: 2021/10/11 18:31:41 parseTrailerInfo: ID object: [<6828194B8291FD02D96B565CFA2AA0D5> <6828194B8291FD02D96B565CFA2AA0D5>] READ: 2021/10/11 18:31:41 parseTrailerInfo end READ: 2021/10/11 18:31:41 parseTrailerDict end TRACE: 2021/10/11 18:31:41 EnsureValidFreeList begin TRACE: 2021/10/11 18:31:41 EnsureValidFreeList: empty free list. READ: 2021/10/11 18:31:41 readXRefTable: end READ: 2021/10/11 18:31:41 dereferenceXRefTable: begin READ: 2021/10/11 18:31:41 decodeObjectStreams: begin READ: 2021/10/11 18:31:41 decodeObjectStreams: end READ: 2021/10/11 18:31:41 dereferenceObjects: begin READ: 2021/10/11 18:31:41 dereferenceObject: begin, dereferencing object 0 READ: 2021/10/11 18:31:41 free object 0 READ: 2021/10/11 18:31:41 dereferenceObject: begin, dereferencing object 1 READ: 2021/10/11 18:31:41 in use object 1 READ: 2021/10/11 18:31:41 dereferenceObject: dereferencing object 1 READ: 2021/10/11 18:31:41 ParseObject: begin, obj#1, offset:44234858 READ: 2021/10/11 18:31:41 newPositionedReader: positioned to offset: 44234858 READ: 2021/10/11 18:31:41 object: small object w/o stream, parse until endobj Fatal: pdfcpu: ParseObjectAttributes: can't find "obj" github.com/pdfcpu/pdfcpu/pkg/pdfcpu.parseObjectAttributes /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/parse.go:235 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.object /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:1685 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.ParseObject /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:1714 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObject /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2302 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObjects /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2404 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceXRefTable /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2483 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.Read /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:80 github.com/pdfcpu/pdfcpu/pkg/api.ReadContext /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/api.go:48 github.com/pdfcpu/pdfcpu/pkg/api.Validate /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/validate.go:43 github.com/pdfcpu/pdfcpu/pkg/api.ValidateFile /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/validate.go:92 github.com/pdfcpu/pdfcpu/pkg/cli.Validate /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/cli/cli.go:33 github.com/pdfcpu/pdfcpu/pkg/cli.Process /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/cli/process.go:35 main.process /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/process.go:91 main.processValidateCommand /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/process.go:133 main.commandMap.process /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/cmd.go:143 main.main /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/main.go:55 runtime.main /Users/horstrutter/gotip/src/runtime/proc.go:225 runtime.goexit /Users/horstrutter/gotip/src/runtime/asm_amd64.s:1371 dereferenceObject: problem dereferencing object 1 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObject /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2304 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceObjects /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2404 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.dereferenceXRefTable /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:2483 github.com/pdfcpu/pdfcpu/pkg/pdfcpu.Read /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/pdfcpu/read.go:80 github.com/pdfcpu/pdfcpu/pkg/api.ReadContext /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/api.go:48 github.com/pdfcpu/pdfcpu/pkg/api.Validate /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/validate.go:43 github.com/pdfcpu/pdfcpu/pkg/api.ValidateFile /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/api/validate.go:92 github.com/pdfcpu/pdfcpu/pkg/cli.Validate /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/cli/cli.go:33 github.com/pdfcpu/pdfcpu/pkg/cli.Process /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/pkg/cli/process.go:35 main.process /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/process.go:91 main.processValidateCommand /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/process.go:133 main.commandMap.process /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/cmd.go:143 main.main /Users/horstrutter/go/src/github.com/pdfcpu/pdfcpu/cmd/pdfcpu/main.go:55 runtime.main /Users/horstrutter/gotip/src/runtime/proc.go:225 runtime.goexit /Users/horstrutter/gotip/src/runtime/asm_amd64.s:1371

@hhrutter
Copy link
Collaborator

Thank you all for reporting this.
Please go install the latest commit.

FYI these files are all corrupt but pdfcpu will fix them on the fly.
Validation will pass now but pdfcpu optimize is highly recommended for further processing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants