Skip to content
This repository has been archived by the owner on Nov 13, 2021. It is now read-only.

AddUTF8FromBytes modifies its argument utf8bytes resulting in a panic if you generate two PDFs wih the "same" font bytes. #316

Closed
vdobler opened this issue Oct 1, 2019 · 11 comments

Comments

@vdobler
Copy link

vdobler commented Oct 1, 2019

I've encountered a strange problem: I can generate a PDF with embedded UTF-8 font without any problems, but just once. Generating the same PDF once more results in a panic because the input utf8bytes to AddUTF8FontFromBytes is modified.

$ ./fpdfpanic
Making PDF
--> OK
Making PDF
panic: runtime error: index out of range [20] with length 12

goroutine 1 [running]:
github.com/jung-kurt/gofpdf/v2.(*utf8FontFile).getSymbols(0xc0000cc2a0, 0x14, 0xc0000c6e28, 0xc00006c210, 0xc00006c1b0, 0xc00006a120, 0xb, 0xb, 0xc0000c6e28, 0xc00006c210, ...)
	/Users/m0252720/go/pkg/mod/github.com/jung-kurt/gofpdf/v2@v2.13.3/utf8fontfile.go:793 +0x522
github.com/jung-kurt/gofpdf/v2.(*utf8FontFile).parseSymbols(0xc0000cc2a0, 0xc0000a36e0, 0xb, 0xc00006c2a0, 0x3ff0000000000000, 0x0, 0x0, 0x0)
	/Users/m0252720/go/pkg/mod/github.com/jung-kurt/gofpdf/v2@v2.13.3/utf8fontfile.go:563 +0x541
github.com/jung-kurt/gofpdf/v2.(*utf8FontFile).GenerateСutFont(0xc0000cc2a0, 0xc0000a36e0, 0x0, 0xc000014a3a, 0x3)
	/Users/m0252720/go/pkg/mod/github.com/jung-kurt/gofpdf/v2@v2.13.3/utf8fontfile.go:672 +0x3eb
github.com/jung-kurt/gofpdf/v2.(*Fpdf).putfonts(0xc0002ce800)
	/Users/m0252720/go/pkg/mod/github.com/jung-kurt/gofpdf/v2@v2.13.3/fpdf.go:4036 +0xb78
github.com/jung-kurt/gofpdf/v2.(*Fpdf).putresources(0xc0002ce800)
	/Users/m0252720/go/pkg/mod/github.com/jung-kurt/gofpdf/v2@v2.13.3/fpdf.go:4511 +0x75
github.com/jung-kurt/gofpdf/v2.(*Fpdf).enddoc(0xc0002ce800)
	/Users/m0252720/go/pkg/mod/github.com/jung-kurt/gofpdf/v2@v2.13.3/fpdf.go:4698 +0x121
github.com/jung-kurt/gofpdf/v2.(*Fpdf).Close(0xc0002ce800)
	/Users/m0252720/go/pkg/mod/github.com/jung-kurt/gofpdf/v2@v2.13.3/fpdf.go:688 +0xa1
main.makePDF(0xc00006c150)
	/Users/m0252720/go/src/tmp/fpdfpanic/main.go:28 +0x215
main.main()
	/Users/m0252720/go/src/tmp/fpdfpanic/main.go:17 +0xd5

Here is (stripped) the code to reproduce above error:

package main

import (
	"bytes"
	"encoding/base64"
	"fmt"

	"github.com/jung-kurt/gofpdf/v2"
)

var fontBytes []byte

func main() {
	fontBytes, _ = base64.StdEncoding.DecodeString(dvscData)
	makePDF()
	makePDF()
}

func makePDF() *bytes.Buffer {
	fmt.Println("Making PDF")
	pdf := gofpdf.New("P", "mm", "A4", "")
	pdf.AddUTF8FontFromBytes("HNC", "", fontBytes)
	pdf.AddPage()
	pdf.SetXY(20, 20)
	pdf.SetFont("HNC", "", 12)
	pdf.MultiCell(0, 15, "abcABC1234", "", "L", false)
	pdf.Close()

	out := &bytes.Buffer{}
	pdf.Output(out)
	fmt.Println("--> OK")
	return out
}

// DejaVuSansCondensed.ttf in base64 encoding. (trimmed).
var dvscData = `AAEAAAAUAQAAB ..... `

The problem does not happen if the font is loaded from disk, e.g. with pdf.AddUTF8Font("HNC", "", "..../github.com/jung-kurt/gofpdf/font/DejaVuSansCondensed.ttf")

The panic does not happen on the first rendering but on all subsequent renderings (not show in demo code above).

The reason is that somehow AddUTF8FontFromBytes modifies its input utf8bytes. Makeing a copy beforehand avoids the problem and renders all PDFs without problems.

         // Working code:
	pdf := gofpdf.New("P", "mm", "A4", "")
	cpy := make([]byte, len(fontBytes))
	copy(cpy, fontBytes)
	pdf.AddUTF8FontFromBytes("HNC", "", cpy)

(Maybe this modification if the utf8bytes argument is intended but then it should be documented.)

The delta is not big, here the output of github.com/google/go-cmp/cmp.Diff'ing utf8bytes before and after:

[]uint8{
  	... // 55292 identical bytes
  	0x00, 0x9c, 0x04, 0xe1, 0x02, 0x66, 0x00, 0x8f, 0x01, 0x8d, 0x02, 0xf6, 0x00, 0xcd, 0x03, 0x44, //  |.....f.........D|
  	0x00, 0x29, 0x00, 0x66, 0x04, 0xee, 0x00, 0x73, 0x00, 0x00, 0x14, 0x00, 0x00, 0x96, 0x00, 0x00, //  |.).f...s........|
- 	0xb7,                                                                                           // -|.|
+ 	0x00,                                                                                           // +|.|
  	0x07, 0x06, 0x05, 0x04, 0x03, 0x02, 0x01, 0x00, 0x2c, 0x20, 0x10, 0xb0, 0x02, 0x25, 0x49, 0x64, //  |........, ...%Id|
  	0xb0, 0x40, 0x51, 0x58, 0x20, 0xc8, 0x59, 0x21, 0x2d, 0x2c, 0xb0, 0x02, 0x25, 0x49, 0x64, 0xb0, //  |.@QX .Y!-,..%Id.|
  	... // 120 identical bytes
  	0x25, 0x49, 0x60, 0xb0, 0x20, 0x63, 0x68, 0x20, 0x8a, 0x10, 0x8a, 0x23, 0x3a, 0x8a, 0x10, 0x65, //  |%I`. ch ...#:..e|
  	0x3a, 0x2d, 0x00, 0x00, 0x00, 0x00, 0x02, 0x00, 0x08, 0x00, 0x02, 0xff, 0xff, 0x00, 0x03, 0x00, //  |:-..............|
- 	0x02, 0x00,                                                                                     // -|..|
+ 	0x00, 0x00,                                                                                     // +|..|
  	0x5b, 0xfe, 0x96, 0x03, 0xf6, 0x05, 0xa4, 0x00, 0x03, 0x00, 0x07, 0x00, 0x00, 0x13, 0x11, 0x21, //  |[..............!|
  	0x11, 0x25, 0x21, 0x11, 0x21, 0x5c, 0x03, 0x99, 0xfc, 0xce, 0x02, 0xcc, 0xfd, 0x34, 0xfe, 0x96, //  |.%!.!\.......4..|
  	... // 479001 identical bytes
  	0xe8, 0xec, 0xbc, 0xb1, 0xc6, 0x15, 0xb8, 0xf8, 0xf8, 0x7d, 0xec, 0xbc, 0xb1, 0xc6, 0x15, 0xb8, //  |.........}......|
  	0xf7, 0xfc, 0xe8, 0xe3, 0xe3, 0x00, 0x00, 0x00, 0x00, 0x01, 0x00, 0x00, 0x00, 0x02, 0x5e, 0xb8, //  |..............^.|
- 	0x5b, 0x12, 0x3e, 0xed,                                                                         // -|[.>.|
+ 	0x00, 0x00, 0x00, 0x00,                                                                         // +|....|
  	0x5f, 0x0f, 0x3c, 0xf5, 0x00, 0x1f, 0x08, 0x00, 0x00, 0x00, 0x00, 0x00, 0xd3, 0xc2, 0x2b, 0xf9, //  |_.<...........+.|
  	0x00, 0x00, 0x00, 0x00, 0xd3, 0xc2, 0x2b, 0xf9, 0xf8, 0xa7, 0xfc, 0x4c, 0x0c, 0xea, 0x09, 0xdc, //  |......+....L....|
  	0x00, 0x20, 0x00, 0x08, 0x00, 0x00, 0x00,                                                       //  |. .....|
- 	0x01,                                                                                           // -|.|
  	0x00, 0x00, 0x00, 0x00, 0x00,                                                                   //  |.....|
+ 	0x00,                                                                                           // +|.|
  	0x01, 0x00, 0x00, 0x07, 0x6d, 0xfe, 0x1d, 0x00, 0x00, 0x0d, 0x7e, 0xf8, 0xa7, 0xfa, 0xe3, 0x0c, //  |....m.....~.....|
  	0xea,                                                                                           //  |.|
- 	0x00, 0x01,                                                                                     // -|..|
+ 	0x00, 0x01, 0x00, 0x00, 0x00,                                                                   // +|.....|
  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00,                         //  |............|
- 	0x00, 0x00, 0x18, 0x5e, 0x04, 0x52,                                                             // -|...^.R|
+ 	0x0b, 0x00, 0x00,                                                                               // +|...|
  	0x00, 0x5b, 0x00, 0x00, 0x00, 0x00, 0x02, 0xaa, 0x00, 0x00, 0x02, 0x49, 0x00, 0x00, 0x02, 0xe2, //  |.[.........I....|
  	0x01, 0x16, 0x03, 0x4f, 0x00, 0xb1, 0x06, 0x08, 0x00, 0x8e, 0x04, 0x94, 0x00, 0x99, 0x06, 0xd7, //  |...O............|
  	... // 66316 identical bytes
  	0x4b, 0xcc, 0x00, 0x07, 0x4c, 0x50, 0x00, 0x07, 0x4c, 0xf0, 0x00, 0x07, 0x4d, 0x78, 0x00, 0x07, //  |K...LP..L...Mx..|
  	0x4e, 0x08, 0x00, 0x07, 0x4e, 0x94, 0x00, 0x07, 0x4e, 0xf4, 0x00, 0x07, 0x4f, 0x54, 0x00, 0x01, //  |N...N...N...OT..|
- 	0x00, 0x00, 0x18, 0x6d,                                                                         // -|...m|
+ 	0x00, 0x00, 0x00, 0x0b,                                                                         // +|....|
  	0x03, 0x54, 0x00, 0x2b, 0x00, 0x68, 0x00, 0x0c, 0x00, 0x02, 0x00, 0x10, 0x00, 0x99, 0x00, 0x08, //  |.T.+.h..........|
  	0x00, 0x00, 0x04, 0x15, 0x00, 0x00, 0x00, 0x08, 0x00, 0x04, 0x00, 0x00, 0x00, 0x1a, 0x01, 0x3e, //  |...............>|
  	... // 15717 identical bytes
  	0x6e, 0x73, 0x00, 0x00, 0x43, 0x00, 0x6f, 0x00, 0x6e, 0x00, 0x64, 0x00, 0x65, 0x00, 0x6e, 0x00, //  |ns..C.o.n.d.e.n.|
  	0x73, 0x00, 0x65, 0x00, 0x64, 0x00, 0x00, 0x43, 0x6f, 0x6e, 0x64, 0x65, 0x6e, 0x73, 0x65, 0x64, //  |s.e.d..Condensed|
- 	0x00, 0x00, 0x02,                                                                               // -|...|
+ 	0x00, 0x00, 0x00,                                                                               // +|...|
  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0xff, 0x7e, 0x00, 0x5a, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, //  |.......~.Z......|
  	0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x00, 0x18, 0x6d, //  |...............m|
  	... // 63402 identical bytes
  }
@vdobler vdobler changed the title panic: runtime error: index out of range when using AddUTF8FromBytes and generating 2 PDFs AddUTF8FromBytes modifies its argument utf8bytes resulting in a panic if you generate two PDFs wih the "same" font bytes. Oct 1, 2019
@jung-kurt
Copy link
Owner

Thanks for the well-written, concise report and demonstration code, @vdobler. There is no reason why a cached copy of the font should be modified. I will look into this and hope others can provide some insights too.

@ArtemKor
Copy link

ArtemKor commented Oct 4, 2019

I fixed it and create pull reguest.

@jung-kurt
Copy link
Owner

@vdobler -- can you confirm that the problem has been corrected?

@vdobler
Copy link
Author

vdobler commented Oct 7, 2019

I do not thing that the PR #318 is a proper fix for the underlying problem. Making a copy before modifying the byte slice hides the actual problem: somehow reading the font is not a read-only action.

This type of stopgap solution can be used without this "fix" (actually that's what I'm doing now: make a copy). But I think the actual modification should be fixed instead of hidden behind a copy.

@jung-kurt
Copy link
Owner

@vdobler -- all points well-taken.

@jacobalberty
Copy link
Contributor

The font gets modified in two places within utf8fontfile.go, line 225 in the splice function. Because you are working with slices the arrays backing them get modified. splice gets called by the calls to insertUint16 at lines 782, 786, and 790. If you simply modify splice to make a copy there before proceeding you need to be aware it will make copies for the call at insertUint16 as line 736 (which might not be a bad idea as it ultimately may modify some of the backing font data, but didn't in my quick tests tonight) and another copy at the calls to splice at lines 261 and 1031.

The second place the data gets modified is at line 1025.
data = append(data, []byte{0, 0, 0}...) once again this is a slice backed by an array and you're modifying the underlying array.
I've come up with two solutions aptly named badfix and worsefix

https://github.com/jacobalberty/gofpdf/tree/badfix
Badfix creates a copy within the GenerateCutFont function by appending the data with an empty byte array, this creates a new backing array that separates those changes entirely from the underlying font data. It might need to be applied to the getTableData call at like 698 as well to prevent the call to insertUint16 at line 736 from modifying the underlying font data but I never encountered that as an issue in my testing.

https://github.com/jacobalberty/gofpdf/tree/worsefix
This one takes the brute force method of just copying the slices into new arrays in the splice function and the getMetrics function. It's pretty much guaranteed safe but I'm fairly certain it copies more than the fix in PR #318

I am not sure either of these fixes are truly the correct answer as such I have not opened a PR, though badfix is probably the best one to pursue without major refactoring.

@jung-kurt
Copy link
Owner

Excellent insights and report, @jacobalberty. I will study your branches. How did you go about figuring how the backing array was modified?

@jacobalberty
Copy link
Contributor

@jung-kurt I modified a report generator I had to hash the font bytes before and after then backed out #318 to confirm that hashing the bytes would show the issue. Then went back to with #318 included and started by looking at the getTableData function, I checked everywhere it was called then looked at where the results went eventually tracing it down to those append() calls. When I saw the splice() function's append I asked in #go-nuts on freenode to get some insight on it.

Just FYI this very situation seems to be why they added 3 index slicing into go 1.2 from the design doc for 3 index slicing
The addition of append to the language made this somewhat more important, because append lets programmers overwrite entries between len and cap without realizing it or even mentioning cap.

@jacobalberty
Copy link
Contributor

jacobalberty commented Nov 6, 2019 via email

@jung-kurt
Copy link
Owner

Very impressive analysis, @jacobalberty. Thanks for delving into this and figuring out all of the implications.

jacobalberty added a commit to jacobalberty/gofpdf that referenced this issue Nov 7, 2019
jacobalberty added a commit to jacobalberty/gofpdf that referenced this issue Nov 7, 2019
@jung-kurt
Copy link
Owner

Many thanks, @jacobalberty for putting this issue to rest ( 47a2c24).

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

4 participants