Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crash when WriteUsedFontsDefinitions #237

Closed
da-liii opened this issue Dec 25, 2023 · 17 comments · Fixed by #240
Closed

Crash when WriteUsedFontsDefinitions #237

da-liii opened this issue Dec 25, 2023 · 17 comments · Fixed by #240

Comments

@da-liii
Copy link
Contributor

da-liii commented Dec 25, 2023

  • pdfwriter version: v4.6.1
  • os: macOS 13
  • arch: arm64

I changed one of the font from Songti.ttc to NotoSerifCJK-Regular.ttc, and it crashes.

6   MoganResearch                 	       0x103374a70 CharStringType2Interpreter::InterpretCallGSubr(unsigned char*) + 92
7   MoganResearch                 	       0x1033728c0 CharStringType2Interpreter::Intepret(IndexElement const&, IType2InterpreterImplementation*) + 288
8   MoganResearch                 	       0x103361eb0 CFFFileInput::CalculateDependenciesForCharIndex(unsigned short, unsigned short, CharString2Dependencies&) + 140
9   MoganResearch                 	       0x103367010 CFFEmbeddedFontWriter::AddComponentGlyphs(unsigned int, std::__1::set<unsigned int, std::__1::less<unsigned int>, std::__1::allocator<unsigned int> >&, bool&) + 100
10  MoganResearch                 	       0x103365088 CFFEmbeddedFontWriter::AddDependentGlyphs(std::__1::vector<unsigned int, std::__1::allocator<unsigned int> >&) + 96
11  MoganResearch                 	       0x1033649c8 CFFEmbeddedFontWriter::CreateCFFSubset(FreeTypeFaceWrapper&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, std::__1::vector<unsigned short, std::__1::allocator<unsigned short> >*, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, bool&, MyStringBuf&) + 416
12  MoganResearch                 	       0x103364628 CFFEmbeddedFontWriter::WriteEmbeddedFont(FreeTypeFaceWrapper&, std::__1::vector<unsigned int, std::__1::allocator<unsigned int> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, ObjectsContext*, std::__1::vector<unsigned short, std::__1::allocator<unsigned short> >*, unsigned long&) + 152
13  MoganResearch                 	       0x103352a20 CFFDescendentFontWriter::WriteFont(unsigned long, std::__1::basic_string<char, std::__1::char_traits<char>, std::__1::allocator<char> > const&, FreeTypeFaceWrapper&, std::__1::vector<std::__1::pair<unsigned int, GlyphEncodingInfo>, std::__1::allocator<std::__1::pair<unsigned int, GlyphEncodingInfo> > > const&, ObjectsContext*, bool) + 740
14  MoganResearch                 	       0x103306290 CIDFontWriter::WriteFont(FreeTypeFaceWrapper&, WrittenFontRepresentation*, ObjectsContext*, IDescendentFontWriter*, bool) + 732
15  MoganResearch                 	       0x1033582d8 WrittenFontCFF::WriteFontDefinition(FreeTypeFaceWrapper&, bool) + 216
16  MoganResearch                 	       0x103349364 UsedFontsRepository::WriteUsedFontsDefinitions() + 72
17  MoganResearch                 	       0x103310fe8 PDFHummus::DocumentContext::FinalizeNewPDF() + 32
18  MoganResearch                 	       0x103336e34 PDFWriter::EndPDF() + 76
19  MoganResearch                 	       0x103191480 pdf_hummus_renderer_rep::~pdf_hummus_renderer_rep() + 1208 

I guess there are some thing wrong here:
https://github.com/XmacsLabs/mogan/blob/v1.2.2/src/Plugins/Pdf/pdf_hummus_renderer.cpp#L1207-L1245

I checked pdfWriter.GetFontForFile, for the same font file, it always return the same pointer.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 25, 2023

Here is the way to reproduce it:
https://github.com/XmacsLabs/mogan/releases/tag/v1.2.2

build it and launch it and manually import the Noto CJK SC ttc (have to download the ttc first)

And then open the doc (https://github.com/XmacsLabs/mogan/blob/v1.2.2/TeXmacs/doc/about/mogan/research.zh.tm), and change the font to Noto Serif CJK SC, and then export it to PDF.

Here is the font I'm using:
NotoSerifCJK-Regular.ttc

@da-liii
Copy link
Contributor Author

da-liii commented Dec 25, 2023

A simplified way to reproduce it is:

  • Install NotoSerifCJK-Redular.ttc to ~/Library/Fonts via double click and install
  • Download and install Mogan Research 1.2.2 on macOS arm64 (because it is a release version, there are no debug info, the crash actually happen in PDF HUmmus)
  • Open Mogan Research and open the document below
  • Export it to PDF

You should save the following snippets into crash.tm and then open it.

<TeXmacs|2.1.2>

<style|<tuple|generic|chinese>>

<\body>
  \<#5E2E\>\<#52A9\>\<#83DC\>\<#5355\>\<#4E0B\>\<#7684\>\<#6587\>\<#6863\>\<#9ED8\>\<#8BA4\>\<#4F1A\>\<#663E\>\<#793A\>\<#4E2D\>\<#6587\>\<#6587\>\<#6863\>\<#FF0C\>\<#5F53\>\<#4E2D\>\<#6587\>\<#6587\>\<#6863\>\<#4E0D\>\<#5B58\>\<#5728\>\<#FF08\>\<#6BD4\>\<#5982\>\<#6CA1\>\<#6709\>\<#88AB\>\<#7FFB\>\<#8BD1\>\<#FF09\>
</body>

<\initial>
  <\collection>
    <associate|font|Noto CJK SC>
    <associate|font-family|rm>
    <associate|page-medium|paper>
  </collection>
</initial>

The only related font file is NotoSerifCJK-Regular.ttc via debugging info.

And the crash depends on the content of the document.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 25, 2023

Another truth is that when using Songti SC, it works fine.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 25, 2023

#150 might be the related issue.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 25, 2023

Well, finally, I found that it will crash on the character

https://old.unicode-table.com/cn/9ED8/

here is the document to reproduce the bug:

<TeXmacs|2.1.2>

<style|<tuple|generic|chinese>>

<\body>
  \<#9ED8\>
</body>

<\initial>
  <\collection>
    <associate|font|Noto CJK SC>
    <associate|font-family|rm>
    <associate|page-medium|paper>
  </collection>
</initial>

@da-liii
Copy link
Contributor Author

da-liii commented Dec 25, 2023

Tried SourceHanSerifSC-Regular.otf in https://github.com/adobe-fonts/source-han-serif , it crashes too.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 26, 2023

image
inFontIndex: 0
inCharStringIndex: 46956

// the dumped op and oprand
-44	16	363	29	96	30	6	29	227	30	7	-79
op 29
-73
op 29
33.5
op 11

op 18

op 11
84	55	-55	151	-125	27	55	43	-23	61	-20	61	-57	53	31	57	31	55	-29	54	142	63	-63	65	-59	65
op 20
-65
op 0

op 19
26
op 0
309	164
op 21
-14	-6
op 5
23	-37	23	-60	-47
op 26
50	-48	61	109	-143	89
op 8
100	29
op 21
-12	-7	27	-28	27	-48	4	-38
op 25
52	-43	54	108	-152	56
op 8
-196	-50
op 21
-15	-3	12	-45	8	-69	-10	-54
op 25
43	-59	77	110	-115	120
op 8
-85	-12
op 21
-18	1	1	-52	-29	-62	-26	-24
op 25
-19	-16	-10	-22	11	-20	14	-22	37	7	18	20	26	31	16	69	-21	90
op 8

op 19

op 0

op 0
40	573
op 21
-833
op 29

op 19
-75
op 0

op 11
-14	-4	17	-41	20	-64	1	-49
op 25

op 19

op 0

op 0
38	-41	47	90	-109	109
op 8
605	74
op 21
-11	-6	31	-39	38	-62	9	-49
op 25
61	-47	58	124	-186	79
op 8

op 19

op 10

@da-liii
Copy link
Contributor Author

da-liii commented Dec 26, 2023

This operators seq (without operand) occurs in several times.

op 19

op 10

op 0

https://learn.microsoft.com/en-us/typography/opentype/spec/cff2charstr#one-byte-cff2-operators

The logic to increase the ipc for hintmask (19) needs to be reviewed.

Byte* CharStringType2Interpreter::InterpretHintMask(Byte* inProgramCounter)
{
	mStemsCount+= (unsigned short)(mOperandStack.size() / 2);

	EStatusCode status = mImplementationHelper->Type2Hintmask(mOperandStack,inProgramCounter);
	if(status != PDFHummus::eSuccess)
		return NULL;

	ClearStack();
	return inProgramCounter+(mStemsCount/8 + (mStemsCount % 8 != 0 ? 1:0));
}

@galkahana
Copy link
Owner

@darcy-shen where did you get the op code dump from?
it seems to show an error in the glyph code...as far as i understand it. op 10, which is callsubr, is not having an operand. that operand would be the subr index. hence the problem. as far as i can see this is where pdfwriter fails (i got my own tracer for glyphs in CharStringType2Tracer. i'll see if there's an expected behavior in such a case (e.g. skip?) which might work better. and also consider your suggetion RE review hintmask...though i think it's ok...but i'll check.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 26, 2023

where did you get the op code dump from?

I skipped the callsubr here when there are no oprands.

Byte* CharStringType2Interpreter::InterpretCallSubr(Byte* inProgramCounter)
{
	CharString* aCharString = NULL;
	if(mOperandStack.size() < 1)
-		return NULL;
+              return inProgramCounter;

@galkahana
Copy link
Owner

gotcha. does this help in displaying the glyph properly?

@da-liii
Copy link
Contributor Author

da-liii commented Dec 26, 2023

does this help in displaying the glyph properly?

The PDF is ok but the glyph is missing. The exported pdf in the master branch is corrupted.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 26, 2023

Here is my debugging pr: #239

@galkahana
Copy link
Owner

cool. I got something similar with CharStringType2Tracer.
im comparing it with a reference implementaiton called ttx

[on a mac just go brew install fonttools and ttx is one of them.]
[would upload output but github doesnt allow. too bid]

might not make it to resolve this today, but tomorrow i got all day so i'll fairly confident i'll be able to come up with a correction.

i'll compare and see where the problem is....there's definitely something, here's what they got:

          -44 16 363 29 96 30 6 29 227 30 7 -79 callgsubr
          84 55 -55 151 -125 27 55 43 -23 61 -20 61 -57 53 31 57 31 55 -29 54 142 63 -63 65 -59 65 cntrmask 011010100100101000000000
          hintmask 100000001010010100000000
          309 164 rmoveto
          -14 -6 rlineto
          23 -37 23 -60 -47 vvcurveto
          50 -48 61 109 -143 89 rrcurveto
          100 29 rmoveto
          -12 -7 27 -28 27 -48 4 -38 rlinecurve
          52 -43 54 108 -152 56 rrcurveto
          -196 -50 rmoveto
          -15 -3 12 -45 8 -69 -10 -54 rlinecurve
          43 -59 77 110 -115 120 rrcurveto
          -85 -12 rmoveto
          -18 1 1 -52 -29 -62 -26 -24 rlinecurve
          -19 -16 -10 -22 11 -20 14 -22 37 7 18 20 26 31 16 69 -21 90 rrcurveto
          hintmask 000010010000000000000000
          40 573 rmoveto
          -833 callgsubr
          -14 -4 17 -41 20 -64 1 -49 rlinecurve
          hintmask 000010010000000000000000
          38 -41 47 90 -109 109 rrcurveto
          605 74 rmoveto
          -11 -6 31 -39 38 -62 9 -49 rlinecurve
          61 -47 58 124 -186 79 rrcurveto
          hintmask 001010100000101000000000
          -344 -96 rmoveto
          -67 23 -6 -40 -15 -79 -14 -49 rlinecurve
          13 -5 27 43 25 53 14 33 rlinecurve
          10 -1 8 2 5 4 rrcurveto
          -176 -119 262 119 vlineto
          -290 -321 rmoveto
          29 vlineto
          hintmask 011010100001001000000000
          114 -96 9857 callsubr
          -95 hlineto
          -92 -9 -76 -7 -44 -2 30 -78 rcurveline
          9 2 10 8 5 12 188 32 138 30 103 22 -3 16 rcurveline
          -207 -21 rlineto
          90 172 vlineto
          14 9 262 callsubr
          -39 -48 rlineto
          -87 96 115 -32 8 hlineto
          19 28 15 6 hvcurveto
          295 -833 callsubr
          -72 -907 callgsubr
          -275 -1129 callgsubr
          -398 9 vlineto
          hintmask 001010100000101000000000
          24 22 12 6 hvcurveto
          118 59 rmoveto
          -118 262 118 hlineto
          hintmask 000101000000000000100000
          624 -174 -33 callgsubr
          -123 hlineto
          4 80 1 87 1 97 23 3 10 11 3 -215 callgsubr
          -1 -113 0 -100 -4 -90 4729 callsubr
          -988 callgsubr
          141 hlineto
          -14 -252 -50 -169 -181 -138 15 -16 rcurveline
          -1111 callgsubr
          220 133 57 173 16 259 21 -245 51 -185 120 -123 15 29 23 15 26 1 4 10 rcurveline
          -135 101 -77 180 -27 227 rrcurveto
          6830 callsubr
          hintmask 000101000000000000100000
          -29 29 -48 -96 callgsubr
          endchar
        </CharString>

@galkahana
Copy link
Owner

yeah defo a bug there right at the start when reading how many stems there should be. this results in the hummus interpreter figuring there's 6 stems, while there are at least 17 which results in reading too few bytes later.

@galkahana
Copy link
Owner

galkahana commented Dec 26, 2023

ok. figured it out. and the glyph shows correctly.

MR.
will merge soon. hope it's ok.

i have a bug (for 13 year oh my) in the implementation of cntrmask operator. as opposed to a correct implementation of hintmask it was NOT CONSIDERING it might be used in a special way that optimizes not having to call vstem.
as a result it got lots of operands...but did nothing with them.

as a result all later hintmasks and cntrmasks calls did not read enough of the following bytes (1 instead of 3) and this resulted in a wonderful mass with creating random operands.

anyways, fixed the code of cntrmask and it seems ok now.

please let me know if this resolves the problem on your end too.

@da-liii
Copy link
Contributor Author

da-liii commented Dec 27, 2023

It fixed my bug, thanks a lot. It will benefit all Mogan/GNU TeXmacs users on Linux who are using CJK characters.

@da-liii da-liii closed this as completed Dec 27, 2023
@da-liii da-liii reopened this Dec 27, 2023
@da-liii da-liii closed this as completed Dec 27, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants