Skip to content

Text extraction issue with Inter v4.1 in XeLaTeX-generated PDFs #774

@igrmk

Description

@igrmk

Text copied from a XeLaTeX-produced PDF using Inter v4.1 contains unexpected characters, while version 3.19 works flawlessly.

To Reproduce

  1. Install XeTeX, Poppler, curl, unzip

  2. Run the script below in a dedicated directory. Both produced PDFs are attached for reference:

    mkdir -p fonts
    curl -s -L -O --output-dir fonts "https://github.com/rsms/inter/releases/download/v3.19/Inter-3.19.zip"
    curl -s -L -O --output-dir fonts "https://github.com/rsms/inter/releases/download/v4.1/Inter-4.1.zip"
    unzip -q -o -d fonts/Inter-3.19 fonts/Inter-3.19.zip
    unzip -q -o -d fonts/Inter-4.1 fonts/Inter-4.1.zip
    
    cat <<EOF > inter-3.19.tex
    \documentclass{article}
    \pagestyle{empty}
    \usepackage{fontspec}
    
    \setmainfont{Inter}[
        Path           = ./fonts/Inter-3.19/Inter Desktop/,
        Extension      = .otf,
        UprightFont    = *-Regular,
        BoldFont       = *-Bold,
        ItalicFont     = *-Italic,
        BoldItalicFont = *-BoldItalic
    ]
    
    \begin{document}
    (C++) (100\%)
    \end{document}
    EOF
    
    cat <<EOF > inter-4.1.tex
    \documentclass{article}
    \pagestyle{empty}
    \usepackage{fontspec}
    
    \setmainfont{Inter}[
        Path           = ./fonts/Inter-4.1/extras/otf/,
        Extension      = .otf,
        UprightFont    = *-Regular,
        BoldFont       = *-Bold,
        ItalicFont     = *-Italic,
        BoldItalicFont = *-BoldItalic
    ]
    
    \begin{document}
    (C++) (100\%)
    \end{document}
    EOF
    
    mkdir -p pdfs
    xelatex -interaction=batchmode -output-directory pdfs inter-3.19.tex > /dev/null
    xelatex -interaction=batchmode -output-directory pdfs inter-4.1.tex > /dev/null
    
    pdftotext pdfs/Inter-3.19.pdf - | grep -v $'\f' | grep -v '^$'
    pdftotext pdfs/Inter-4.1.pdf - | grep -v $'\f' | grep -v '^$'
    
  3. It outputs the following, even though both PDFs appear fine visually:

    (C++) (100%)
    ?C?????100%? <redacted due to smileys that cannot be pasted>
    

Expected behavior
I expect it to output:

(C++) (100%)
(C++) (100%)

Environment

  • OS: macOS 15.1.1, M2
  • XeTeX 3.141592653-2.6-0.999996 (TeX Live 2024)
  • Inter Regular 4.1

Additional notes
You can reproduce the issue by copying text from the provided PDFs. The problem is evident at least in macOS Preview.

inter-3.19.pdf
inter-4.1.pdf

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions