Skip to content

Map Chinese region codes (zh-TW, zh-CN) to script codes when passing lang to Pandoc #14416

@cderv

Description

@cderv

When a user sets lang: zh-TW, Pandoc emits:

[WARNING] Could not load translations for zh-TW
  data file translations/zh.yaml not found

Pandoc ships only zh-Hans.yaml and zh-Hant.yaml (IETF-recommended script subtags for Chinese). Its fallback logic strips to the primary language subtag (zh), which also doesn't exist — hence the warning:

https://github.com/jgm/pandoc/blob/8799ad87b797e67577bc5368580685c3fa8e97fb/src/Text/Pandoc/Translations.hs#L58-L86

Quarto already has one special-case rewrite at pandoc invocation that rewrites bare zh to zh-Hans:

// If the user provides only `zh` as a lang, disambiguate to 'simplified'
if (pandocMetadata.lang === "zh") {
pandocMetadata.lang = "zh-Hans";
}

Suggestion

We could extend the mapping in pandoc.ts with region → script:

  • zh-TW, zh-HK, zh-MO → pass zh-Hant to Pandoc
  • zh-CN, zh-SG → pass zh-Hans to Pandoc

This would only affect the value handed to Pandoc. Quarto's own file resolution in src/core/language.ts would be unaffected — _language-zh-TW.yml would still load for users writing lang: zh-TW.

Repro

---
lang: zh-TW
---

Render any format that uses Pandoc translations (e.g. LaTeX/PDF with chapters). Warning appears in the render log.

From #14409.

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't workingpandoc

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions