Skip to content

Latest commit

 

History

History
363 lines (301 loc) · 13 KB

language-support-prebuilt.md

File metadata and controls

363 lines (301 loc) · 13 KB
title titleSuffix description author manager ms.service ms.custom ms.topic ms.date
Language and locale support for prebuilt models - Document Intelligence (formerly Form Recognizer)
Azure AI services
Document Intelligence prebuilt / pretrained model language extraction and detection support.
laujan
nitinme
azure-ai-document-intelligence
ignite-2023
conceptual
02/29/2024

Language support: prebuilt models

::: moniker range="doc-intel-4.0.0" [!INCLUDE applies to v4.0] ::: moniker-end

::: moniker range="doc-intel-3.1.0" [!INCLUDE applies to v3.1] ::: moniker-end

::: moniker range="doc-intel-3.0.0" [!INCLUDE applies to v3.0] ::: moniker-end

::: moniker range="doc-intel-2.1.0" [!INCLUDE applies to v2.1] ::: moniker-end

Azure AI Document Intelligence models provide multilingual document processing support. Our language support capabilities enable your users to communicate with your applications in natural ways and empower global outreach. Prebuilt models enable you to add intelligent domain-specific document processing to your apps and flows without having to train and build your own models. The following tables list the available language and locale support by model and feature:

Business card

:::moniker range="doc-intel-4.0.0"

Important

Starting with Document Intelligence v4.0 (preview), and going forward, the business card model (prebuilt-businessCard) is deprecated. To extract data from business cards, use earlier models.

Feature version Model ID
Business card model • v3.1:2023-07-31 (GA)
• v3.0:2022-08-31 (GA)
• v2.1 (GA)
prebuilt-businessCard
:::moniker-end

:::moniker range="doc-intel-3.1.0 || doc-intel-3.0.0 "

Model ID: prebuilt-businessCard

Language Locale code Default
• English (United States) en-US
• English (Australia) en-AU
• English (Canada) en-CA
• English (United Kingdom)en-GB
• English (India) en-IN
• English (Japan) en-JP
• Japanese (Japan) ja-JP
Autodetected (en-US or ja-JP)

:::moniker-end

:::moniker range="doc-intel-2.1.0"

Language Locale code Default
• English (United States) en-US
• English (Australia) en-AU
• English (Canada) en-CA
• English (United Kingdom) en-GB
• English (India) en-IN
Autodetected

:::moniker-end

Contract

:::moniker range="doc-intel-4.0.0 || doc-intel-3.1.0"

Model ID: prebuilt-contract

Language Locale code Default
English (United States) en-US English (United States) en-US

:::moniker-end

Health insurance card

:::moniker range=">=doc-intel-3.0.0"

Model ID: prebuilt-healthInsuranceCard.us

Language Locale code Default
English (United States) English (United States) en-US

:::moniker-end

ID document

:::moniker range=">=doc-intel-3.0.0"

Model ID: prebuilt-idDocument

Supported document types

Region Document types
Worldwide Passport Book, Passport Card
United States Driver License, Identification Card, Residency Permit (Green card), Social Security Card, Military ID
Europe Driver License, Identification Card, Residency Permit
India Driver License, PAN Card, Aadhaar Card
Canada Driver License, Identification Card, Residency Permit (Maple Card)
Australia Driver License, Photo Card, Key-pass ID (including digital version)

::: moniker-end

:::moniker range="doc-intel-2.1.0"

Region Document types
Worldwide Passport Book, Passport Card
United States Driver License, Identification Card

:::moniker-end

Invoice

Model ID: prebuilt-invoice

:::moniker range=">=doc-intel-3.1.0"

Languages Details
• Albanian (sq) Albania (al)
• Arabic (ar) Arabic (ar)
• Bulgarian (bg) Bulgaria (bg)
• Chinese (simplified (zh-hans)) China (zh-hans-cn)
• Chinese (traditional (zh-hant)) Hong Kong SAR (zh-hant-hk), Taiwan (zh-hant-tw)
• Croatian (hr) Bosnia and Herzegovina (ba), Croatia (hr), Serbia (rs)
• Czech (cs) Czech Republic (cz)
• Danish (da) Denmark (dk)
• Dutch (nl) Netherlands (nl)
• English (en) United States (us), Australia (au), Canada (ca), United Kingdom (-uk), India (-in)
• Estonian (et) Estonia (ee)
• Finnish (fi) Finland (fl)
• French (fr) France (fr)
• German (de) Germany (de)
• Greek (el) Greece (el)
• Hebrew (he) Hebrew (he)
• Hungarian (hu) Hungary (hu)
• Icelandic (is) Iceland (is)
• Italian (it) Italy (it)
• Japanese (ja) Japan (ja)
• Korean (ko) Korea (kr)
• Latvian (lv) Latvia (lv)
• Lithuanian (lt) Lithuania (lt)
• Macedonian (mk) Macedonia (mk)
• Malay (ms) Malaysia (ms)
• Norwegian (nb) Norway (no)
• Polish (pl) Poland (pl)
• Portuguese (pt) Portugal (pt), Brazil (br)
• Romanian (ro) Romania (ro)
• Russian (ru) Russia (ru)
• Serbian (Cyrillic) (sr-cyrl) Serbia (sr)
• Serbian (sr-Latn) Serbia (latn-rs)
• Slovak (sk) Slovakia (sv)
• Slovenian (sl) Slovenia (sl)
• Spanish (es) Spain (es)
• Swedish (sv) Sweden (se)
• Thai (th) Thailand (th)
• Turkish (tr) Turkey (tr)
• Ukrainian (uk) Ukraine (uk)
• Vietnamese (vi) Vietnam (vi)
Currency Code Details
ARS Argentine Peso (ar)
AUD Australian Dollar (au)
BAM Bosnian Convertible Mark (ba)
BRL Brazilian Real (br)
GBP British Pound Sterling (gb)
BGN Bulgarian Lev (bg)
CAD Canadian Dollar (ca)
CLP Chilean Peso (cl)
CNY Chinese Yuan (cn)
COP Colombian Peso (co)
CRC Costa Rican Coldón (us)
CZK Czech Koruna (cz)
DKK Danish Krone (dk)
EUR Euro (eu)
GGP Guernsey Pound (gg)
HUF Hungarian Forint (hu)
ISK Icelandic Króna (us)
INR Indian Rupee (in)
IDR Indonesian Rupiah (id)
ILS Israeli New Shekel (il)
JPY Japanese Yen (jp)
MKD Macedonian Denar (mk)
TWD New Taiwan Dollar (tw)
NOK Norwegian Krone (no)
PAB Panamanian Balboa (pa)
PEN Peruvian Sol (pe)
PLN Polish Zloty (pl)
RON Romanian Leu (ro)
RUB Russian Ruble (ru)
RSD Serbian Dinar (rs)
KRW South Korean Won (kr)
SEK Swedish Krona (se)
THB Thai Baht (th)
TRY Turkish Lira (tr)
UAH Ukrainian Hryvnia (ua)
USD United States Dollar (us)
VND Vietnamese Dong (vn)

:::moniker-end

:::moniker range="doc-intel-3.0.0"

Supported languages Details
• English (en) United States (us), Australia (au), Canada (ca), United Kingdom (-uk), India (-in)
• Spanish (es) Spain (es)
• German (de) Germany (de)
• French (fr) France (fr)
• Italian (it) Italy (it)
• Portuguese (pt) Portugal (pt), Brazil (br)
• Dutch (nl) Netherlands (nl)
Supported Currency Codes Details
BRL Brazilian Real (br)
GBP British Pound Sterling (gb)
CAD Canada (ca)
EUR Euro (eu)
GGP Guernsey Pound (gg)
INR Indian Rupee (in)
USD United States (us)

:::moniker-end

:::moniker range="doc-intel-2.1.0"

Supported languages Details
English (en) United States (us)
:::moniker-end

Receipt

:::moniker range=">=doc-intel-3.0.0"

Model ID: prebuilt-receipt

Language name Language code Language name Language code
English en Lithuanian lt
Afrikaans af Luxembourgish lb
Akan ak Macedonian mk
Albanian sq Malagasy mg
Arabic ar Malay ms
Azerbaijani az Maltese mt
Bamanankan bm Maori mi
Basque eu Marathi mr
Belarusian be Maya, Yucatán yua
Bhojpuri bho Mongolian mn
Bosnian bs Nepali ne
Bulgarian bg Norwegian no
Catalan ca Nyanja ny
Cebuano ceb Oromo om
Corsican co Pashto ps
Croatian hr Persian fa
Czech cs Persian (Dari) prs
Danish da Polish pl
Dutch nl Portuguese pt
Estonian et Punjabi pa
Faroese fo Quechua qu
Fijian fj Romanian ro
Filipino fil Russian ru
Finnish fi Samoan sm
French fr Sanskrit sa
Galician gl Scottish Gaelic gd
Ganda lg Serbian (Cyrillic) sr-cyrl
German de Serbian (Latin) sr-latn
Greek el Sesotho st
Guarani gn Sesotho sa Leboa nso
Haitian Creole ht Shona sn
Hawaiian haw Slovak sk
Hebrew he Slovenian sl
Hindi hi Somali (Latin) so-latn
Hmong Daw mww Spanish es
Hungarian hu Sundanese su
Icelandic is Swedish sv
Igbo ig Tahitian ty
Iloko ilo Tajik tg
Indonesian id Tamil ta
Irish ga Tatar tt
isiXhosa xh Tatar (Latin) tt-latn
isiZulu zu Thai th
Italian it Tongan to
Japanese ja Turkish tr
Javanese jv Turkmen tk
Kazakh kk Ukrainian uk
Kazakh (Latin) kk-latn Upper Sorbian hsb
Kinyarwanda rw Uyghur ug
Kiswahili sw Uyghur (Arabic) ug-arab
Korean ko Uzbek uz
Kurdish ku Uzbek (Latin) uz-latn
Kurdish (Latin) ku-latn Vietnamese vi
Kyrgyz ky Welsh cy
Latin la Western Frisian fy
Latvian lv Xitsonga ts
Lingala ln
Supported Languages Language code
English (United States) en-US
French fr-FR
German de-DE
Italian it-IT
Japanese ja-JP
Portuguese pt-PT
Spanish es-ES

::: moniker-end

::: moniker range="doc-intel-2.1.0"

Model Language Locale code Default
Receipt • English (United States) en-US
• English (Australia) en-AU
• English (Canada) en-CA
• English (United Kingdom) en-GB
• English (India) en-IN
Autodetected

::: moniker-end

Tax documents

:::moniker range="doc-intel-4.0.0"

Model ID Language Locale code Default
prebuilt-tax.us.w2 English (United States) English (United States) en-US
prebuilt-tax.us.1098 English (United States) English (United States) en-US
prebuilt-tax.us.1098E English (United States) English (United States) en-US
prebuilt-tax.us.1098T English (United States) English (United States) en-US
prebuilt-tax.us.1099 English (United States) English (United States) en-US
:::moniker-end

:::moniker range="doc-intel-3.1.0"

Model ID Language Locale code Default
prebuilt-tax.us.w2 English (United States) English (United States) en-US
prebuilt-tax.us.1098 English (United States) English (United States) en-US
prebuilt-tax.us.1098E English (United States) English (United States) en-US
prebuilt-tax.us.1098T English (United States) English (United States) en-US
:::moniker-end

:::moniker range="doc-intel-3.0.0"

Model ID Language Locale code Default
prebuilt-tax.us.w2 English (United States) English (United States) en-US
:::moniker-end