-
-
Notifications
You must be signed in to change notification settings - Fork 4.5k
explosion spaCy Language-support Discussions
Sort by:
Latest activity
Categories, most helpful, and community links
Categories
Community links
🌍 Language Support Discussions
Discuss the language data and training models for new languages
Pinned to Language Support
-
🌍 Adding models for new languages master thread
enhancementFeature requests and improvements lang / allGlobal language data new languageAdding support for new languages to spaCy.
Discussions
-
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 Arabic language support
lang / arArabic language data and models -
You must be logged in to vote 🌍 spaCy Turkish models are ready
lang / trTurkish language data and models -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 creating Amharic model am_core_web_sm
lang / amAmharic language data and models -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 Development of tools for Sanskrit
lang / saSanskrit language data and models new languageAdding support for new languages to spaCy. -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 Moving English Dependency Parsing to Universal Dependencies
feat / parserFeature: Dependency Parser -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 French tokenization - iconsistent application of exceptions in FR_BASE_EXCEPTIONS & other unexpected tokenization
lang / frFrench language data and models feat / tokenizerFeature: Tokenizer -
You must be logged in to vote 🌍 xx_sent_ud_sm bad sentence split
modelsIssues related to the statistical models lang / zhChinese language data and models lang / xxMulti-language data and models feat / senterFeature: Sentence Recognizer -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 Losing POS Tagging & Other Token Attributes when Segmenting with Jieba or Pkuseg
usageGeneral spaCy usage feat / tokenizerFeature: Tokenizer -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 Amharic - አማርኛ (am-et) language support
lang / amAmharic language data and models -
You must be logged in to vote 🌍 -
You must be logged in to vote 🌍 Why does the German sentence tokenizer consider a semicolon a sentence ending?
lang / deGerman language data and models feat / tokenizerFeature: Tokenizer -
You must be logged in to vote 🌍 Other Languages Support
modelsIssues related to the statistical models -
You must be logged in to vote 🌍 Portuguese words starting with a capital letter are not correctly lemmatized
lang / ptPortuguese language data and models feat / lemmatizerFeature: Rule-based and lookup lemmatization -
You must be logged in to vote 🌍 Adding support for Tibetan in spacy
new languageAdding support for new languages to spaCy.