An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
-
Updated
Jun 21, 2024 - Python
An Integrated Corpus Tool With Multilingual Support for the Study of Language, Literature, and Translation
A very simple news crawler with a funny name
Python & command-line tool to gather text on the Web: Crawling & scraping, content extraction, metadata. TXT, Markdown, CSV & XML output.
[ACL'24] MC^2: A Multilingual Corpus of Minority Languages in China (Tibetan, Uyghur, Kazakh, and Mongolian)
粵文語料篩選器 Cantonese text filter
微博热榜爬虫
📑 Galician corpus for misogyny detection
A corpus and models for the atuomated legal assessment of clauses in German consumer contracts.
L2SCA & LCA fork: cross-platform, GUI, without Java dependency
Estonian Grammatical Error Correction (GEC) test and development corpus that contains L2 learner texts error-annotated in the M2 format.
A parser for annotated MuseScore 3 files.
Radio Audio Corpus Collection Toolkit with Hackrf One.
中文语言理解测评基准 Chinese Language Understanding Evaluation Benchmark: datasets, baselines, pre-trained models, corpus and leaderboard
❤️Emotional First Aid Dataset, 心理咨询问答、聊天机器人语料库
🚁 保险行业语料库,聊天机器人
Add a description, image, and links to the corpus topic page so that developers can more easily learn about it.
To associate your repository with the corpus topic, visit your repo's landing page and select "manage topics."