A project that extracts Honkai: Star Rail text corpus
-
Updated
Jul 12, 2024 - Python
A project that extracts Honkai: Star Rail text corpus
Walk through to convert PMC OAS Dataset into a text corpus
Walk through to convert WikiMedia into a text corpus
Search a long list of names (patterns) in a large text corpus systematically and quickly
Expanding sentences in a given text corpus. The code checks for NE in sentences and create new sentences by injecting new NEs from NE list.
This is a tool which can be used to index and query a large XML-based text corpus using Elasticsearch.
Polish RoBERTA model trained on Polish literature, Wikipedia, and Oscar. The major assumption is that quality text will give a good model.
Walk through to convert congressional roll call votes into a text corpus
MirasText
Walk through to convert Kaggle's COVID-19 Open Research Dataset Challenge into a text corpus
Crawling data from websites as text corpus
A model was trained using Google handwritten Fonts using a text corpus containing only digits ranging from 0-9. The main aim was to recognize ICR sheets from such trained data. Our model gave an accuracy of 94.6% using Tesseract Version-4.
Statistical text analysis and semantic networks with Python
Add a description, image, and links to the text-corpus topic page so that developers can more easily learn about it.
To associate your repository with the text-corpus topic, visit your repo's landing page and select "manage topics."