textwizard-dev
Popular repositories Loading
-
-
TextWizard
TextWizard PublicTextWizard is a Python library to extract, clean, and analyze text from PDFs, Office docs, images, CSV, and HTML/XML. It provides local OCR (Tesseract) and Azure Document Intelligence, NER (spaCy/S…
Python 1
-
-
-
-
Repositories
Showing 6 of 6 repositories
- WizardHTML Public
textwizard-dev/WizardHTML’s past year of commit activity - WizardExtract Public
textwizard-dev/WizardExtract’s past year of commit activity - TextWizard Public
TextWizard is a Python library to extract, clean, and analyze text from PDFs, Office docs, images, CSV, and HTML/XML. It provides local OCR (Tesseract) and Azure Document Intelligence, NER (spaCy/Stanza), language detection, spell check, lexical statistics, and HTML tools.
textwizard-dev/TextWizard’s past year of commit activity - WizardSpell Public
textwizard-dev/WizardSpell’s past year of commit activity - WizardLangID Public
textwizard-dev/WizardLangID’s past year of commit activity - WizardDocx Public
textwizard-dev/WizardDocx’s past year of commit activity
People
This organization has no public members. You must be a member to see who’s a part of this organization.
Top languages
Loading…
Most used topics
Loading…