You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Python based Open Source ETL tools for file crawling, document processing (text extraction, OCR), content analysis (Entity Extraction & Named Entity Recognition) & data enrichment (annotation) pipelines & ingestor to Solr or Elastic search index & linked data graph database
This Python script reads a document, looks for a Base64-encoded PDF, decodes it, and writes extracted PDF text to a file. The script uses the PyPDF2 library to extract text from de-encoded PDFs.
ZWSP-Tool is a powerful toolkit that allows to manipulate zero width spaces quickly and easily. ZWSP-Tool allows in particular to detect, clean, hide, extract and bruteforce a text containing zero width spaces.