Google Docs add-on offering users the ability to extract entities, translate names, and research entities on wikipedia from within their multilingual document.
-
Updated
Jun 13, 2016 - JavaScript
Google Docs add-on offering users the ability to extract entities, translate names, and research entities on wikipedia from within their multilingual document.
Automatically extract relevant data from invoices by processing their .pdf/.xml files.
PerDa2Disco - Personnal Data to Discovery
RL3 examples repository (information extraction, NER, NLP, web & text mining, etc).
Spark RDD transformation and action, process unstructured data
Improved quality and presentation of job listings on Craigslist website via scraping and training data from Indeed’s job listings’, to enhance user experience, drive more traffic and thus increase revenue
The infoZilla unstructured software engineering data mining tool. It can find and extract source code regions, patches, stack traces, enumerations and itemizations from discussion threads.
Data Engineering knowledge as a readable tutorial (collaboratively).
Python code to access Large text ( At least 10 pages) from a .txt file, MS Word Document, PDF file, Wikipedia page, 500 tweets.
Web Data Frames
The RL3 Standard Library is a collection of modules accessible to a RL3 program to simplify the programming process and removing the need to rewrite commonly used RL3 patterns and predicates.
PostVector: unstructured and vector retrieval database extension to PostgreSQL.
Documentation for the BigConnect platform
Extract tabular information from scanned documents (PDF to CSV)
Building Knowledge Graphs from Unstructured Text
This repository is all about Data Science and Machine Learning.
All course materials for ZTM ML on Udemy
Add a description, image, and links to the unstructured-data topic page so that developers can more easily learn about it.
To associate your repository with the unstructured-data topic, visit your repo's landing page and select "manage topics."