In today's modern world converting unstructured data to a structured data is really a challenge. So, here in this project we tried to use the techniques of Machine Learning and Natural Language Processing to extract data. Then we converted the structured data from pdf, doc, etc to srtuctured data.
In this project, we are able to convert an unstructured pdf into an organized set of data. We then apply machine learning to train the model that can best the extract data.
- Clone the repository on your local computer.
- Open main.py file. (Spyder Version 4 is recommended)
- The data is located in the data sub-folder.
Our Machine Learning model acquired a very high accuracy.