NLP PDF Assignment

Objective

This project performs Natural Language Processing (NLP) tasks on a PDF document.

Tasks Performed

PDF Reading
Text Extraction
Lowercasing
Remove Numbers using Regex
Remove Special Symbols
Remove Extra Spaces
Tokenization
Stopword Removal
Stemming
Lemmatization
One Hot Encoding
TF-IDF
Plotly Visualization

Libraries Used

PyPDF2
nltk
spacy
scikit-learn
pandas
plotly

PDF Source

Think Python PDF: https://greenteapress.com/thinkpython2/thinkpython2.pdf

How to Run

Install required libraries
Open the notebook
Run all cells

Author

Sumiya Riaz

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
README.md		README.md
assignment.ipynb		assignment.ipynb
book.pdf		book.pdf

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP PDF Assignment

Objective

Tasks Performed

Libraries Used

PDF Source

How to Run

Author

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

NLP PDF Assignment

Objective

Tasks Performed

Libraries Used

PDF Source

How to Run

Author

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages