NLP - Analyzing Peter Pan

This is a small project to kick-off my Natural language processing (NLP) journey. It is based on the project "Discover Insights into Classic Texts" from the course "Apply Natural Language Processing with Python skill Path" of CodeCademy.com

Goal

The goal of this project is to perform a simple analysis of the classical book Peter Pan and to discover who are the most mentioned characters. The file containing the book was downloaded from Project Gutenberg.

Installation

Create a virtual environment (optional)

virtualenv --python=/usr/bin/python3.6 ~/NLP
source ~/NLP/bin/activate

Clone the repository

In a terminal, clone this repository wherever you want:

git clone https://github.com/irenebosque/NLP-analyzing-Peter-Pan.git

Install additional requirements

Then, in the terminal copy/paste the following:

pip install nltk
python
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

Run the code

To perform the analysis you need to run the file called script.py

python script.py

Analysis of the results

The most relevant final results are:

((('peter', 'NN'),), 319)
((('wendy', 'NN'),), 180)
((('hook', 'NN'),), 127)
((('john', 'NN'),), 116)
((('michael', 'NN'),), 71)

Looking at most_common_np_chunks, you can identify characters of importance in the text such as Peter, Wendy, Hook, John and Michael, based on their frequency.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.idea		.idea
__pycache__		__pycache__
README.md		README.md
chunk_counters.py		chunk_counters.py
peter_pan.txt		peter_pan.txt
script.py		script.py
tokenize_words.py		tokenize_words.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

NLP - Analyzing Peter Pan

Goal

Installation

Create a virtual environment (optional)

Clone the repository

Install additional requirements

Run the code

Analysis of the results

About

Releases

Packages

Languages

irenebosque/NLP-analyzing-Peter-Pan

Folders and files

Latest commit

History

Repository files navigation

NLP - Analyzing Peter Pan

Goal

Installation

Create a virtual environment (optional)

Clone the repository

Install additional requirements

Run the code

Analysis of the results

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages