Analyzing "War and Peace, vol. 1" with Regular Expressions and NLTK

This Jupyter Notebook project uses Python's re (regular expression) and nltk (Natural Language Toolkit) packages to analyze Leo Tolstoy's classic novel, "War and Peace, vol. 1" in the original Russian language.

Getting Started

Before running this notebook on your local machine, you will need to clone this repository. You might also need to install the following packages:

nltk: for natural language processing pip install nltk

The notebook contains several code cells that analyze the text of the novel using regular expressions and NLTK.

Why I made this

This Jupyter Notebook project demonstrates how regular expressions and the NLTK package can be used to analyze the text of a classic novel in a foreign language. By using regular expressions to extract words and NLTK to process them, we were able to identify the most common words in the novel and remove common stopwords, as well as determine the overall mood of the book. But for the most part, I was just practicing with regex here.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
.gitignore		.gitignore
main.ipynb		main.ipynb
readme.md		readme.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data

data

.gitignore

.gitignore

main.ipynb

main.ipynb

readme.md

readme.md

Repository files navigation

Analyzing "War and Peace, vol. 1" with Regular Expressions and NLTK

Getting Started

Why I made this

About

Releases

Packages

Languages

egorcherkasoff/regex-nlp-text-analysis

Folders and files

Latest commit

History

Repository files navigation

Analyzing "War and Peace, vol. 1" with Regular Expressions and NLTK

Getting Started

Why I made this

About

Topics

Resources

Stars

Watchers

Forks

Languages