markov_analysis

This project implements Markov analysis for text prediction from a given text file. Utilizes urllib.request to read text file from project gutenberg. The program works by first gathering a text file of a book from project gutenberg. Words are then stripped of punctuation. A dictionary is then created for that book, with each unique word being a key, and the words that follow it composing a list as that key's value. So if the word 'he' is followed in the book at different times by 'went', 'said', 'will', 'needs', 'went', 'said', 'said', and 'can', then the entry in our dictionary would be wordDic['he'] = ['went', 'said', 'will', 'needs', 'went', 'said', 'said', 'can']. Note that this means that we are essentially using a graph structure here, with individual words being vertices, and edges being drawn to words that follow each individual word in our text. Then, when we predict a sentence, a word is chosen at random from our value list. Since words appear in different frequencies, the probability of any word these words following 'went' is probabilistically chained to how often each word actually follows 'went' in our book. If we wanted to predict a 10-word sentence, and our second word cosen is 'said,' then our next word will be chosen from the dictionary values for the key 'said'. So our sentence becomes 'he said'...some word, up through ten words.

Name		Name	Last commit message	Last commit date
Latest commit History 9 Commits
.DS_Store		.DS_Store
README.md		README.md
src.py		src.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

markov_analysis

About

Releases

Packages

Languages

bpbirch/markov_analysis

Folders and files

Latest commit

History

Repository files navigation

markov_analysis

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages