AIT690-Assignment2

This Python program called ngram.py will learn an N-gram language model from an arbitrary number of plain text files. The program can generate a given number of sentences based on that N-gram model.

This program can work for any value of N, and output m sentences as the user requires. Your can run the program as follows:

ngram.py n m input-file/s

n refers to the number of grams and m refers to the number of sentences you want to generate.

for example: ngram.py 3 10 'austen-emma.txt' 'austen-persuasion.txt'

The .txt files used in this project are from http://www.gutenberg.org. Thus, you could chose the files name as follows:

'austen-emma.txt', 'austen-persuasion.txt', 'austen-sense.txt', 'bible-kjv.txt', 'blake-poems.txt', 'bryant-stories.txt',
'burgess- busterbrown.txt', 'carroll-alice.txt', 'chesterton-ball.txt', 'chesterton-brown.txt', 'chesterton-thursday.txt',
'edgeworth-parents.txt', 'melville-moby_dick.txt', 'milton-paradise.txt', 'shakespeare-caesar.txt', 'shakespeare-hamlet.txt', 'shakespeare-macbeth.txt', 'whitman-leaves.txt'

Some of the code for fetching the file and calculating Conditional Frequency Distribution is picked up from NTLK Book. https://www.nltk.org/book/

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
README.md		README.md
assignment2.py		assignment2.py
ngram-log.txt		ngram-log.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

assignment2.py

assignment2.py

ngram-log.txt

ngram-log.txt

Repository files navigation

AIT690-Assignment2

About

Releases

Packages

Contributors 2

Languages

xguo7/AIT690-Assignment2

Folders and files

Latest commit

History

Repository files navigation

AIT690-Assignment2

About

Resources

Stars

Watchers

Forks

Languages