GitHub - HamidurRahman1/Statistical-Language-Modeling: A statistical language modeling application

What it is:

A language model based on statistics learns the likelihood of word event dependent on instances of content. More
straightforward models may take a gander at a setting of a short grouping of words, while bigger models may work at the
degree of sentences or sections. Most ordinarily, language models work at the degree of words.

Models:

Unigram

Bigram

Bigram with Add-One smoothing

How to run:

Just run the 'Main.py' file. It will do it's job and create answer for all questions in order.
Answers are EXPLICITLY marked as there are functions defined for each of them.

Design and Explanation:

Followed an OOP approach. PreProcess as an example does same pre-processing for TRAINING data and
TEST data. At some point pre-processing TEST data depends on pre-processed TRAINING data that's why
PreProcess takes optionally a pre-processed TRAINING object and do it's comparing.
Same approach goes for Unigram, Bigram, BigramSmoothing (BigramAddOneSmoothing).

'Util.py' defines some functions that are used by all classes.

'QA.py' file consists of 7 functions. Each function is named qa*(...) meaning this function
correspond to each question and generate an answer in a nicely format and before all these each
function of course takes some necessary arguments to output an answer.

Name		Name	Last commit message	Last commit date
Latest commit History 34 Commits
data		data
.gitignore		.gitignore
Classes.py		Classes.py
Main.py		Main.py
QA.py		QA.py
README.md		README.md
Util.py		Util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

About

Uh oh!

Releases

Packages

Uh oh!

Languages

HamidurRahman1/Statistical-Language-Modeling

Folders and files

Latest commit

History

Repository files navigation

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

Packages