Skip to content

AidenWilliams/Building-a-Language-Model

Repository files navigation

Building-a-Language-Model

Setup

For this assignment I wrote the python package LanguageModel, code documentation and explanation is included as docstrings inside the code. I put my particular coding and design choices in an md cell with the heading Coding Decisions. I am using the Maltese [1] corpus dataset for this assignment and python version 3.7.

I have also included an html file generated by jupyter notebooks and I recommend viewing that instead of using the jupyter server. Alternatively I used the Jetbrains Pycharm IDE which also renders the md components neatly.

Included is a requirements.txt which includes the external libraries used in this assignment. To install the libraries with pip you can use this command:

sudo pip install -r requirements.txt

Omit sudo if you are using Windows.

The file structure is as follows

Building a Language Model
|
+--Language Model
|       |
|       +-- __init__.py
|       +-- Corpus.py
|       +-- NGramCounts.py
|       +-- NGRamModel.py
+--Maltese
|       |
|       +-- various txt files (Not included in git/submission)
+--Religion
|       |
|       +-- two txt files (Not included in git/submission)
+--Sports
|       |
|       +-- two txt files (Not included in git/submission)
+--Test Corpus
|       |
|       +-- Test.txt
+--.gitignore
+--README.md
+--Building a Language Model.ipynb
+--Building a Language Model.html
+--Building a Language Model.pdf
+--Plagiarism form.pdf
+--requirements.txt

This project has also been uploaded to git on: https://github.com/AidenWilliams/Building-a-Language-Model