bettertogether

This app uses Machine Learning NLP/topic modeling/document similarity techniques to group OMSCS CS-6460 Fall 2018 students by interests based on their essays/writing assignments.

With few clicks you will see a ranking of who's work is most similar of yours.

The objective is to help you find people with similar interests who are working in the same topics you are, to facilitate team formation and collaboration. After all, learning is better togehter. Have fun!

Getting Started

These instructions will get you a copy of the project up and running on your local machine for development and testing purposes.

Prerequisites

I built this on a MacOS X using Python 3.7.0. Check your Python version by running python -V. If you have an earlier version of Python installed, I suggest you upgrade it, downloading the latest Python version.

On top of that, you will need pip for installing Python packages. pip is already installed if you are using downloaded from python.org. Just make sure to upgrade pip.

Installing

Clone this repository locally and install all requirements by running in terminal:

git clone git@github.com:ucals/bettertogether.git
cd bettertogether
pip install -r requirements.txt

Downloading corpora and preparing to use TextRazor

To run it properly you will have to download all PDFs with students' assignments. To do it, go to Canvas, click on Account -> Settings, scroll to the bottom of the page, and click + New Access Token. Copy the new token.

You will also have to get a TextRazor free API Key. After creating a free accoung, you will be redirected to a success page containing your API Key. Copy that as well.

Edit your ~/.bash_profile and add the following line:

export CANVAS_API_KEY="your new Canvas token"
export TEXTRAZOR_API_KEY="your new TextRazor API key"

Replace your new Canvas token by the token you got from Canvas, and your new TextRazor API key by the API Key you got from TextRazor . Reload your profile by running source ~/.bash_profile from terminal.

Edit pytest.ini and set download_all_assignments = True. Now, to download all PDFs, just run:

pytest -k TestPreProcess

This process will take some time. After it ends, you will have all PDFs from students' assignments downloaded in pdfs/ folder. Finally, I recommend setting download_all_assignments = False back in pytest.ini.

Running Tests and the Webserver

All main tests are located in test_api.py. To run them:

pytest -k TestApi

To run the webserver locally, just:

python main.py

You can access it by browsing to http://localhost:8080.

Training Models

I have included the trained models in the repository as it takes ~10 minutes to train them. If you want to train them yourself, install Jupyter Notebook and run tutorial.ipynb 3 times. At the beginning of each time, alter assignment variable in 2nd cell to "Assignment 2", "Assignment 3", and "Assignment 4". This procedure will generate the following files:

doc2vec_model_assignment_2
doc2vec_model_assignment_3
doc2vec_model_assignment_4

They will be located in models/ folder.

Build With

This code was built on top of the following code:

Author

I'm Carlos Souza, and I did this side project as part of Master of Science in Computer Science CS-6460 Education Technology course from Georgia Institute of Technology. I'm accessible at souza@gatech.edu or carlos@udacity.com.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Acknowledgement

Quoc Le & Tomas Mikolov, thanks for this fantastic article!
RaRe Technologies, thanks for this great tutorial!
Skipgram, thanks for this amazing tutorial!

Name		Name	Last commit message	Last commit date
Latest commit History 46 Commits
docs		docs
models		models
static		static
views		views
.gitignore		.gitignore
.travis.yml		.travis.yml
LICENSE.md		LICENSE.md
README.md		README.md
api.py		api.py
appspec.yml		appspec.yml
main.py		main.py
pre_process.py		pre_process.py
pytest.ini		pytest.ini
requirements.txt		requirements.txt
test_api.py		test_api.py
test_pre_process.py		test_pre_process.py
tutorial.ipynb		tutorial.ipynb

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

bettertogether

Getting Started

Prerequisites

Installing

Downloading corpora and preparing to use TextRazor

Running Tests and the Webserver

Training Models

Build With

Author

License

Acknowledgement

About

Releases

Packages

Languages

License

ucals/bettertogether

Folders and files

Latest commit

History

Repository files navigation

bettertogether

Getting Started

Prerequisites

Installing

Downloading corpora and preparing to use TextRazor

Running Tests and the Webserver

Training Models

Build With

Author

License

Acknowledgement

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages