Reddit/nosleep Recommender System

Working software can be viewed at

http://ec2-54-86-31-115.compute-1.amazonaws.com:8000/

Important

To run a search you would need story IDs. The rationale is that if we need to hook this up with Reddit, the parameter passed to our system will be a story ID, which is unique. We have huge dataset of story IDs ran between 01/01/2019 and 07/01/2019 (Six months)
We are providing some story IDs, which you can use to test:
abg1pv , abg4dj, abg8cw, abgcly, abgd7w, abgfyo, abgjjn, abgyue, abgzs9, abhcls, abhgjs

PS: Complete list of story IDs can be found in storyids.txt here: https://github.com/CSE6242TEAM135/Nosleep-Recommender-System/blob/master/storyids.txt

The application is web based and assumes AWS infrastructure.

Aws components used • S3
• DynamoDB
• EC 2 (Virtual Machine – RedHat)

Following libraries need to be installed on the RedHat machine

• Python 3.7.4
• Django
• Pandas
• Boto3
• NLTK
• Wordcloud
• Plotly
• Networkx

To install any Python library use this syntax : python3 -m pip install --user plotly

In addition, AWS CLI must be installed and configured with appropriate Access keys , which will allow to communicate with DynamoDB

To install our software,

Log into EC2 and run these commands:
• git clone https://github.com/CSE6242TEAM135/Nosleep-Recommender-System.git
This will pull all the required files.
• Then type this command:
python3 Nosleep-Recommender-System/NoSleepRecommender_DJANGO/manage.py runserver 0.0.0.0:8000 &
It will start the server.
• Thereafter do
ctrl+a+d
This will continue running the server in the background and you can safely exit the CLI

Structure of Git:

Model folder contains our machine learning models, which includes Topic Modeling, Sentiment Analysis (using NLTK and Vader) and scoring methodology
NoSleepRecommender_DJANGO folder contains Django web server, Wordcloud and Network graph files
storyids.txt contains list of complete story ids that can be fetched from AWS.

Name		Name	Last commit message	Last commit date
Latest commit History 70 Commits
Mernless/.idea		Mernless/.idea
NoSleepRecommender_DJANGO		NoSleepRecommender_DJANGO
NoSleep_MERN		NoSleep_MERN
models		models
sentiment_analysis_nltk		sentiment_analysis_nltk
.DS_Store		.DS_Store
Data_Preprocessing.py		Data_Preprocessing.py
README.md		README.md
Topic_Modelling_LDA_js.ipynb		Topic_Modelling_LDA_js.ipynb
Topic_Modelling_LSA_js.ipynb		Topic_Modelling_LSA_js.ipynb
cleaned_stories.csv		cleaned_stories.csv
distance_dict.txt		distance_dict.txt
hpd_topic_modeling.ipynb		hpd_topic_modeling.ipynb
network_graph_6242.py		network_graph_6242.py
reddit_scraping.ipynb		reddit_scraping.ipynb
reddit_scraping_LC.ipynb		reddit_scraping_LC.ipynb
sentiment_analysis_vader_St_NLP.py		sentiment_analysis_vader_St_NLP.py
storyids.txt		storyids.txt
topic_df.xlsx		topic_df.xlsx
topic_modeling.py		topic_modeling.py
user_comments_vader_NLP.xlsx		user_comments_vader_NLP.xlsx

CSE6242TEAM135/Nosleep-Recommender-System

Folders and files

Latest commit

History

Repository files navigation

Reddit/nosleep Recommender System

Working software can be viewed at

Important

The application is web based and assumes AWS infrastructure.

Following libraries need to be installed on the RedHat machine

To install any Python library use this syntax : python3 -m pip install --user plotly

To install our software,

Structure of Git:

About

Resources

Stars

Watchers

Forks

Languages