Subreddit Recommender

Using natural language processing and deep feature embeddings to recommend subreddits.

Project Organization

├── README.md
├── data
│   ├── interim        <- Intermediate data that has been transformed.
│   ├── processed      <- The final, canonical data sets for modeling.
│   └── raw            <- The original, immutable data dump.
│
├── models             <- Trained and serialized models.
│
├── src
│   ├── __init__.py
│   │
│   ├── data           <- Scripts to download and generate data.
│   │
│   ├── features       <- Scripts to turn raw data into features for modeling.
│   │
│   └── models         <- Scripts to train models.
│
├── tests
│
├── requirements.txt
│
└── tox.ini

Installation

Create a virtual environment and install the dependencies.

virtualenv env
source env/bin/activate
pip install -r requirements.txt

Create a file called .env in the root of the project directory with your reddit API keys in the format below. Downloading the data is quite slow, so it will multithread with as many keys as you have available.

CLIENT_0=api_key:api_id
CLIENT_1=api_key:api_id

Run the data extraction scripts.

python src/data/make_subreddit_list.py
python src/data/download_reddit_data.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

README.md

README.md

Subreddit Recommender

Project Organization

Installation

Files

README.md

Latest commit

History

README.md

File metadata and controls

Subreddit Recommender

Project Organization

Installation