Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Binary file added .DS_Store
Binary file not shown.
Binary file added Reddit-scraping-and-flair-detection/.DS_Store
Binary file not shown.
1,015 changes: 1,015 additions & 0 deletions Reddit-scraping-and-flair-detection/Exploratory-Data-Analysis(EDA).ipynb

Large diffs are not rendered by default.

1,364 changes: 1,364 additions & 0 deletions Reddit-scraping-and-flair-detection/Modelling.ipynb

Large diffs are not rendered by default.

27 changes: 27 additions & 0 deletions Reddit-scraping-and-flair-detection/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@
# Reddit Flair Detector
## Steps followed:

Described each step along with code in the notebooks.

### Step 1: Extraction of r/india data
Used praw library of python for extraction.

### Step 2: Exploratory Data Analysis
Analysed the data using graphs and scattered points as well as correlation. Used matplotlib library for the same.

### Step 3: Made Reddit Flair Detector. Performed the following the steps:
- Preprocessed the data: Removed stopwords and performed stemming on the data
- Diving into training and test: Divided the dataset into training and test set. Used standard, 0.7:0.3 metric
- Testing accross classifiers: Tested along 3 classifiers: Naive Bayees, SVM and Logisitic Regression. Checked accuracy of each of the classifiers.
- Saving the model: Saved the model with highest accuracy in a .sav file to use it for prediction.
- Model testing: Take input URL from the user and return the predicted and actual flairs. Call the saved model for predicted flairs

### How it works:
The model reads all the urls in the file line by line and predict the flair
- The same is stored in json file.

### Output:

It will be a key and predicted flair as value.


Large diffs are not rendered by default.

Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
1,217 changes: 1,217 additions & 0 deletions Reddit-scraping-and-flair-detection/data.csv

Large diffs are not rendered by default.

Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Reddit-scraping-and-flair-detection/title.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file added Reddit-scraping-and-flair-detection/value.png
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.