shubhigupta991
diff --git a/‎.DS_Store
12 KB b/‎.DS_Store
12 KB
diff --git a/‎Reddit-scraping-and-flair-detection/.DS_Store
6 KB b/‎Reddit-scraping-and-flair-detection/.DS_Store
6 KB
diff --git a/‎Reddit-scraping-and-flair-detection/Exploratory-Data-Analysis(EDA).ipynb
Lines changed: 1015 additions & 0 deletions b/‎Reddit-scraping-and-flair-detection/Exploratory-Data-Analysis(EDA).ipynb
Lines changed: 1015 additions & 0 deletions
diff --git a/‎Reddit-scraping-and-flair-detection/Modelling.ipynb
Lines changed: 1364 additions & 0 deletions b/‎Reddit-scraping-and-flair-detection/Modelling.ipynb
Lines changed: 1364 additions & 0 deletions
diff --git a/‎Reddit-scraping-and-flair-detection/README.md
Lines changed: 27 additions & 0 deletions b/‎Reddit-scraping-and-flair-detection/README.md
Lines changed: 27 additions & 0 deletions
diff --git a/‎Reddit-scraping-and-flair-detection/WebScrapping and PreProcessing.ipynb
Lines changed: 936 additions & 0 deletions b/‎Reddit-scraping-and-flair-detection/WebScrapping and PreProcessing.ipynb
Lines changed: 936 additions & 0 deletions
diff --git a/‎Reddit-scraping-and-flair-detection/comms per flair.png
140 KB b/‎Reddit-scraping-and-flair-detection/comms per flair.png
140 KB
diff --git a/‎Reddit-scraping-and-flair-detection/data.csv
Lines changed: 1217 additions & 0 deletions b/‎Reddit-scraping-and-flair-detection/data.csv
Lines changed: 1217 additions & 0 deletions
diff --git a/‎Reddit-scraping-and-flair-detection/final_model.pkl -1
6.72 MB b/‎Reddit-scraping-and-flair-detection/final_model.pkl -1
6.72 MB
diff --git a/‎Reddit-scraping-and-flair-detection/lenth of words in body.png
58.3 KB b/‎Reddit-scraping-and-flair-detection/lenth of words in body.png
58.3 KB
@@ -0,0 +1,27 @@
+# Reddit Flair Detector
+## Steps followed:
+
+Described each step along with code in the notebooks. 
+
+### Step 1: Extraction of r/india data 
+Used praw library of python for extraction.
+
+### Step 2: Exploratory Data Analysis
+Analysed the data using graphs and scattered points as well as correlation. Used matplotlib library for the same.
+
+### Step 3: Made Reddit Flair Detector. Performed the following the steps:
+- Preprocessed the data: Removed stopwords and performed stemming on the data
+- Diving into training and test: Divided the dataset into training and   test set. Used standard, 0.7:0.3 metric
+- Testing accross classifiers: Tested along 3 classifiers: Naive Bayees, SVM   and Logisitic Regression. Checked accuracy of each of the classifiers.
+- Saving the model: Saved the model with highest accuracy in a .sav file to   use it for prediction. 
+- Model testing: Take input URL from the user and return the predicted and    actual flairs. Call the saved model for predicted flairs
+
+### How it works:
+The model reads all the urls in the file line by line and predict the flair
+- The same is stored in json file.
+
+### Output:
+
+It will be a key and predicted flair as value.
+
+