# Title
### Author [Github](Github link)

## Problem Statement

- What's my problem
- __How__ will I solve this problem?

## Executive Summary

## Table of Contents

## Loading packages and Data

In [43]:
import time
import numpy as np
import datetime as dt
import requests
import pandas as pd
pd.set_option("display.max_columns", None)
pd.set_option('display.max_colwidth', 200)
from sklearn.feature_extraction.text import CountVectorizer, TfidfVectorizer

In [44]:
df = pd.read_csv("../data/subreddits.csv", )
df.head()

Unnamed: 0,title,selftext,subreddit,created_utc,author,num_comments,score,is_self,timestamp
0,LF Recommendation: Fantasy with long travel episode(s)?,Hi looking for recommendation for a top preferably well known / well-regarded fantasy novel that has as part of its story arc some sort of longer traveling section where character(s) have to trave...,Fantasy,1595295818,Overthrown77,16,5,True,2020-07-20
1,What is the best Fantasy book you've read that has three stars or less on Goodreads?,"usually I don't enjoy something below four stars, to be honest. But sometimes a 3 star book will leave me baffled at everybody's madness, wondering if I read a version from a different timeline be...",Fantasy,1595304562,OraclePreston,55,19,True,2020-07-21
2,Favorite fantasy names?,"I love fantasy, and I also have a thing for names. I’d love to hear everyone’s favorite character names from fantasy! It would also be cool to hear if anyone has named their children after their f...",Fantasy,1595305376,omnomenclature,59,17,True,2020-07-21
3,Anyone can recommend me a book without having a focus on the monarchy?,"Idk how else to put the title but I really want to get a good story of adventure on the perspective of a commoner or some mercenary, etc. It can have a monarchy in it just not have it be a focus o...",Fantasy,1595305716,UlyssesCourier,61,33,True,2020-07-21
4,It’s been 10 years since I read “The Way of Kings” the first time.,"And now, here I am, a 32 year old man, chuckling under my blanket at the mental image of 35 bridgemen marching tightly through a chaotic war amp in parshendi bone-armor. At 2 in the morning. I fee...",Fantasy,1595311158,Bock_Tea,65,126,True,2020-07-21


In [45]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 5057 entries, 0 to 5056
Data columns (total 9 columns):
 #   Column        Non-Null Count  Dtype 
---  ------        --------------  ----- 
 0   title         5057 non-null   object
 1   selftext      5057 non-null   object
 2   subreddit     5057 non-null   object
 3   created_utc   5057 non-null   int64 
 4   author        5057 non-null   object
 5   num_comments  5057 non-null   int64 
 6   score         5057 non-null   int64 
 7   is_self       5057 non-null   bool  
 8   timestamp     5057 non-null   object
dtypes: bool(1), int64(3), object(5)
memory usage: 321.1+ KB


## Data Dictionary

|Column Name|Data Type|Description|
|---|---|---|
|title |object|Title given to the post|
|selftext |object|Main body of the post|
|subreddit |object|Which subreddit the post was from, our target|
|created_utc |int|Date and time this was posted in Epoch time|
|author |object|The username of the redditor|
|num_comments |int|The number of comments made on the post|
|score |int|The number of upvotes the post received|
|is_self |bool|Checks that the post is a text|
|timestamp |object|The date of the post in yyyy/mm/dd form|

## Data Cleaning

### Null Values

In [46]:
df.isna().sum()

title           0
selftext        0
subreddit       0
created_utc     0
author          0
num_comments    0
score           0
is_self         0
timestamp       0
dtype: int64

### Dropping Columns

In [47]:
df.drop(columns = ["created_utc", "score", "author", "num_comments"], inplace = True)
df.head(2)

Unnamed: 0,title,selftext,subreddit,is_self,timestamp
0,LF Recommendation: Fantasy with long travel episode(s)?,Hi looking for recommendation for a top preferably well known / well-regarded fantasy novel that has as part of its story arc some sort of longer traveling section where character(s) have to trave...,Fantasy,True,2020-07-20
1,What is the best Fantasy book you've read that has three stars or less on Goodreads?,"usually I don't enjoy something below four stars, to be honest. But sometimes a 3 star book will leave me baffled at everybody's madness, wondering if I read a version from a different timeline be...",Fantasy,True,2020-07-21


---

In [48]:
df.loc[df["subreddit"] == "Fantasy", ["title", "selftext", "subreddit"]]

Unnamed: 0,title,selftext,subreddit
0,LF Recommendation: Fantasy with long travel episode(s)?,Hi looking for recommendation for a top preferably well known / well-regarded fantasy novel that has as part of its story arc some sort of longer traveling section where character(s) have to trave...,Fantasy
1,What is the best Fantasy book you've read that has three stars or less on Goodreads?,"usually I don't enjoy something below four stars, to be honest. But sometimes a 3 star book will leave me baffled at everybody's madness, wondering if I read a version from a different timeline be...",Fantasy
2,Favorite fantasy names?,"I love fantasy, and I also have a thing for names. I’d love to hear everyone’s favorite character names from fantasy! It would also be cool to hear if anyone has named their children after their f...",Fantasy
3,Anyone can recommend me a book without having a focus on the monarchy?,"Idk how else to put the title but I really want to get a good story of adventure on the perspective of a commoner or some mercenary, etc. It can have a monarchy in it just not have it be a focus o...",Fantasy
4,It’s been 10 years since I read “The Way of Kings” the first time.,"And now, here I am, a 32 year old man, chuckling under my blanket at the mental image of 35 bridgemen marching tightly through a chaotic war amp in parshendi bone-armor. At 2 in the morning. I fee...",Fantasy
...,...,...,...
2512,"Hi. I'm Terry M. Dunn, author of the DANEBURY trilogy. Ask me anything !!!","I first launched 'DANNU'S MAN' two years ago on Kindle, followed by 'ONE MAN'S HONOUR' and have just added the final book 'A MAN of his TIME' in the last week, completing three years work for the ...",Fantasy
2513,"/r/Fantasy - Daily Recommendation Requests and Simple Questions Thread - May 13, 2020","This thread is to be used for recommendation requests or simple questions that are small/general enough that they won’t spark a full thread of discussion. \n\nAs usual, first have a look at the si...",Fantasy
2514,"/r/Fantasy Writing Wednesday - May 13, 2020",The weekly Writing Wednesday thread is the place to ask questions about writing. Wanna run an idea past someone? Looking for a beta reader? Have a question about publishing your first book? Need w...,Fantasy
2515,"Deals, Deals, Deals - Daily Sales Thread May 13, 2020","Due to the increased flood of sales posts, authors announcing sales of their own books should post in the daily sales thread. \n\n**This thread is for sales and deals only.** If your book is regul...",Fantasy


In [49]:
df.loc[df["subreddit"] == "scifi", ["title", "selftext", "subreddit"]]

Unnamed: 0,title,selftext,subreddit
2517,Help needed - I’m writing a whodunit detective story and I have a question on creating a piece of evidence.,"So this is my idea, At the place of crime, the detective finds out that the clock is running faster than the usual as it has been magnetized. As the story unfolds, he finds out that the Antagonist...",scifi
2518,Anyone else feel Robocop 2 is underrated?,"Okay, so don’t get me wrong, the original Robocop is unbeatable and clearly the best of the series but I think Robocop 2 has been unfairly lumped in with the other Robocop sequels and successors. ...",scifi
2519,Looking for an cancelled space sci-fi series,Hi.\n\nI'm thinking of an series where it seems to be earth vs an alien race war.\nOne of the.main actors is a high level military that runs one of the big starships.\nThe guy is an older black ma...,scifi
2520,Would a Halo Ring around earth work?,"In theory, if you built a halo ring around earth, then rotated it around earth, could you produce any electricity from induction?",scifi
2521,"Trying to remember something from an old book or short story - an insectoid alien who asked ""Does your dagger contain iron?""",I recall many years ago I read a book or a short story in which an insectoid alien (or an alien who was descended from insect-like ancestors) was escorting a human through her (I think?) hive. I ...,scifi
...,...,...,...
5052,Need Help Identifying 2 Book Titles,"I read two books (one may have been a short story) some time ago and I vaguely remember the premise but not the titles.\n\nOne, possibly a short story, was about a star's life cycle from the star'...",scifi
5053,Looking for recommendations for exploration themed novels.,Like the subject line says. I am looking for some new reading material that leans towards exploration and a sense of wonder play key parts in the story.\n\nI have avidly consumed the works of Jack...,scifi
5054,"I am not an expert, but this book should be “The Theory of Everything” :D","Even though the main part of the story was novel and capturing. I did enjoy by AI, definitely my favorite character. First time in Sci-Fi that it was not some banal AI revolution or some omnipoten...",scifi
5055,Goliath &amp; The Water Knife,"Prime's just released **GOLIATH** Season 3's plot focuses much of it's plot on water scarcity and associated rights. \n\n**The Water Knife** by *Paolo Bacigalupi,* based on his short story, **The...",scifi


In [50]:
cvec = CountVectorizer()
cvec.fit(df)

CountVectorizer()

In [51]:
df_vect = cvec.transform(df)

In [None]:
vertorized_df = pd.DataFrame(df_vect.toarray(),
                          columns=cvec.get_feature_names())
X_train_df

## Exploratory Data Analysis
This might lead back to data cleaning

## Model Prep
Establish X, y, train/test split

## Modeling

### Baseline

Every model needs these 4 things

In [1]:
# Training Score

In [2]:
# Testing Score

In [3]:
# Crossval Score

__Interpretation:__ The training score is "blank" and a testing score of "blank"

### Model 1

### Model 1

### Model 2

### Model 3

## Model Selection

Markdown Table
Model Name | Training Score | Testing Score
-|-|-
Baseline | % | % 
etc, etc

Our best predictive model was the "blank" model, and go into it's performance
Our best interpretive model was the "blank" model, and we'll explain it

## Model Evaluation
residuals, LINEI, etc

## Conclusion & Recommendations
State explicitly you model performance and ANSWER THE PROBLEM!

## References
-[Human words](URL)