<a href="https://colab.research.google.com/github/techkratos/spookystories/blob/main/ai/horror_story_classifier.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Importing libraries

In [250]:
import praw
import pandas as pd
import numpy as np
from sklearn.feature_extraction.text import CountVectorizer,TfidfTransformer,TfidfVectorizer 
from sklearn.naive_bayes import MultinomialNB
from sklearn.linear_model import LinearRegression
from sklearn.pipeline import Pipeline
from nltk.corpus import stopwords
from nltk.stem import PorterStemmer
from sklearn.feature_selection import SelectKBest, chi2
from sklearn.ensemble import RandomForestClassifier
import nltk
import re
nltk.download("stopwords")

[nltk_data] Downloading package stopwords to /root/nltk_data...
[nltk_data]   Package stopwords is already up-to-date!


True

# Data scraping and cleaning from the subreddits

In [240]:
#initializing reddit parser using praw
reddit = praw.Reddit(client_id="Ycq9VPJclmgRPw",     
                     client_secret="mUDxzEkq2GAKo6l5RqNNIG1eWW_0Tw",  
                     user_agent="nosleep-scraper",)

#list of subreddits that we parse through
list_of_subreddits = [ "nosleep", "poetry", "funnystories", "scifiwriting" ]

In [241]:
stories_per_subreddit = 100

# Iterating through each subreddit and picking up top 100 stories from that subreddit
stories = []
for subreddit in list_of_subreddits:
  submissions = reddit.subreddit(subreddit).top(limit=stories_per_subreddit)
  for submission in submissions:
    story = [submission.selftext,submission.score]
    if (subreddit == "nosleep"):
      story.append(1)
    elif (subreddit == "poetry"):
      story.append(2)
    elif (subreddit == "funnystories"):
      story.append(3)
    else:
      story.append(4)
    stories.append(story)

# verifying iteration
len(stories)

400

In [243]:
#Creating a dataframe of the stories.
df = pd.DataFrame(stories,columns = ["story","score","class"])
df.head()

Unnamed: 0,story,score,class
0,"His Tinder profile said he was 45, but he look...",43647,1
1,"Every night, no matter the weather, something ...",28828,1
2,"The poster read, “Happiness! Sold in Glass Jar...",26919,1
3,"I don't know when you're going to read this, b...",23260,1
4,\nI moved in with my boyfriend yesterday. We’v...,20653,1


# Text preprocessing

In [287]:
def text_process(df):
  stemmer = PorterStemmer()
  words = stopwords.words("english")
  df['cleaned_story'] = df['story'].apply(lambda x: " ".join([stemmer.stem(i) for i in re.sub("[^a-zA-Z]", " ", x).split() if i not in words]).lower())
  return df

In [289]:
df = text_process(df)

In [290]:
vectorizer = TfidfVectorizer(min_df= 3, stop_words="english", sublinear_tf=True, norm='l2', ngram_range=(1, 2))
final_features = vectorizer.fit_transform(df['cleaned_story']).toarray()
final_features.shape

(400, 7556)

# Model building

In [291]:
#seperating test train
msk = np.random.rand(len(df)) < 0.8
df_train = df[msk]
df_test = df[~msk]

X_train_class = df_train["cleaned_story"]
Y_train_class = df_train["class"]
X_test_class = df_test["cleaned_story"]
Y_test_class = df_test["class"]

X_train_score = X_train_class
X_test_score = X_test_class
Y_train_score = df_train["score"]
Y_test_score = df_test["score"] 


In [292]:
#Making pipelines for both the story classifier and the score regressor
text_clf_classifier = Pipeline([('vect', vectorizer),('chi',  SelectKBest(chi2, k=1200)),('clf', RandomForestClassifier()),])
score_clf_regressor = Pipeline([('vect',vectorizer),('clf', LinearRegression()),])

In [293]:
classifier_model = text_clf_classifier.fit(X_train_class,Y_train_class)
regression_model = score_clf_regressor.fit(X_train_score, Y_train_score)

In [294]:
np.mean(classifier_model.predict(X_test_class)== Y_test_class)

0.8076923076923077

In [295]:
regression_model.score(X_test_score,Y_test_score)

0.8232507477142748

In [313]:
def prediction(story):
  prediction_df = pd.DataFrame([[story,0,0]],columns = ["story","classs","score"])
  prediction_df = text_process(prediction_df)
  classifier_prediction = classifier_model.predict(prediction_df["cleaned_story"])
  regressor_prediction  = regression_model.predict(prediction_df["cleaned_story"])
  return classifier_prediction[0],regressor_prediction[0]

In [314]:
prediction("I fucking hate Halloween. It’s when all the pricks come out to play; full moon and all that. I feel a special kind of abhorrence toward All Hallow’s Eve that has nothing to do with the actual holiday itself. You see, each year on October 31st, I receive a disturbingly unsettling phone call. I don’t know who it’s from nor do I have any indication as to why they are doing this to me. I’ve tried everything I could to find out - I’ve gone to the police, I’ve gone to my phone network company, hell, I even tried to hire a hacker. Nothing would work. I change my number yearly just before October but alas, when October 31st rolls around, I get the phone call.It all started about three years ago, back when I used to actually celebrate Halloween. I was in a pub with a few mates from work when my phone rang, I fumbled for it in my pocket to see who it was. It was an ‘Unknown Number’ so I hung up the call - thinking it was a crank call. It was Halloween after all. But it just wouldn’t stop. I could feel my phone vibrating incessantly in my pocket. Eventually, I excused myself and went outside to answer it, about to tell whoever it was to fuck off.“Hello?”“Is that...Matthew?”The voice was soft, mellow and velvety fucking smooth but it sent a shiver down my spine. I couldn’t figure out whether it was a man or woman - as weird as this sounds, it kind of sounded like both. They spoke so slowly, like they were choosing each and every syllable with extreme care. I felt the hairs on the back of my neck stand up.“Yes, it’s Matthew. Who am I speaking to?”“Someone who has been watching you for a long while, Matthew.”“Who the fuck is this?”“Who we are bears no meaning. It’s what we can do that should be important to you Matthew.”I hung up the phone. I didn’t realise then what a mistake that was going to be.When I got home that night, I found my cat Biggles dead on my front porch. His insides were spooned out; he was nothing but a meat suit. A note was stapled crudely to his matted fur.October 31st. It read.The following year, I had almost forgotten about it. On October 31st, I was at a house party when my phone rang at exactly 10.55pm. It was an unknown number again, I frowned feeling my stomach tie in multiple knots. It couldn’t be though, I had changed my number since then.Hesitantly, I answered.“Hello?”“Is that...Matthew?”The voice was different this time, more guttural but high pitched at the same time. It’s how I’d always thought a person who was being choked would sound like. Gasping for breath.“Please stop calling me.”“We have such wonderful things to show you Matthew.”“Show me what?”This thing, it laughed then. It was the most chilling laugh I’d ever heard in my life. It was a dry sort of cackle, no emotion behind it.“The beautiful, visceral and stretchy insides of people. Have you ever seen the inside of a person Matthew? It’s glorious. We can show you how.”“Fuck you.”I hung up.When I got home that night, my mother called me, she was hysterical. My grandad was dead. Murdered apparently - found all splayed and bloody. His insides were all fished out, there was nothing left. Just an empty sack of skin. My heart sank when she told me what was pinned to his chest. It was a note that read October 31st.The following year, they took my mum. She was found strung up in her bathroom; looked like a suicide. Except she didn’t have any insides, all her organs and flesh were gone. She was split open from throat to navel and hung out to dry. There wasn’t even any blood left by the time she was found. I started to think that they just wanted to torture me; kill my family; murder my friends. Sometimes evil motherfuckers don’t need a reason to be evil motherfuckers. I tried to kill myself after my mum died. It was the only way I could get myself out of this but I couldn’t do it. I just couldn’t go through with it.Whoever this was, they were not human. I knew that much. I felt hopeless, lost and scared shitless.It’s October 31st today and my phone is already ringing. I feel the familiar fear creep up on me like a slithering serpent that’s about to strike and fill me with poison. Only what was waiting for me on the other end of that phone was something so much more deadly, much more terrifying. I stare at my phone for a long time before I finally answer.“Hello?”“Is that...Matthew?”“Fuck you, you fucking know it’s Matthew.“Will you let us see what’s inside you Matthew? We need your meat suit.”“You’ve already taken everyone I care about. You can go fuck yourself now.”“Not...everyone Matthew. Did you know you have a baby?”My heart sank. Fuck, Natalie. We broke up about a year ago but I didn’t know she was pregnant.“Yes Matthew. You have a scrumptious baby, so young, so...ripe. The flesh so smooth.”I’m crying, sobbing really.“Give us your flesh and she will not have to suffer.”They’ve never asked me this before. I knew I had no choice. I couldn’t let the same thing happen to a small, innocent baby.“Okay.”The line went dead.Someone just slipped a note under my door. Along with a blood stained, mauve coloured blade.")

(1, 11698.69940436905)