# Wellness Squad: Model Playground
This notebook has been developed to test the utility functions (utils.py) with our pretrained suicide classification models.
Please make note of the following if you intend to run this notebook:
1. You will need to obtain the models by running the training programs within the Notebooks directory.
2. Scraping Reddit using our helper method will not work on Google Colaboratory; their IP has been blacklisted.

In [1]:
import joblib
from utils import predict_ideation, scrape_quora, scrape_reddit, scrape_youtube_transcript

Some weights of the model checkpoint at cardiffnlp/twitter-roberta-base-sentiment-latest were not used when initializing RobertaForSequenceClassification: ['roberta.pooler.dense.bias', 'roberta.pooler.dense.weight']
- This IS expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model trained on another task or with another architecture (e.g. initializing a BertForSequenceClassification model from a BertForPreTraining model).
- This IS NOT expected if you are initializing RobertaForSequenceClassification from the checkpoint of a model that you expect to be exactly identical (initializing a BertForSequenceClassification model from a BertForSequenceClassification model).


In [2]:
rf = joblib.load(r'Models\rf_base_model.pkl')
lr = joblib.load(r'Models\lr_base_model.pkl')
gbc = joblib.load(r'Models\gbc_base_model.pkl')
vc = joblib.load(r'Models\vc_base_model.pkl')

In [3]:
## Test: Predicting a list of strings {non-suicide, suicide}
print(predict_ideation(
    rf,
    ['The derivative of y=x^2 is 2x.',
     'Only optimists commit suicide, optimists who no longer succeed at being optimists. The others, having no reason to live, why would they have any to die?'],
    False
))

[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished



Message The derivative of y=x^2 is 2x. has been labeled as: ['non-suicide']
	 Non-Suicide Approximation (%): 0.98
 	 Suicide Approximation (%): 0.02

Message Only optimists commit suicide, optimists who no longer succeed at being optimists. The others, having no reason to live, why would they have any to die? has been labeled as: ['suicide']
	 Non-Suicide Approximation (%): 0.47
 	 Suicide Approximation (%): 0.53



In [3]:
## Test: Classifying reddit posts that were scrapped from Reddit
print(scrape_reddit(
    rf,
    'yorku',
    'new'
)
)

Requesting information (json file) from https://www.reddit.com/r/yorku/new...
Number of scraped posted: 25


[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_job

---------- Analysis of Comment 1 ----------
Message ^ has been labeled as: ['non-suicide']
	 Non-Suicide Approximation (%): 0.759558802308802
 	 Suicide Approximation (%): 0.24044119769119765
---------- Analysis of Comment 2 ----------
Message I missed a test worth 25% for my Intro to Research Methods in Psychology course and I’m a second year student. Once I realized my mistake, I immediately submitted the pass/fail option. 

It said something about using available grades and since the test isn’t marked yet, do you think I’ll be able to pass?

I did one test already and got 56% and did an assignment and I can’t see the grade for that. 

Also, the 3rd test is in a week, so should I still do that or will I get the pass fail results before then?

Ya’ll I’m dying over here. I’m focusing on 6 courses and had too much to do. has been labeled as: ['non-suicide']
	 Non-Suicide Approximation (%): 0.55
 	 Suicide Approximation (%): 0.45
---------- Analysis of Comment 3 ----------
Message Bestie

[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished


In [3]:
## Test: Classifying reddit posts that were scrapped from Quora
print(
    scrape_quora(rf, 'https://www.quora.com/What-is-the-meaning-of-life-179', 20, 50, 5)
)

[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished
[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished


Sleeping for 5 seconds (page loading)
Post #1 has been expanded...
Post #2 has been expanded...
Post #3 has been expanded...
Post #4 has been expanded...
Post #5 has been expanded...
Post #6 has been expanded...
Post #7 has been expanded...
Post #8 has been expanded...
Post #9 has been expanded...
Post #10 has been expanded...
Post #11 has been expanded...
Post #12 has been expanded...
Post #13 has been expanded...
Post #14 has been expanded...
Post #15 has been expanded...
Post #16 has been expanded...
Post #17 has been expanded...
Post #18 has been expanded...
Post #19 has been expanded...
Post #20 has been expanded...
After pruning, 60 posts (text-only) remain.
Message Difficulties in your life don’t come to destroy you, but to help you realize your full potential.Life is meaningful when you have someone to cry together, laugh together and bear each other’s sorrows.Nothing is too difficult to try.Create your own little heaven right here. Then go change yourself such that you qualify

In [8]:
## Test: Classifying a provided YouTube Transcript
print(
    scrape_youtube_transcript(rf, 'https://www.youtube.com/watch?v=aCyGvGEtOwc')
)

[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished



Message Music Music take it from the top got a body like an hour gotta sleep in like a clock his mind she caught it right now she buys a free I told Michael I was the only one Music Wow Music forgot Applause Music and picture the Music second chances theyll never matter people never change the lawyer nothing more sorry thatll never change and about forgiveness sorry Music Music Applause Music but God Music Applause Music it just feels soo Music washes water youre sure Music lets watch my wildest dreams come true you Music Applause Music  has been labeled as: ['non-suicide']
	 Non-Suicide Approximation (%): 0.88
 	 Suicide Approximation (%): 0.12



[Parallel(n_jobs=8)]: Using backend ThreadingBackend with 8 concurrent workers.
[Parallel(n_jobs=8)]: Done  34 tasks      | elapsed:    0.0s
[Parallel(n_jobs=8)]: Done 100 out of 100 | elapsed:    0.0s finished
