# Guidance
- `Runtime` -> `Run all`
- Go to [Prediction](#pred) and change the value of words variable
- Run the last 2 cells

In [2]:
!git clone https://github.com/yyLeaves/depression-text-classification-model.git ./depression_checkpoint

Cloning into './depression_checkpoint'...
remote: Enumerating objects: 13, done.[K
remote: Counting objects: 100% (13/13), done.[K
remote: Compressing objects: 100% (10/10), done.[K
remote: Total 13 (delta 0), reused 0 (delta 0), pack-reused 0[K
Unpacking objects: 100% (13/13), done.


# BERT Model
## Load pre-trained bert models
- https://www.sbert.net/docs/pretrained_models.html

In [3]:
import pandas as pd
import os
import re
!pip install sentence-transformers
from sentence_transformers import SentenceTransformer

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting sentence-transformers
  Downloading sentence-transformers-2.2.0.tar.gz (79 kB)
[K     |████████████████████████████████| 79 kB 6.0 MB/s 
[?25hCollecting transformers<5.0.0,>=4.6.0
  Downloading transformers-4.19.4-py3-none-any.whl (4.2 MB)
[K     |████████████████████████████████| 4.2 MB 41.6 MB/s 
Collecting sentencepiece
  Downloading sentencepiece-0.1.96-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (1.2 MB)
[K     |████████████████████████████████| 1.2 MB 36.0 MB/s 
[?25hCollecting huggingface-hub
  Downloading huggingface_hub-0.7.0-py3-none-any.whl (86 kB)
[K     |████████████████████████████████| 86 kB 5.4 MB/s 
Collecting pyyaml>=5.1
  Downloading PyYAML-6.0-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.manylinux_2_12_x86_64.manylinux2010_x86_64.whl (596 kB)
[K     |████████████████████████████████| 596 kB 50.9 MB/s 
Collecting tokenizers!=0.11.3,<

In [4]:
bert_model = SentenceTransformer('all-mpnet-base-v2')

Downloading:   0%|          | 0.00/1.18k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/190 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/10.1k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/571 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/116 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/39.3k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/349 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/438M [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/239 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/466k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/363 [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/13.1k [00:00<?, ?B/s]

Downloading:   0%|          | 0.00/232k [00:00<?, ?B/s]

## Build Neural Network

In [5]:
import tensorflow as tf
from tensorflow.keras import Sequential, layers, callbacks
def create_model():
  model = Sequential([
    layers.Dropout(0.5, 
                   name='dropout'),
    layers.Dense(1024, 
                 activation='relu', 
                 input_shape=(768,), 
                 name='fc'),
    layers.Dropout(0.5, 
                   name='dropout2'),
    layers.Dense(1, 
                 activation='sigmoid', 
                 name='output')
  ]
  )
  return model
model = create_model()


## Load best model

In [6]:
checkpoint_path = './depression_checkpoint/checkpoint'
model.load_weights(checkpoint_path)

<tensorflow.python.training.tracking.util.CheckpointLoadStatus at 0x7f5decb105d0>

# Prediction
<a id='pred'></a>

In [7]:
def process_data(sentences):
  sentences = sentences.lower().replace('\n',' ')
  sentences = re.sub(r'[^a-zA-Z0-9\s]', ' ', sentences)
  sentences = ' '.join(sentences.split()[:200])
  return sentences
def process_predict(words):
  word_embed = bert_model.encode([process_data(words)])
  return word_embed

In [8]:
# # https://www.stereogum.com/1799/hunter_s_thompsons_suicide_note/news/
# words = "No More Games. No More Bombs. No More Walking. No More Fun.\
#  No More Swimming. 67. That is 17 years past 50. 17 more than I needed or wanted. \
#  Boring. I am always bitchy. No Fun for anybody. 67. You are getting Greedy. \
#  Act your old age. Relax This won't hurt."

# words = "So regarding yesterdays race and RBs decision to put Perez on Intermediate tires... \
# I actually think this was not great strat call at all. Well, it is, given that it resulted in \
# Ferrari panicking, but as far as logic behind it...there wasnt much. It was kinda hail mary by \
# RB to use Sergio and get some kinda of movement ahead, because staying as is, they were on course for 3-4."

words = "Everyday is devine"
# words = "Everyday is suffering"

# words = """
# Very sad to hear the great Dr. Jian Sun has died (way too young). We are so grateful for ResNet, the workhorse of 
#  💔, and so many other incredible contributions. RIP
# """

words = "No More Games. No More Bombs. No More Walking. No More Fun.\
 No More Swimming. 67. That is 17 years past 50. 17 more than I needed or wanted. \
 Boring. I am always bitchy. No Fun for anybody. 67. You are getting Greedy. \
 Act your old age. Relax This won't hurt."
word_embed = process_predict(words)

In [9]:
prob = model.predict(word_embed)[0][0]
print(f"The probability of depression is {prob * 100:.02f}%")
if prob > 0.5:
  print(f"The words suggest a likelihood of {prob * 100:.02f}%. The writer at \
  least have some depression emotions. Please remind him/her!")
else:
  print(f"Relax! With a likelihood of {prob * 100:.02f}%. \
  The writer is not very likely in a depression mood!")

The probability of depression is 72.54%
The words suggest a likelihood of 72.54%. The writer at least have some depression emotions. Please remind him/her!
