<h1>Table of Contents<span class="tocSkip"></span></h1>
<div class="toc"><ul class="toc-item"><li><span><a href="#Generation-Notebook" data-toc-modified-id="Generation-Notebook-1"><span class="toc-item-num">1&nbsp;&nbsp;</span>Generation Notebook</a></span><ul class="toc-item"><li><span><a href="#Loading" data-toc-modified-id="Loading-1.1"><span class="toc-item-num">1.1&nbsp;&nbsp;</span>Loading</a></span></li><li><span><a href="#Patient-Case" data-toc-modified-id="Patient-Case-1.2"><span class="toc-item-num">1.2&nbsp;&nbsp;</span>Patient Case</a></span></li><li><span><a href="#Therapist-Response-Generation" data-toc-modified-id="Therapist-Response-Generation-1.3"><span class="toc-item-num">1.3&nbsp;&nbsp;</span>Therapist Response Generation</a></span></li></ul></li></ul></div>

# Generation Notebook
---
Running this notebook allows you to generate therapist responses from a pre-trained model saved on Google Drive. The code is pretty straight forward to follow, but you have to make sure you that you have access to a GPU machine. If you are running this via Colab, make sure to enable the GPU processing before you run the notebook.

In [0]:
!pip install transformers

In [2]:
from google.colab import drive
import sys
drive.mount('/gdrive')
sys.path.append('../gdrive/My Drive/')
from resources.model_utils import generate
from transformers import GPT2Tokenizer, GPT2LMHeadModel
from resources.general_utils import gpu_information_summary
from termcolor import colored

Go to this URL in a browser: https://accounts.google.com/o/oauth2/auth?client_id=947318989803-6bn6qk8qdgf4n4g3pfee6491hc0brc4i.apps.googleusercontent.com&redirect_uri=urn%3aietf%3awg%3aoauth%3a2.0%3aoob&response_type=code&scope=email%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdocs.test%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive%20https%3a%2f%2fwww.googleapis.com%2fauth%2fdrive.photos.readonly%20https%3a%2f%2fwww.googleapis.com%2fauth%2fpeopleapi.readonly

Enter your authorization code:
··········
Mounted at /gdrive


## Loading
---
Here we are loading the pre-trained model saved in Google Drive.

In [0]:
fine_tuned_dir = "../gdrive/My Drive/fine_tuned/"
model = GPT2LMHeadModel.from_pretrained(fine_tuned_dir)
tokenizer = GPT2Tokenizer.from_pretrained(fine_tuned_dir)

## Patient Case
Below you can enter, a patient case and see how the model performs. Note that we are using the `.` as a special token so you cannot use it elsewhere in a sentence (i.e. you must have a single period only at the end of your sentence). 

Your sentence should also not include any other **punctuation** marks as we purged them during model training.


In [0]:
patient_case = "I have so many issues to address I have a history of sexual abuse I’m a breast cancer survivor and I am a lifetime insomniac  I have a long history of depression and I’m beginning to have anxiety I have low self esteem but I’ve been happily married for almost 35 years  I’ve never had counseling about any of this Do I have too many issues to address in counseling."

## Therapist Response Generation
Below we will generate 3 probable sequences a therapist might say. In blue you will see the patient's case and in green the possible therapist's response. A cherry picked example is produced below and we have it here:
 * **Patient**: I have so many issues to address I have a history of sexual abuse I’m a breast cancer survivor and I am a lifetime insomniac  I have a long history of depression and I’m beginning to have anxiety I have low self esteem but I’ve been happily married for almost 35 years  I’ve never had counseling about any of this Do I have too many issues to address in counseling.
 * **Therapist**: Yes there are plenty of issues that you might want to address but perhaps the most important are the ones you have worked through before you have had counseling.

Here in the reflection, the algorithm understands that there are too many issues to be addressed with the patient and rather than reitterating all those points, the algorithm points this fact out.

In [17]:
input_generate = {
    "text": patient_case,
    "tokenizer": tokenizer,
    "model": model,
    "stop_token": None,
    "length": 1024,
    "num_return_sequences": 3,
    "temperature": 1,
    "k": 100,
    "p": 0.95,
}

results = generate(**input_generate)
print("-"*30)
for seqs in results:
  print(colored(seqs[0], "blue",attrs=["bold", "dark"]))
  print(colored(seqs[1], "green",attrs=["bold", "dark"]))
  print("-"*30)

Setting `pad_token_id` to 50256 (first `eos_token_id`) to generate sequence


------------------------------
[2m[1m[34mI have so many issues to address I have a history of sexual abuse I’m a breast cancer survivor and I am a lifetime insomniac  I have a long history of depression and I’m beginning to have anxiety I have low self esteem but I’ve been happily married for almost 35 years  I’ve never had counseling about any of this Do I have too many issues to address in counseling.[0m
[2m[1m[32m.Yes there are plenty of issues that you might want to address but perhaps the most important are the ones you have worked through before you have had counseling.[0m
------------------------------
[2m[1m[34mI have so many issues to address I have a history of sexual abuse I’m a breast cancer survivor and I am a lifetime insomniac  I have a long history of depression and I’m beginning to have anxiety I have low self esteem but I’ve been happily married for almost 35 years  I’ve never had counseling about any of this Do I have too many issues to address in counseli