In [1]:
import os
import requests

In [2]:
#Get PDF path
pdf_path='book.pdf'

In [3]:
import fitz
from tqdm.auto import tqdm

def text_formatter(text: str) -> str:
    "Perform minor formatting on text"
    cleaned_text = text.replace("\n",' ').strip()
    
    """ more formatting can go here """
    return cleaned_text

In [4]:
""" open pdf """
def open_and_read_pdf(pdf_path: str) ->str:
    doc = fitz.open(pdf_path)
    pages_and_text=[]
    for page_num, page in tqdm(enumerate(doc)):
        text = page.get_text()
        text = text_formatter(text=text)
        pages_and_text.append({"page_number": page_num-10,
                               "page_char_count": len(text),
                               "page_word_count": len(text.split(" ")),
                               "page_sentence_count": len(text.split(". ")),
                               "page_token_count": len(text)/4, #1 token ~ 4 chars
                               "text": text})
    return pages_and_text

In [5]:
#call the functio
pages_and_text = open_and_read_pdf(pdf_path=pdf_path)
pages_and_text[:2]

0it [00:00, ?it/s]

[{'page_number': -10,
  'page_char_count': 0,
  'page_word_count': 1,
  'page_sentence_count': 1,
  'page_token_count': 0.0,
  'text': ''},
 {'page_number': -9,
  'page_char_count': 24,
  'page_word_count': 3,
  'page_sentence_count': 1,
  'page_token_count': 6.0,
  'text': 'Essentials of Obstetrics'}]

In [6]:
import random
random.sample(pages_and_text,k=3)

[{'page_number': 252,
  'page_char_count': 719,
  'page_word_count': 93,
  'page_sentence_count': 15,
  'page_token_count': 179.75,
  'text': '252 Essentials of Obstetrics Case 2 1. Decrease uterine activity by stopping oxytocin  infusion. 2. Maternal and fetal oxygenation can be improved by  turning the mother to a lateral recumbent position and  by giving oxygen by mask. 3. 6JG\x02UECNR\x02UVKOWNCVKQP\x02VGUV\x02YKNN\x02GNKEKV\x02HGVCN\x02JGCTV\x02TCVG\x02 acceleration in a healthy fetus. It is a reassuring sign. Sample questions Long-answer questions 1. 9JCV\x02CTG\x02VJG\x02OGVJQFU\x02QH\x02CUUGUUKPI\x02HGVCN\x02YGNN\x0fDGKPI\x02 KP\x02NCDQT! 2. What are reassuring and nonreassuring signs of  HGVCN\x02UVCVWU!\x02*QY\x02CTG\x02%CVGIQT[\x02+++\x02VTCEKPIU\x02 OCPCIGF! Short-answer questions 1. Early, late, and variable decelerations 2. Fetal scalp blood sampling 3. Sinusoidal pattern 4. Fetal pulse oximetry 5. Admission test'},
 {'page_number': 659,
  'page_char_count': 3001,
  'page

In [7]:
#Converting the corpus into sentences (chunks / splits)
from spacy.lang.en import English #spacy lib used to process sentences based on some rules
nlp = English()

nlp.add_pipe("sentencizer")
doc = nlp("This is a sentence. This is another sentence")
assert len(list(doc.sents)) == 2

list(doc.sents)

[This is a sentence., This is another sentence]

In [8]:
for item in tqdm(pages_and_text):
    item['sentences']=list(nlp(item['text']).sents)
    #make sure all sentences are strings
    item['sentences'] = [str(sentence) for sentence in item['sentences']]
    #count the sentences
    item['page_sentence_count)spacy'] = len(item['sentences'])

  0%|          | 0/914 [00:00<?, ?it/s]

In [9]:
random.sample(pages_and_text,k=1)

[{'page_number': 784,
  'page_char_count': 3373,
  'page_word_count': 514,
  'page_sentence_count': 24,
  'page_token_count': 843.25,
  'text': "784 Essentials of Obstetrics cirrhosis. The disease is usually treated with  immunosuppressive drugs. The drugs used are  corticosteroids and azathioprine. The disease  can relapse in pregnancy. Risk of spontaneous  miscarriage and fetal demise is high. Cirrhosis and portal  hypertension Women with decompensated cirrhosis are often  anovulatory and rarely conceive. Women who  do conceive face increased maternal mortality.  Fetal and neonatal outcomes are also poor, with  increased rates of preterm deliveries, spontaneous  abortions, stillbirths, and neonatal mortality. Gallstones in pregnancy Gallstones (cholelithiasis) occur more com- monly during pregnancy due to decreased gall- bladder motility and increased cholesterol sat- uration of bile. Up to 10% of pregnant women  may develop stones or bile sludge during preg- nancy or in the immediat

In [10]:
#chunking our sentences together so that it can be fitted in to the embedding model and the LLM context window
num_sent_chunk_size=10

#function to split lists of texts, eg: [25] -> [10,10,5]
def split_list(input_list: list[str],slice_size: int = num_sent_chunk_size)-> list[list[str]]:
    return [input_list[i:i+ slice_size] for i in range(0,len(input_list),slice_size)]

text_list = list(range(25))
split_list(text_list)


[[0, 1, 2, 3, 4, 5, 6, 7, 8, 9],
 [10, 11, 12, 13, 14, 15, 16, 17, 18, 19],
 [20, 21, 22, 23, 24]]

In [11]:
#loop through pages and text and split into chunks
for item in tqdm(pages_and_text):
    item['sentence_chunks'] = split_list(input_list=item['sentences'],
    slice_size = num_sent_chunk_size)
    item['num_chunks']=len(item['sentence_chunks'])

  0%|          | 0/914 [00:00<?, ?it/s]

In [12]:
random.sample(pages_and_text,k=1)

[{'page_number': 392,
  'page_char_count': 2768,
  'page_word_count': 433,
  'page_sentence_count': 15,
  'page_token_count': 692.0,
  'text': '392 Essentials of Obstetrics utcomes of sterili ation The following are the outcomes of sterilization: • Pregnancy is uncommon after tubal steriliza- tion. It is highest after clip sterilization, and low- est after Pomeroy or modified Pomeroy method. • When pregnancy does occur, there is a greater  risk that it will be an ectopic pregnancy. equest for restoration of  fertility A couple may request reversal of sterilization.  This usually follows the loss of a child. The two  choices are as follows: • Tubal reanastomosis • In vitro fertilization (IVF) Both have approximately 60% chance of a  live pregnancy, although surgical reversal is less  expensive. Tubal reanastomosis roce ure o  reanastomosis Tubal reanastomosis includes the following  steps: • The technique involves microsuturing using  6-0 to 10-0 sutures. • The damaged and scarred porti

In [13]:
import re
#split each chunk into its own item
pages_and_chunks = []
for item in tqdm(pages_and_text):
    for sentence_chunk in item['sentence_chunks']:
        chunk_dict ={}
        chunk_dict['page_number']  =item['page_number']
        
        #join the sentence into a para
        joined_sentence_chunk = ''.join(sentence_chunk).replace("  "," ").strip()
        joined_sentence_chunk = re.sub(r'\.([A-Z])',r'.\1',joined_sentence_chunk) # ".A" -> ". A"
        chunk_dict['sentence_chunk']=joined_sentence_chunk
        
        #stats on chunks
        chunk_dict['chunk_char_count']=len(joined_sentence_chunk)
        chunk_dict['chunk_word_count']=len([word for word in joined_sentence_chunk.split(' ')])
        chunk_dict['chunk_token_count']=len(joined_sentence_chunk)/4
        pages_and_chunks.append(chunk_dict)
        

  0%|          | 0/914 [00:00<?, ?it/s]

In [14]:
random.sample(pages_and_chunks,k=1)

[{'page_number': 500,
  'sentence_chunk': 'Enough fluid is removed to achieve an SDP of 8 cm or AFI of 15 cm.However, not more than 5 L is removed at one sitting. •If the fluid stops draining, the needle tip is adjusted. • The fetal heart beat is documented at the end of the procedure. •Antibiotics and tocolytics are not required.Amniotic fluid volume should be monitored weekly following amniocentesis.If fluid accu- mulates again, the procedure may have to be repeated.Complications o amniocentesis Complications of amniocentesis are listed in Box 34.5.Medical management Mild idiopathic polyhydramnios may respond to medical therapy.In moderate-to-severe polyhy- dramnios, amnioreduction is performed before starting medical therapy.',
  'chunk_char_count': 695,
  'chunk_word_count': 103,
  'chunk_token_count': 173.75}]

In [15]:
import pandas as pd
df = pd.DataFrame(pages_and_chunks)

In [16]:

min_token_length = 30
for row in df[df['chunk_token_count']<=min_token_length].sample(5).iterrows():
    print(f'Chunk token count: {row[1]["chunk_token_count"]} | Text: {row[1]["sentence_chunk"]}')

Chunk token count: 25.0 | Text: Fetal presentation 6.Fetal position 7. &GUETKDGJQYGPICIGOGPVQHVJGRTGUGPVKPIRCTVKU determined
Chunk token count: 3.75 | Text: Self-Assessment
Chunk token count: 28.75 | Text: Delivery of placenta 5.Episiotomy 6.Perineal laceration 7.Caput succedaneum 8.Molding of fetal head Self-Assessment
Chunk token count: 10.25 | Text: Anthropoid pelvis 6.Waste space of Morris
Chunk token count: 8.0 | Text: Section 3 Intrapartum Management


In [17]:
#filter data for rows with under 30 tokens to filter out unwanted info
pages_and_chunks_over_min_token_len=df[df['chunk_token_count']>min_token_length].to_dict(orient='records')
pages_and_chunks_over_min_token_len[:2]

[{'page_number': -7,
  'sentence_chunk': 'Essentials of Obstetrics Dr Lakshmi Seshadri, md Senior Consultant in Obstetrics and Gynecology Thirumalai Mission Hospital, Vellore Formerly, Professor and Head of the Department Christian Medical College Hospital Vellore, Tamil Nadu Dr Gita Arjun, facog Director E. V. Kalyani Medical Foundation Pvt.Ltd. Chennai Formerly, Director, and Obstetrician and Gynecologist E.V. Kalyani Medical Centre Chennai, Tamil Nadu',
  'chunk_char_count': 416,
  'chunk_word_count': 56,
  'chunk_token_count': 104.0},
 {'page_number': -6,
  'sentence_chunk': 'Manager Commissioning: P. Sangeetha Consultant Editor: Dr Vallika Devi Katragadda Production Editor: Pooja Chauhan Asstt Manager Manufacturing: Sumit Johry Copyright © 2015 by Wolters Kluwer Health (India) 10th Floor, Tower C Building No.10 Phase – II DLF Cyber City Gurgaon Haryana - 122002 All rights reserved.This book is protected by copyright.No part of this book may be reproduced in any form or by any mean

In [18]:
random.sample(pages_and_chunks_over_min_token_len, k=1)

[{'page_number': 317,
  'sentence_chunk': 'The Abnormal Puerperium 317 Pulmonary embolism Pulmonary embolism is associated with high maternal mortality.It may follow deep vein thrombosis of the leg or can occur without any prior symptoms.Dyspnea, cough, and chest pain are the usual symptoms. Diagnosis requires a high index of suspicion. Chest X-ray, arterial blood gas analysis, ven- tilation/perfusion scan, and MRI are helpful. Treatment is by immediate anticoagulation. Intracaval filters may be inserted into the inferior vena cava to prevent recurrent emboli reaching the lungs from the legs or pelvis.Postpartum neuropathy Neuropathies occur in the postpartum period due to traction, compression, or vascular injury. The incidence is less in modern obstetric prac- tice.The weakness and paralysis are usually transient; most women recover within 72 hours.',
  'chunk_char_count': 820,
  'chunk_word_count': 119,
  'chunk_token_count': 205.0}]

In [19]:
#embedding the text chunks

from sentence_transformers import SentenceTransformer
embedding_model = SentenceTransformer(model_name_or_path='all-mpnet-base-v2', device='cpu')

#test with list of sentences
sentences =['The sentence transformer library provides easy way to create embeddingd.',
            'Sentences can be embedded one by one or in a list.',
            'I like Tokyo!']

embeddings = embedding_model.encode(sentences)
embeddings_dict = dict(zip(sentences, embeddings))

for sentence, embedding in embeddings_dict.items():
    print(f'Sentence: {sentence}')
    print(f'Embedding: {embedding}')
    print("")

Sentence: The sentence transformer library provides easy way to create embeddingd.
Embedding: [-2.75681205e-02  1.03441179e-02 -1.11789536e-02  7.39674270e-02
 -9.52574424e-03 -8.96938052e-03  2.86497851e-03 -6.45970553e-02
  2.01922990e-02 -2.65798662e-02  3.72373313e-02  6.21617623e-02
 -3.32638733e-02  8.78149830e-03  3.85674424e-02 -4.69840206e-02
  5.06647900e-02  1.50193032e-02 -5.09907212e-03 -1.26787776e-03
  3.59600931e-02  3.18777636e-02  2.23470479e-02  2.88969669e-02
 -2.01448370e-02 -4.15904261e-03 -3.98050109e-03 -4.45906855e-02
  6.35564178e-02 -7.89518375e-03 -2.10392401e-02 -1.21932030e-02
  6.21878617e-02 -6.09315978e-03  1.05285665e-06  1.17947450e-02
 -5.27662449e-02 -5.94704458e-03  3.20126638e-02  4.14350303e-03
  5.34314662e-02 -4.53464277e-02  7.83289410e-03  5.14237136e-02
 -3.97130400e-02 -1.21788522e-02  4.98625487e-02  1.77571680e-02
  8.35436210e-02  2.56222710e-02 -2.06516571e-02 -4.30853590e-02
 -9.25287080e-04 -1.01864040e-02 -4.08064313e-02  2.39221789e

In [20]:
embeddings[0].shape

(768,)

In [21]:
embedding_model.to('cuda')

#embed each chunk one by one
for item in tqdm ( pages_and_chunks_over_min_token_len):
    item['embedding'] = embedding_model.encode(item['sentence_chunk'])

  0%|          | 0/2321 [00:00<?, ?it/s]

In [22]:
text_chunks = [item['sentence_chunk'] for item in pages_and_chunks_over_min_token_len ]
text_chunks[419]

'166 Essentials of Obstetrics noticed that serum levels of a few analytes were at different levels in mothers carrying fetuses with Down syndrome when compared with those in the rest of the population.These differences are now used to screen for Down syndrome, trisomy 18 and trisomy 13.The analytes used for screening for Down syn- drome, trisomy 18, and trisomy 13 include the following: • First trimester \x03– E human chorionic gonadotropin (E hCG) – Pregnancy-associated plasma protein A (PAPP-A) • Second trimester – Unconjugated estriol (uE3) – Alpha fetoprotein (AFP) \x03– E hCG – Inhibin A The concentration of each serum marker is expressed as a multiple of the median (MoM) for unaffected pregnancies of the same gestational age.The serum marker is plotted on a graph, and whether it is higher or lower than the MoM of an unaffected pregnancy is calculated.In the first trimester, the level of E\x03hCG is ele- vated and that of PAPP-A is decreased in Down syndrome, but both are decrease

In [23]:
#embed all texts in batches

text_chunk_embeddings = embedding_model.encode(text_chunks, batch_size = 32, convert_to_tensor=True)
text_chunk_embeddings

tensor([[ 0.0044, -0.0619, -0.0407,  ...,  0.0496, -0.0433,  0.0100],
        [ 0.0238, -0.0939, -0.0238,  ...,  0.0238, -0.0796, -0.0013],
        [ 0.0295, -0.0710,  0.0020,  ...,  0.0324, -0.0912, -0.0615],
        ...,
        [ 0.0186, -0.0429, -0.0286,  ...,  0.0353, -0.0520, -0.0138],
        [-0.0084, -0.0046, -0.0285,  ...,  0.0587,  0.0112,  0.0164],
        [ 0.0099,  0.0038, -0.0281,  ...,  0.0140, -0.0102,  0.0551]],
       device='cuda:0')

In [24]:
#save the embeddings into a file

text_chunk_and_embeddings_df = pd.DataFrame(pages_and_chunks_over_min_token_len)
embedding_df_save_path = 'text_chunk_and_embeddings_df.csv'
text_chunk_and_embeddings_df.to_csv(embedding_df_save_path,index=False)

In [25]:
text_chunk_and_embeddings_df_load = pd.read_csv(embedding_df_save_path)
text_chunk_and_embeddings_df_load.head()

Unnamed: 0,page_number,sentence_chunk,chunk_char_count,chunk_word_count,chunk_token_count,embedding
0,-7,"Essentials of Obstetrics Dr Lakshmi Seshadri, ...",416,56,104.0,[ 4.36814316e-03 -6.18959144e-02 -4.07023057e-...
1,-6,Manager Commissioning: P. Sangeetha Consultant...,1568,219,392.0,[ 2.38161758e-02 -9.38664898e-02 -2.38494780e-...
2,-6,Care has been taken to confirm the accuracy of...,1370,187,342.5,[ 2.94985473e-02 -7.10000321e-02 1.95628474e-...
3,-5,A medical student is on a journey of discovery...,1143,180,285.75,[ 3.36700045e-02 -8.12965780e-02 1.46308048e-...
4,-5,Information is easily assimilated only when it...,1183,188,295.75,[ 2.23075021e-02 -9.31098014e-02 2.45771240e-...


In [26]:
#Rag Search and Retrieve answer - Similarity search
#Embeddings can be images, text, sound , etc

In [27]:
import random, torch
import numpy as np
import pandas as pd

device = 'cuda' if torch.cuda.is_available() else 'cpu'

#import text and embeddings df
text_chunk_and_embeddings_df = pd.read_csv('text_chunk_and_embeddings_df.csv')

# Convert embedding column back to np.array
text_chunk_and_embeddings_df['embedding']=text_chunk_and_embeddings_df['embedding'].apply(lambda x:np.fromstring(x.strip('[]'),sep=' '))

#convert embedding into torch.tensor
embeddings = torch.tensor(np.stack(text_chunk_and_embeddings_df['embedding'].tolist(),axis=0), dtype=torch.float32).to(device='cuda')

#convert text and embeddin df to list of dicts
pages_and_chunks = text_chunk_and_embeddings_df.to_dict(orient='records')



In [28]:
embeddings.shape

torch.Size([2321, 768])

Embedding Model

Note: to use dot product for comparision , ensure vector sized are of same shape and tensor are in same data type and device


In [29]:
#Embedding Model
from sentence_transformers import util, SentenceTransformer
embedding_model = SentenceTransformer(model_name_or_path = 'all-mpnet-base-v2',device=device)

In [30]:
embeddings[0].dtype

torch.float32

In [31]:
#Define query
query = 'Puerperal thromboembolism'

#embed query using the same model
query_embedding = embedding_model.encode(query, convert_to_tensor=True).to('cuda')

#Get similarity score with the dot product (use cosine similarity if models aren't normalized)
dot_scores = util.dot_score(a=query_embedding, b=embeddings)[0]

# get the top-k results
top_results_dot_product = torch.topk(dot_scores,k=5)
top_results_dot_product

torch.return_types.topk(
values=tensor([0.6298, 0.6275, 0.6223, 0.6056, 0.6010], device='cuda:0'),
indices=tensor([ 815,  802,  805, 2122, 2100], device='cuda:0'))

In [32]:
pages_and_chunks[815]

{'page_number': 319,
 'sentence_chunk': 'What is puerperium?Describe the complications of puerperium and their management.Short-answer questions 1.Secondary postpartum hemorrhage 2.Prevention of puerperal sepsis 3.Predisposing causes for puerperal sepsis 4.Septic pelvic thrombophlebitis 5.Deep vein thrombosis. Self-Assessment Key points Continued • Treatment is by antibiotics to cover polymicrobial infection.Cultures are not necessary.',
 'chunk_char_count': 398,
 'chunk_word_count': 45,
 'chunk_token_count': 99.5,
 'embedding': array([ 1.98454522e-02, -8.08149204e-03,  6.16196031e-03, -3.48575749e-02,
        -3.86560969e-02,  2.77512316e-02,  1.49132675e-02, -2.74066534e-02,
        -4.45624292e-02, -2.72775088e-02,  2.49636490e-02, -2.36333236e-02,
        -5.82235772e-03,  7.55929872e-02,  1.10855885e-03,  1.47300139e-02,
        -2.50956044e-02,  7.75746396e-03,  1.91731583e-02,  1.12128789e-02,
         2.79571814e-03, -7.84106459e-03, -5.89657202e-02, -2.03975234e-02,
        -2

In [33]:
larger_embeddings = torch.randn(500*embeddings.shape[0], 768).to(device)

dot_scores = util.dot_score(a=query_embedding, b=larger_embeddings)[0]

For 10M+ embeddings , we can use indexing using Faiss

In [34]:
import textwrap

def print_wrapped(text, wrap_length = 80):
    wrapped_text = textwrap.fill(text, wrap_length)
    print(wrapped_text)
    
for score, idx in zip(top_results_dot_product[0], top_results_dot_product[1]):
    print(f'Score: {score:.4f}')
    print("text:")
    print(pages_and_chunks[idx]['sentence_chunk'])
    print(f'Page number: {pages_and_chunks[idx]['page_number']}')
    print("\n")

Score: 0.6298
text:
What is puerperium?Describe the complications of puerperium and their management.Short-answer questions 1.Secondary postpartum hemorrhage 2.Prevention of puerperal sepsis 3.Predisposing causes for puerperal sepsis 4.Septic pelvic thrombophlebitis 5.Deep vein thrombosis. Self-Assessment Key points Continued • Treatment is by antibiotics to cover polymicrobial infection.Cultures are not necessary.
Page number: 319


Score: 0.6275
text:
Breast engorgement, mastitis, and breast abscess These are common causes of puerperal pyrexia and are discussed in Chapter 25, Lactation and breastfeeding.espiratory infections Respiratory infections are usually seen after cesarean section, especially if general anesthesia has been used.Stasis and aspiration are the lead- ing causes of pneumonitis. Differential diagnosis of puerperal pyrexia Arriving at the right diagnosis requires a careful history, physical examination, and appropriate investigations (Table 22.2).Fever, chills, malais

We can improve the order of these results with a reranking model. A model that has been trained specifically to take search results

For text similarity , we use cosine similarity

In [35]:
import torch

def dot_product(vector1,vector2):
    return torch.dot(vector1,vector2)

def cosine_sim(vector1, vector2):
    dot_product = torch.dot(vector1,vector2)
    
    #get euclidean/L2 norm
    norm_vector1 = torch.sqrt(torch.sum(vector1**2))
    norm_vector2 = torch.sqrt(torch.sum(vector2**2))
    
    return dot_product/ (norm_vector1 * norm_vector2)

In [36]:
def retrieve_relevant_resources(query:str,
                                embeddings: torch.tensor,
                                model: SentenceTransformer=embedding_model,
                                n_resources_to_return: int=5,
                                print_time: bool=True):
    #embed query
    query_embedding = model.encode(query, convert_to_tensor=True)
    
    #get dot product
    dot_scores = util.dot_score(query_embedding, embeddings)[0]
    
    scores, indices = torch.topk(input=dot_scores, k=n_resources_to_return)
    
    return scores, indices

def print_top_results_and_scores(query: str,
                                 embeddings: torch.tensor,
                                 pages_and_chunks: list[dict]=pages_and_chunks,
                                 n_resources_to_return:int=5):
    scores, indices = retrieve_relevant_resources(query=query,
                                                  embeddings=embeddings,
                                                  n_resources_to_return=n_resources_to_return)
    
    scores = scores.view(-1).tolist()
    indices = indices.view(-1).tolist()
    print(scores, indices)
    for score, idx in zip(scores, indices):
        print(f"Score: {score:.4f}")
        print("text")
        print_wrapped(pages_and_chunks[idx]['sentence_chunk'])
        print(f'Page Number: {pages_and_chunks[idx]['page_number']}')
        print("\n")

In [37]:
query='psychotic behavior in the postnatal period.'
#retrieve_relevant_resources(query=query, embeddings=embeddings)
print_top_results_and_scores(query=query, embeddings=embeddings)

[0.5791202783584595, 0.5307761430740356, 0.4602738320827484, 0.4580863118171692, 0.4565398097038269] [809, 808, 1386, 1400, 1399]
Score: 0.5791
text
Insomnia, low energy level, loss of appetite and weight, anger, feeling of being
overwhelmed or inadequate, and obsessional thoughts are the usual symptoms.
Treatment is by supportive therapy and antide- pressants.Selective serotonin
reuptake inhibitors are the drugs of choice.Postpartum psychosis Some women
develop delusions, hallucinations, and psychotic behavior in the postnatal
period. Usually, there is a past history of schizophrenia or bipolar
disorder.The condition is uncom- mon but can recur in a subsequent pregnancy.
Hospitalization, antipsychotic therapy, and occa- sionally electroconvulsive
therapy are required. Mental health issues in the puerperium are summarized in
Table 22.3.Box 22.15 Predisposing factors for mental health problems • History
of depression Ŧ Family history Ŧ Past history • Stressful environment Ŧ At home
Ŧ At

Getting LLM
#LLM depends on the VRAM available

In [38]:
!nvidia-smi


Wed Jun 11 21:44:55 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 576.52                 Driver Version: 576.52         CUDA Version: 12.9     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                  Driver-Model | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  NVIDIA GeForce RTX 3060 ...  WDDM  |   00000000:01:00.0 Off |                  N/A |
| N/A   63C    P0             46W /   65W |    4969MiB /   6144MiB |     94%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

Loading Model loading

the LLM token generation can be fastened using flash-attn

In [12]:
from transformers import AutoTokenizer, AutoModelForCausalLM


# Log in using your token (make sure it's kept secure)
token="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"

# Load tokenizer and model (no need for use_auth_token anymore)
tokenizer = AutoTokenizer.from_pretrained("google/gemma-2b-it", token= token)
model = AutoModelForCausalLM.from_pretrained("google/gemma-2b-it", token=token, torch_dtype=torch.bfloat16)


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [18]:
import torch
from transformers import AutoTokenizer, AutoModelForCausalLM
from transformers.utils import is_flash_attn_2_available 

# 1. Create quantization config for smaller model loading (optional)

from transformers import BitsAndBytesConfig
quantization_config = BitsAndBytesConfig(load_in_4bit=True,
                                         bnb_4bit_compute_dtype=torch.float16)

# Bonus: Setup Flash Attention 2 for faster inference, default to "sdpa" or "scaled dot product attention" if it's not available

if (is_flash_attn_2_available()) and (torch.cuda.get_device_capability(0)[0] >= 8):
  attn_implementation = "flash_attention_2"
else:
  attn_implementation = "sdpa"
print(f"[INFO] Using attention implementation: {attn_implementation}")


model_id = 'google/gemma-2b-it'
print(f"[INFO] Using model_id: {model_id}")

# 3. Instantiate tokenizer (tokenizer turns text into numbers ready for the model) 
tokenizer = AutoTokenizer.from_pretrained(pretrained_model_name_or_path=model_id)

# 4. Instantiate the model
llm_model = AutoModelForCausalLM.from_pretrained(pretrained_model_name_or_path=model_id, 
                                                 torch_dtype=torch.float16, # datatype to use, we want float16
                                                 quantization_config=quantization_config if quantization_config else None,
                                                 low_cpu_mem_usage=False, # use full memory 
                                                 attn_implementation=attn_implementation) # which attention version to use

if not quantization_config: # quantization takes care of device setting automatically, so if it's not used, send model to GPU 
    llm_model.to("cuda")

[INFO] Using attention implementation: sdpa
[INFO] Using model_id: google/gemma-2b-it


Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [19]:
llm_model


GemmaForCausalLM(
  (model): GemmaModel(
    (embed_tokens): Embedding(256000, 2048, padding_idx=0)
    (layers): ModuleList(
      (0-17): 18 x GemmaDecoderLayer(
        (self_attn): GemmaAttention(
          (q_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
          (k_proj): Linear4bit(in_features=2048, out_features=256, bias=False)
          (v_proj): Linear4bit(in_features=2048, out_features=256, bias=False)
          (o_proj): Linear4bit(in_features=2048, out_features=2048, bias=False)
        )
        (mlp): GemmaMLP(
          (gate_proj): Linear4bit(in_features=2048, out_features=16384, bias=False)
          (up_proj): Linear4bit(in_features=2048, out_features=16384, bias=False)
          (down_proj): Linear4bit(in_features=16384, out_features=2048, bias=False)
          (act_fn): GELUActivation()
        )
        (input_layernorm): GemmaRMSNorm((2048,), eps=1e-06)
        (post_attention_layernorm): GemmaRMSNorm((2048,), eps=1e-06)
      )
    )
    (n

Generate text with LLM

In [20]:
input_text = "psychotic behavior in the postnatal period."

#create prompt template
dialogue_template = [
    {"role":"user",
     "content":input_text}
]

#appy the chat template
prompt = tokenizer.apply_chat_template(conversation=dialogue_template,
                                       tokenize=False,
                                       add_generation_prompt=True)
print("prompt", prompt)

prompt <bos><start_of_turn>user
psychotic behavior in the postnatal period.<end_of_turn>
<start_of_turn>model



In [21]:
#tokenize the input text and send to GPU
input_ids = tokenizer(prompt,
                      return_tensors = 'pt').to('cuda')

#generate output from  local LLM
outputs = llm_model.generate(**input_ids,
                             max_new_tokens = 256)
print(outputs[0])

tensor([     2,      2,    106,   1645,    108,  46142,    802,   8409,    575,
           573, 204995,   4037, 235265,    107,    108,    106,   2516,    108,
           688,  46342,    802,   8409,    575,    573, 204995,   4037,    688,
           109,    688,  15085,  66058,    109,  46342,    802,   8409,  20604,
           577,    476,   3001,    576,  36796,    674,    708,  26050,    731,
           476,   8863,    576,   3127,  52722, 235269,    476,  72272,    604,
          6364,    578,   3127,  46076, 235269,    578,    476,   8863,    576,
         65654, 235265,   3766,  36796,    798,  17361,    575,   4282,   5742,
        235269,   3359,  53762, 235269,   2011, 235290, 115367, 235269,   4682,
        171540, 235269, 113122, 235269,    578,  15053,  57728,    675,   3588,
        235265,    109,    688,  54020,  66058,    109, 235287,   5231,  85890,
          7549,  66058,   5704,  15904,    674,  22033,   7549,   1554,    476,
          4937,   4731,    575,    573, 

In [22]:
#tokens to text
outputs_decoded = tokenizer.decode(outputs[0])
print(outputs_decoded)

<bos><bos><start_of_turn>user
psychotic behavior in the postnatal period.<end_of_turn>
<start_of_turn>model
**Psychotic behavior in the postnatal period**

**Definition:**

Psychotic behavior refers to a range of behaviors that are characterized by a lack of social competence, a disregard for rules and social norms, and a lack of empathy. These behaviors can manifest in various ways, including aggression, self-harm, tantrums, defiance, and difficulty interacting with others.

**Causes:**

* **Genetic factors:** Research suggests that genetic factors play a significant role in the development of psychotic behavior.
* **Environmental factors:** Trauma, abuse, neglect, and other stressful experiences in the postnatal period can increase the risk of developing psychotic symptoms.
* **Neurodevelopmental disorders:** Conditions such as autism spectrum disorder, attention-deficit hyperactivity disorder (ADHD), and fetal alcohol syndrome can increase the likelihood of psychotic behavior.
* **S