**Importing Haystack library**

In [None]:
%%bash

pip install --upgrade pip
pip install farm-haystack[colab,inference]





**Making data folder and importing files from Google Drive**

---



In [None]:
import os
from google.colab import drive

# Mount Google Drive
drive.mount('/content/gdrive')

# Create a folder named 'data' in Google Colab
folder_name = "data"
folder_path = os.path.join('/content', folder_name)

if not os.path.exists(folder_path):
    os.makedirs(folder_path)

print(f"The '{folder_name}' folder has been created at '{folder_path}'.")

# Specify the file names in your Google Drive
file1_name = "train.json"
file2_name = "test.json"

# Specify the paths to the files in your Google Drive
file1_drive_path = "/content/gdrive/MyDrive/" + file1_name
file2_drive_path = "/content/gdrive/MyDrive/" + file2_name

# Specify the paths to copy the files in Google Colab
file1_colab_path = os.path.join(folder_path, file1_name)
file2_colab_path = os.path.join(folder_path, file2_name)

# Copy the files from Google Drive to Google Colab
!cp "$file1_drive_path" "$file1_colab_path"
!cp "$file2_drive_path" "$file2_colab_path"

print(f"The files '{file1_name}' and '{file2_name}' have been copied to '{folder_path}'.")


Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).
The 'data' folder has been created at '/content/data'.
The files 'train.json' and 'test.json' have been copied to '/content/data'.


In [None]:
from haystack.telemetry import tutorial_running

tutorial_running(2)


**Model to use**

In [None]:
model_name = "dmis-lab/biobert-large-cased-v1.1-squad"

**Importing FARMReader class**

In [None]:
from haystack.nodes import FARMReader

reader = FARMReader(model_name_or_path=model_name , use_gpu=True)

The secret `HF_TOKEN` does not exist in your Colab secrets.
To authenticate with the Hugging Face Hub, create a token in your settings tab (https://huggingface.co/settings/tokens), set it as secret in your Google Colab and restart your session.
You will be able to reuse this secret in all of your notebooks.
Please note that authentication is recommended but still optional to access public models or datasets.
  return self.fget.__get__(instance, owner)()


In [None]:
data_dir = "data/"

**Training Model**

In [None]:
reader.train(data_dir=data_dir, train_filename="train.json", use_gpu=True, n_epochs=4, save_dir="my_model")

Preprocessing dataset: 100%|██████████| 1/1 [00:00<00:00, 12.90 Dicts/s]
Train epoch 0/3 (Cur. train loss: 0.0000):   0%|          | 0/31 [00:00<?, ?it/s]

In [None]:
model = FARMReader(model_name_or_path="my_model")



**Evaluating result**

In [None]:

reader_eval_results = model.eval_on_file("data/", "test.json", device="cuda")

- instead of giving you full control over which labels to use, this method always returns three types of metrics: combined (no suffix), text_answer ('_text_answer' suffix) and no_answer ('_no_answer' suffix) metrics.
- instead of comparing predictions with labels on a string level, this method compares them on a token-ID level. This makes it unable to do any string normalization (e.g. normalize whitespaces) beforehand.
Hence, results might slightly differ from those of `Pipeline.eval()`
.If you are just about starting to evaluate your model consider using `Pipeline.eval()` instead.
Preprocessing dataset: 100%|██████████| 1/1 [00:00<00:00, 34.52 Dicts/s]
Evaluating: 100%|██████████| 2/2 [00:03<00:00,  1.76s/it]


**Scores**

In [None]:
reader_eval_results

{'EM': 50.0,
 'f1': 59.635416666666664,
 'top_n_accuracy': 81.25,
 'top_n': 4,
 'EM_text_answer': 33.33333333333333,
 'f1_text_answer': 46.18055555555556,
 'top_n_accuracy_text_answer': 75.0,
 'top_n_EM_text_answer': 58.333333333333336,
 'top_n_f1_text_answer': 71.18055555555554,
 'Total_text_answer': 12,
 'EM_no_answer': 100.0,
 'f1_no_answer': 100.0,
 'top_n_accuracy_no_answer': 100.0,
 'Total_no_answer': 4}

**Question Answering**

In [None]:
context = """PREOPERATIVE DIAGNOSIS: , Morbid obesity.,POSTOPERATIVE DIAGNOSIS:  ,Morbid obesity.,PROCEDURE: , Laparoscopic antecolic antegastric Roux-en-Y gastric bypass with EEA anastomosis.,ANESTHESIA: , General with endotracheal intubation.,INDICATION FOR PROCEDURE: , This is a 30-year-old female, who has been overweight for many years.  She has tried many different diets, but is unsuccessful.  She has been to our Bariatric Surgery Seminar, received some handouts, and signed the consent.  The risks and benefits of the procedure have been explained to the patient.,PROCEDURE IN DETAIL:  ,The patient was taken to the operating room and placed supine on the operating room table.  All pressure points were carefully padded.  She was given general anesthesia with endotracheal intubation.  SCD stockings were placed on both legs.  Foley catheter was placed for bladder decompression.  The abdomen was then prepped and draped in standard sterile surgical fashion.  Marcaine was then injected through umbilicus.  A small incision was made.  A Veress needle was introduced into the abdomen.  CO2 insufflation was done to a maximum pressure of 15 mmHg.  A 12-mm VersaStep port was placed through the umbilicus.  I then placed a 5-mm port just anterior to the midaxillary line and just subcostal on the right side.  I placed another 5-mm port in the midclavicular line just subcostal on the right side, a few centimeters below and medial to that, I placed a 12-mm VersaStep port.  On the left side, just anterior to the midaxillary line and just subcostal, I placed a 5-mm port.  A few centimeters below and medial to that, I placed a 15-mm port.  I began by lifting up the omentum and identifying the transverse colon and lifting that up and thereby identifying my ligament of Treitz.  I ran the small bowel down approximately 40 cm and divided the small bowel with a white load GIA stapler.  I then divided the mesentery all the way down to the base of the mesentery with a LigaSure device.  I then ran the distal bowel down, approximately 100 cm, and at 100 cm, I made a hole at the antimesenteric portion of the Roux limb and a hole in the antimesenteric portion of the duodenogastric limb, and I passed a 45 white load stapler and fired a stapler creating a side-to-side anastomosis.  I reapproximated the edges of the defect.  I lifted it up and stapled across it with another white load stapler.  I then closed the mesenteric defect with interrupted Surgidac sutures.  I divided the omentum all the way down to the colon in order to create a passageway for my small bowel to go antecolic.  I then put the patient in reverse Trendelenburg.  I placed a liver retractor, identified, and dissected the angle of His.  I then dissected on the lesser curve, approximately 2.5 cm below the gastroesophageal junction, and got into a lesser space.  I fired transversely across the stomach with a 45 blue load stapler.  I then used two fires of the 60 blue load with SeamGuard to go up into my angle of His, thereby creating my gastric pouch.  I then made a hole at the base of the gastric pouch and had Anesthesia remove the bougie and place the OG tube connected to the anvil.  I pulled the anvil into place, and I then opened up my 15-mm port site and passed my EEA stapler.  I passed that in the end of my Roux limb and had the spike come out antimesenteric.  I joined the spike with the anvil and fired a stapler creating an end-to-side anastomosis, then divided across the redundant portion of my Roux limb with a white load GI stapler, and removed it with an Endocatch bag.  I put some additional 2-0 Vicryl sutures in the anastomosis for further security.  I then placed a bowel clamp across the bowel.  I went above and passed an EGD scope into the mouth down to the esophagus and into the gastric pouch.  I distended gastric pouch with air.  There was no air leak seen.  I could pass the scope easily through the anastomosis.  There was no bleeding seen through the scope.  We closed the 15-mm port site with interrupted 0 Vicryl suture utilizing Carter-Thomason.  I copiously irrigated out that incision with about 2 L of saline.  I then closed the skin of all incisions with running Monocryl.  Sponge, instrument, and needle counts were correct at the end of the case.  The patient tolerated the procedure well without any complications."""
ques='Does the patient have any complaints?'
ans = model.predict_on_texts(ques,[context])
ans['answers']

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.60 Batches/s]


[<Answer {'answer': 'overweight', 'type': 'extractive', 'score': 0.9938992261886597, 'context': 'NDICATION FOR PROCEDURE: , This is a 30-year-old female, who has been overweight for many years.  She has tried many different diets, but is unsuccess', 'offsets_in_document': [{'start': 303, 'end': 313}], 'offsets_in_context': [{'start': 70, 'end': 80}], 'document_ids': ['78d4422fa129d25a94d7ed4250af016d'], 'meta': {}}>,
 <Answer {'answer': 'reverse Trendelenburg', 'type': 'extractive', 'score': 1.990109922189731e-05, 'context': 'y for my small bowel to go antecolic.  I then put the patient in reverse Trendelenburg.  I placed a liver retractor, identified, and dissected the ang', 'offsets_in_document': [{'start': 2612, 'end': 2633}], 'offsets_in_context': [{'start': 65, 'end': 86}], 'document_ids': ['78d4422fa129d25a94d7ed4250af016d'], 'meta': {}}>,
 <Answer {'answer': 'bladder decompression.  The abdomen was then prepped and draped in standard sterile surgical fashion.', 'type': 'extracti

In [None]:
ques2='What is the gender of the Patient'
ans = model.predict_on_texts(ques2,[context])
ans['answers']

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.62 Batches/s]


[<Answer {'answer': 'female', 'type': 'extractive', 'score': 0.9382225275039673, 'context': 'otracheal intubation.,INDICATION FOR PROCEDURE: , This is a 30-year-old female, who has been overweight for many years.  She has tried many different ', 'offsets_in_document': [{'start': 282, 'end': 288}], 'offsets_in_context': [{'start': 72, 'end': 78}], 'document_ids': ['78d4422fa129d25a94d7ed4250af016d'], 'meta': {}}>,
 <Answer {'answer': '.', 'type': 'extractive', 'score': 7.0621713348373305e-06, 'context': 'essure of 15 mmHg.  A 12-mm VersaStep port was placed through the umbilicus.  I then placed a 5-mm port just anterior to the midaxillary line and just', 'offsets_in_document': [{'start': 1199, 'end': 1200}], 'offsets_in_context': [{'start': 75, 'end': 76}], 'document_ids': ['78d4422fa129d25a94d7ed4250af016d'], 'meta': {}}>,
 <Answer {'answer': '-', 'type': 'extractive', 'score': 5.905879334022757e-06, 'context': '.  I joined the spike with the anvil and fired a stapler creating an end-t

**Making our Pipeline**

In [None]:

from haystack import Pipeline, Document
from haystack.utils import print_answers

p = Pipeline()
p.add_node(component=model, name="Reader", inputs=["Query"])
res = p.run(
    query=ques2, documents=[Document(content=context)]
)
print_answers(res,details="medium")

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.59 Batches/s]

'Query: What is the gender of the Patient'
'Answers:'
[   {   'answer': 'female',
        'context': 'otracheal intubation.,INDICATION FOR PROCEDURE: , This is '
                   'a 30-year-old female, who has been overweight for many '
                   'years.  She has tried many different ',
        'score': 0.9382225275039673},
    {   'answer': '.',
        'context': 'essure of 15 mmHg.  A 12-mm VersaStep port was placed '
                   'through the umbilicus.  I then placed a 5-mm port just '
                   'anterior to the midaxillary line and just',
        'score': 7.0621713348373305e-06},
    {   'answer': '-',
        'context': '.  I joined the spike with the anvil and fired a stapler '
                   'creating an end-to-side anastomosis, then divided across '
                   'the redundant portion of my Roux lim',
        'score': 5.905879334022757e-06}]



