# Physician Notetaker

Link to Assessment : https://sunset-parrot-38b.notion.site/Physician-Notetaker-1ac7fca3d41680849078deb25228b34f

## Task 1 : Medical NLP Summarization 

**Task:** Implement an NLP pipeline to **extract medical details** from the transcribed conversation.

### **📍 Deliverables:**

1. **Named Entity Recognition (NER):** Extract **Symptoms, Treatment, Diagnosis, Prognosis** using `spaCy` or `transformers`.
2. **Text Summarization:** Convert the transcript into a **structured medical report**.
3. **Keyword Extraction:** Identify **important medical phrases** (e.g., "whiplash injury," "physiotherapy sessions").

**📍 Sample Input (Raw Transcript):**

```
text
CopyEdit
Doctor: How are you feeling today?
Patient: I had a car accident. My neck and back hurt a lot for four weeks.
Doctor: Did you receive treatment?
Patient: Yes, I had ten physiotherapy sessions, and now I only have occasional back pain.

```

**📍 Expected Output (Structured Summary in JSON Format):**

```json
json
CopyEdit
{
  "Patient_Name": "Janet Jones",
  "Symptoms": ["Neck pain", "Back pain", "Head impact"],
  "Diagnosis": "Whiplash injury",
  "Treatment": ["10 physiotherapy sessions", "Painkillers"],
  "Current_Status": "Occasional backache",
  "Prognosis": "Full recovery expected within six months"
}

```

**📍 Questions:**

- How would you handle **ambiguous or missing medical data** in the transcript?
- What **pre-trained NLP models** would you use for medical summarization?



## Task 1 Using Spacy

In [1]:
# !pip uninstall spacy
# !pip uninstall en-core-web-sm
!pip install spacy==3.5.3
!python -m spacy download en_core_web_sm

Defaulting to user installation because normal site-packages is not writeable
Collecting spacy==3.5.3
  Downloading spacy-3.5.3-cp311-cp311-win_amd64.whl.metadata (25 kB)
Downloading spacy-3.5.3-cp311-cp311-win_amd64.whl (12.2 MB)
   ---------------------------------------- 0.0/12.2 MB ? eta -:--:--
   --- ------------------------------------ 1.0/12.2 MB 12.7 MB/s eta 0:00:01
   --- ------------------------------------ 1.0/12.2 MB 12.7 MB/s eta 0:00:01
   --- ------------------------------------ 1.0/12.2 MB 12.7 MB/s eta 0:00:01
   --- ------------------------------------ 1.0/12.2 MB 12.7 MB/s eta 0:00:01
   ------ --------------------------------- 1.8/12.2 MB 1.5 MB/s eta 0:00:07
   ------ --------------------------------- 2.1/12.2 MB 1.8 MB/s eta 0:00:06
   ------ --------------------------------- 2.1/12.2 MB 1.8 MB/s eta 0:00:06
   ------ --------------------------------- 2.1/12.2 MB 1.8 MB/s eta 0:00:06
   ---------- ----------------------------- 3.1/12.2 MB 1.6 MB/s eta 0:00:06
  

ERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.
en-core-sci-md 0.5.3 requires spacy<3.7.0,>=3.6.1, but you have spacy 3.5.3 which is incompatible.
en-core-sci-sm 0.5.1 requires spacy<3.5.0,>=3.4.1, but you have spacy 3.5.3 which is incompatible.
en-core-web-sm 3.6.0 requires spacy<3.7.0,>=3.6.0, but you have spacy 3.5.3 which is incompatible.
scispacy 0.5.5 requires spacy<3.8.0,>=3.7.0, but you have spacy 3.5.3 which is incompatible.


Defaulting to user installation because normal site-packages is not writeable

2025-03-08 23:02:15.261530: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.
2025-03-08 23:02:17.300821: I tensorflow/core/util/port.cc:113] oneDNN custom operations are on. You may see slightly different numerical results due to floating-point round-off errors from different computation orders. To turn them off, set the environment variable `TF_ENABLE_ONEDNN_OPTS=0`.



Collecting en-core-web-sm==3.5.0
  Downloading https://github.com/explosion/spacy-models/releases/download/en_core_web_sm-3.5.0/en_core_web_sm-3.5.0-py3-none-any.whl (12.8 MB)
     ---------------------------------------- 0.0/12.8 MB ? eta -:--:--
     ------ --------------------------------- 2.1/12.8 MB 10.7 MB/s eta 0:00:02
     ------------- -------------------------- 4.5/12.8 MB 10.7 MB/s eta 0:00:01
     ------------------ --------------------- 5.8/12.8 MB 9.8 MB/s eta 0:00:01
     ------------------ --------------------- 6.0/12.8 MB 9.2 MB/s eta 0:00:01
     -------------------- ------------------- 6.6/12.8 MB 6.7 MB/s eta 0:00:01
     ----------------------- ---------------- 7.6/12.8 MB 5.9 MB/s eta 0:00:01
     --------------------------- ------------ 8.7/12.8 MB 5.8 MB/s eta 0:00:01
     ------------------------------- -------- 10.0/12.8 MB 5.9 MB/s eta 0:00:01
     ----------------------------------- ---- 11.3/12.8 MB 5.9 MB/s eta 0:00:01
     -------------------------------

In [2]:
import spacy

nlp = spacy.load('en_core_web_sm')

print(nlp) # Loading the model

<spacy.lang.en.English object at 0x000001BF53048210>


In [3]:
def sanitize_conversation(text:str)->str:
    """
    Sanitizes and cleans the raw conversation text (removes * > symbols)
    """
    text = text.replace("*","")
    text = text.replace(">","")
    return text.strip()


with open("convo.txt","r") as f:
      text = f.read()
cleaned_text = sanitize_conversation(text)
print(cleaned_text)

# Save cleaned text to a text file.
with open("cleaned_convo.txt","w") as f:
    f.write(cleaned_text)
# Run the above snippet once to generate a neat conversation file.

doc = nlp(cleaned_text)
print(doc)

Physician: Good morning, Ms. Jones. How are you feeling today?
 
 
 Patient: Good morning, doctor. Iâ€™m doing better, but I still have some discomfort now and then.
 
 Physician: I understand you were in a car accident last September. Can you walk me through what happened?
 
 Patient: Yes, it was on September 1st, around 12:30 in the afternoon. I was driving from Cheadle Hulme to Manchester when I had to stop in traffic. Out of nowhere, another car hit me from behind, which pushed my car into the one in front.
 
 Physician: That sounds like a strong impact. Were you wearing your seatbelt?
 
 Patient: Yes, I always do.
 
 Physician: What did you feel immediately after the accident?
 
 Patient: At first, I was just shocked. But then I realized I had hit my head on the steering wheel, and I could feel pain in my neck and back almost right away.
 
 Physician: Did you seek medical attention at that time?
 
 Patient: Yes, I went to Moss Bank Accident and Emergency. They checked me over and 

In [None]:
# Lets try downloading pretrained medical usecase models

# Uncomment to run this / install on your system ----
#                                                    |

# !pip install scispacy 
!pip install "https://s3-us-west-2.amazonaws.com/ai2-s2-scispacy/releases/v0.5.1/en_core_sci_scibert-0.5.1.tar.gz"
# en_core_sci_scibert is mainly trained for biomedical usecases, entities

In [6]:
import scispacy
import spacy
# en_core_web_sm is not good for our use case, so lets use en_core_sci_sm here 
nlp = spacy.load("en_core_sci_sm")
doc = nlp(cleaned_text)
print(doc)

  


Physician: Good morning, Ms. Jones. How are you feeling today?
 
 
 Patient: Good morning, doctor. Iâ€™m doing better, but I still have some discomfort now and then.
 
 Physician: I understand you were in a car accident last September. Can you walk me through what happened?
 
 Patient: Yes, it was on September 1st, around 12:30 in the afternoon. I was driving from Cheadle Hulme to Manchester when I had to stop in traffic. Out of nowhere, another car hit me from behind, which pushed my car into the one in front.
 
 Physician: That sounds like a strong impact. Were you wearing your seatbelt?
 
 Patient: Yes, I always do.
 
 Physician: What did you feel immediately after the accident?
 
 Patient: At first, I was just shocked. But then I realized I had hit my head on the steering wheel, and I could feel pain in my neck and back almost right away.
 
 Physician: Did you seek medical attention at that time?
 
 Patient: Yes, I went to Moss Bank Accident and Emergency. They checked me over and 

In [8]:
print(doc.ents)

(Physician, Good morning, Ms. Jones, Patient, Good morning, doctor, I, discomfort, car accident, happened, Patient, afternoon, I, Cheadle, Manchester, I, stop, traffic, nowhere, car, behind, car, Physician, sounds, impact, seatbelt, Patient, accident, Patient, I, realized I, I, pain, neck, right away, Physician, medical attention, Patient, Moss Bank Accident, Emergency, whiplash injury, X-rays, advice, sent, progress, Patient, weeks, neck, back pain, badâ€”I, sleeping, painkillers, regularly, improving, sessions, physiotherapy, stiffness, discomfort, sense, pain, Patient, Itâ€™s not constant, occasional backaches, Thatâ€™, hear, effects, anxiety, driving, difficulty, Patient, nervous driving, I havenâ€™t, emotional issues, accident, impacted, hobbies, Patient, week, I, routine, doing, anything, Thatâ€™s encouraging, ahead, physical examination, check, mobility, lingering pain, Physical Examination Conducted]
 
 Physician, Everything looks, neck, back, movement, thereâ€™s no tenderness,