# Healthcare Lab (Prompt Engineering)

**Learning Objectives:**
  * Practice basic prompt engineering
  * Gain exposure to healthcare related DataSets

## Context of the dataset

### 1. The dataset is consisted of records corresponding to medical events.
### 2. Each medical event is uniquely identified by `MedicalClaim`.
### 3. A given medical event might involve several medical procedures.
### 4. Each medical procedure is uniquely identified by `ClaimItem`
### 5. A given medical procedure is characterized by `PrincipalDiagnosisDesc`,`PrincipalDiagnosis`,`RevenueCodeDesc`, `RevenueCode`, `TypeFlag` and `TotalExpenses`

### 6. Each medical procedure involves: `MemberName`,`MemberID`,`County`,`HospitalName`, `HospitalType`, `StartDate`,`EndDate`


## 0. Library Import

In [1]:
!pip install openai

Collecting openai
  Downloading openai-1.30.3-py3-none-any.whl (320 kB)
[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/320.6 kB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━━━━━━[0m[90m╺[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m112.6/320.6 kB[0m [31m3.4 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m317.4/320.6 kB[0m [31m5.4 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m320.6/320.6 kB[0m [31m4.3 MB/s[0m eta [36m0:00:00[0m
Collecting httpx<1,>=0.23.0 (from openai)
  Downloading httpx-0.27.0-py3-none-any.whl (75 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m75.6/75.6 kB[0m [31m4.4 MB/s[0m eta [36m0:00:00[0m
Collecting httpcore==1.* (from httpx<1,>=0.23.0->openai)
  Downloading httpcore-1.0.5-py3-none-any.whl (77 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m77.9/77.9 kB[0m [31m4.5 MB/s

In [2]:
import pandas as pd
import warnings
import numpy as np
import json

In [3]:
from openai import OpenAI
import os

In [4]:
warnings.simplefilter('ignore')

# 1. You have to get your [OpenAI API Key](https://platform.openai.com/account/api-keys)

In [5]:
# Used by the agent in this tutorial
os.environ["OPENAI_API_KEY"] = 'YOU-NEED-YOUR-OWN-KEY'

In [6]:
MODEL="gpt-4o"

client = OpenAI(
  api_key=os.environ['OPENAI_API_KEY'],  # this is also the default, it can be omitted
)

## 2. Data loading and DataFrame creation
#### We select only 15 rows to reduce queries to OpenAI

In [7]:
HealthCareDataSet=pd.read_csv("https://github.com/thousandoaks/Python4DS-I/raw/main/datasets/HealthcareDataset_PublicRelease.csv",sep=',',parse_dates=['StartDate','EndDate','BirthDate']).sample(15)

In [8]:
HealthCareDataSet

Unnamed: 0,Id,MemberName,MemberID,County,MedicalClaim,ClaimItem,HospitalName,HospitalType,StartDate,EndDate,PrincipalDiagnosisDesc,PrincipalDiagnosis,RevenueCodeDesc,RevenueCode,TypeFlag,BirthDate,TotalExpenses
48814,732871,535d5e17,1a988644,fd218584,1a67afec8a9d75cc,2,446442f4,HOSPITAL,2020-12-11,2020-12-11,Peptic ulcer site unspeci,K27.9,PHARMACY,250.0,ER,1974-09-10,41.139
35269,702853,11c6a7e1,d7bb9874,425a37b2,bbf0c47a0609a3db,5,2148dc02,HOSPITAL,2020-08-14,2020-08-17,Benign neoplasm of transv,D12.3,MEDICAL/SURGICAL SUPPLIES,270.0,INP,1944-09-06,584.5
3165,639718,35831bdf,9903809d,adb3fb00,cfa5263d429f0142,7,ae2f2d9e,HOSPITAL,2020-01-18,2020-01-18,Strain of muscle fascia a,S16.1XXA,EMERGENCY ROOM,450.0,ER,1944-08-21,2097.508
3712,640645,893a4f12,1751db10,02af982d,bed8cceb123d329c,21,38018d16,HOSPITAL,2020-01-31,2020-02-04,Traumatic subdural hemorr,S06.5X0A,CARDIOLOGY: ECHOCARDIOLOGY,483.0,INP,1928-06-06,1179.5
19416,670774,bd771f85,f1651c52,6f943458,2bea0433753012e0,10,6407c38a,HOSPITAL,2020-05-07,2020-05-10,Viral pneumonia unspecifi,J12.9,LABORATORY - CLINICAL DIAGNOSTIC: HEMATOLOGY,305.0,INP,1926-03-31,554.4
13334,658949,ac0ae6c3,3416eff5,02af982d,f8438bd419d5b179,10,ae46acbf,HOSPITAL,2020-03-24,2020-03-26,Poisoning by other drugs,T50.991A,EMERGENCY ROOM,450.0,INP,1963-06-17,3044.972
20707,673253,351727c1,d11ebf84,02af982d,775e09f7d4c40d6e,7,ae46acbf,HOSPITAL,2020-05-19,2020-05-19,Low back pain,M54.5,LABORATORY - CLINICAL DIAGNOSTIC: HEMATOLOGY,305.0,ER,1964-12-02,524.993
47280,729831,fd9b8efc,93e2730e,425a37b2,302fec6dd5442c4c,7,a9bf1474,HOSPITAL,2020-11-08,2020-11-08,Constipation unspecified,K59.00,CT SCAN,350.0,ER,1974-09-26,10712.8
28525,689507,5f1ddbe8,3eae7881,b021dd12,002fd7d73d8060f1,2,cf2a3695,HOSPITAL,2020-07-17,2020-07-23,Trigeminal neuralgia,G50.0,INTERMEDIATE ICU,206.0,INP,1945-09-05,12377.12
5314,644391,d554f613,371a731c,ea48569b,b16447a0f160c2e9,26,761ae146,HOSPITAL,2020-01-07,2020-01-10,Other displaced fracture,S42.492A,EKG/ECG,730.0,INP,1949-09-30,194.642


In [9]:
HealthCareDataSet.info()

<class 'pandas.core.frame.DataFrame'>
Index: 15 entries, 48814 to 33051
Data columns (total 17 columns):
 #   Column                  Non-Null Count  Dtype         
---  ------                  --------------  -----         
 0   Id                      15 non-null     int64         
 1   MemberName              15 non-null     object        
 2   MemberID                15 non-null     object        
 3   County                  15 non-null     object        
 4   MedicalClaim            15 non-null     object        
 5   ClaimItem               15 non-null     int64         
 6   HospitalName            15 non-null     object        
 7   HospitalType            15 non-null     object        
 8   StartDate               15 non-null     datetime64[ns]
 9   EndDate                 15 non-null     datetime64[ns]
 10  PrincipalDiagnosisDesc  15 non-null     object        
 11  PrincipalDiagnosis      15 non-null     object        
 12  RevenueCodeDesc         15 non-null     object    

## 3. Generative AI assisted processing

### 3.1. Classification.
#### Given a Principal Diagnosis Description we need to map it to its corresponding ICD-10 code (https://www.icd10data.com/ICD10CM/Codes)

In [10]:
HealthCareDataSet[['PrincipalDiagnosisDesc','PrincipalDiagnosis']]

Unnamed: 0,PrincipalDiagnosisDesc,PrincipalDiagnosis
48814,Peptic ulcer site unspeci,K27.9
35269,Benign neoplasm of transv,D12.3
3165,Strain of muscle fascia a,S16.1XXA
3712,Traumatic subdural hemorr,S06.5X0A
19416,Viral pneumonia unspecifi,J12.9
13334,Poisoning by other drugs,T50.991A
20707,Low back pain,M54.5
47280,Constipation unspecified,K59.00
28525,Trigeminal neuralgia,G50.0
5314,Other displaced fracture,S42.492A


In [11]:
def icd10_encoder(text):

    prompt= f"""
    Given the following text, find the corresponsding medical ICD10 code: ```{text}```
    """


    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

In [12]:

HealthCareDataSet['PrincipalDiagnosis_ICD10']=HealthCareDataSet['PrincipalDiagnosisDesc'].apply(lambda x:icd10_encoder(x))

In [13]:
pd.options.display.max_colwidth = 250
HealthCareDataSet[['PrincipalDiagnosisDesc','PrincipalDiagnosis_ICD10']]

Unnamed: 0,PrincipalDiagnosisDesc,PrincipalDiagnosis_ICD10
48814,Peptic ulcer site unspeci,"The text ""Peptic ulcer site unspeci"" appears to be an incomplete description of a medical condition. However, based on the information provided, it seems to refer to a peptic ulcer with an unspecified site. The corresponding ICD-10 code for a pep..."
35269,Benign neoplasm of transv,"The text ""Benign neoplasm of transv"" appears to be an incomplete description of a medical condition. However, based on the given information, it seems to refer to a benign neoplasm (non-cancerous tumor) of a specific location. If we assume ""trans..."
3165,Strain of muscle fascia a,"The text ""Strain of muscle fascia"" corresponds to the ICD-10 code **S39.012**. This code is used for ""Strain of muscle, fascia and tendon of lower back."" If the strain is located in a different part of the body, the specific code may vary. For a ..."
3712,Traumatic subdural hemorr,"The corresponding ICD-10 code for ""Traumatic subdural hemorrhage"" is **S06.5X0**. This code is used to classify traumatic subdural hemorrhage without loss of consciousness. If there are additional details such as the duration of loss of conscious..."
19416,Viral pneumonia unspecifi,"The corresponding ICD-10 code for ""Viral pneumonia, unspecified"" is **J12.9**."
13334,Poisoning by other drugs,"The ICD-10 code for ""Poisoning by other drugs"" is **T50.9**. This code falls under the category of ""Poisoning by, adverse effect of and underdosing of other and unspecified drugs, medicaments and biological substances."""
20707,Low back pain,"The corresponding ICD-10 code for ""Low back pain"" is **M54.5**."
47280,Constipation unspecified,"The corresponding ICD-10 code for ""Constipation, unspecified"" is **K59.00**."
28525,Trigeminal neuralgia,"The corresponding ICD-10 code for ""Trigeminal neuralgia"" is **G50.0**."
5314,Other displaced fracture,"The corresponding ICD-10 code for ""Other displaced fracture"" is **S82.899**. This code is used for unspecified fractures of the lower leg, including the tibia and fibula, that are displaced. However, the exact code may vary depending on the speci..."


### 3.2. Classification.
#### Given a Principal Diagnosis Description we want to determine the severity of the medical condition on a 1 to 5 scale.

In [14]:
HealthCareDataSet[['PrincipalDiagnosisDesc','PrincipalDiagnosis']]

Unnamed: 0,PrincipalDiagnosisDesc,PrincipalDiagnosis
48814,Peptic ulcer site unspeci,K27.9
35269,Benign neoplasm of transv,D12.3
3165,Strain of muscle fascia a,S16.1XXA
3712,Traumatic subdural hemorr,S06.5X0A
19416,Viral pneumonia unspecifi,J12.9
13334,Poisoning by other drugs,T50.991A
20707,Low back pain,M54.5
47280,Constipation unspecified,K59.00
28525,Trigeminal neuralgia,G50.0
5314,Other displaced fracture,S42.492A


In [15]:
def severity_classifier(text):

    prompt= f"""
    The following text represents a given medical condition coded in ICD-10 format, your task is to determine on a scale from 0 to 5 the level of severity: ```{text}```
    """


    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

In [16]:
severity_classifier("R55")

'The ICD-10 code "R55" corresponds to "Syncope and collapse." Syncope, commonly known as fainting, is a temporary loss of consciousness usually related to insufficient blood flow to the brain. It is often a benign condition but can sometimes indicate a more serious underlying issue.\n\nOn a scale from 0 to 5, where 0 represents no severity and 5 represents extreme severity, I would rate the severity of "R55" as follows:\n\n**Severity Level: 2**\n\n**Rationale:**\n- **Mild to Moderate Severity:** Syncope can be alarming and may require medical evaluation to rule out serious conditions, but it is often not life-threatening on its own.\n- **Potential for Serious Underlying Conditions:** While syncope itself is usually not severe, it can be a symptom of more serious cardiovascular or neurological conditions that may require further investigation.\n- **Impact on Daily Life:** Frequent episodes of syncope can impact daily activities and quality of life, necessitating medical attention to man

In [17]:
severity_classifier('N13.2')

'ICD-10 code N13.2 corresponds to "Hydronephrosis with renal and ureteral calculous obstruction." This condition involves the swelling of a kidney due to a build-up of urine, which is caused by an obstruction from kidney stones in both the kidney and the ureter.\n\nOn a scale from 0 to 5, where 0 represents no severity and 5 represents the highest severity, I would rate this condition as a **3**. \n\nHere\'s the reasoning:\n- **Moderate Severity**: Hydronephrosis with renal and ureteral calculous obstruction can cause significant pain and discomfort, and if left untreated, it can lead to serious complications such as kidney damage or infection.\n- **Potential for Serious Complications**: While it is a serious condition, it is often treatable with medical intervention such as the removal of the obstruction, pain management, and sometimes surgery.\n- **Impact on Health**: The condition can have a considerable impact on a patient\'s health and quality of life, but with appropriate treatme

In [18]:
def severity_classifier_json(text):

    prompt= f"""
      You are a medical service AI assistant.
      Your task are:

      First, Determine the level of severity, on a scale from 0 to 5, of the given medical condition coded in ICD-10 format: ```{text}```
      Second, provide an explanation of the given severity level.

      Structure the response as a JSON consisted of two fields: the level of severity and the explanation
      """


    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=0,
        response_format={ "type": "json_object" }
    )
    return json.loads(response.choices[0].message.content)

In [19]:
severity_classifier_json('R07.89')

{'level_of_severity': 2,
 'explanation': "The ICD-10 code R07.89 corresponds to 'Other chest pain.' This condition can vary in severity depending on the underlying cause. While it is not immediately life-threatening like a heart attack (which would be coded differently), it can still be a sign of a potentially serious condition such as angina, pleuritis, or musculoskeletal issues. Therefore, it warrants medical evaluation but is generally considered moderate in severity unless accompanied by other alarming symptoms."}

In [20]:
HealthCareDataSet['PrincipalDiagnosis_Severity']=HealthCareDataSet['PrincipalDiagnosis'].apply(lambda x:severity_classifier_json(x))

In [21]:
HealthCareDataSet['PrincipalDiagnosis_Severity']

48814    {'level_of_severity': 3, 'explanation': 'The ICD-10 code K27.9 refers to a 'Peptic ulcer, site unspecified, unspecified as acute or chronic, without hemorrhage or perforation.' This condition involves the development of ulcers in the stomach lini...
35269    {'level_of_severity': 2, 'explanation': 'The ICD-10 code D12.3 refers to a benign neoplasm of the transverse colon. This condition is generally considered to be of low to moderate severity. While benign neoplasms are non-cancerous and typically d...
3165     {'level_of_severity': 2, 'explanation': 'The ICD-10 code S16.1XXA refers to 'Strain of muscle, fascia and tendon at neck level, initial encounter.' This condition typically involves a muscle strain in the neck area, which can cause pain and disco...
3712     {'level_of_severity': 4, 'explanation': 'The ICD-10 code S06.5X0A refers to a 'Traumatic subdural hemorrhage without loss of consciousness, initial encounter.' This condition involves bleeding between the brain and

In [22]:

HealthCareDataSet['PrincipalDiagnosis_Severity'].apply(lambda x:x['level_of_severity'])

48814    3
35269    2
3165     2
3712     4
19416    3
13334    3
20707    2
47280    1
28525    2
5314     3
50538    5
41205    2
3953     1
24052    2
33051    3
Name: PrincipalDiagnosis_Severity, dtype: int64

In [23]:

HealthCareDataSet['PrincipalDiagnosis_Severity'].apply(lambda x:x['explanation'])

48814    The ICD-10 code K27.9 refers to a 'Peptic ulcer, site unspecified, unspecified as acute or chronic, without hemorrhage or perforation.' This condition involves the development of ulcers in the stomach lining or the upper part of the small intesti...
35269    The ICD-10 code D12.3 refers to a benign neoplasm of the transverse colon. This condition is generally considered to be of low to moderate severity. While benign neoplasms are non-cancerous and typically do not spread to other parts of the body, ...
3165     The ICD-10 code S16.1XXA refers to 'Strain of muscle, fascia and tendon at neck level, initial encounter.' This condition typically involves a muscle strain in the neck area, which can cause pain and discomfort but is generally not life-threateni...
3712     The ICD-10 code S06.5X0A refers to a 'Traumatic subdural hemorrhage without loss of consciousness, initial encounter.' This condition involves bleeding between the brain and its outermost covering (the dura), which

### 3.3. Translation
#### Given a Revenue Code Description we want to hsve it translated into Spanish

In [24]:
HealthCareDataSet

Unnamed: 0,Id,MemberName,MemberID,County,MedicalClaim,ClaimItem,HospitalName,HospitalType,StartDate,EndDate,PrincipalDiagnosisDesc,PrincipalDiagnosis,RevenueCodeDesc,RevenueCode,TypeFlag,BirthDate,TotalExpenses,PrincipalDiagnosis_ICD10,PrincipalDiagnosis_Severity
48814,732871,535d5e17,1a988644,fd218584,1a67afec8a9d75cc,2,446442f4,HOSPITAL,2020-12-11,2020-12-11,Peptic ulcer site unspeci,K27.9,PHARMACY,250.0,ER,1974-09-10,41.139,"The text ""Peptic ulcer site unspeci"" appears to be an incomplete description of a medical condition. However, based on the information provided, it seems to refer to a peptic ulcer with an unspecified site. The corresponding ICD-10 code for a pep...","{'level_of_severity': 3, 'explanation': 'The ICD-10 code K27.9 refers to a 'Peptic ulcer, site unspecified, unspecified as acute or chronic, without hemorrhage or perforation.' This condition involves the development of ulcers in the stomach lini..."
35269,702853,11c6a7e1,d7bb9874,425a37b2,bbf0c47a0609a3db,5,2148dc02,HOSPITAL,2020-08-14,2020-08-17,Benign neoplasm of transv,D12.3,MEDICAL/SURGICAL SUPPLIES,270.0,INP,1944-09-06,584.5,"The text ""Benign neoplasm of transv"" appears to be an incomplete description of a medical condition. However, based on the given information, it seems to refer to a benign neoplasm (non-cancerous tumor) of a specific location. If we assume ""trans...","{'level_of_severity': 2, 'explanation': 'The ICD-10 code D12.3 refers to a benign neoplasm of the transverse colon. This condition is generally considered to be of low to moderate severity. While benign neoplasms are non-cancerous and typically d..."
3165,639718,35831bdf,9903809d,adb3fb00,cfa5263d429f0142,7,ae2f2d9e,HOSPITAL,2020-01-18,2020-01-18,Strain of muscle fascia a,S16.1XXA,EMERGENCY ROOM,450.0,ER,1944-08-21,2097.508,"The text ""Strain of muscle fascia"" corresponds to the ICD-10 code **S39.012**. This code is used for ""Strain of muscle, fascia and tendon of lower back."" If the strain is located in a different part of the body, the specific code may vary. For a ...","{'level_of_severity': 2, 'explanation': 'The ICD-10 code S16.1XXA refers to 'Strain of muscle, fascia and tendon at neck level, initial encounter.' This condition typically involves a muscle strain in the neck area, which can cause pain and disco..."
3712,640645,893a4f12,1751db10,02af982d,bed8cceb123d329c,21,38018d16,HOSPITAL,2020-01-31,2020-02-04,Traumatic subdural hemorr,S06.5X0A,CARDIOLOGY: ECHOCARDIOLOGY,483.0,INP,1928-06-06,1179.5,"The corresponding ICD-10 code for ""Traumatic subdural hemorrhage"" is **S06.5X0**. This code is used to classify traumatic subdural hemorrhage without loss of consciousness. If there are additional details such as the duration of loss of conscious...","{'level_of_severity': 4, 'explanation': 'The ICD-10 code S06.5X0A refers to a 'Traumatic subdural hemorrhage without loss of consciousness, initial encounter.' This condition involves bleeding between the brain and its outermost covering (the dur..."
19416,670774,bd771f85,f1651c52,6f943458,2bea0433753012e0,10,6407c38a,HOSPITAL,2020-05-07,2020-05-10,Viral pneumonia unspecifi,J12.9,LABORATORY - CLINICAL DIAGNOSTIC: HEMATOLOGY,305.0,INP,1926-03-31,554.4,"The corresponding ICD-10 code for ""Viral pneumonia, unspecified"" is **J12.9**.","{'level_of_severity': 3, 'explanation': 'The ICD-10 code J12.9 corresponds to 'Viral pneumonia, unspecified'. Pneumonia is an infection that inflames the air sacs in one or both lungs, which can fill with fluid or pus. Viral pneumonia can range f..."
13334,658949,ac0ae6c3,3416eff5,02af982d,f8438bd419d5b179,10,ae46acbf,HOSPITAL,2020-03-24,2020-03-26,Poisoning by other drugs,T50.991A,EMERGENCY ROOM,450.0,INP,1963-06-17,3044.972,"The ICD-10 code for ""Poisoning by other drugs"" is **T50.9**. This code falls under the category of ""Poisoning by, adverse effect of and underdosing of other and unspecified drugs, medicaments and biological substances.""","{'level_of_severity': 3, 'explanation': 'The ICD-10 code T50.991A refers to 'Poisoning by other drugs, medicaments and biological substances, accidental (unintentional), initial encounter.' This condition indicates an accidental poisoning event, ..."
20707,673253,351727c1,d11ebf84,02af982d,775e09f7d4c40d6e,7,ae46acbf,HOSPITAL,2020-05-19,2020-05-19,Low back pain,M54.5,LABORATORY - CLINICAL DIAGNOSTIC: HEMATOLOGY,305.0,ER,1964-12-02,524.993,"The corresponding ICD-10 code for ""Low back pain"" is **M54.5**.","{'level_of_severity': 2, 'explanation': 'The ICD-10 code M54.5 corresponds to 'Low back pain.' This condition is generally considered to be of moderate severity. While it can cause significant discomfort and impact daily activities, it is often m..."
47280,729831,fd9b8efc,93e2730e,425a37b2,302fec6dd5442c4c,7,a9bf1474,HOSPITAL,2020-11-08,2020-11-08,Constipation unspecified,K59.00,CT SCAN,350.0,ER,1974-09-26,10712.8,"The corresponding ICD-10 code for ""Constipation, unspecified"" is **K59.00**.","{'level_of_severity': 1, 'explanation': 'The ICD-10 code K59.00 corresponds to 'Constipation, unspecified.' This condition is generally considered to be of low severity. While it can cause discomfort and may require treatment, it is typically not..."
28525,689507,5f1ddbe8,3eae7881,b021dd12,002fd7d73d8060f1,2,cf2a3695,HOSPITAL,2020-07-17,2020-07-23,Trigeminal neuralgia,G50.0,INTERMEDIATE ICU,206.0,INP,1945-09-05,12377.12,"The corresponding ICD-10 code for ""Trigeminal neuralgia"" is **G50.0**.","{'level_of_severity': 2, 'explanation': 'The ICD-10 code G50.0 corresponds to Trigeminal Neuralgia. This condition is characterized by sudden, severe, and stabbing pain in the distribution of the trigeminal nerve, typically affecting one side of ..."
5314,644391,d554f613,371a731c,ea48569b,b16447a0f160c2e9,26,761ae146,HOSPITAL,2020-01-07,2020-01-10,Other displaced fracture,S42.492A,EKG/ECG,730.0,INP,1949-09-30,194.642,"The corresponding ICD-10 code for ""Other displaced fracture"" is **S82.899**. This code is used for unspecified fractures of the lower leg, including the tibia and fibula, that are displaced. However, the exact code may vary depending on the speci...","{'level_of_severity': 3, 'explanation': 'The ICD-10 code S42.492A refers to a 'Displaced fracture of lower end of left humerus, initial encounter for closed fracture.' This type of fracture is considered moderately severe because it involves a di..."


In [25]:
def universal_translator(text, language):

    prompt= f"""
    Translate the following  text to {language} : ```{text}```
    """


    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

In [26]:
HealthCareDataSet['RevenueCodeDesc'].apply(lambda x:universal_translator(x,'Spanish'))

48814                                                                  ```FARMACIA```
35269                                           ```SUMINISTROS MÉDICOS/QUIRÚRGICOS```
3165                                                        ```SALA DE EMERGENCIAS```
3712                                                      CARDIOLOGÍA: ECOCARDIOLOGÍA
19416                                  LABORATORIO - DIAGNÓSTICO CLÍNICO: HEMATOLOGÍA
13334                                                       ```SALA DE EMERGENCIAS```
20707                                  LABORATORIO - DIAGNÓSTICO CLÍNICO: HEMATOLOGÍA
47280    La traducción de "CT SCAN" al español es "TAC" o "Tomografía Computarizada".
28525                                                            ```UCI INTERMEDIA```
5314                                                                    ```ECG/EKG```
50538                                         TERAPIA FÍSICA: EVALUACIÓN/REEVALUACIÓN
41205                                     LABORATORIO 

### 3.3. Text Processing
#### Given a Date we need to compute the current age of the patient

In [27]:
HealthCareDataSet['BirthDate']

48814   1974-09-10
35269   1944-09-06
3165    1944-08-21
3712    1928-06-06
19416   1926-03-31
13334   1963-06-17
20707   1964-12-02
47280   1974-09-26
28525   1945-09-05
5314    1949-09-30
50538   1943-07-18
41205   1938-09-24
3953    1954-10-22
24052   1931-09-01
33051   1947-04-21
Name: BirthDate, dtype: datetime64[ns]

In [28]:
def age_calculator(birthdate):

    prompt= f"""
    Given the following birthdate compute the current age of the person: ```{birthdate}```

    """


    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

In [29]:
age_calculator('1963-01-01')

"To compute the current age of a person born on January 1, 1963, you can follow these steps:\n\n1. Determine the current date.\n2. Subtract the birth year from the current year.\n3. Adjust for whether the person has already had their birthday this year.\n\nLet's assume today's date is October 5, 2023.\n\n1. Current date: October 5, 2023\n2. Birthdate: January 1, 1963\n\nFirst, calculate the difference in years:\n2023 - 1963 = 60 years\n\nNext, check if the person has already had their birthday this year. Since January 1 has already passed by October 5, the person has had their birthday this year.\n\nTherefore, the person is 60 years old."

In [30]:
def age_calculator(birthdate):

    prompt= f"""
    Given the following birthdate compute the current age of the person: ```{birthdate}```
    Provide just the figure
    """


    messages = [{"role": "user", "content": prompt}]
    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        temperature=0
    )
    return response.choices[0].message.content

In [31]:
age_calculator('1963-01-01')

'60'

In [32]:
HealthCareDataSet['BirthDate'].apply(lambda x:age_calculator(x))

48814    49
35269    79
3165     79
3712     95
19416    97
13334    60
20707    58
47280    49
28525    78
5314     74
50538    80
41205    85
3953     69
24052    92
33051    76
Name: BirthDate, dtype: object