<h2>Features 🚀</h2>
<ul>
  <li>🧙‍♀️ NLP in 2 lines of code with no training data required</li>
  <li>🔨 Easily add one shot, two shot, or few shot examples to the prompt</li>
  <li>✌ Output always provided as a Python object (e.g. list, dictionary) for easy parsing and filtering</li>
  <li>💥 Custom examples and samples can be easily added to the prompt</li>
  <li>💰 Optimized prompts to reduce OpenAI token costs (coming soon)</li>
</ul>

## Load Azure OpenAI API Credentials

In [1]:
# Import necessary modules
import os
from dotenv import load_dotenv, find_dotenv

# Load environment variables from .env file
load_dotenv(find_dotenv())

# Retrieve API credentials from environment variables
api_key = os.environ.get("OPENAI_API_KEY")
api_base = os.environ.get("OPENAI_API_BASE")
api_version = os.environ.get("OPENAI_API_VERSION")
api_type = os.environ.get("OPENAI_API_TYPE")

# Print API credentials
print(f'API_KEY: {api_key}')
print(f'API BASE: {api_base}')
print(f'API VERSION: {api_version}')
print(f'API TYPE: {api_type}')

API_KEY: 3be6ba13cc1f4a16bd5293d8feba2036
API BASE: https://openailx.openai.azure.com/
API VERSION: 2023-08-01-preview
API TYPE: azure


## Define any LLM model (such as GPT-3) ✅

In [2]:
from promptify import Prompter, OpenAI, Pipeline, Azure
import json
import pprint

# Create an instance of the OpenAI model
model = Azure(api_key=api_key, api_base=api_base, api_version=api_version, api_type=api_type, engine='gpt-35-turbo')

# Example sentence for demonstration
sent = "The patient is a 93-year-old female with a medical history of chronic right hip pain, \
osteoporosis, hypertension, depression, and chronic atrial fibrillation admitted for evaluation \
and management of severe nausea and vomiting and urinary tract infection"
print(sent)

The patient is a 93-year-old female with a medical history of chronic right hip pain, osteoporosis, hypertension, depression, and chronic atrial fibrillation admitted for evaluation and management of severe nausea and vomiting and urinary tract infection


### 1: MultiLabel Text Classification Example in 2 Lines of code, with no training data required 🚀

In [3]:
prompter = Prompter('multilabel_classification.jinja')
pipe = Pipeline(prompter, model)
result = pipe.fit(domain          = 'clinical', # it could be any domain such as -> financial, education, biomedical etc
                  text_input      = sent,
                  labels          = None,
                  n_output_labels = 3
                )
try:
    output = result[0].get('text').replace("'", '"')
    output_dict = json.loads(output)
except Exception as e:
    print(f"An unexpected error occurred: {e}") 

# Output
pprint.pprint(output_dict)

100%|██████████| 1/1 [00:03<00:00,  3.39s/it]

[{'1': 'Gastrointestinal Symptoms',
  '2': 'Nausea and Vomiting',
  'branch': 'Evaluation and Management',
  'group': 'Admission',
  'main class': 'Symptoms and Complaints'},
 {'1': 'Musculoskeletal Diseases',
  '2': 'Bone Diseases',
  'branch': 'Medical History',
  'group': 'Chronic',
  'main class': 'Diseases and Disorders'},
 {'1': 'Circulatory and Respiratory Diseases',
  '2': 'Cardiovascular Diseases',
  '3': 'Arrhythmias, Cardiac',
  'branch': 'Medical History',
  'group': 'Chronic',
  'main class': 'Diseases and Disorders'}]





### 2: MultiLabel Text Classification with Custom Classes 🚀

In [4]:
# Case: 2
# If want to perform MultiLabel Text Classification with custom classes only (hangling out-of-bounds prediction) prompt

classes = ['Medicine','Oncology','Metastasis','Breast cancer','Lung cancer','Cerebrospinal fluid','Tumor microenvironment','Single-cell RNA sequencing','Idiopathic intracranial hypertension']

result = pipe.fit(n_output_labels = len(classes),
                  domain          = 'clinical',
                  text_input      = sent,
                  labels          = classes
                )
try:
    output = result[0].get('text').replace("'", '"')
    output_dict = json.loads(output)
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Output
pprint.pprint(output_dict)

100%|██████████| 1/1 [00:02<00:00,  2.68s/it]

[{'1': 'Geriatrics',
  '2': 'Chronic pain',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Geriatrics',
  '2': 'Osteoporosis',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Cardiology',
  '2': 'Atrial fibrillation',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Cardiology',
  '2': 'Hypertension',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Gastroenterology',
  '2': 'Nausea and vomiting',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Urology',
  '2': 'Urinary tract infection',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'}]





### 3 : MultiLabel Text Classification with One shot Example 🚀

In [5]:
# Case: 3
# If want to perform MultiLabel Text Classification wit one shot example adding by default
# Observe The changes in the model's output
# Output will be python object -> [[{"main class":  main classification category, "1": 1st level category, "2": 2nd level category, ...., "branch": sentence branch, "group": group of sentence}]]

one_shot = "Leptomeningeal metastases (LM) occur in patients with breast cancer (BC) and lung cancer (LC). The cerebrospinal fluid (CSF) tumour microenvironment (TME) of LM patients is not well defined at a single-cell level. We did an analysis based on single-cell RNA sequencing (scRNA-seq) data and four patient-derived CSF samples of idiopathic intracranial hypertension (IIH)"
one_shot = [[one_shot, {'main class': 'Health', '1': 'Medicine', '2': 'Oncology', '3': 'Metastasis', '4': 'Breast cancer', '5': 'Lung cancer', '6': 'Cerebrospinal fluid', '7': 'Tumor microenvironment', '8': 'Single-cell RNA sequencing', '9': 'Idiopathic intracranial hypertension', 'branch': 'Health', 'group': 'Clinical medicine'}]]

classes = ['Medicine','Oncology','Metastasis','Breast cancer','Lung cancer','Cerebrospinal fluid','Tumor microenvironment','Single-cell RNA sequencing','Idiopathic intracranial hypertension']

result = pipe.fit(n_output_labels = len(classes),
                  domain          = 'clinical',
                  text_input      = sent,
                  examples        = one_shot,
                  labels          = classes)
try:
    output = result[0].get('text').replace("'", '"')
    output_dict = json.loads(output)
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Output
pprint.pprint(output_dict)

  0%|          | 0/1 [00:22<?, ?it/s]

Error in model execution: RetryError[<Future at 0x7fdd672ae740 state=finished raised RateLimitError>]
An unexpected error occurred: 'NoneType' object is not subscriptable
[{'1': 'Geriatrics',
  '2': 'Chronic pain',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Geriatrics',
  '2': 'Osteoporosis',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Cardiology',
  '2': 'Atrial fibrillation',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Cardiology',
  '2': 'Hypertension',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Gastroenterology',
  '2': 'Nausea and vomiting',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Urology',
  '2': 'Urinary tract infection',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  




### 4: MultiLabel Text Classification with some Domain Knowledge 🚀

In [6]:
# Case : 4
# If want to give some domain knowledge and description in prompt to enhance the output

one_shot = "Leptomeningeal metastases (LM) occur in patients with breast cancer (BC) and lung cancer (LC). The cerebrospinal fluid (CSF) tumour microenvironment (TME) of LM patients is not well defined at a single-cell level. We did an analysis based on single-cell RNA sequencing (scRNA-seq) data and four patient-derived CSF samples of idiopathic intracranial hypertension (IIH)"
one_shot = [[one_shot, {'main class': 'Health', '1': 'Medicine', '2': 'Oncology', '3': 'Metastasis', '4': 'Breast cancer', '5': 'Lung cancer', '6': 'Cerebrospinal fluid', '7': 'Tumor microenvironment', '8': 'Single-cell RNA sequencing', '9': 'Idiopathic intracranial hypertension', 'branch': 'Health', 'group': 'Clinical medicine'}]]
classes = ['Medicine','Oncology','Metastasis','Breast cancer','Lung cancer','Cerebrospinal fluid','Tumor microenvironment','Single-cell RNA sequencing','Idiopathic intracranial hypertension']

result = pipe.fit(n_output_labels = len(classes),
                  domain          = 'clinical',
                  text_input      = sent,
                  examples        = one_shot,
                  description     = "Below Paragraph is from discharge summary of a patient. The Paragraph describes the condition and symptoms of patient.",
                  labels          = classes)
try:
    output = result[0].get('text').replace("'", '"')
    output_dict = json.loads(output)
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Output
pprint.pprint(output_dict)

  0%|          | 0/1 [00:09<?, ?it/s]

Error in model execution: RetryError[<Future at 0x7fdd913ab940 state=finished raised RateLimitError>]
An unexpected error occurred: 'NoneType' object is not subscriptable
[{'1': 'Geriatrics',
  '2': 'Chronic pain',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Geriatrics',
  '2': 'Osteoporosis',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Cardiology',
  '2': 'Atrial fibrillation',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Cardiology',
  '2': 'Hypertension',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Gastroenterology',
  '2': 'Nausea and vomiting',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  'main class': 'Medicine'},
 {'1': 'Urology',
  '2': 'Urinary tract infection',
  'branch': 'Evaluation and management',
  'group': 'Admission',
  




### Name Entity Recognition (NER) Example

In [7]:
# NER example

text_input = """The patient is a 93-year-old female with a medical \
history of chronic right hip pain, osteoporosis, hypertension, depression, and chronic atrial \
fibrillation admitted for evaluation and management	of severe nausea and vomiting and urinary tract 
infection"""

prompter = Prompter('ner.jinja')
pipe = Pipeline(prompter, model)
result = pipe.fit(text_input = text_input,
                  domain = 'medical',
                  labels = None
                )
try:
    output = result[0].get('text').replace("'", '"')
    output_dict = json.loads(output)
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Output
pprint.pprint(output_dict)

100%|██████████| 1/1 [00:26<00:00, 26.88s/it]

[{'E': '93', 'T': 'age'},
 {'E': 'female', 'T': 'gender'},
 {'E': 'chronic right hip pain', 'T': 'medical_condition'},
 {'E': 'osteoporosis', 'T': 'medical_condition'},
 {'E': 'hypertension', 'T': 'medical_condition'},
 {'E': 'depression', 'T': 'medical_condition'},
 {'E': 'chronic atrial fibrillation', 'T': 'medical_condition'},
 {'E': 'severe nausea', 'T': 'symptom'},
 {'E': 'vomiting', 'T': 'symptom'},
 {'E': 'urinary tract infection', 'T': 'medical_condition'},
 {'branch': 'medical', 'group': 'patient'}]





### MultiClass Text Classification Example

In [8]:
# Multiclass text classification example
labels = {'surprise', 'neutral', 'hate', 'joy', 'worry', 'sadness'}

prompter = Prompter('multiclass_classification.jinja')
pipe = Pipeline(prompter, model)
result = pipe.fit(text_input = "The customer service is pretty good but it can be better.",
                  labels = labels
                )
try:
    output = result[0].get('text').replace("'", '"')
    output_dict = json.loads(output)
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Output
pprint.pprint(output_dict)

  0%|          | 0/1 [00:00<?, ?it/s]


TypeError: Object of type set is not JSON serializable

### Binary Text Classification Example

In [9]:
prompter = Prompter('binary_classification.jinja')
pipe = Pipeline(prompter, model)
result = pipe.fit(text_input = "The customer service is pretty good but it can be better.", 
                  label_0 = "positive",
                  label_1 = "negative",
                  model_name = "gpt-3.5-turbo"
                )
try:
    output = result[0].get('text').replace("'", '"')
    output_dict = json.loads(output)
except Exception as e:
    print(f"An unexpected error occurred: {e}")

# Output
pprint.pprint(output_dict)

  0%|          | 0/1 [00:00<?, ?it/s]


TypeError: Prompter.generate() got multiple values for argument 'model_name'