<a href="https://colab.research.google.com/github/imaabay/CA2_Repository/blob/main/method2/gpt_experiments/ST02A_v2.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!pip install langchain langchain-community langchain-openai

Collecting langchain-community
  Downloading langchain_community-0.3.9-py3-none-any.whl.metadata (2.9 kB)
Collecting langchain-openai
  Downloading langchain_openai-0.2.11-py3-none-any.whl.metadata (2.7 kB)
Collecting dataclasses-json<0.7,>=0.5.7 (from langchain-community)
  Downloading dataclasses_json-0.6.7-py3-none-any.whl.metadata (25 kB)
Collecting httpx-sse<0.5.0,>=0.4.0 (from langchain-community)
  Downloading httpx_sse-0.4.0-py3-none-any.whl.metadata (9.0 kB)
Collecting langchain
  Downloading langchain-0.3.9-py3-none-any.whl.metadata (7.1 kB)
Collecting langchain-core<0.4.0,>=0.3.15 (from langchain)
  Downloading langchain_core-0.3.21-py3-none-any.whl.metadata (6.3 kB)
Collecting pydantic-settings<3.0.0,>=2.4.0 (from langchain-community)
  Downloading pydantic_settings-2.6.1-py3-none-any.whl.metadata (3.5 kB)
Collecting tiktoken<1,>=0.7 (from langchain-openai)
  Downloading tiktoken-0.8.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.6 kB)
Collecting m

In [None]:
from langchain_openai import OpenAI, ChatOpenAI
from langchain import PromptTemplate
import os
from google.colab import userdata

os.environ["OPENAI_API_KEY"] = userdata.get('OPENAI_API_KEY')

In [None]:
os.environ["LANGCHAIN_TRACING_V2"] = "true"
os.environ["LANGCHAIN_ENDPOINT"] = "https://api.smith.langchain.com"
os.environ["LANGCHAIN_API_KEY"] = userdata.get("LANGCHAIN_API_KEY")

## Setup Evaluation Metrics

In [None]:
from langchain_openai import ChatOpenAI # Try ChatAnthropic as well
from langchain_core.prompts.prompt import PromptTemplate
from langsmith.evaluation import LangChainStringEvaluator

_PROMPT_TEMPLATE = """
  You are an expert tasked with evaluating the explainability of large language model-generated answers for medical diagnoses.
  Your role is to assess whether the given answers provide sufficient explanation and clarity for a user to understand the medical diagnosis.
  You are assessing the following question:
  {query}
  Here is the real answer:
  {context}
  You are assessing the following predicted answer:
  {result}
"""

PROMPT = PromptTemplate(
    input_variables = ["query", "context", "result"],
    template = _PROMPT_TEMPLATE
)

eval_llm = ChatOpenAI( temperature=0)

evaluators = [
  LangChainStringEvaluator("context_qa", config={"llm": eval_llm, "prompt": PROMPT}),
  LangChainStringEvaluator("labeled_criteria", config={"criteria": "conciseness"}),
  LangChainStringEvaluator("labeled_criteria", config={"criteria": "coherence"}),
  LangChainStringEvaluator("labeled_criteria", config={"criteria": "detail"}),
  LangChainStringEvaluator("labeled_criteria", config={"criteria": "helpfulness"}),
  LangChainStringEvaluator("labeled_criteria", config={"criteria": "depth"}),
  LangChainStringEvaluator("labeled_criteria", config={"criteria": "insensitivity"}),
  LangChainStringEvaluator("labeled_criteria", config={"criteria": "harmfulness"}),

]

## Templates

In [None]:
template_1 = """
You are an AI chatbot designed to assist doctors in diagnosing patients using causal and counterfactual reasoning methods. Your
    role is to provide accurate diagnoses based on the information provided.

    Reason through this step by step based on the following:
      1. Analyse the underlying causes of symptoms while also exploring counterfactuals- what might have occurred
        under different circumstances or the absence of certain symptoms.
      2. Clearly explain the reasoning behind each diagnosis, highliting cause-and-effect relationships and any relevant
        counterfactual scenarios.
      3. For the final diagnosis, present the one that has the highest probability and including the reasoning.

Always ensure that the information you provide is truthful, reliable, and based on established medical knowledge.
"""

In [None]:
template_2 = """
    You are an AI assistant designed to help doctors diagnose patients using causal and counterfactual reasoning methods. Your role
    is to provide accurate diagnoses based on the information provided, following these steps:
    1. Analyse the symptoms:
      a. List the patient's symptoms and relevant medical history
      b. Identify potential underlying causes for each symptom.
      c. Consider how these symptoms might interact or influence each other.
    2. Explore causal relationships:
      a. Create a simple causal graph showing how symptoms and potential causes might be linked.
    3. Consider counterfactuals:
      a. Analyse the underlying causes of symptoms while also exploring counterfactuals- what might have occurred
        under different circumstances or the absence of certain symptoms.
    4. Develop potential diagnoses:
      a. Based on the causal analysis and counterfactual reasoning, list possible diagnoses.
      b. For each diagnosis, explain why it fits the symptoms and causal relationships observed.
    5. Present the most likely diagnosis:
      a. Identify the diagnosis with the highest probability.
      b. Clearly explain why this diagnosis is most likely, referring back to the causal relationships and counterfactual scenarios discussed earlier.
    6. Suggest next steps:
      a. Recommend any additional tests or examinations that could confirm or rule out the diagnosis.
      b. If applicable, suggest potential treatment options, explaining how they address the causal factors identified.

    Remember to always base your analysis on established medical knowledge and current best practices. If there's any uncertainty or if multiple diagnoses seem equally likely, clearly state this and explain why further investigation might be needed

    Reason through this step by step. Please present this in a manner which is clear and clean, use different titles and subtitles.

"""

In [None]:
template_3 = """
   You are an AI chatbot designed to help doctors in diagnosing patients. Your
    role is to provide accurate diagnoses based on the information provided,
    following Pearl's three-layer causal hierarchy.

    Reason through this step by step:

    1. Association (Level 1):
     a. What symptoms does the patient present?
     b. What conditions are commonly associated with these symptoms?
     c. What is the likelihood of each potential diagnosis given the symptoms?

    2. Intervention (Level 2):
     a. What diagnostic tests or interventions would you reccomend to confirm or rule
       out potential diagnoses?
     b. How would the results of these tests affect the likelihood of each diagnoses?
     c. What treatments would you consider, and how might they impact the patient's condition.

    3. Counterfactuals (Level 3):
     a. If a certain symptom were absent, how would that change the diagnosis?
     b. What if the patient has risk factors?
     c. How would the outcome differ if an alternative treatment were chosen?

   Always ensure that the information you provide is truthful, reliable, and based on established medical knowledge.
   Explain your reasoning at each step to make the diagnosis process more transparent and explainable.

"""

## Run Evaluation

In [None]:
import openai

openai_client = openai.Client()

def my_app_v1(question):
   return openai_client.chat.completions.create(
       model="gpt-4o-mini",
       temperature=0,
       messages=[
           {
               "role": "system",
               "content": template_3,
           },
           {
               "role": "user",
               "content": question,
           }
       ],
   ).choices[0].message.content

def langsmith_app(inputs):
  output = my_app_v1(inputs["question"])
  return {"output": output}

In [None]:
from langsmith import evaluate

experiment_results = evaluate(
    langsmith_app, # AI System
    data = "XAI_EVAL",
    evaluators=evaluators,
    experiment_prefix="causal-ai-template3-evaluation-gpt-4o-mini"
)

View the evaluation results for experiment: 'causal-ai-template3-evaluation-gpt-4o-mini-7e8b526f' at:
https://smith.langchain.com/o/683c5cb9-3b64-5127-a5a8-405b032642f2/datasets/b7b8dceb-8703-42f2-92ba-cfa8aad7542b/compare?selectedSessions=8017ef0d-c71f-44e7-959d-833c951f53ed




0it [00:00, ?it/s]

ERROR:langsmith.evaluation._runner:Error running target function: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/langsmith/evaluation/_runner.py", line 1629, in _forward
    fn(
  File "/usr/local/lib/python3.10/dist-packages/langsmith/run_helpers.py", line 614, in wrapper
    raise e
  File "/usr/local/lib/python3.10/dist-packages/langsmith/run_helpers.py", line 611, in wrapper
    function_result = run_container["context"].run(func, *args, **kwargs)
  File "<ipython-input-6-3bf539279eca>", line 22, in langsmith_app
    output = my_app_v1(inputs["question"])
  File "<ipython-input-6-3bf539279eca>", line 6, in my_app_v1
    return openai_client.

KeyboardInterrupt: 