In [None]:
import google.generativeai as genai
import time

client = genai.configure(api_key='***')

model=genai.GenerativeModel(
  model_name="gemini-2.5-flash-preview-05-20",
  system_instruction="""
You are a biomedical language expert tasked with converting paragraphs from PubMed articles into a structured, intermediate causal representation using JSON.
This representation will capture all causal relationships found in the text.

Extraction Rules:

1. Output Structure:
   - Each causal relation must be represented as a separate JSON object.
   - Each relation must contain:
     - A single effect (as a string)
     - One or more causes (as a list of strings)

2. Cause Normalization:
   - Do NOT differentiate between causes and conditional features. For example, phrases like “during fasting” or “in presence of infection” must be merged with their associated cause.
   - Treat temporality, context, and conditions as integral parts of the cause if they appear together.

3. Logic Operators:
   Each relation must include a "logic" field describing interaction between causes:
   - "-" : for a single direct cause
   - "NOT" : for a single cause with negation
   - "AND", "OR", "XOR", "NAND", "NOR", "NXOR" : for multiple interacting causes
   Operator Meanings:
   - "-" (No operator): Use this when there is only one cause (i.e., direct causality without logical interaction).
   - "NOT": The single cause must be **negated** — meaning the effect occurs **when the cause is absent or false**.
   - "AND": The effect occurs **only when all causes are present** simultaneously.
   - "OR": The effect occurs **if at least one of the causes is present**.
   - "XOR" (exclusive OR): The effect occurs **if exactly one cause is present**, but **not more than one**.
   - "NAND": The effect occurs **unless all causes are present together** (i.e., **inhibition** of the effect when all causes are true).
   - "NOR": The effect occurs **only when none of the causes are present**.
   - "NXOR" (exclusive NOR): The effect occurs **when all causes are either present or all absent** — i.e., the causes must be logically aligned in state.
   Additional Notes:
   - When using any negation-based logic ("NOT", "NAND", "NOR", "NXOR"), each event must still be positively defined. Capture negative relations with those logic operators.
   - For example, use "elevated levels of ALP" as a cause, and apply "NOT" logic — instead of writing "non-elevated ALP" or "absence of ALP elevation" directly in the cause.
   - Do not encode negation or absence directly in the cause text — negation must always be expressed using the "logic" field.

4. Probability Reference:
   - If the sentence includes a word or phrase expressing likelihood or certainty (such as: may, might, likely, suggests, probably, certainly, often, etc.), put it in the "linguistic_probability_reference" field.
   - If no such expression exists, write "-" for the field. 
   - Do not write causative expressions as probability references, since a realtion (the JSON object) carries that meaning by itself.
   - Do not write any expression without a probabilistic meaning as a probability reference.

4a. Probability Scope Precision:
   - When a probability term (e.g., “often”, “likely”, “may”) is present, apply it only to the cause(s) directly modified by that expression in the sentence.
   - Do NOT assign a shared probability to a group of causes unless all of them are clearly under the same probabilistic modifier.
   - If only one of the causes is modified by a probability term and others are not, split them into separate JSON objects.

5. Effect Decomposition:
   - If a sentence includes multiple effects (e.g., a list of outcomes), decompose them into atomic parts and create a separate JSON object for each effect.
   - Do NOT include multiple effects in a single JSON object.

6. Hidden Meaning Inference:
   - Do not extract surface mentions like “bilirubin” or “ALT” without context.
   - Instead, rewrite both causes and effects to reflect their full inferred meaning. For example, if the sentence says “bilirubin”, and context implies elevation, extract “raised levels of bilirubin”.
   - Apply the same principle for both causes and effects.

7. Event Discrimination and Atomic Causal Elements:
- Every item in the "causes" list and the "effect" field must represent a **single, atomic causal event**.
- If a sentence contains multiple events — including measurements of different variables or a relationship between them — each event must be **explicitly separated** and encoded as its own causal element.
- For example, if a phrase expresses three atomic events, all three should be captured as distinct elements in the "causes" list.
- Avoid merging multiple events into a single string. Use logic gates (e.g., "AND") to connect atomic events, not to merge partial descriptions.

8. Effect Atomicity and Relation Duplication:
- The "effect" field must always contain a single, atomic causal event.
- If the effect describes a pattern involving multiple variables or outcomes (e.g., “ALP and GGT fluctuate”), split it into separate causal relations — one per variable or outcome.
- Unlike "causes", "effect" cannot contain logic gates or multiple event references. Duplicate the relation with the same causes and probabilistic reference but separate effects.

Output Schema:
Each relation must be formatted as:

{
  "causes": ["descriptive cause phrase", ...],
  "logic": "AND | OR | NOT | XOR | NOR | NAND | NXOR | -",
  "linguistic_probability_reference": "e.g., may | likely | certainly | -",
  "effect": "descriptive effect phrase"
}

Example 1 Input Sentence:
"Mechanical biliary obstruction results in raised levels of ALP, GGT and often bilirubin."

Example 1 Output:

{
  "causes": ["Mechanical biliary obstruction"],
  "logic": "-",
  "linguistic_probability_reference": "-",
  "effect": "raised levels of ALP"
},
{
  "causes": ["Mechanical biliary obstruction"],
  "logic": "-",
  "linguistic_probability_reference": "-",
  "effect": "raised levels of GGT"
},
{
  "causes": ["Mechanical biliary obstruction"],
  "logic": "-",
  "linguistic_probability_reference": "often",
  "effect": "raised levels of bilirubin"
}

Example 2 Input Sentence:
"Elevation of troponin levels is more likely in patients with myocardial infarction during early reperfusion."

Example 2 Output:

{
  "causes": ["myocardial infarction", "early reperfusion phase"],
  "logic": "AND",
  "linguistic_probability_reference": "more likely",
  "effect": "elevated levels of troponin"
}

Task:
Given the following paragraph from a PubMed article, extract all causal relations and represent them as independent JSON objects following the rules above.

Input Text:
"""
)

In [68]:
response = model.generate_content("""Alcohol induces hepatic enzymes leading to a raised GGT
with an ALP which may be normal, or disproportionately
lower than the GGT. A GGT:ALP ratio >2.5 in association
with jaundice suggests alcohol as a cause of liver disease.10;11
The presence of a macrocytosis, due to either an associated
dietary deficiency of folate or B12, or due to a direct
suppression of bone marrow by alcohol is supportive of the
diagnosis of alcoholic liver disease. A raised GGT is not
diagnostic of alcohol abuse, with research showing it remains
high in former drinkers as well as current drinkers. In men,
the highest levels of GGT occur in those who drink daily. In
women, binge drinkers and those consuming alcohol without
food will have especially high levels. The level of GGT is
loosely dose dependant, with those in the top two quartiles
of alcohol intake having the highest titres.""")

In [69]:
import re
import json

raw_json = response.text
raw_json = re.sub(r"^```json\n|\n```$", "", raw_json.strip())

# Step 2: Replace Python tuples `()` with valid JSON arrays `[]`
raw_json = raw_json.replace("(", "[").replace(")", "]")

# Step 3: Parse into a JSON object
try:
    parsed_json = json.loads(raw_json)
    print(json.dumps(parsed_json, indent=4))  # Pretty print JSON
except json.JSONDecodeError as e:
    print("JSON Decode Error:", e)
    print("Fixed JSON String:", raw_json)

[
    {
        "causes": [
            "Alcohol consumption"
        ],
        "logic": "-",
        "linguistic_probability_reference": "-",
        "effect": "induction of hepatic enzymes"
    },
    {
        "causes": [
            "induction of hepatic enzymes"
        ],
        "logic": "-",
        "linguistic_probability_reference": "-",
        "effect": "raised levels of GGT"
    },
    {
        "causes": [
            "Alcohol consumption"
        ],
        "logic": "-",
        "linguistic_probability_reference": "may",
        "effect": "normal levels of ALP"
    },
    {
        "causes": [
            "Alcohol consumption"
        ],
        "logic": "-",
        "linguistic_probability_reference": "may",
        "effect": "disproportionately lower levels of ALP compared to GGT"
    },
    {
        "causes": [
            "GGT:ALP ratio >2.5",
            "presence of jaundice"
        ],
        "logic": "AND",
        "linguistic_probability_reference": "suggests

In [None]:
"""l = genai.list_models()
for i in l:
    print(i.name)"""

models/embedding-gecko-001
models/gemini-1.5-pro-latest
models/gemini-1.5-pro-002
models/gemini-1.5-pro
models/gemini-1.5-flash-latest
models/gemini-1.5-flash
models/gemini-1.5-flash-002
models/gemini-1.5-flash-8b
models/gemini-1.5-flash-8b-001
models/gemini-1.5-flash-8b-latest
models/gemini-2.5-pro-preview-03-25
models/gemini-2.5-flash-preview-05-20
models/gemini-2.5-flash
models/gemini-2.5-flash-lite-preview-06-17
models/gemini-2.5-pro-preview-05-06
models/gemini-2.5-pro-preview-06-05
models/gemini-2.5-pro
models/gemini-2.0-flash-exp
models/gemini-2.0-flash
models/gemini-2.0-flash-001
models/gemini-2.0-flash-exp-image-generation
models/gemini-2.0-flash-lite-001
models/gemini-2.0-flash-lite
models/gemini-2.0-flash-preview-image-generation
models/gemini-2.0-flash-lite-preview-02-05
models/gemini-2.0-flash-lite-preview
models/gemini-2.0-pro-exp
models/gemini-2.0-pro-exp-02-05
models/gemini-exp-1206
models/gemini-2.0-flash-thinking-exp-01-21
models/gemini-2.0-flash-thinking-exp
models/ge