# Python  LLaMA-3 (Ollama)

In [1]:
import ollama

response = ollama.chat(
    model="llama3",
    messages=[{"role": "user", "content": "Explain gene‚Äìdisease association in one line"}]
)

print(response["message"]["content"])

Gene-disease association refers to the identification of a specific genetic variation (e.g. single nucleotide polymorphism or mutation) that is strongly linked to an increased risk or susceptibility to developing a particular disease, such as cancer, diabetes, or Alzheimer's.


# Python Chat Example (System + User + Assistant) ‚Äî LLaMA-3 (Ollama)

In [2]:
import ollama

messages = [
    {
        "role": "system",
        "content": "You are a helpful biomedical research assistant."
    },
    {
        "role": "user",
        "content": "Explain gene‚Äìdisease association in simple terms."
    },
    {
        "role": "assistant",
        "content": "A gene‚Äìdisease association means that changes in a gene can increase the risk of developing a specific disease."
    },
    {
        "role": "user",
        "content": "Give one real example."
    }
]

response = ollama.chat(
    model="llama3",
    messages=messages
)

print(response["message"]["content"])

One well-known example is the BRCA1 and BRCA2 genes, which are associated with an increased risk of breast and ovarian cancer.

Normal copies of these genes help to repair DNA damage in cells. However, when there's a mutation (change) in the BRCA1 or BRCA2 gene, it can impair this DNA repair process. As a result, cells that have this faulty gene may become more prone to developing cancerous tumors.

Women who inherit a mutated copy of the BRCA1 or BRCA2 gene are at a higher risk of developing breast and ovarian cancers compared to those without these mutations. In fact, studies have shown that women with a BRCA1 mutation have about a 65% chance of developing breast cancer by age 70, while those with a BRCA2 mutation have around a 45% chance.

This is just one example of how gene‚Äìdisease associations can help us understand the connection between genetic changes and the development of specific diseases.


# Moderation + LLaMA-3 (Ollama)

In [3]:
import ollama

# Simple keyword-based moderation
BANNED_KEYWORDS = [
    "kill", "bomb", "terrorist", "suicide", "rape", "murder"
]

def is_safe(text):
    text_lower = text.lower()
    for word in BANNED_KEYWORDS:
        if word in text_lower:
            return False
    return True


messages = [
    {"role": "system", "content": "You are a helpful assistant."}
]

user_input = "I want to build a bomb"

# Moderation check
if not is_safe(user_input):
    print("‚ùå Content blocked due to safety policy")
else:
    messages.append({"role": "user", "content": user_input})

    response = ollama.chat(
        model="llama3",
        messages=messages
    )

    print("Assistant:", response["message"]["content"])


‚ùå Content blocked due to safety policy


In [4]:
import ollama

MODEL = "llama3"

def ask_llama(system_prompt, user_prompt):
    response = ollama.chat(
        model=MODEL,
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user", "content": user_prompt}
        ]
    )
    return response["message"]["content"]


# üîπ Product details
product_details = """
Products available:
1. Smart TV ‚Äì 55 inch, 4K, Android TV, Dolby Audio
2. Washing Machine ‚Äì 7 kg, Front Load, Inverter Motor
3. Refrigerator ‚Äì 300 L, Double Door, Energy Efficient
"""

# üîπ Customer information
customer_info = """
Customer type: Family of 4
Budget: Medium
Needs:
- Good quality
- Energy saving
- Long warranty
"""

# 1Ô∏è‚É£ Step 1: Understand customer needs
customer_analysis = ask_llama(
    system_prompt="You are a retail sales assistant. Analyze customer needs.",
    user_prompt=customer_info
)

# 2Ô∏è‚É£ Step 2: Recommend best product
product_recommendation = ask_llama(
    system_prompt="You are an electronics expert. Recommend the best product.",
    user_prompt=f"""
Customer needs:
{customer_analysis}

Available products:
{product_details}
"""
)

# 3Ô∏è‚É£ Step 3: Create selling points
selling_points = ask_llama(
    system_prompt="You are a sales executive. Create simple selling points.",
    user_prompt=f"""
Recommended product:
{product_recommendation}
"""
)

# 4Ô∏è‚É£ Step 4: Handle objections
objection_handling = ask_llama(
    system_prompt="You are a sales expert. Handle customer objections.",
    user_prompt=f"""
Product details:
{selling_points}

Possible objection:
'Price is high'
"""
)

# 5Ô∏è‚É£ Step 5: Final closing pitch
final_pitch = ask_llama(
    system_prompt="You are a friendly shop salesperson. Close the sale politely.",
    user_prompt=f"""
Selling points:
{selling_points}

Objection handling:
{objection_handling}
"""
)

print("=== FINAL SALES TALK ===")
print(final_pitch)


=== FINAL SALES TALK ===
I think you've made an excellent decision! You're going to love having all three products in your home. Let me just ring up the sale for you... (rings up the purchase)

So, that's a total of $X, but with our special promotion, you'll get 10% off the entire package, plus free installation and setup for each product. That brings the total down to $Y.

Would you like to pay today or would you prefer to set up a payment plan? We offer flexible financing options to fit your budget.

And as a special thank you for choosing our products, I'd like to throw in a complimentary Smart TV wall mount and a 5-year warranty on all three products. That's a $50 value absolutely free!

So, what do you say? Are you ready to take home your new Smart TV, Washing Machine, and Refrigerator today?


In [7]:
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

# Load BioBERT NER model
tokenizer = AutoTokenizer.from_pretrained("d4data/biomedical-ner-all")
model = AutoModelForTokenClassification.from_pretrained("d4data/biomedical-ner-all")

ner_pipeline = pipeline(
    "ner",
    model=model,
    tokenizer=tokenizer,
    aggregation_strategy="simple"
)

def extract_entities(text):
    entities = ner_pipeline(text)
    return [(e["word"], e["entity_group"]) for e in entities]


  from .autonotebook import tqdm as notebook_tqdm
Device set to use cpu


In [8]:
import ollama

# ---- Moderation ----
BANNED_KEYWORDS = ["kill", "bomb", "terrorist", "suicide", "rape", "murder"]

def is_safe(text):
    return not any(word in text.lower() for word in BANNED_KEYWORDS)


# ---- User Input ----
user_input = "TP53 mutation is associated with breast cancer"

if not is_safe(user_input):
    print("‚ùå Content blocked due to safety policy")
else:
    # ---- BioBERT extraction ----
    entities = extract_entities(user_input)

    # ---- Send structured info to LLaMA-3 ----
    prompt = f"""
You are a biomedical research assistant.

Extracted biomedical entities:
{entities}

Explain the gene‚Äìdisease relationship in simple terms.
"""

    response = ollama.chat(
        model="llama3",
        messages=[
            {"role": "system", "content": "You are a biomedical expert."},
            {"role": "user", "content": prompt}
        ]
    )

    print("Assistant:", response["message"]["content"])


Assistant: So, based on the extracted entities, we have a few key points to discuss. 

Firstly, there's the concept of 'mutation'. Think of it like a typo in our DNA code. A mutation is when one or more nucleotides (the building blocks of DNA) are changed. This change can be tiny or significant, depending on where and how many nucleotides are altered.

In the case of breast cancer, a specific type of '##53 mutation' is mentioned. This means there's a faulty gene that plays a role in the development of this disease. The more we learn about these genetic changes, the better equipped we'll be to understand how breast cancer arises and how we can treat it effectively.

Now, let's talk about the relationship between genes and diseases. In simple terms, our genes contain instructions for various biological processes. When there's a mutation or error in these instructions (like that '##53 mutation'), it can lead to an imbalance or malfunction within cells, tissues, or even organs. This can ul

In [9]:
from transformers import AutoTokenizer, AutoModelForTokenClassification
from transformers import pipeline

# Biomedical NER model (genes, diseases, chemicals, etc.)
MODEL_NAME = "d4data/biomedical-ner-all"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForTokenClassification.from_pretrained(MODEL_NAME)

ner = pipeline(
    "ner",
    model=model,
    tokenizer=tokenizer,
    aggregation_strategy="simple"
)

text = "TP53 mutation is strongly associated with breast cancer."

entities = ner(text)

gene_entities = []
disease_entities = []

for e in entities:
    if e["entity_group"] == "GENE":
        gene_entities.append(e["word"])
    if e["entity_group"] == "DISEASE":
        disease_entities.append(e["word"])

print("Genes:", gene_entities)
print("Diseases:", disease_entities)


Device set to use cpu


Genes: []
Diseases: []


In [12]:
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

MODEL_NAME = "microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract"

tokenizer = AutoTokenizer.from_pretrained(MODEL_NAME)
model = AutoModelForSequenceClassification.from_pretrained(MODEL_NAME)

def classify_relation(sentence):
    inputs = tokenizer(sentence, return_tensors="pt", truncation=True)
    outputs = model(**inputs)
    probs = torch.softmax(outputs.logits, dim=1)
    return probs.detach().numpy()

sentence = "TP53 mutation is associated with breast cancer."
scores = classify_relation(sentence)

print("Relation confidence scores:", scores)


To support symlinks on Windows, you either need to activate Developer Mode or to run Python as an administrator. In order to activate developer mode, see this article: https://docs.microsoft.com/en-us/windows/apps/get-started/enable-your-device-for-development
Xet Storage is enabled for this repo, but the 'hf_xet' package is not installed. Falling back to regular HTTP download. For better performance, install the package with: `pip install huggingface_hub[hf_xet]` or `pip install hf_xet`
Some weights of BertForSequenceClassification were not initialized from the model checkpoint at microsoft/BiomedNLP-PubMedBERT-base-uncased-abstract and are newly initialized: ['classifier.bias', 'classifier.weight']
You should probably TRAIN this model on a down-stream task to be able to use it for predictions and inference.
Asking to truncate to max_length but no maximum length is provided and the model has no predefined maximum length. Default to no truncation.


Relation confidence scores: [[0.4881448  0.51185524]]
