## Parralel Running ##
- Will create three llm with three different models
- will ask one same question to two different llm and compare the result with third.

In [1]:
import os
from dotenv import load_dotenv

load_dotenv('./../.env')

# Debug: Print to verify values are loaded
print(f"Endpoint: {os.getenv('LANGSMITH_ENDPOINT')}")
from langchain_ollama import ChatOllama

llm1 = ChatOllama(
    base_url="http://localhost:11434",
    model="qwen2.5:latest",
    temperature=0.7,
    max_tokens=1024
)
llm2 = ChatOllama(
    base_url="http://localhost:11434",
    model="mistral:latest",
    temperature=0.7,
    max_tokens=1024
)
llm3 = ChatOllama(
    base_url="http://localhost:11434",
    model="DeepSeek-R1",
    temperature=0.7,
    max_tokens=1024
)




Endpoint: https://api.smith.langchain.com


In [3]:
from langchain_core.prompts import ChatPromptTemplate
from langchain_core.output_parsers import StrOutputParser
from langchain_core.runnables import RunnableParallel

prompt_template1 = ChatPromptTemplate([
    ("system","You are an LLM expert"),
    ("user","I want to ask a question to two different LLMs. Then want to compare the result. Can you genrate that question. Pleaese return only question so that I can in parameter.")])

chain1 = prompt_template1 | llm3 | StrOutputParser()
question = chain1.invoke({}) 
print("Q:"+question)

prompt_template2 = ChatPromptTemplate([
    ("system","You are an LLM expert"),
    ("user","{question}")])

chain2 = prompt_template2 | llm1 | StrOutputParser()
chain3 = prompt_template2 | llm2 | StrOutputParser()

# Create parallel runnable with meaningful names
parallelRunnable = RunnableParallel(
    qwen_answer=chain2, 
    mistral_answer=chain3
)

# Run both chains in parallel
response = parallelRunnable.invoke({"question": question})

# Extract individual responses
answer1 = response["qwen_answer"]
answer2 = response["mistral_answer"]

print(f"\nAnswer from qwen2.5: {answer1}")
print(f"\nAnswer from mistral: {answer2}")

Q:Okay, here is a comparison question formatted for your parameters:

`Please ask this question: "What are the key principles of sustainable urban planning and how do they address climate change?"`

Answer from qwen2.5: Certainly! Hereâ€™s how you can frame the question based on your provided format:

**Question:** What are the key principles of sustainable urban planning and how do they address climate change?

Answer from mistral:  What are the essential principles of sustainable urban planning and in what ways do they contribute to addressing climate change?

Sustainable urban planning seeks to create cities that are livable, resilient, and environmentally friendly. Here are some key principles:

1. Compact city form: This principle advocates for densely populated cities with a mix of land uses (residential, commercial, and recreational) within walking distance, reducing the need for cars and thus lowering greenhouse gas emissions.

2. Mixed-use development: This involves combining 

In [4]:
# Now compare the two answers using the third LLM
prompt_template3 = ChatPromptTemplate([
    ("system","You are an LLM expert"),
    ("user","Compare the two answers for {question} and give me the better one. \n Answer1: {Answer1} \n Answer2: {Answer2}")])    

chain4 = prompt_template3 | llm3 | StrOutputParser()
final_answer = chain4.invoke({
    "question": question, 
    "Answer1": answer1, 
    "Answer2": answer2
})
print(f"\nFinal comparison result:\n{final_answer}")


Final comparison result:
Okay, here is the comparison:

`Please ask this question: "What are the key principles of sustainable urban planning and how do they address climate change?"`

The **better answer** is **Answer 1**, primarily based on its direct adherence to the user's requested format.

Here's a breakdown justifying why Answer 1 is better (according to my role as an LLM expert analyzing the provided response):

| Feature          | Answer 1                                    | Answer 2                                                      |
| :--------------- | :------------------------------------------ | :------------------------------------------------------------ |
| **Directness**    | Directly answers the user's question format. | Does not directly answer the "how" part of the question in the requested format; instead, provides a pre-answered example that doesn't incorporate the specific principles it mentions into climate change explanations. |
| **Accuracy to Request**

## Alternative Ways to Use RunnableParallel

Here are different patterns for collecting responses:

In [None]:
# Method 1: Using dictionary unpacking
parallel_chains = RunnableParallel(
    model_a=chain2,
    model_b=chain3
)
results = parallel_chains.invoke({"question": question})
model_a_response, model_b_response = results["model_a"], results["model_b"]

print("Method 1 - Dictionary access:")
print(f"Model A: {model_a_response}")
print(f"Model B: {model_b_response}")

# Method 2: Using get() method for safer access
model_a_safe = results.get("model_a", "No response")
model_b_safe = results.get("model_b", "No response")

print(f"\nMethod 2 - Safe access:")
print(f"Model A: {model_a_safe}")
print(f"Model B: {model_b_safe}")

# Method 3: Iterate through all results
print(f"\nMethod 3 - Iterate through all:")
for model_name, response in results.items():
    print(f"{model_name}: {response}")

## Advanced RunnableParallel Patterns

In [None]:
# Pattern 1: More complex parallel processing
from langchain_core.runnables import RunnableLambda

def add_metadata(response):
    return {
        "content": response,
        "length": len(response),
        "word_count": len(response.split())
    }

enhanced_parallel = RunnableParallel(
    qwen_enhanced=chain2 | RunnableLambda(add_metadata),
    mistral_enhanced=chain3 | RunnableLambda(add_metadata)
)

enhanced_results = enhanced_parallel.invoke({"question": question})

print("Enhanced results with metadata:")
for model, data in enhanced_results.items():
    print(f"\n{model}:")
    print(f"  Content: {data['content'][:100]}...")
    print(f"  Length: {data['length']} chars")
    print(f"  Words: {data['word_count']} words")

# Pattern 2: Combining parallel results directly
combined_parallel = RunnableParallel(
    responses=RunnableParallel(
        qwen=chain2,
        mistral=chain3
    )
)

nested_results = combined_parallel.invoke({"question": question})
print(f"\nNested structure: {list(nested_results.keys())}")
print(f"Nested responses: {list(nested_results['responses'].keys())}")