## Welcome to the Second Lab - Week 1, Day 3

Today we will work with lots of models! This is a way to get comfortable with APIs.

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Important point - please read</h2>
            <span style="color:#ff7800;">The way I collaborate with you may be different to other courses you've taken. I prefer not to type code while you watch. Rather, I execute Jupyter Labs, like this, and give you an intuition for what's going on. My suggestion is that you carefully execute this yourself, <b>after</b> watching the lecture. Add print statements to understand what's going on, and then come up with your own variations.<br/><br/>If you have time, I'd love it if you submit a PR for changes in the community_contributions folder - instructions in the resources. Also, if you have a Github account, use this to showcase your variations. Not only is this essential practice, but it demonstrates your skills to others, including perhaps future clients or employers...
            </span>
        </td>
    </tr>
</table>

In [1]:
# Start with imports - ask ChatGPT to explain any package that you don't know

import os
import json
from dotenv import load_dotenv
from openai import OpenAI
from anthropic import Anthropic
from IPython.display import Markdown, display

In [2]:
# Always remember to do this! this will over write the keys on the variable of the env 
load_dotenv(override=True)

True

In [3]:
# Print the key prefixes to help with any debugging

openai_api_key = os.getenv('OPENAI_API_KEY')
anthropic_api_key = os.getenv('ANTHROPIC_API_KEY')
google_api_key = os.getenv('GOOGLE_API_KEY')
deepseek_api_key = os.getenv('DEEPSEEK_API_KEY')
groq_api_key = os.getenv('GROQ_API_KEY')

if openai_api_key:
    print(f"OpenAI API Key exists and begins {openai_api_key[:8]}")
else:
    print("OpenAI API Key not set")
    
if anthropic_api_key:
    print(f"Anthropic API Key exists and begins {anthropic_api_key[:7]}")
else:
    print("Anthropic API Key not set (and this is optional)")

if google_api_key:
    print(f"Google API Key exists and begins {google_api_key[:2]}")
else:
    print("Google API Key not set (and this is optional)")

if deepseek_api_key:
    print(f"DeepSeek API Key exists and begins {deepseek_api_key[:3]}")
else:
    print("DeepSeek API Key not set (and this is optional)")

if groq_api_key:
    print(f"Groq API Key exists and begins {groq_api_key[:4]}")
else:
    print("Groq API Key not set (and this is optional)")

OpenAI API Key exists and begins sk-proj-
Anthropic API Key not set (and this is optional)
Google API Key not set (and this is optional)
DeepSeek API Key not set (and this is optional)
Groq API Key not set (and this is optional)


In [4]:
request = "Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. "
request += "Answer only with the question, no explanation."
messages = [{"role": "user", "content": request}]

In [7]:
messages

[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence. Answer only with the question, no explanation.'}]

In [5]:
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages,
)
question = response.choices[0].message.content
print(question)


How would you approach the task of reconciling conflicting ethical principles in a scenario where a utilitarian outcome clashes with deontological obligations, and what decision-making framework would you apply to arrive at a resolution?


In [6]:
competitors = []
answers = []
messages = [{"role": "user", "content": question}]

In [7]:
# The API we know well

model_name = "gpt-4o-mini"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling conflicting ethical principles, such as utilitarian outcomes and deontological obligations, is a complex task that involves careful consideration of both ethical frameworks. Here’s how I would approach this task and a decision-making framework that can facilitate resolution:

### Step-by-Step Approach

1. **Identify the Ethical Principles**: Begin by clearly outlining the conflicting principles. For example, utilitarianism focuses on maximizing overall happiness or welfare for the greatest number, while deontology emphasizes adherence to moral duties and rules regardless of consequences.

2. **Clarify the Context**: Understand the specific scenario and the stakeholders involved. Who will be affected? What are the short-term and long-term implications of each ethical principle on these stakeholders?

3. **Analyze the Options**: Consider the choices available and how each aligns with the conflicting principles. Gather evidence and data to understand the potential outcomes of each option, especially considering utilitarian analysis concerning the potential benefits and harms.

4. **Evaluate the Deontological Obligations**: Determine the moral duties involved in the scenario. Assess whether any duties are absolute (e.g., honesty) or context-dependent. This might require engagement with the relevant ethical codes or principles that guide the involved parties.

5. **Consider the Greater Good**: Reflect on how adherence to deontological obligations might lead to greater trust, integrity, or social capital in the long run, which could also be in line with utilitarian principles. This might involve negotiating or modifying the obligations without undermining their core intent.

6. **Seek Middle Ground**: Identify solutions that could satisfy both principles. This might involve creative compromises or alternative actions that uphold the deontological duties while still promoting a beneficial outcome.

7. **Stakeholder Engagement**: Involve stakeholders in the decision-making process where appropriate. Their perspectives can provide insights that help balance the ethical considerations and promote transparency.

8. **Make a Decision**: After thorough analysis and engagement, choose the course of action that best balances both ethical frameworks. Document the reasoning behind the decision to clarify how the conflict was resolved.

9. **Implement and Reflect**: After implementing the decision, monitor the outcomes and gather feedback from stakeholders. Reflect on whether the resolution was effective and how similar conflicts might be approached in the future.

### Decision-Making Framework

A useful framework for this scenario is the **Ethical Decision-Making Model**, which often includes steps like:

- **Recognize an Ethical Issue**: Acknowledge the conflict between utilitarian outcomes and deontological principles.
- **Gather Facts**: Collect all relevant information related to the situation.
- **Evaluate Alternative Actions**: Use both consequentialist and deontological perspectives to assess the implications of each potential course of action.
- **Make a Decision and Test It**: Consider how the decision aligns with both ethical frameworks. Ask questions: "Would I be comfortable if this decision were made public?" and "Does it respect the rights and dignity of all stakeholders?"
- **Implement the Decision**: Carry out the course of action while ensuring that ethical considerations remain at the forefront.
- **Reflect on the Outcomes**: After the decision is made and actions are implemented, reflect on the results and consider how this experience can inform future ethical decision-making.

### Conclusion

The reconciliation of conflicting ethical principles requires a thoughtful and structured approach. By understanding the nuances of both utilitarian and deontological ethics, engaging stakeholders, and utilizing a decision-making framework, you can arrive at a resolution that strives to respect both the duties involved and the broader consequences of the action.

In [8]:
# Since I have API only for openAI I am going to use another model from their set 


model_name = "gpt-5-nano"

response = openai.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Short answer: use a structured, multi-layered decision process that starts from deontological commitments, weighs consequences only after those commitments are respected as far as possible, and then uses additional frameworks (like double effect, rule utilitarianism, and fairness/virtue considerations) to break ties or find workable compromises.

A practical framework you can apply

1) Frame and identify the conflicts
- State the two (or more) ethical claims in plain terms: e.g., “X would maximize happiness/lessen harm,” vs. “Y is a duty we must uphold (truth-telling, keeping promises, not harming innocents, rights, etc.).”
- Identify all stakeholders and the rights, duties, and interests at stake.

2) classify duties as absolute or prima facie
- If you’re using Ross’s prima facie duties (fidelity, non-maleficence, justice, beneficence, etc.), note which ones are likely to be binding in this case and which can be overridden, if at all. If a duty is truly absolute (e.g., a categorical moral prohibition in your framework), treat it as non-negotiable unless you have a compelling, universally justifiable reason to depart.

3) check for universalizable or contractual constraints
- Consider whether the action could be universalized (would it be acceptable if everyone acted this way?) or if it would undermine trust or fairness in a social contract. This helps to detect hidden conflicts between utilitarian gains and long-term rights/justice.

4) apply the doctrine of double effect (when relevant)
- Use this tool when an action with a good outcome also causes a bad effect:
  - The intention must be the good effect, not the bad.
  - The bad effect must be foreseen but not the means to the good.
  - There must be a legitimate, proportional reason to perform the action.
  - No less-harmful means should be available.
- If the action would violate a deontological duty in a means-ends way, double effect often blocks the action.

5) weigh consequences with a constrained utilitarian lens
- Only after you’ve handled absolute/prima facie duties and considered double effect should you assess consequences.
- Do a careful, transparent utilitarian calculation for viable options, but constrain it by the duties you determined as binding.
- Prefer options that maximize good while avoiding the violation of duties that are non-negotiable or prima facie weighted more heavily in the context.

6) look for non-violating alternatives and compromise
- Before concluding that you must violate a duty, exhaust options that preserve duties while achieving similar utilitarian gains (e.g., less intrusive means, safeguards, temporary measures, consent where possible, delegees, or shifting the timing).

7) use a tie-breaker framework (optional but helpful)
If there’s still no clear winner, you can apply one or more of:
- Rule utilitarianism: prefer rules that, if followed generally, maximize good while respecting duties.
- Virtue/agency ethics: what would a prudent, conscientious agent do in this situation? This can help with moral character and long-term trust considerations.
- Justice/fairness lens: prioritize options that distribute burdens and benefits fairly and minimize harm to vulnerable groups.
- Public-consultation or contractarian check: would stakeholders consent to the chosen approach if they could negotiate the rules?

8) decide, justify, and implement with accountability
- Choose the option that preserves non-negotiable duties, minimizes rights violations, and, after considering consequences, yields the greatest net good.
- Document the reasoning clearly, including how duties were weighed, how exceptions were handled, and what safeguards are in place.
- Plan for review and appeals if new information arises.

9) reflect and learn
- After implementation, examine outcomes, check for unanticipated harms, and adjust policies or rules to reduce future conflicts.

A concrete mini-example (illustrative)
- Scenario: A city must decide whether to compel vaccination during a severe outbreak.
  - Utilitarian pull: higher vaccination rate saves lives.
  - Deontological pull: there is a strong duty to consent and to respect bodily autonomy; mandates could violate individual rights.
  - Approach: 
    - Identify duties: autonomy/consent vs public safety.
    - Decide if autonomy is absolute or prima facie here (likely prima facie, but with strong protections).
    - Apply double effect: the good (saving lives) is intended; bad effect is restricted autonomy; ensure least-restrictive means and time-limited mandate.
    - Look for alternatives: education, outreach, incentives, exemptions for medical contraindications, and process safeguards.
    - If no acceptable non-coercive route achieves the needed protection, assess a narrowly tailored temporary mandate with robust oversight, sunset clause, and independent review.
    - Document reasoning and plan for ongoing evaluation.

Which decision-making framework to pick
- A robust, widely applicable approach combines: deontological analysis (absolute vs prima facie duties), the doctrine of double effect (when action has both good and bad effects), and a utilitarian consequence check restricted by duties. Add tie-breakers from rule utilitarianism, Rawlsian/contractarian fairness, and virtue considerations as needed.
- In bioethics or medicine, you can layer principlism (autonomy, beneficence, non-maleficence, justice) on top of the framework to ensure a comprehensive view.

Key sources you can draw from
- Deontology: Kantian ethics; Kant’s Categorical Imperative.
- Duty-based pluralism: W. D. Ross’s prima facie duties (fidelity, reparation, justice, beneficence, non-maleficence, self-improvement, gratitude).
- Consequential ethics: utilitarianism (act vs rule utilitarianism) and the principle of double effect (often used in medical ethics).
- Fairness/justice: Rawlsian contractarian perspective.
- Practical ethics: Beauchamp & Childress principlism (autonomy, beneficence, non-maleficence, justice) as a usable heuristic in applied contexts.

If you want, I can tailor this to a specific scenario you have in mind and walk through the steps with concrete options and a recommended course of action.

In [14]:
# Anthropic has a slightly different API, and Max Tokens is required

model_name = "claude-3-7-sonnet-latest"

claude = Anthropic()
response = claude.messages.create(model=model_name, messages=messages, max_tokens=1000)
answer = response.content[0].text

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

TypeError: "Could not resolve authentication method. Expected either api_key or auth_token to be set. Or for one of the `X-Api-Key` or `Authorization` headers to be explicitly omitted"

In [None]:
gemini = OpenAI(api_key=google_api_key, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
model_name = "gemini-2.0-flash"

response = gemini.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
deepseek = OpenAI(api_key=deepseek_api_key, base_url="https://api.deepseek.com/v1")
model_name = "deepseek-chat"

response = deepseek.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

In [None]:
groq = OpenAI(api_key=groq_api_key, base_url="https://api.groq.com/openai/v1")
model_name = "llama-3.3-70b-versatile"

response = groq.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)


## For the next cell, we will use Ollama

Ollama runs a local web service that gives an OpenAI compatible endpoint,  
and runs models locally using high performance C++ code.

If you don't have Ollama, install it here by visiting https://ollama.com then pressing Download and following the instructions.

After it's installed, you should be able to visit here: http://localhost:11434 and see the message "Ollama is running"

You might need to restart Cursor (and maybe reboot). Then open a Terminal (control+\`) and run `ollama serve`

Useful Ollama commands (run these in the terminal, or with an exclamation mark in this notebook):

`ollama pull <model_name>` downloads a model locally  
`ollama ls` lists all the models you've downloaded  
`ollama rm <model_name>` deletes the specified model from your downloads

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/stop.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Super important - ignore me at your peril!</h2>
            <span style="color:#ff7800;">The model called <b>llama3.3</b> is FAR too large for home computers - it's not intended for personal computing and will consume all your resources! Stick with the nicely sized <b>llama3.2</b> or <b>llama3.2:1b</b> and if you want larger, try llama3.1 or smaller variants of Qwen, Gemma, Phi or DeepSeek. See the <A href="https://ollama.com/models">the Ollama models page</a> for a full list of models and sizes.
            </span>
        </td>
    </tr>
</table>

In [9]:
!ollama pull llama3.2

[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠇ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠏ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠋ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠙ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠹ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠸ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠼ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠴ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠦ [K[?25h[?2026l[?2026h[?25l[1Gpulling manifest ⠧ [K[?25h[?2026l[?2026h[?25l[1Gpulling ma

In [10]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

response = ollama.chat.completions.create(model=model_name, messages=messages)
answer = response.choices[0].message.content

display(Markdown(answer))
competitors.append(model_name)
answers.append(answer)

Reconciling conflicting ethical principles can be challenging, but there are several steps that can help. Here's an approach I'd take:

1. **Clarify the ethical principles**: Start by clearly articulating both the utilitarian outcome and deontological obligations. Make sure you understand the underlying values, rules, or norms that each principle represents.
2. **Identify the conflict**: Analyze how these two ethical principles intersect and identify specific areas of conflict. Ask yourself questions like:
	* Which principle takes precedence in this situation?
	* How do these conflicting principles align with the core values of each?
3. **Consider the context**: Take into account the specifics of the scenario, including any relevant facts, consequences, or power dynamics at play.
4. **Weigh the competing values**: Evaluate which value (utilitarian or deontological) is more important in this situation. You may need to consider how each value impacts different stakeholders and overall well-being.

To arrive at a resolution, I'd apply a decision-making framework that combines elements of:

1. **Principle-Based Ethics**: This approach emphasizes the importance of upholding moral principles regardless of utilitarian considerations.
2. **Consequentialism**: Consider the potential consequences of each course of action and their impact on overall well-being.
3. **Rights Framework**: Assess whether there are any inherent rights or duties at stake that need to be considered.

**Decision-making process:**

1. **Identify key questions**: What are the primary concerns or conflicting values being debated?
2. **Apply ethical theories**: Utilize principle-based ethics, consequentialism, and a rights framework to analyze each question.
3. **Consider counter-examples**: Think of scenarios where either utilitarian or deontological principles seem more compelling, and assess how these might align with the main principles at play.
4. **Find common ground**: Seek areas of convergence between utilitarian outcomes and deontological obligations. Are there some value commitments shared by both?
5. **Reflect on intuitions**: Engage your ethical intuition and consider what you would do in a similar situation, taking into account both ethical principles.
6. **Justify the decision**: Use logical reasoning to support any conclusions or decisions reached during this process.

**Example application:**

Suppose a hospital administrator faces a conflict between:

* Utilitarian principle: maximizing overall patient well-being
* Deontological principle: upholding confidentiality and informed consent

Scenario: A patient requests removal of their pacemaker without disclosing the life-threatening risks associated with such an action. The administrator must decide whether to comply with the patient's request and potentially harm them, or maintain confidentiality while possibly denying the patient's autonomy.

Decision-making framework:

1. Clarify principles: Identify the values at stake (well-being vs. confidentiality).
2. Consider context: Account for the potential consequences of each option.
3. Weigh competing values: Assess which value takes precedence.
4. Apply ethical theories:
* Principle-Based Ethics: Uphold confidentiality as a moral obligation.
* Consequentialism: Prevent harm to patients by maintaining confidentiality.
* Rights Framework: Respect patients' autonomy and right to informed consent.

Decision: While upholding confidentiality is a fundamental principle, the right of informed consent takes precedence. The administrator should inform the patient about the risks associated with pacemaker removal but maintain confidentiality regarding their life-threatening risks without disclosing them to unauthorized individuals.

This decision-taking approach acknowledges that both principles have value in this situation and strives for consistency with widely accepted medical ethics.

In [11]:
# So where are we?

print(competitors)
print(answers)


['gpt-4o-mini', 'gpt-5-nano', 'llama3.2']
['Reconciling conflicting ethical principles, such as utilitarian outcomes and deontological obligations, is a complex task that involves careful consideration of both ethical frameworks. Here’s how I would approach this task and a decision-making framework that can facilitate resolution:\n\n### Step-by-Step Approach\n\n1. **Identify the Ethical Principles**: Begin by clearly outlining the conflicting principles. For example, utilitarianism focuses on maximizing overall happiness or welfare for the greatest number, while deontology emphasizes adherence to moral duties and rules regardless of consequences.\n\n2. **Clarify the Context**: Understand the specific scenario and the stakeholders involved. Who will be affected? What are the short-term and long-term implications of each ethical principle on these stakeholders?\n\n3. **Analyze the Options**: Consider the choices available and how each aligns with the conflicting principles. Gather eviden

In [12]:
# It's nice to know how to use "zip"
for competitor, answer in zip(competitors, answers):
    print(f"Competitor: {competitor}\n\n{answer}")


Competitor: gpt-4o-mini

Reconciling conflicting ethical principles, such as utilitarian outcomes and deontological obligations, is a complex task that involves careful consideration of both ethical frameworks. Here’s how I would approach this task and a decision-making framework that can facilitate resolution:

### Step-by-Step Approach

1. **Identify the Ethical Principles**: Begin by clearly outlining the conflicting principles. For example, utilitarianism focuses on maximizing overall happiness or welfare for the greatest number, while deontology emphasizes adherence to moral duties and rules regardless of consequences.

2. **Clarify the Context**: Understand the specific scenario and the stakeholders involved. Who will be affected? What are the short-term and long-term implications of each ethical principle on these stakeholders?

3. **Analyze the Options**: Consider the choices available and how each aligns with the conflicting principles. Gather evidence and data to understand t

In [13]:
# Let's bring this together - note the use of "enumerate"

together = ""
for index, answer in enumerate(answers):
    together += f"# Response from competitor {index+1}\n\n"
    together += answer + "\n\n"

In [14]:
print(together)

# Response from competitor 1

Reconciling conflicting ethical principles, such as utilitarian outcomes and deontological obligations, is a complex task that involves careful consideration of both ethical frameworks. Here’s how I would approach this task and a decision-making framework that can facilitate resolution:

### Step-by-Step Approach

1. **Identify the Ethical Principles**: Begin by clearly outlining the conflicting principles. For example, utilitarianism focuses on maximizing overall happiness or welfare for the greatest number, while deontology emphasizes adherence to moral duties and rules regardless of consequences.

2. **Clarify the Context**: Understand the specific scenario and the stakeholders involved. Who will be affected? What are the short-term and long-term implications of each ethical principle on these stakeholders?

3. **Analyze the Options**: Consider the choices available and how each aligns with the conflicting principles. Gather evidence and data to underst

In [15]:
judge = f"""You are judging a competition between {len(competitors)} competitors.
Each model has been given this question:

{question}

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{together}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks."""


In [16]:
print(judge)

You are judging a competition between 3 competitors.
Each model has been given this question:

How would you approach the task of reconciling conflicting ethical principles in a scenario where a utilitarian outcome clashes with deontological obligations, and what decision-making framework would you apply to arrive at a resolution?

Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}

Here are the responses from each competitor:

# Response from competitor 1

Reconciling conflicting ethical principles, such as utilitarian outcomes and deontological obligations, is a complex task that involves careful consideration of both ethical frameworks. Here’s how I would approach this task and a decision-making framework that can facilitate resolution:

### Step-by-

In [17]:
judge_messages = [{"role": "user", "content": judge}]

In [18]:
# Judgement time!

openai = OpenAI()
response = openai.chat.completions.create(
    model="o3-mini",
    messages=judge_messages,
)
results = response.choices[0].message.content
print(results)


{"results": ["2", "1", "3"]}


In [19]:
# OK let's turn this into results!

results_dict = json.loads(results)
ranks = results_dict["results"]
for index, result in enumerate(ranks):
    competitor = competitors[int(result)-1]
    print(f"Rank {index+1}: {competitor}")

Rank 1: gpt-5-nano
Rank 2: gpt-4o-mini
Rank 3: llama3.2


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/exercise.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Exercise</h2>
            <span style="color:#ff7800;">Which pattern(s) did this use? Try updating this to add another Agentic design pattern.
            </span>
        </td>
    </tr>
</table>

We used orchestrator worker 

<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 150px; height: 150px; vertical-align: middle;">
            <img src="../assets/business.png" width="150" height="150" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#00bfff;">Commercial implications</h2>
            <span style="color:#00bfff;">These kinds of patterns - to send a task to multiple models, and evaluate results,
            are common where you need to improve the quality of your LLM response. This approach can be universally applied
            to business projects where accuracy is critical.
            </span>
        </td>
    </tr>
</table>

In [49]:
## Lets try to make a competition between two groups:
## One group is suggesting one single deep learning architecture for trajectory for autonomous vehicle 
## Teh other group is trying to find the best modular approach for trajectory planning 
## Each group will be consists of 3 members (same member) and a manager which combine their solution to a final solution which will be submitt to a judge (another LLM) and see how win 

Task: Provide a complete trajectory-planning solution for an autonomous vehicle. Inputs: (1) HD map data, (2) time-series histories of other road users, and (3) outputs from the perception stack (e.g., object detections, classifications, and semantic information).

In [50]:

task = "Task: Provide a complete trajectory-planning solution for an autonomous vehicle. Inputs: (1) HD map data, (2) time-series histories of other road users, and (3) outputs from the perception stack (e.g., object detections, classifications, and semantic information)."

goup_number_1_task = task + " Your solution should consist of a single end-to-end network that takes perception inputs and map data as input and outputs a planned trajectory.No code only explanation what you suggest to do. The proposed approach should be competitive enough to win a benchmark competition."

group_number_2_task = task + " Your solution should follow a modular approach, providing separate networks for prediction, path planning, and velocity planning. These components should integrate into a multi-path planning and scoring system that selects the optimal trajectory. No code only explanation what you suggest to do.  And why it is good.The proposed approach should be competitive enough to win a benchmark competition"

In [51]:
messages_group_1 = [{"role": "user", "content": goup_number_1_task}]
messages_group_2 = [{"role": "user", "content": group_number_2_task}]

In [52]:
group_1_answers = []
group_2_answers = []
model_names = []

In [53]:
ollama = OpenAI(base_url='http://localhost:11434/v1', api_key='ollama')
model_name = "llama3.2"

## Group 1 
response = ollama.chat.completions.create(model=model_name, messages=messages_group_1)
answer_g1 = response.choices[0].message.content


display(Markdown(answer_g1))



Based on the given inputs, I propose a complete trajectory-planning solution for an autonomous vehicle that consists of a single end-to-end network architecture. Here's the step-by-step explanation:

**Network Architecture:**

1. **Perception Input Fusion**: The perception input from the perception stack (e.g., object detections, classifications, and semantic information) is fused with the HD map data to provide contextual information about the environment.
2. **Feature Extraction**: A combination of feature extraction techniques are used:
	* 2D LiDAR point cloud features (e.g., SURF or SIFT) for static obstacles
	* 3D object detection features (e.g., IOU-based classification)
	* Semantic segmentation features (e.g., class labels and confidence scores)
3. **Graph-Based Planning**: The fused feature set is used to construct a graph representation of the environment, including:
	* Road segments as nodes
	* Obstacles as edges with corresponding weights (distance, velocity, or speed)
	* Lane markings and other contextual information for better planning
4. **Trajectory Optimization**: A trajectory optimization algorithm (e.g., Model Predictive Control or Motion Quadrature) takes the graph representation and generates a sequence of optimal control actions (steering angles, acceleration/deceleration).
5. **Dynamic Programming**: The trajectory optimization output is then used to apply dynamic programming (DP) for optimizing the entire trajectory, considering constraints such as:
	* Minimum time-to-arrival
	* Average velocity and smoothness
	* Lane changing and merging maneuvers

**Additional Components:**

1. **High-Level Decision Making**: A high-level decision-making model can be integrated with the trajectory planning network to make decisions based on the planned trajectory, such as deciding when to change lanes or navigate towards a destination.
2. **Motion Model Updating**: The trajectory is continuously updated and refined using the output of the trajectory planning network.

**Competitive Enabling Factors:**

1. **Integration of heterogeneous data**: Combines perception inputs with HD map data for comprehensive contextual information.
2. **Graph-based planning**: Represents the environment as a graph, enabling efficient navigation through occlusions or static obstacles.
3. **Trajectory optimization using DP**: Ensures smooth and safe trajectory by considering multiple possibilities.

By combining these components, I believe this solution would be competitive enough to win a benchmark competition for autonomous vehicle trajectory planning.

In [54]:

## Group 2
response = ollama.chat.completions.create(model=model_name, messages=messages_group_2)
answer_g2 = response.choices[0].message.content
display(Markdown(answer_g2))

Here's a proposed solution for a complete trajectory-planning system for an autonomous vehicle, incorporating modular components and multi-path planning:

**Modular Architecture:**

1. **Prediction Network:** This network takes the HD map data, current vehicle state, and object detection information from the perception stack as input. It generates a prediction of future states based on complex dynamics models (e.g., physics, motion primitives).
2. **Path Planning Network:** This network receives the predicted states from the prediction network and outputs a feasible path using graph-based methods or optimization algorithms (e.g., Dijkstra's algorithm, A\* searching).
3. **Velocity Planning Network:** This network takes the predicted path and time constraints as input and outputs a velocity profile that satisfies acceleration limitations.

**Multi-Path Planning:**

1.  Given the current state of the vehicle and the HD map, divide the road into many small segments (N).
2.  For each segment, run the prediction network to generate a set of possible states at different points in time up to a certain horizon.
3.  Use path planning algorithm from N possible states through the generation of "cost paths," that can calculate costs based on time, distance, obstacle constraints.

**Scoring and Decision Making:**

1.  Evaluate each cost-effective path according to user requirements (e.g., minimize travel time, ensure safety).
2.  The best-performing path is chosen as optimal in this scenario.
3.  Incorporate machine learning algorithms such as reinforcement learning or graph neural networks into the solution if the benchmark competition allows.

**Why it's Good:**

1.  Modular Design: Each network function independently to maintain stability and facilitate easier training.
2.  Scalability: If the size of the road grows, only add more prediction components when possible; the rest remains relatively unchanged.
3.  Adaptation : This architecture can incorporate a variety of inputs like speed limits, traffic signs, traffic signals or other factors that may change often.

**Benefits for Benchmark Competition:**

1.  Robustness: By addressing various aspects of autonomous driving in parallel, this system increases its chances of performing well compared with approaches that focus on one component at a time.
2.  Flexibility: This approach can accommodate novel challenges arising during competition by extending the scope of the prediction models to more complex situations without too much rework on other parts of the system.

In [55]:
model_names.append(model_name)
group_1_answers.append(answer_g1)
group_2_answers.append(answer_g2)

In [56]:
model_name = "gpt-4o-mini"
openai = OpenAI()
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages_group_1,
)
answer_g1 = response.choices[0].message.content
display(Markdown(answer_g1))


To develop a complete trajectory-planning solution for an autonomous vehicle that can compete in a benchmark competition, we would design an end-to-end neural network system that effectively integrates all available input data, processes it, and generates a safe and efficient trajectory. Here’s a step-by-step outline of the proposed approach:

### 1. Input Data Fusion
The first step is to perform input data fusion where we combine the three main input types:
- **HD Map Data**: This includes static features like lanes, traffic signs, barriers, drivable areas, and other high-definition map details. A feature extraction module can leverage CNN architectures to transform this map data into a more manageable set of features that describe road geometry and restrictions.
- **Time-Series Histories of Other Road Users**: This data informs the behavior prediction of moving agents, such as other vehicles and pedestrians. Recurrent Neural Networks (RNNs) or Long Short-Term Memory (LSTM) networks can be used to process this temporal data, capturing the dynamics and intentions of these road users.
- **Perception Stack Outputs**: The perception data, including object detections and classifications, provides real-time awareness of the vehicle's surroundings. A backbone network (e.g., a CNN or Transformer) can be employed to process the images to extract relevant features, yielding information about the positions, velocities, and categories of detected objects.

### 2. Feature Representation
Following input fusion, the combined inputs should be converted into a coherent feature representation. This could be accomplished through:
- **Feature Encoding**: All extracted features from the HD maps, time-series data, and perception outputs can be concatenated into a single, extensive feature vector. This would be enriched with encoding positional context (e.g., relative distances, angles) and temporal context (e.g., temporal smoothing of behaviors).

### 3. Behavior Prediction Module
Given the immediate environment, this module predicts the future behavior of dynamic road users. A combination of:
- **Multi-agent Interaction Networks**: Use graph neural networks (GNNs) to model interactions and dependencies between the vehicle and other agents, enabling a richer understanding of potential future movements. This predicts the trajectory of other road users based on their historical data and possible interactions.

### 4. Trajectory Planning
Once the context and behaviors are understood, the trajectory planning module generates a trajectory for the autonomous vehicle:
- **Sequence Generation**: Use a recurrent neural network (RNN) or Transformer to generate the trajectory sequence over time. This output would be a series of waypoints or control commands (like steering angles and acceleration) that define the intended trajectory of the vehicle.
- **Safety and Object Constraints**: Integrate a set of rules to ensure the planned trajectory adheres to road rules, avoids collisions, and respects the dynamics of the vehicle. This could involve a collision avoidance layer that uses predicted positions of other road users to dynamically adjust the planned trajectory in real-time.

### 5. Optimization and Refinement
The final trajectories generated from the previous step should go through an optimization phase:
- **Reinforcement Learning**: Employ a reinforcement learning approach to fine-tune the trajectory based on reward signals such as adherence to traffic rules, comfort of the ride, and efficiency (minimized travel time). This enables the network to continuously improve trajectory generation based on cumulative continuous observations and experiences.
- **Trajectory Smoothing**: Implement a post-processing step to refine the trajectory to ensure smoothness and feasibility with respect to vehicle kinematics.

### 6. Output
The final step of the network outputs the planned trajectory, represented as a series of waypoints or control signals (acceleration and steering commands), which can be sent to the vehicle's control system.

### Evaluation and Benchmarking
Validate the network using real-world driving scenarios, including edge cases (e.g., unexpected pedestrians, other vehicles entering the road). The evaluation metrics would consider safety, comfort, efficiency, and adherence to traffic laws. Participation in competitions should involve testing with diverse datasets and under unpredictable environmental conditions to ensure robustness.

### Conclusion
This proposed approach seeks to leverage deep learning techniques to create a unified trajectory-planning network that intelligently recognizes the dynamic aspects of driving, incorporates HD map information, and outputs a verified plan that can adapt to real-world complexities, making it competitive in benchmark competitions.

In [57]:
response = openai.chat.completions.create(
    model="gpt-4o-mini",
    messages=messages_group_2,
)
answer_g2 = response.choices[0].message.content
display(Markdown(answer_g2))


Creating a robust trajectory-planning solution for an autonomous vehicle involves designing a modular system consisting of several key components: prediction, path planning, and velocity planning. Here’s an overview of each component, how they integrate, and why this approach is competitive:

### 1. **Input Data Preprocessing**

#### a. **HD Map Data Integration**
- **Function**: Provide high-definition, detailed information about the road network, including lanes, traffic signals, road signs, and static obstacles.
- **Reason**: Maps are crucial for understanding the environment, defining boundaries, and identifying road geometry, which dramatically influences path planning.

#### b. **Time-Series Histories of Road Users**
- **Function**: Analyze the historical movement patterns of nearby vehicles and pedestrians to anticipate future behavior.
- **Reason**: Predicting the behavior of dynamic objects is essential for safe and efficient trajectory planning.

#### c. **Perception Outputs**
- **Function**: Use outputs from perception systems to detect and classify surrounding objects, detect semantic information (e.g., lane markings), and recognize the operational context.
- **Reason**: This allows the system to incorporate real-time data about the environment, augmenting the map information with live observations for better decision-making.

### 2. **Prediction Module**
- **Design**: Employ a recurrent neural network (RNN) or Long Short-Term Memory (LSTM) network to process the time-series data of other road users combined with perception data.
- **Function**: This module predicts the future positions and trajectories of road users based on their historical movement patterns and current states.
- **Benefit**: A solid prediction model improves the vehicle’s situational awareness and helps to avoid potential collisions by planning informed paths relative to where other agents are likely to move.

### 3. **Path Planning Module**
- **Design**: A combination of graph-based methods (like A* or Dijkstra's) and sampling-based methods (like RRT or PRM) could be employed to generate candidate paths.
- **Function**: Generate possible paths for the vehicle to follow considering the HD map, predicted trajectories of other road users, and environmental constraints (e.g., road boundaries).
- **Benefit**: The graph-based approach provides guarantees on optimality, while sampling methods are flexible for complex environments, allowing for rapid re-planning when unexpected obstacles arise.

### 4. **Velocity Planning Module**
- **Design**: Use a combination of proportional-integral-derivative (PID) control and a reinforcement learning-based approach to determine the optimal speed along the selected path.
- **Function**: Generate velocity profiles for the planned path, taking into account dynamic obstacles, traffic rules, and vehicle capabilities.
- **Benefit**: Ensures the vehicle safely navigates at a suitable speed, enhancing both comfort and safety. The PID control offers deterministic performance while reinforcement learning optimizes the velocity based on diverse scenarios.

### 5. **Multi-Path Planning and Scoring System**
- **Integration of Modules**: Create a framework to evaluate multiple paths and their corresponding velocity profiles.
- **Function**: Score each trajectory based on safety metrics (clearance from obstacles), comfort (smoothness of speed changes), and efficiency (travel time and energy consumption).
- **Selection Strategy**: Implement a weighted scoring system to rank trajectories according to predefined criteria (e.g., safety first, followed by efficiency).
- **Benefit**: This system allows for real-time decision-making and adaptability to changing environments or dynamic agents, thus providing a holistic solution to trajectory planning.

### 6. **Validation and Continuous Learning**
- **Function**: Implement a feedback loop that collects data from real-world scenarios, allowing the system to learn from previous experiences and improve continuously.
- **Benefit**: The ability to adapt and learn enhances long-term performance and safety, making the solution resilient against a variety of unexpected scenarios.

### How This Approach is Competitive
- **Modularity**: Separating the components allows for specialized optimizations, easier debugging, and incremental improvements.
- **Data-Driven Decisions**: Leveraging historical data and real-time inputs ensures the vehicle can adapt to dynamic conditions, providing robust responses to a wide array of scenarios.
- **Balanced Approach**: The multi-path planning and scoring system ensures that the best trajectory is chosen among several candidates, leading to a balance of safety, efficiency, and comfort.
- **Acceleration in Performance**: Given the combination of machine learning predictions with traditional planning techniques, the vehicle can quickly adapt to changes while maintaining high-performance standards.

In conclusion, this modular and integrative approach provides a comprehensive solution to trajectory planning for autonomous vehicles, emphasizing safety, adaptability, and efficiency, which are key to succeeding in competitive benchmarks.

In [58]:
model_names.append(model_name)
group_1_answers.append(answer_g1)
group_2_answers.append(answer_g2)

Now that you have 2 group member that answer the same question, we take a manager and he should come up with a final proposal which will be judge!!

In [59]:
together_group_1 = ""
for index, answer in enumerate(group_1_answers):
    together_group_1 += f"# Response from emplyee {index+1}\n\n"
    together_group_1 += answer + "\n\n"

together_group_2 = ""
for index, answer in enumerate(group_2_answers):
    together_group_2 += f"# Response from emplyee {index+1}\n\n"
    together_group_2 += answer + "\n\n"

In [60]:
group_1_message_to_manager = f"""You are the manager of two employees.  
Each employee has been given the following question:  

{messages_group_1}  

Your task is to combine their answers into a single proposal, which will be evaluated by the manager.  
The employees’ responses are as follows:  
{together_group_1}"""

In [62]:
print(group_1_message_to_manager)

You are the manager of two employees.  
Each employee has been given the following question:  

[{'role': 'user', 'content': 'Task: Provide a complete trajectory-planning solution for an autonomous vehicle. Inputs: (1) HD map data, (2) time-series histories of other road users, and (3) outputs from the perception stack (e.g., object detections, classifications, and semantic information). Your solution should consist of a single end-to-end network that takes perception inputs and map data as input and outputs a planned trajectory.No code only explanation what you suggest to do. The proposed approach should be competitive enough to win a benchmark competition.'}]  

Your task is to combine their answers into a single proposal, which will be evaluated by the manager.  
The employees’ responses are as follows:  
# Response from emplyee 1

Based on the given inputs, I propose a complete trajectory-planning solution for an autonomous vehicle that consists of a single end-to-end network archi

In [63]:
group_2_message_to_manager = f"""You are the manager of two employees.  
Each employee has been given the following question:  

{messages_group_2}  

Your task is to combine their answers into a single proposal, which will be evaluated by the manager.  
The employees’ responses are as follows:  
{together_group_2}"""

In [67]:
model_name = "gpt-5-nano"
openai = OpenAI()

messages_to_manager_1 = [{"role": "user", "content":group_1_message_to_manager}]
response = openai.chat.completions.create(
    model=model_name,
    messages=messages_to_manager_1,
)
answer_manager_group_1 = response.choices[0].message.content
display(Markdown(answer_manager_group_1))



Unified End-to-End Trajectory Planning Network (UE-ETPN)

Overview
- Goal: provide a single end-to-end neural network that ingests HD map data, perception stack outputs, and time-series histories of other road users, and outputs a planned trajectory (waypoints or control commands) for an autonomous vehicle.
- Key idea: fuse static map context with dynamic scene information, reason about interactions via a graph, and generate smooth, safe trajectories through an integrated sequence decoder with a differentiable planning layer. This combines the strengths of both employees: end-to-end learning with rich multimodal fusion (employee 2) and graph-based, constraint-aware planning stages (employee 1), augmented by modern behavior- and optimization-informed components.

1) Input modalities and data flow
- HD Map Data
  - Lane graphs (lanes, connectivity, speed limits), road topology, shoulders, barriers, stop lines, intersections, traffic rules.
  - Drivable area, restricted zones, exit/entry ramps.
  - Processed into a lane-graph representation with geometric attributes (curvature, width) and semantic attributes (speed limits, permissible maneuvers).
- Perception Outputs
  - Detected objects: type/class, 3D position, velocity, heading, extent, confidence.
  - Semantic context: drivable surface classification, road furniture, traffic signs.
- Time-Series Histories
  - Past trajectories and states of surrounding agents (positions, velocities, headings, past maneuvers).
  - Temporal window chosen to capture short- to mid-term intent (e.g., last 1–2 seconds or longer, depending on dataset/latency).
- Outputs to the network
  - A consolidated scene representation that blends static map context, dynamic agent states, and perception-derived object features.

2) Core architecture (end-to-end network with differentiable planning)
- Multimodal Encoders
  - Map Encoder: converts the HD map lane graph into embeddings. Uses a graph neural network (GNN) over the lane graph to capture connectivity, lane geometry, and allowable maneuvers.
  - Perception Encoder: processes object detections and semantics via a backbone (CNN/Transformer) to produce object embeddings (type, pose, velocity, confidence).
  - Temporal Dynamics Encoder: processes time-series histories of other road users with a temporal model (LSTM/Temporal Transformer) to capture motion patterns and intent cues.
- Scene Interaction Module (Graph-Based Reasoning)
  - Ego-vehicle node connected to detected agents and relevant map features.
  - Message passing across agents and map elements to reason about interactions (e.g., potential occlusions, yielding behavior, lane-change intents).
  - Outputs scene-aware agent embeddings and interaction-aware likelihoods of future positions.
- Fusion and Scene Context
  - A fusion layer combines map embeddings, perception embeddings, and temporal embeddings into a unified scene representation. Includes positional encoding (relative distances, angles to ego) and confidence-aware weighting.
- Trajectory Decoder (End-to-End Sequence Generator)
  - A sequence generator (RNN or Transformer) that outputs a sequence of future ego states, represented as either:
    - Waypoints in ego-centered coordinates, or
    - Steering/acceleration commands, aligned to a fixed horizon and timestep.
  - Attends to the scene representation and agent interaction embeddings, ensuring planned motion respects map constraints and predicted behaviors of other agents.
- Differentiable Planning Layer (Constraint-Aware Optimization)
  - A differentiable MPC-like layer or a motion-quadrature style optimization that refines the decoder outputs to satisfy kinematic/dynamic constraints and safety margins.
  - Costs/constraints include: collision avoidance, lane-keeping, speed limits, comfort (minimize jerk), progress toward goal (time-to-arrival), and adherence to traffic rules.
  - This layer provides a global feasibility and smoothness correction to the raw decoder outputs, acting as a bridge between purely learned prediction and model-based feasibility.
- Optional Dynamic Programming (DP) Module
  - For long-horizon planning, a lightweight DP pass operates on the graph-derived plan to further optimize multi-step tradeoffs (e.g., preferred lane sequences, overtaking opportunities, merging strategies) while enforcing hard constraints. This is designed to be differentiable or to run as a fast, deterministic refinement layer during inference.

3) Behavior modeling and safety
- Behavior-aware conditioning
  - The interaction graph yields predicted behaviors for other agents (e.g., likely trajectories, lane-change intentions) and conditions the ego plan to avoid conflicts.
- Safety guarantees and margins
  - Incorporate explicit minimum clearance buffers around obstacles and predicted agents.
  - Enforce hard constraints at the planning layer for staying within drivable areas, respecting lane boundaries, and obeying traffic signals.
- Rule-based override (optional)
  - A lightweight, differentiable “safety filter” or override module can be activated in edge cases to guarantee lane-keeping, stop requirements, or MPC hard constraints, ensuring fallback safety.

4) Training strategy
- Data and supervision
  - Imitation learning: train to reproduce expert-like trajectories from datasets with ground-truth demonstrations.
  - Behavior prediction signals: train the interaction module to predict likely future positions of other agents (slower/longer-horizon predictions can be auxiliary outputs).
  - Map and perception consistency: encourage coherent representations between map features and perception outputs.
- Loss components (multi-task)
  - Trajectory fidelity: mean-squared or Huber loss between predicted ego states and expert trajectories.
  - Collision risk penalty: soft collision cost with dynamic agents to discourage dangerous plans.
  - Comfort and feasibility: penalize high jerk, high acceleration, or infeasible accelerations relative to vehicle dynamics.
  - Constraint adherence: penalties for leaving drivable areas, violating lane boundaries, or ignoring traffic rules.
  - Behavior consistency: align predicted agent behaviors with observed maneuvers.
- Reinforcement learning and fine-tuning
  - Optional RL stage to optimize long-horizon objectives (safety, efficiency, comfort) using simulated scenarios and domain randomization.
- Data augmentation
  - Occlusions, sensor noise, weather variations, and synthetic rare scenarios to improve robustness.

5) Inference, real-time considerations, and robustness
- Real-time requirements
  - Modular design with efficient graph processing and attention-based decoding. The differentiable planning layer is designed for low-latency execution, with configurable horizon and timestep.
- Robustness
  - Uncertainty-aware decoding: propagate perception and behavior uncertainty into the planning layer to reflect risk in the output trajectory.
  - Ensemble orMonte Carlo sampling (optional) for critical scenarios to assess plan robustness.
- Computational considerations
  - Parameter sharing and lightweight graph computations to fit on embedded hardware. Real-time profiling and pruning of attention layers to maintain target latency.

6) Evaluation plan aligned with benchmarks
- Metrics
  - Safety: collision rate, near-misses, time-to-collision proxies.
  - Comfort: jerk, acceleration profiles, longitudinal/sideload smoothness.
  - Efficiency: time-to-goal, progress rate, lane-change efficiency.
  - Compliance: rule adherence, lane-keeping accuracy, adherence to speed limits.
  - Robustness: performance under perception noise, occlusions, and adversarial-like perturbations.
- Validation strategy
  - Benchmarking on standard datasets (e.g., nuScenes, Waymo Open) and additional synthetic benchmarks emphasizing occlusions, dense traffic, and complex interchanges.
  - Ablations to quantify the contribution of:
    - Graph-based planning vs pure sequence decoding
    - End-to-end planning with differentiable MPC vs post-hoc optimization
    - Impact of temporal history window size
    - Effectiveness of the behavior prediction module

7) Risks and mitigations
- Data distribution shift
  - Mitigation: continual learning, domain adaptation, and ongoing fine-tuning on new scenarios.
- Safety guarantees
  - Mitigation: a fallback safety filter or a hard-constrained planner when uncertainty is high or perception outputs are unreliable.
- Interpretability
  - Mitigation: expose attention maps or attention weights to provide insight into which map elements or agents influenced the plan; provide deterministic fail-safe modes.
- Computational load
  - Mitigation: scalable graph sparsification, horizon tuning, and optional regional planning refinements depending on latency budget.

8) Roadmap and milestones
- Phase 1: Pretraining and modular validation
  - Train map encoder, perception encoder, and temporal encoder separately on available datasets.
  - Validate graph-based interaction reasoning with synthetic and real data.
- Phase 2: End-to-end integration
  - Combine encoders with the decoder and differentiable planning layer.
  - Run offline evaluations against expert trajectories and safety metrics.
- Phase 3: Reinforcement learning and refinement
  - Apply RL with simulated scenarios to improve long-horizon performance and robustness.
- Phase 4: Real-time deployment testing
  - Optimize for latency on target hardware, perform field tests with safety overrides in controlled environments.
- Phase 5: Benchmark-ready
  - Demonstrate competitive performance on standard benchmarks with comprehensive ablations and robustness analyses.

Why this blended approach is competitive
- Combines end-to-end learning with explicit graph-based reasoning about map context and multi-agent interactions, addressing occlusions and complex scene structures.
- Uses a differentiable planning layer to enforce feasibility and safety while maintaining the benefits of neural optimization for adaptability and performance.
- Leverages temporal histories to predict and respond to agent behavior, improving anticipatory planning in dynamic environments.
- Provides a clear path to safety guarantees via hard constraints, safety filters, and robust validation, which is crucial for benchmark credibility and real-world deployment.

If you’d like, I can tailor this proposal to a specific benchmark (e.g., nuScenes-like or Waymo-like tasks), specify exact input formats, or draft concrete loss terms and training schedules aligned with your preferred datasets.

In [68]:
messages_to_manager_2 = [{"role": "user", "content":group_2_message_to_manager}]
response = openai.chat.completions.create(
    model=model_name,
    messages=messages_to_manager_2,
)
answer_manager_group_2 = response.choices[0].message.content
display(Markdown(answer_manager_group_2))


Unified Trajectory Planning Proposal (combining both employees’ approaches)

Executive summary
This proposal combines the strengths of both responses into a cohesive, modular trajectory-planning system for an autonomous vehicle. The architecture exposes three dedicated networks (Prediction, Path Planning, Velocity Planning) plus a multi-path planning and scoring layer that evaluates and selects the best trajectory in real time. It leverages rich inputs (HD maps, time-series histories of other road users, and perception outputs) and blends classical planning methods (graph-based and sampling-based) with data-driven models (RNN/LSTM-based prediction, learned cost/score components, and optional RL/MPC components). The result is a robust, scalable, and competitive framework suitable for benchmark-style evaluations.

1) System architecture and data flow
- Inputs
  - HD map data: road geometry, lanes, boundaries, traffic controls, static obstacles.
  - Time-series histories: past trajectories and states of nearby road users (vehicles, pedestrians, cyclists).
  - Perception outputs: detections, classifications, semantic scene information (lane markings, occlusions, drivable area, signals).
- Modules (modular, operating in a pipeline with optional feedback loops)
  - Prediction Module
    - Purpose: forecast future states of dynamic agents (other road users) with uncertainty.
    - Inputs: HD map, ego state, perception outputs, and time-series histories.
    - Output: a set of future trajectories for each agent (multi-hypothesis or probabilistic forecast) with confidence/uncertainty (e.g., per-hypothesis probability, occupancy likelihoods, or a probabilistic occupancy grid over horizon).
  - Path Planning Module
    - Purpose: generate feasible geometric paths that respect map constraints and predicted dynamics.
    - Inputs: HD map, ego state, predicted trajectories of other agents, semantic information (lanes, signals).
    - Outputs: a set of candidate paths (path primitives, segments or full-path trajectories) with associated feasibility and safety attributes.
  - Velocity Planning Module
    - Purpose: compute a time-parameterized velocity profile that follows a chosen path while satisfying dynamics, comfort, and rules.
    - Inputs: chosen path, horizon time, predicted obstacle constraints, vehicle dynamics model.
    - Outputs: velocity schedule along the path (and a corresponding acceleration profile), ready to execute.
  - Multi-Path Planning and Scoring System
    - Purpose: evaluate multiple path hypotheses and velocity plans, select the best overall trajectory.
    - Inputs: all candidate paths and their velocity profiles, predicted future states of dynamic objects, safety and comfort criteria.
    - Outputs: the selected trajectory (path + velocity) to execute, plus a re-plan trigger flag (if the environment changes significantly).
- Feedback and re-planning
  - The system is capable of re-planning at high refresh rates when predictions change or new perception data arrives, ensuring responsiveness to dynamic scenes.

2) Prediction Module (why this design)
- Approach
  - Use a learned prediction model (RNN/LSTM-based or transformer-based) to process time-series data of other road users along with perception and map information.
  - Produce multiple future hypotheses per agent (multi-hypothesis forecasting) with associated probabilities or uncertainties.
- Why it’s good
  - Captures temporal dependencies and interactions among agents.
  - Provides uncertainty quantification, enabling safer planning under ambiguity.
  - Easily extensible to incorporate additional inputs (e.g., intention signals, traffic rules, semantics).
- Alternatives and compatibility
  - If bandwidth or latency is tight, switch to a lighter probabilistic model or horizon-limited predictions, still feeding the Path Planning Module with uncertainty metrics.

3) Path Planning Module (why this design)
- Approach
  - Combine graph-based planning (A*, Dijkstra) for optimality guarantees with sampling-based methods (RRT*, PRM) to handle complex, non-convex constraints and dynamic gaps.
  - Leverage the HD map for global geometry and lane connectivity, and the predicted trajectories of other agents to carve out safe corridors.
  - Produce multiple candidate paths (path hypotheses) that explore different routes and ways to negotiate dynamic obstacles.
- Why it’s good
  - Guarantees on optimality where feasible (graph-based) and flexibility in complex environments (sampling-based).
  - Multi-path generation improves robustness to prediction errors and occlusions; enables better re-planning decisions.
  - Compatible with both classical optimization and learning-based cost components.

4) Velocity Planning Module (why this design)
- Approach
  - Primary method: Model Predictive Control (MPC) or optimization-based velocity planning along a fixed path, enforcing dynamic limits, comfort, and rule compliance.
  - Secondary enhancement: PID baseline for deterministic, fast-response control; optional reinforcement learning (RL) to adapt velocity behavior across varied scenarios (subject to safety constraints and validation).
- Why it’s good
  - Ensures safe, smooth, and efficient motion along the selected path.
  - MPC accounts for future predicted obstacles and ego dynamics, improving safety margins.
  - RL can improve comfort and energy efficiency in complex scenarios when properly constrained and validated.

5) Multi-Path Planning and Scoring System (why this design)
- Concept
  - For each segment of the map and each time horizon, generate a diverse set of candidate paths (N paths) and associated velocity profiles (M profiles per path).
  - For each candidate, compute a composite cost that encodes safety, time/effort, comfort (jerk, acceleration continuity), consistency with traffic rules, and energy usage.
  - Weigh costs with a predefined scheme or learnable weights, ensuring safety is primary.
  - Rank and select the top trajectory to execute. Trigger replanning if new data significantly changes costs or feasibility.
- Why it’s good
  - Real-time resilience: multiple viable trajectories offer alternatives if obstacles appear or predictions shift.
  - Balances competing objectives (safety, efficiency, comfort) via a flexible scoring system.
  - Compatible with benchmarking requirements that reward robustness and performance across scenarios.

6) Data preprocessing and integration (practical considerations)
- HD map
  - Normalize coordinate frames, extract lane graph, road boundaries, intersections, signals, and stop lines.
- Time-series histories
  - Build agent-specific motion histories and interactions (position, velocity, yaw, type).
  - Generate features such as relative distances, closure rates, and behavioral priors.
- Perception outputs
  - Fuse detections with semantic labels (e.g., lane markings, drivable area, crosswalks) to refine planning constraints.
- Real-time constraints
  - Maintain bounded latency budgets for each module; implement asynchronous processing where possible.
  - Use uncertainty-aware representations (e.g., probabilistic occupancy, per-agent confidence) to influence planning.

7) Training, validation, and benchmarking
- Prediction module
  - Train with multi-task losses (position, velocity, orientation, and uncertainty) on diverse driving datasets.
  - Use offline evaluation with metrics such as root-mean-square error, final-state error, and probabilistic calibration (proper confidence estimates).
- Planning and scoring
  - Validate the multi-path planner in simulation with diverse scenarios (dense traffic, occlusions, complex intersections).
  - Use metric suites: safety (minimum distance to obstacles, collision rate in simulation), efficiency (average travel time, path length), comfort (jerk, acceleration profiles), and robustness (re-planning frequency, failure rates).
- Continuous learning
  - Optional feedback loop to refine prediction and scoring weights from real-world data, ensuring safe adaptation over time.

8) Safety, reliability, and real-time considerations
- Safety-first design
  - Always enforce hard safety constraints in the velocity planning and path planning modules (keep clearances, legal speed limits, stop requirements).
  - Use conservative defaults when uncertainty is high; prefer re-planning over risky execution.
- Real-time viability
  - Parallelize modules where possible; fix upper bounds on per-module latency; reduce horizon adaptively in dense traffic to maintain refresh rates.
- Validation and governance
  - Rigorously validate all changes in simulation before deployment to real-world environments.
  Use formal safety cases or fail-safes to ensure a safe fallback mode if perception or prediction fails.

9) Competitive advantages for benchmarks
- Modularity and specialization
  - Separate prediction, planning, and velocity control allows targeted optimization, better generalization, and easier debugging.
- Multi-path exploration
  - Generating and evaluating multiple trajectories improves robustness to perception/prediction errors, a key advantage in benchmarks with varied scenarios.
- Blend of learning and classical methods
  - Combines the strengths of data-driven predictions with proven planning algorithms, offering both performance and reliability.
- Data richness and adaptability
  - The system readily accommodates richer inputs (traffic signals, road regulations, V2X data) and can adapt to new benchmarks with minimal architectural changes.

10) Implementation plan and milestones (high level)
- Phase 1: Baseline modular pipeline
  - Implement prediction (multi-hypothesis), graph-based path planning, and MPC-based velocity planning with a fixed set of weights for scoring.
- Phase 2: Multi-path generation and scoring
  - Add multiple path hypotheses, implement scoring function, and validate against simulated benchmarks.
- Phase 3: Advanced integration and learning
  - Introduce RL/MPC enhancements for velocity, refine costs through offline/online learning, and calibrate uncertainty handling.
- Phase 4: Validation and deployment readiness
  - Comprehensive offline validation, closed-loop simulations, and safety certification activities.

11) Potential risks and mitigations
- Prediction errors leading to unsafe plans
  - Mitigation: maintain high-level safety envelopes, conservative re-planning, and uncertainty-aware planning.
- Latency and real-time constraints
  - Mitigation: optimize code paths, parallelization, horizon adaptivity, and lightweight prediction configurations when needed.
- Overfitting to benchmarks
  - Mitigation: test on diverse simulated scenarios, enforce cross-dataset validation, and maintain a strong safety-first baseline.

Conclusion
This unified proposal preserves the modular, data-driven spirit of both employees’ ideas while delivering a coherent system that is robust, scalable, and competitive for benchmark environments. By separating prediction, path planning, and velocity planning, and by employing a multi-path planning and scoring framework, the vehicle can make safe, efficient, and comfortable decisions in a wide range of traffic conditions. The approach is adaptable to richer inputs and can incorporate learning-based refinements over time, all while maintaining the reliability expected in real-world autonomous driving deployments.

In [69]:
both_proposals = f""" You are  given two proposal of two groups for solving the following task: {task}. Your goal is to decide which proposal is the best to go with because you are the boss who decide where we should put our effort. once you decide you should explain why you choose it. The proposal are the following: group numebr 1: {answer_manager_group_1}.

The proposal of group 2: {answer_manager_group_2}.
"""




In [70]:
openai = OpenAI()
both_proposals_message = [{"role":"user", "content": both_proposals}]
response = openai.chat.completions.create(
    model="o3-mini",
    messages=both_proposals_message,
)
results = response.choices[0].message.content
display(Markdown(results))


I would choose Group 1’s “Unified End-to-End Trajectory Planning Network (UE-ETPN)” proposal. Here’s why:

1. Integrated Multimodal Fusion with Explicit Interaction Modeling
 • Group 1’s approach carefully fuses rich map data, perception outputs, and time-series histories through dedicated encoders (e.g., a graph neural network for the lane graph and temporal models for agent states). This kind of tightly coupled, end-to-end scheme makes it easier to learn complex, high-level representations of both static (map, rules) and dynamic (other agents’ behaviors) aspects of the scene.
 • The explicit “scene interaction module” that reasons via message passing (similar to a graph-based interaction model) is highly attractive, as it naturally captures inter-agent interactions along with map-based context (e.g., occlusions, lane-change intents). This joint reasoning is essential for safe and smooth trajectory planning.

2. Differentiable Planning for Safety and Feasibility
 • The integrated differentiable planning layer (akin to a differentiable MPC layer) is a very appealing design choice. It helps bridge the gap between pure learning and classic control, by allowing the network to correct its raw outputs based on known kinematic/dynamic constraints and safety margins.
 • Because the planning constraints (collision avoidance, lane adherence, comfort metrics, etc.) are baked into the network training, the output trajectory can be fine-tuned to satisfy hard constraints – a big plus for real-world deployment and benchmark credibility.

3. End-to-End Training and Real-Time Considerations
 • An end-to-end design—with clear steps from inputs through risk-aware behavior prediction and constraint enforcement—not only simplifies the training pipeline (imitation plus optional reinforcement learning) but also ensures that gradient feedback touches every stage of the system. This can be key for coordinated state-of-the-art performance.
 • The proposal also addresses practical issues such as latency, uncertainty propagation, and optional safety filters which are critical in real-world scenarios.

4. Simplicity, Interpretability, and Scalability
 • Although Group 2’s modular approach is very comprehensive, its breaking up the problem into distinct networks (prediction, path planning, velocity planning) could lead to increased system complexity and more intricate integration challenges.
 • The unified approach of Group 1, by embedding a holistic planning solution in one end-to-end differentiable pipeline, is generally easier to debug, tune, and adapt to different benchmarks (e.g., nuScenes or Waymo) through a single training process.
 • Furthermore, having a “differentiable planning” module ensures that the network’s behavior is directly optimized for both performance and safety constraints, which is key when your benchmark metrics span safety, comfort, and efficiency.

In summary, Group 1’s proposal offers a compelling blend of end-to-end learning, explicit graph-based reasoning, and differentiable constraint handling in one coherent framework. This integrated solution seems best positioned to meet the evolving requirements of both benchmark evaluations and real-world autonomous driving challenges while keeping the design relatively streamlined.