### Import required Packages

In [1]:
# import the required packages

import os
import json
from openai import OpenAI
from groq import Groq
from dotenv import load_dotenv
from IPython.display import display, Markdown

### 1. setting up the env and api clients

In [2]:
# load the api keys
load_dotenv(override=True)
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
GOOGLE_API_KEY = os.getenv("GOOGLE_API_KEY")
GROQ_API_KEY   = os.getenv("GROQ_API_KEY")

In [3]:
# message for llm's
question  ="Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence."
question += "only provide the question,strictly do not add any other explanations"
message   = [{"role":"user", "content":f"{question}"}]
message


[{'role': 'user',
  'content': 'Please come up with a challenging, nuanced question that I can ask a number of LLMs to evaluate their intelligence.only provide the question,strictly do not add any other explanations'}]

In [4]:
# setting the api clients 
openai_client = OpenAI(api_key=OPENAI_API_KEY)
gemini_client = OpenAI(api_key=GOOGLE_API_KEY, base_url="https://generativelanguage.googleapis.com/v1beta/openai/")
groq_client   = OpenAI(api_key=GROQ_API_KEY, base_url="https://api.groq.com/openai/v1")

In [5]:
# calling open ai gpt 4o mini for generating the question.
model ="gpt-4o-mini"
response = openai_client.chat.completions.create(model=model,messages=message)
question = response.choices[0].message.content
display(Markdown(question))


If a person can only make decisions based on binary choices, how would you evaluate the complexity of their decision-making process when faced with a situation that involves multiple conflicting values or priorities?

### 2. Compairing response from multiple models

In [6]:
# save the responses into a list for further checkings.
responses  = []
model_name = []
message = [{"role":"user","content":f"{question}"}]
print(message)

[{'role': 'user', 'content': 'If a person can only make decisions based on binary choices, how would you evaluate the complexity of their decision-making process when faced with a situation that involves multiple conflicting values or priorities?'}]


In [7]:
model ="gpt-4o-mini"
response = openai_client.chat.completions.create(model=model,messages=message)
answer = response.choices[0].message.content
display(Markdown(answer))
responses.append(answer)
model_name.append(model)

Evaluating the complexity of a person's decision-making process when they can only make binary choices—particularly in situations involving multiple conflicting values or priorities—requires an understanding of both the cognitive constraints of binary decision-making and the nuances of navigating complex scenarios.

1. **Binary Decision-Making Limitation**: A person restricted to binary choices (yes/no, true/false, accept/reject) faces inherent limitations in expressing preferences or weighing options. In situations with conflicting values, this binary framework may oversimplify the range of possibilities and nuances, leading to potential dissatisfaction or regret after the decision.

2. **Complexity of Conflicting Values**: When faced with multiple conflicting values, the decision-maker must navigate trade-offs. For example, if one value prioritizes personal gain while another emphasizes social responsibility, the binary choice may force them to prioritize one over the other without fully accounting for the implications. This creates a tension in their decision-making process, as they may struggle to align their choice with their broader values.

3. **Cognitive Dissonance**: The requirement to choose between conflicting values can lead to cognitive dissonance, where the individual feels discomfort or conflict regarding their decision. They might rationalize their choice post-hoc or experience guilt or uncertainty about their decision's alignment with their values.

4. **Sequential Decision-Making**: To cope with binary limitations in complex scenarios, the person might employ sequential decision-making. They could approach the situation step-by-step, making a series of binary choices that cumulatively reflect their priorities. This adds layers to the decision process, as each choice can influence subsequent options.

5. **Contextual Factors**: The decision's context, including external pressures (time constraints, social expectations), can add to its complexity. A binary decision in a high-stakes context may evoke emotional responses that complicate the process further.

6. **Impacts of Simplification**: The oversimplification of choices can lead to missed opportunities for more thoughtful considerations. Ideal decision-making would incorporate a spectrum of options, allowing for nuanced outcomes. However, with a binary limitation, outcomes may lean towards being polarized rather than balanced.

In summary, while binary decision-making simplifies the process in one sense, it introduces complexity when navigating multifaceted, conflicting values. The person may employ various strategies, including sequential decision-making and rationalization, to manage these conflicts. The evaluation of their decision-making complexity lies in understanding how these factors interplay and influence the eventual choice, despite the binary framing.

In [8]:
# compairing multiple models
model ="qwen/qwen3-32b"
response = groq_client.chat.completions.create(model=model,messages=message)
answer = response.choices[0].message.content
display(Markdown(answer))
responses.append(answer)
model_name.append(model)

<think>
Okay, so the user is asking about evaluating the complexity of a person's decision-making process when the person can only make binary choices, but they're faced with a situation that has multiple conflicting values or priorities. Hmm. Let me break this down.

First, I need to understand what it means to make decisions based on binary choices. Binary here probably means that each decision is between two options, like yes/no, on/off, etc. But when there are multiple values in conflict, how does reducing each decision to binary steps affect the complexity?

Let me think. If someone has to navigate several conflicting priorities, maybe each decision they make is a step that resolves part of the conflict, but the overall process might involve more steps. But since each step is binary, maybe the complexity is in the number of binary steps needed to address all conflicts. Alternatively, maybe the process becomes more complex because each binary decision might not fully resolve the issue, leading to a need for more decisions or a chain of binary choices.

I should consider how binary choices compose into a more complex decision tree. For example, if there are three conflicting values, each with their own priorities, the decision tree might branch out into multiple layers of binary decisions. The complexity could be measured in terms of the depth of the decision tree or the number of possible paths through these binary decisions.

Wait, but how does the conflict between values play into this? If there are multiple priorities, each binary decision might only address one aspect, leading to a situation where the person has to make a series of binary choices, each resolving part of the conflict. The complexity arises from the number of such decisions and how they interact.

Another angle: if all the decisions are binary, maybe the person has to prioritize one value over another at each step, leading to a series of trade-offs. The challenge is in how these trade-offs are structured. For example, using a decision matrix or a weighted scoring system, but since it's binary, maybe they have to iteratively eliminate options or choose between pairs until a final decision is made.

Wait, but in binary decision-making, each decision is a yes/no. So maybe they have to pair off the different priorities and decide which one to prioritize in each pair. The complexity would then depend on the number of such pairings and the order in which they're made. If the priorities are independent, the order might not matter much, but if they're interdependent, the sequence of decisions could affect subsequent choices, leading to different outcomes.

I also need to consider heuristics or algorithms. For example, the person might use a simple majority rule, always choosing the majority of their priorities, but that could ignore conflicting ones. Alternatively, they might have a hierarchy, where certain priorities take precedence, leading to a more structured approach with a fixed order of decision-making.

But the user is asking about the complexity. So maybe the key is to model this as a decision tree or a binary decision diagram. The complexity in terms of computational complexity would depend on the number of nodes or the depth. However, the user is probably asking more about the cognitive load or the number of steps involved, rather than computational complexity per se.

Another thought: when dealing with multiple conflicting values, each binary decision could be considered a constraint. The person is essentially constructing a series of constraints through their binary choices. The complexity would then be the number of constraints needed to resolve all conflicts, which might be non-trivial.

But how can we quantify this? If there are n conflicts, each resolved by a binary choice, the complexity could be exponential in the worst case if each decision leads to two branches. However, if the person is using a decision-making strategy that reduces the problem step-wise, maybe the complexity is more linear or polynomial.

Wait, maybe using game theory or multi-attribute utility theory as a framework. If each binary decision corresponds to a trade-off between attributes, the complexity is the number of such trade-offs required to reach a final decision that balances all the conflicting values. 

I should also think about examples. Suppose someone has to decide between job A and job B, considering salary, location, and benefits. If they must make binary choices, they might first compare job A vs. Job B on salary, then on location, then on benefits. Each binary decision reduces the options. But how does this handle conflicts where one job better on salary but worse on location?

In this case, the person might have to make a series of binary choices. For instance, first decide which attribute is more important (salary vs. location), then within each attribute compare the options. But if the decision is constrained to binary each time, the process becomes a chain of yes/no questions. The complexity here would be the number of binary decisions required to resolve all conflicts, which could be several steps if attributes are interdependent.

Alternatively, maybe the person uses a binary approach by creating pairwise comparisons. For example, in the case of multiple priorities, they might compare each pair of priorities and assign which one should be given more weight. This could be modeled similarly to a tournament in a sports bracket, where each priority is a team and each binary decision is a match between two.

The complexity here would be the number of such pairwise comparisons needed. For m priorities, the number could be m(m-1)/2 in the worst case if each pair has to be compared. That's quadratic complexity. But maybe there's a smarter way to structure the comparisons, reducing the number.

However, the user is asking how to evaluate the complexity. So, the evaluation would involve determining the number of binary decisions required, the structure of the decision process (like whether it's a binary tree or a chain), and how these decisions resolve the conflicts. Additionally, there might be the complexity of ensuring that all conflicts are addressed and that the final decision is consistent with all the binary choices made along the way.

Another aspect is the potential for intransitivity. If binary choices are made in a way that doesn't consider the overall picture, the person might end up with conflicting decisions. For example, choosing A over B, B over C, but then finding C better than A. This inconsistency would imply that the binary decision-making process might not adequately handle transitivity, adding to the complexity as the person might have to backtrack or adjust previous decisions.

Moreover, when there are multiple conflicting values, the person might need to iterate through the binary decisions multiple times, adjusting their preferences or priorities based on the outcomes of previous decisions. This could lead to an iterative process where each binary decision is followed by a reassessment, increasing the overall complexity.

In summary, the complexity arises from the number of binary decisions required, the structure of their interactions (whether they form a tree, a chain, or a web), the potential for inconsistency or intransitivity, and the need for iterative refinement. Evaluating this would involve analyzing the decision process in terms of these factors.
</think>

Evaluating the complexity of decision-making when limited to **binary choices** amidst **multiple conflicting values** involves analyzing how binary decisions interact with hierarchical priorities, constraints, and trade-offs. Here's a structured breakdown:

---

### **1. Key Drivers of Complexity**
- **Number of Binary Decisions Required**:  
  Each binary choice resolves a part of the problem, but multiple decisions may be necessary to address all conflicts. For example, if there are $ n $ values/priorities, the number of pairwise comparisons or trade-offs (using a binary elimination process) could grow non-linearly (e.g., $ O(n^2) $ for all pairwise comparisons).

- **Interdependence of Decisions**:  
  Binary choices might cascade: one decision affects the options or weights of subsequent choices. This creates a **decision tree** where each node splits into two branches, increasing complexity exponentially with depth.

- **Conflict Resolution Strategies**:  
  - **Hierarchical Prioritization**: Decisions might prioritize values in a fixed order (e.g., "safety > cost > convenience"), reducing complexity by linearizing trade-offs.  
  - **Iterative Backtracking**: If conflicts arise (e.g., inconsistency between earlier and later decisions), the person may need to revisit prior choices, increasing cognitive load.

- **Intransitivity Risks**:  
  Binary choices can lead to paradoxes (e.g., $ A > B $, $ B > C $, but $ C > A $), requiring additional decisions to resolve inconsistencies. This adds complexity.

---

### **2. Frameworks for Evaluation**
#### **a. Decision Tree Analysis**
- **Depth**: The number of binary decisions needed to reach a conclusion (e.g., depth $ d $).  
- **Branching Factor**: Each decision splits into two paths, leading to $ 2^d $ possible outcomes if all branches are explored.  
- **Efficiency**: Heuristics (e.g., pruning unlikely branches) reduce complexity, similar to binary search algorithms.

#### **b. Pairwise Comparison Modeling**
- For $ n $ values, **$ \frac{n(n-1)}{2} $** pairwise binary decisions may be required (to rank all values). Complexity is **quadratic** in the worst case.  
- If values can be grouped hierarchically (e.g., using the **Analytic Hierarchy Process**), the number of decisions decreases logarithmically.

#### **c. Logical Constraints**
- **Satisfiability Problems**: Binary choices can be modeled as logical constraints (e.g., "If priority A is chosen, then constraint B must be satisfied"). Complexity depends on how many constraints exist and their dependencies.

---

### **3. Example: Job Selection Dilemma**
Consider choosing between three jobs (A, B, C) with conflicting priorities:  
- **Salary**, **Location**, **Professional Development**.

#### **Binary Decision Steps**:  
1. **Salary vs. Location**: "Is salary more important than location?"  
2. **Professional Development vs. Salary**: "Is professional development more important than salary?"  
3. **Pairwise Job Comparisons**:  
   - "Is Job A better than Job B based on priority X?"  
   - "Is the result consistent with earlier decisions?"  

#### **Complexity Evaluation**:  
- **Quadratic** in pairwise comparisons (e.g., 3 jobs, 3 priorities → 6 comparisons).  
- **Risk of Intransitivity**: If decisions lead to $ A > B $, $ B > C $, but $ C > A $, complexity increases due to backtracking.

---

### **4. Cognitive Load and Heuristics**
- **Simplification Strategies**:  
  - **Thresholds**: Convert multi-criteria decisions into binary by setting thresholds (e.g., "I require salary > $X" or "Location must be within 1 hour").  
  - **Default Choices**: Prioritize one dominant value over others (e.g., "Salary overrides all other factors").  

- **Emotional or Contextual Factors**: Binary choices may become more complex when emotions or urgency override structured trade-offs.

---

### **5. Quantifying Complexity**
| **Metric**                | **Description**                                                                 |
|----------------------------|---------------------------------------------------------------------------------|
| **Number of Decisions**     | Linear/quadratic/worst-case growth based on problem size.                      |
| **Path Diversity**          | Branching decisions leading to $ 2^d $ possible outcomes.                      |
| **Consistency Checks**      | Re-evaluating prior decisions to resolve intransitivities or contradictions.    |
| **Hierarchical Depth**      | Efficiency gains from reducing multi-attribute problems to step-wise binary choices. |

---

### **6. Conclusion**
The complexity of binary decision-making in multi-value conflicts is **non-trivial** and depends on:  
- **How decisions interact** (independently, hierarchically, or iteratively).  
- **The structure of the problem** (number of values/priorities, their interdependencies).  
- **Strategies employed** (e.g., pruning redundant decisions, using heuristics).  

In practice, this process mirrors **computational complexity in constrained optimization**, where binary decisions aim to approximate a multi-dimensional solution space using incremental, dichotomous steps. The challenge lies in balancing **completeness** (addressing all conflicts) with **cognitive/decisional efficiency**.

In [9]:
model ="gpt-5-nano"
response = openai_client.chat.completions.create(model=model,messages=message,)
answer = response.choices[0].message.content
display(Markdown(answer))
responses.append(answer)
model_name.append(model)

Treat the situation as a Boolean (binary) decision problem. Each conflicting value or priority can be viewed as a binary factor: it is either satisfied (1) or not satisfied (0) by a given choice. The more independent conflicting values you have, the larger the space of possible trade-offs becomes, and the more complex the decision process tends to be.

How to think about and quantify the complexity

- Model and size
  - Let m be the number of independent conflicting values/priorities the person must balance.
  - Each decision option can be described by a vector in {0,1}^m indicating which values it satisfies.
  - If the person asks binary questions sequentially to narrow down the best option, the decision tree can have up to 2^m leaves (one for each possible pattern of satisfied/unsatisfied values) in the worst case.
  - The height of the decision tree is the maximum number of binary questions asked; at most m if you check one value at a time, though smarter questioning could prune more quickly.

- Complexity measures you can compute
  - Decision depth: the number of binary questions asked from start to finish.
  - Tree size: total number of internal nodes and leaves in the decision tree.
  - Number of possible outcome paths: up to 2^m in the worst case.
  - Information content: if you have a prior distribution over which trade-offs are likely, you can compute the entropy of the decision path to gauge cognitive load.
  - Computational/algorithmic complexity: if you formalize the problem as finding an option that maximizes a weighted sum of satisfied values, it can be polynomial in the number of options (O(nm) to score options, for fixed weights). If, however, you require a globally optimal compromise among many interacting binary constraints, the problem can become NP-hard in general (similar to 0-1 optimization or SAT variants) when m grows and the constraints interdepend.

- What drives practical complexity
  - Independence of values: If values are highly interdependent (satisfying one makes others easier or harder), the effective complexity can be lower than the raw 2^m bound.
  - Tie-breaking rules: A fixed rule (e.g., lexicographic prioritization) often reduces cognitive load to a predictable, linear number of checks.
  - Heuristics and structure: In real people, heuristics (rules of thumb, defaults, defaults by priority) dramatically reduce the effective depth. Without such rules, complexity grows quickly with m.

- Example to illustrate
  - Suppose there are 4 conflicting values: safety (S), privacy (P), cost (C), speed (Sp) → m = 4.
  - If you had to consider all possible patterns of which values are satisfied, there are up to 2^4 = 16 patterns.
  - With a simple tie-breaking rule (e.g., rank values by priority and accept the first option that satisfies the top k values), you might resolve the choice with far fewer binary checks (depth ≤ 4 in the worst case under one-item-at-a-time checking).
  - If you instead attempt to maximize a weighted sum of satisfied values and must compare many options to find the best trade-off, you’ve turned it into a small optimization problem. For moderate m and a modest number of options, this can be solved efficiently; for large, highly constrained systems, it can become computationally expensive.

Practical takeaways

- Expect combinatorial growth: with m independent conflicting values, the space of trade-offs grows roughly exponentially (up to 2^m possible patterns).
- Use structure to keep it manageable: impose a clear priority order (lexicographic), use thresholds, or group values into higher-level categories to reduce the effective m.
- If you model it algorithmically, be aware of NP-hardness in the most general forms. In practice, simple heuristics or hierarchical rules keep the process tractable.
- When evaluating a real person’s decision-making, you can measure cognitive load empirically by the depth of their reasoning (how many binary checks or considerations they go through) and the time taken, rather than assuming a fixed worst-case complexity.

In short: the complexity of binary-based decision-making under multiple conflicting values tends to scale exponentially with the number of independent conflicts unless a guiding rule or structure (priority order, grouping, heuristics) significantly reduces the effective number of binary checks.

In [10]:
model ="llama-3.3-70b-versatile"
response = groq_client.chat.completions.create(model=model,messages=message)
answer = response.choices[0].message.content
display(Markdown(answer))
responses.append(answer)
model_name.append(model)

Evaluating the complexity of a decision-making process based on binary choices can be quite challenging, especially when faced with situations involving multiple conflicting values or priorities. Here's a step-by-step approach to assess the complexity:

1. **Identify the key values or priorities**: Break down the situation into its core components, and list the relevant values or priorities at play. This could include factors like personal interests, moral principles, social norms, or practical considerations.
2. **Determine the number of relevant binary choices**: For each key value or priority, identify the binary choices that the person must make. For example, "Should I prioritize my personal interests (A) or the well-being of others (B)?" or "Should I choose option X (A) or option Y (B)?" Count the number of these binary choices.
3. **Assess the degree of conflict between values or priorities**: Evaluate how much each binary choice conflicts with other values or priorities. For instance, if the person must choose between two options that both have significant consequences for different stakeholders, the conflict is higher. You can use a scale, such as:
	* Low conflict: The binary choices have minimal impact on other values or priorities.
	* Medium conflict: The binary choices have some impact, but there are still clear priorities.
	* High conflict: The binary choices have significant, competing implications for multiple values or priorities.
4. **Evaluate the decision-making process's hierarchical structure**: Consider how the person's binary choices are organized. Are they:
	* Linear: Each binary choice is made independently, with no dependencies between them.
	* Hierarchical: Binary choices are nested, with earlier choices influencing later ones.
	* Interconnected: Binary choices are interconnected, with each choice affecting multiple other values or priorities.
5. **Calculate the complexity score**: Assign a complexity score based on the factors above. You can use a formula like:

Complexity score = (Number of binary choices) × (Average conflict level) × (Hierarchical structure factor)

Where:

* Number of binary choices is the total count from step 2.
* Average conflict level is the average conflict score from step 3.
* Hierarchical structure factor is:
	+ 1 for linear structures
	+ 1.5 for hierarchical structures
	+ 2 for interconnected structures

Example:
Suppose a person faces a situation with 5 key values or priorities, each with 2 binary choices. The conflict level is medium (score: 0.5) for 3 of the choices and high (score: 1) for the other 2. The decision-making process has a hierarchical structure. The complexity score would be:

Complexity score = (10 binary choices) × (0.6 average conflict level) × (1.5 hierarchical structure factor) = 9

6. **Interpret the complexity score**: A higher complexity score indicates a more complex decision-making process. This can help you understand the person's challenges in making a decision and the potential for trade-offs or compromised solutions.

Keep in mind that this framework is a simplification, and the actual complexity of the decision-making process may be higher or lower, depending on the specific context and the individual's cognitive abilities.

### 3. Finding the Best Response from all.

In [None]:
command = f""" consider you as a judge and now you are juding the response from {len(responses)} competitiors.
the response recieved from the competitiors for the question : {question}, 
Your job is to evaluate each response for clarity and strength of argument, and rank them in order of best to worst.
Respond with JSON, and only JSON, with the following format:
{{"results": ["best competitor number", "second best competitor number", "third best competitor number", ...]}}

Here are the responses from each competitor:

{responses}

Now respond with the JSON with the ranked order of the competitors, nothing else. Do not include markdown formatting or code blocks.
"""

In [14]:
judge_command = [{"role":"user", "content":f"{command}"}]

In [15]:
model ="gpt-4o-mini"
response = openai_client.chat.completions.create(model=model,messages=judge_command)
judgement = response.choices[0].message.content
display(Markdown(judgement))

{"results":["1","4","3","2"]}

In [36]:
results = json.loads(judgement)
result_list = list(results.values())[0]
for i,model in enumerate(model_name):
    print(f"-> rank {i+1} : {model_name[int(result_list[i])-1]}")

-> rank 1 : gpt-4o-mini
-> rank 2 : llama-3.3-70b-versatile
-> rank 3 : gpt-5-nano
-> rank 4 : qwen/qwen3-32b


<table style="margin: 0; text-align: left; width:100%">
    <tr>
        <td style="width: 600px; height: 406px; vertical-align: middle;">
            <img src="assets/orchestrator_workflow.png" style="display: block;" />
        </td>
        <td>
            <h2 style="color:#ff7800;">Orchestrator Worker Workflow</h2>
            <div style="color:#ff7800;">
                The orchestrator worker workflow is a dynamic task orchestration pattern used in AI applications. It consists of three key components:
                <br /><br />
                <strong style="color:white;">Orchestrator:</strong> A high-level coordinator that analyzes complex tasks and breaks them down into subtasks, assigning them to specialized workers.
                <br /><br />
                <strong style="color:white;">Workers:</strong> Specialized agents that execute individual tasks as assigned by the orchestrator, focusing on specific aspects of the overall task.
                <br /><br />
                <strong style="color:white;">Synthesizer:</strong> The final step where the outputs from all workers are collected and combined into a cohesive response for the user.
            </div>
        </td>
    </tr>
</table>
