## Loops (for, while)

### **1. Loops (for, while)**

Loops allow you to **repeat a block of code** multiple times.

- **`for` loop** → Iterates over a sequence (list, tuple, dictionary, set, string).
- **`while` loop** → Runs as long as a condition is true.

In [4]:
# for-loop 
for i in range(3):
    print(i)

for num in range(5):
    print(f"Num-{num+1}")

0
1
2
Num-1
Num-2
Num-3
Num-4
Num-5


In [6]:
# while loop
count = 0
while count < 3:
    print(f"Run {count+1}")
    count += 1

Run 1
Run 2
Run 3


### **2. Explanation Specific to AI Evaluation/Testing**

In AI evaluation:

- **for loops** → iterate over prompts, outputs, or test cases to run evaluations.
- **while loops** → useful for retrying model calls until a valid output is received.

Example:

In [1]:
# Loop through multiple test cases
prompts = [
    "What is AI?",
    "Define ML",
    "What is the captial of Australia"
]

for prompt in prompts:
    print(f"Evaluating: {prompt}")

Evaluating: What is AI?
Evaluating: Define ML
Evaluating: What is the captial of Australia


### **3. AI Evaluation/Testing Exercise**

**Goal:**

Iterate through test cases, run a mock hallucination score check, and print pass/fail.

**Code:**

In [2]:
# Mock Test Data
eval_results = [
    {"prompt": "What is AI?", 
     "hallucination_score":0.05
    },
    {"prompt": "Define machine learning", 
     "hallucination_score": 0.1
    },
    {"prompt": "Capital of Japan",
     "hallucination_score": 0.25
    }
]
# Threshold
hallucination_threshold = 0.2

# Using a for loop
for case in eval_results:
    passes_test = case["hallucination_score"] <= hallucination_threshold
    print(f"Prompt: {case['prompt']} | Pass: {passes_test}")



Prompt: What is AI? | Pass: True
Prompt: Define machine learning | Pass: True
Prompt: Capital of Japan | Pass: False


In [4]:
# Using a while loop for retrying (mocked example)
import random
prompt = "What is AI?"
score = 1.0
attempts = 0

# Threshold
hallucination_threshold = 0.2

while score > hallucination_threshold and attempts < 3:
    score = round(random.uniform(0, 1), 2)  # mock score generation
    attempts += 1
    print(f"Attempt {attempts} | Score: {score}")

Attempt 1 | Score: 0.41
Attempt 2 | Score: 0.2


### **4. DeepEval/RAGAs Context**

When integrating with **DeepEval** or **RAGAs**, loops are used to:

- Feed multiple prompt-output pairs into metrics.
- Aggregate results into a single report.

Example with DeepEval:

In [None]:
#!pip install deepeval

In [None]:
import os
# Enter you OPENAI API KEY to execute below block
# os.environ["OPENAI_API_KEY"] = "sk-proj-"

In [3]:
from deepeval.metrics import FaithfulnessMetric
from deepeval.test_case import LLMTestCase

faithfulness_metric = FaithfulnessMetric(threshold=0.8)

test_cases = [
    LLMTestCase(input="What is AI?", actual_output="AI stands for Artificial Intelligence", retrieval_context=["AI stands for Artificial Intelligence"]),
    LLMTestCase(input="Capital of Japan?", actual_output="Tokyo", retrieval_context=["Tokyo is the capital of Japan"])
]

for case in test_cases:
    faithfulness_metric.measure(case)
    print(f"Score: {faithfulness_metric.score} | Pass: {faithfulness_metric.is_successful()}")


Score: 1.0 | Pass: True


Score: 1.0 | Pass: True
