# Scoring Simulator

This notebook allows you to simulate scoring for a conversation by providing:
- `conversation_id`: The UUID of the conversation to score
- `user_id`: The UUID of the user (for verification)

It will compute and display:
1. **Conversation Scores**: Fillerwords, Clarity, Participation, Key Themes, Index of Questions, Rhythm, and Objective
2. **Profile Scores**: Prospection, Empathy, Technical Domain, Negotiation, and Resilience

## Imports

In [10]:
# Import necessary libraries and services
import asyncio
import sys
import os
import time
import numpy as np
import pandas as pd
from pathlib import Path
from uuid import UUID
from dotenv import load_dotenv
from IPython.display import display, HTML
from collections import defaultdict

# Add parent directory to path to import app modules
project_root = Path().resolve().parent
sys.path.insert(0, str(project_root))

# Load environment variables
load_dotenv()

# Import services
from app.services.messages_service import get_conversation_transcript
from app.services.conversations_service import get_conversation_details
from app.services.db import execute_query_one
from scoring_scripts.get_conver_scores import get_conver_scores
from scoring_scripts.get_conver_skills import get_conver_skills

print("‚úÖ Imports successful!")

‚úÖ Imports successful!


## Auxiliary functions

In [31]:
async def get_conversation_with_stage(conversation_id: UUID):
    """Get conversation details including course_id and stage_id with all feedback"""
    query = """
    SELECT 
        c.conversation_id, 
        c.user_id,
        c.course_id, 
        c.stage_id,
        c.start_timestamp, 
        c.end_timestamp, 
        c.status,
        sbc.general_score,
        sbc.fillerwords_scoring,
        sbc.clarity_scoring,
        sbc.participation_scoring,
        sbc.keythemes_scoring,
        sbc.indexofquestions_scoring,
        sbc.rhythm_scoring,
        sbc.is_accomplished,
        sbc.fillerwords_feedback,
        sbc.clarity_feedback,
        sbc.participation_feedback,
        sbc.keythemes_feedback,
        sbc.indexofquestions_feedback,
        sbc.rhythm_feedback, 
        cs.stage_objectives, 
        cs.key_themes
    FROM conversaApp.conversations c
    LEFT JOIN conversaapp.scoring_by_conversation sbc ON c.conversation_id = sbc.conversation_id
    LEFT JOIN conversaconfig.course_stages cs ON c.course_id = cs.course_id AND c.stage_id = cs.stage_id
    WHERE c.conversation_id = $1
    """
    
    result = await execute_query_one(query, conversation_id)
    return dict(result) if result else None


In [37]:
async def fetch_conversation_details(conversation_id: UUID, user_id: str = None):
    """Fetch conversation details and return them"""
    conversation_details = await get_conversation_with_stage(conversation_id)

    if not conversation_details:
        print(f"‚ùå Conversation {conversation_id} not found!")
        raise ValueError("Conversation not found")

    # Extract required IDs
    course_id = conversation_details.get("course_id")
    stage_id = conversation_details.get("stage_id")
    conv_user_id = conversation_details.get("user_id")
    stage_objectives = conversation_details.get("stage_objectives")
    key_themes = conversation_details.get("key_themes")

    print("üìã Conversation Details:")
    print(f"   User ID: {conv_user_id}")
    print(f"   Course ID: {course_id}")
    print(f"   Stage ID: {stage_id}")
    print(f"   Status: {conversation_details.get('status')}")
    print(f"   Start: {conversation_details.get('start_timestamp')}")
    print(f"   End: {conversation_details.get('end_timestamp')}")
    print("-" * 100)
    print(f"   Key themes: {key_themes}")
    print("-" * 100)
    print(f"   Stage Objectives: {stage_objectives}")
    print("-" * 100)

    # Verify user_id if provided
    if user_id and str(conv_user_id) != user_id:
        print(f"‚ö†Ô∏è  Warning: Provided user_id ({user_id}) doesn't match conversation user_id ({conv_user_id})")

    if not course_id or not stage_id:
        print("‚ùå Missing course_id or stage_id in conversation!")
        raise ValueError("Conversation missing required course_id or stage_id")
    
    return {
        'conversation_details': conversation_details,
        'course_id': course_id,
        'stage_id': stage_id,
        'user_id': conv_user_id,
        'stage_objectives': stage_objectives
    }

In [13]:
def display_transcript(transcript):
    if not transcript:
        print("‚ùå No transcript found for this conversation!")
    else:
        # Prepare data for table
        transcript_data = []
        for idx, turn in enumerate(transcript, 1):
            speaker = turn.get('speaker', 'Unknown')
            text = turn.get('text', '')
            duration = turn.get('duracion', 'N/A')
            
            # Format speaker name
            speaker_display = "üë§ Vendedor" if speaker == "vendedor" else "ü§ñ Cliente"
            
            # Count words
            word_count = len(text.split()) if text else 0
            
            transcript_data.append({
                '#': idx,
                'Speaker': speaker_display,
                'Text': text,
                'Words': word_count,
                'Duration (s)': duration if isinstance(duration, (int, float)) else 'N/A'
            })
        
        # Create DataFrame
        df_transcript = pd.DataFrame(transcript_data)
        
        # Calculate summary statistics
        total_turns = len(transcript_data)
        vendedor_turns = sum(1 for t in transcript_data if 'Vendedor' in t['Speaker'])
        cliente_turns = sum(1 for t in transcript_data if 'Cliente' in t['Speaker'])
        total_words = sum(t['Words'] for t in transcript_data)
        vendedor_words = sum(t['Words'] for t in transcript_data if 'Vendedor' in t['Speaker'])
        cliente_words = sum(t['Words'] for t in transcript_data if 'Cliente' in t['Speaker'])
        
        # Display summary
        print("=" * 100)
        print("üìù CONVERSATION TRANSCRIPT")
        print("=" * 100)
        print(f"\nüìä Summary:")
        print(f"   Total Turns: {total_turns}")
        print(f"   Vendedor Turns: {vendedor_turns} ({vendedor_turns/total_turns*100:.1f}%)")
        print(f"   Cliente Turns: {cliente_turns} ({cliente_turns/total_turns*100:.1f}%)")
        print(f"   Total Words: {total_words}")
        print(f"   Vendedor Words: {vendedor_words} ({vendedor_words/total_words*100:.1f}%)")
        print(f"   Cliente Words: {cliente_words} ({cliente_words/total_words*100:.1f}%)")
        print()
        
        # Display transcript table with HTML for better formatting
        display(HTML(df_transcript.to_html(
            index=False, 
            escape=False, 
            classes='table table-striped table-hover',
            table_id='transcript_table'
        )))

In [14]:
async def show_conver_scores(transcript, course_id, stage_id):
    """Compute and display conversation scores"""
    print("üîÑ Computing conversation scores...")
    print("=" * 60)

    # Compute scores directly using the transcript passed as parameter
    scoring_results = await get_conver_scores(transcript, course_id, stage_id)

    # Extract scores and feedback from the results
    conversation_scores_direct = {
        'general_score': scoring_results.get('puntuacion_global', 'N/A'),
        'fillerwords_scoring': scoring_results.get('detalle', {}).get('muletillas_pausas', 'N/A'),
        'clarity_scoring': scoring_results.get('detalle', {}).get('claridad', 'N/A'),
        'participation_scoring': scoring_results.get('detalle', {}).get('participacion', 'N/A'),
        'keythemes_scoring': scoring_results.get('detalle', {}).get('cobertura', 'N/A'),
        'indexofquestions_scoring': scoring_results.get('detalle', {}).get('preguntas', 'N/A'),
        'rhythm_scoring': scoring_results.get('detalle', {}).get('ppm', 'N/A'),
        'is_accomplished': scoring_results.get('objetivo', {}),
        'fillerwords_feedback': scoring_results.get('feedback', {}).get('muletillas_pausas', 'N/A'),
        'clarity_feedback': scoring_results.get('feedback', {}).get('claridad', 'N/A'),
        'participation_feedback': scoring_results.get('feedback', {}).get('participacion', 'N/A'),
        'keythemes_feedback': scoring_results.get('feedback', {}).get('cobertura', 'N/A'),
        'indexofquestions_feedback': scoring_results.get('feedback', {}).get('preguntas', 'N/A'),
        'rhythm_feedback': scoring_results.get('feedback', {}).get('ppm', 'N/A'),
        'objetivo': scoring_results.get('objetivo', {}),
    }

    # Store in global variable for use in other cells
    global updated_scores
    updated_scores = conversation_scores_direct

    print("\n" + "=" * 60)
    print("üìä CONVERSATION SCORES SUMMARY (COMPUTED)")
    print("=" * 60)
    print(f"General Score: {updated_scores.get('general_score', 'N/A')}")
    print(f"Fillerwords: {updated_scores.get('fillerwords_scoring', 'N/A')}")
    print(f"Clarity: {updated_scores.get('clarity_scoring', 'N/A')}")
    print(f"Participation: {updated_scores.get('participation_scoring', 'N/A')}")
    print(f"Key Themes: {updated_scores.get('keythemes_scoring', 'N/A')}")
    print(f"Index of Questions: {updated_scores.get('indexofquestions_scoring', 'N/A')}")
    print(f"Rhythm: {updated_scores.get('rhythm_scoring', 'N/A')}")
    print(f"Objective Accomplished: {updated_scores.get('is_accomplished', 'N/A')}")

    print("\nüìù FEEDBACK:")
    print(f"Fillerwords: {updated_scores.get('fillerwords_feedback', 'N/A')[:150]}...")
    print(f"Clarity: {updated_scores.get('clarity_feedback', 'N/A')[:150]}...")
    print(f"Participation: {updated_scores.get('participation_feedback', 'N/A')[:150]}...")
    print(f"Key Themes: {updated_scores.get('keythemes_feedback', 'N/A')[:150]}...")
    print(f"Index of Questions: {updated_scores.get('indexofquestions_feedback', 'N/A')[:150]}...")
    print(f"Rhythm: {updated_scores.get('rhythm_feedback', 'N/A')[:150]}...")
    
    return updated_scores

async def show_conver_profiles(transcript):
    """Compute and display profile scores"""
    # Declare global variable first
    global profiling_results
    
    print("üîÑ Computing profile scores...")
    print("=" * 60)

    # Compute profiles directly using the transcript passed as parameter
    profiling_results_computed = await get_conver_skills(transcript)

    # Extract scores and feedback from the computed results
    profiling_results = {
        'prospection_scoring': profiling_results_computed.get('prospection', {}).get('score', 'N/A'),
        'empathy_scoring': profiling_results_computed.get('empathy', {}).get('score', 'N/A'),
        'technical_domain_scoring': profiling_results_computed.get('technical_domain', {}).get('score', 'N/A'),
        'negotiation_scoring': profiling_results_computed.get('negociation', {}).get('score', 'N/A'),
        'resilience_scoring': profiling_results_computed.get('resilience', {}).get('score', 'N/A'),
        'prospection_feedback': profiling_results_computed.get('prospection', {}).get('justification', 'N/A'),
        'empathy_feedback': profiling_results_computed.get('empathy', {}).get('justification', 'N/A'),
        'technical_domain_feedback': profiling_results_computed.get('technical_domain', {}).get('justification', 'N/A'),
        'negotiation_feedback': profiling_results_computed.get('negociation', {}).get('justification', 'N/A'),
        'resilience_feedback': profiling_results_computed.get('resilience', {}).get('justification', 'N/A'),
    }

    print("\n" + "=" * 60)
    print("üë• PROFILE SCORES SUMMARY (COMPUTED)")
    print("=" * 60)
    print(f"Prospection: {profiling_results.get('prospection_scoring', 'N/A')}")
    print(f"Empathy: {profiling_results.get('empathy_scoring', 'N/A')}")
    print(f"Technical Domain: {profiling_results.get('technical_domain_scoring', 'N/A')}")
    print(f"Negotiation: {profiling_results.get('negotiation_scoring', 'N/A')}")
    print(f"Resilience: {profiling_results.get('resilience_scoring', 'N/A')}")

    print("\nüìù FEEDBACK:")
    print(f"Prospection: {profiling_results.get('prospection_feedback', 'N/A')[:150]}...")
    print(f"Empathy: {profiling_results.get('empathy_feedback', 'N/A')[:150]}...")
    print(f"Technical Domain: {profiling_results.get('technical_domain_feedback', 'N/A')[:150]}...")
    print(f"Negotiation: {profiling_results.get('negotiation_feedback', 'N/A')[:150]}...")
    print(f"Resilience: {profiling_results.get('resilience_feedback', 'N/A')[:150]}...")
    
    return profiling_results

In [15]:
def display_results(): 

    # Prepare comprehensive data with scores and feedback
    conversation_data = []
    conversation_metrics = [
        ("General Score", "general_score", None),
        ("Fillerwords", "fillerwords_scoring", "fillerwords_feedback"),
        ("Clarity", "clarity_scoring", "clarity_feedback"),
        ("Participation", "participation_scoring", "participation_feedback"),
        ("Key Themes", "keythemes_scoring", "keythemes_feedback"),
        ("Index of Questions", "indexofquestions_scoring", "indexofquestions_feedback"),
        ("Rhythm", "rhythm_scoring", "rhythm_feedback"),
        ("Objective Accomplished", "is_accomplished", None)
    ]

    for metric_name, score_key, feedback_key in conversation_metrics:
        score_value = updated_scores.get(score_key, 'N/A')
        if score_key == "is_accomplished":
            score_value = "Yes" if score_value else "No"
        feedback_value = updated_scores.get(feedback_key, 'N/A') if feedback_key else 'N/A'
        conversation_data.append({
            "Category": "Conversation",
            "Metric": metric_name,
            "Score": score_value,
            "Feedback": feedback_value if feedback_value != 'N/A' else '-'
        })

    # Prepare profile data with scores and feedback
    profile_data = []
    profile_skills = [
        ("Prospection", "prospection_scoring", "prospection_feedback"),
        ("Empathy", "empathy_scoring", "empathy_feedback"),
        ("Technical Domain", "technical_domain_scoring", "technical_domain_feedback"),
        ("Negotiation", "negotiation_scoring", "negotiation_feedback"),
        ("Resilience", "resilience_scoring", "resilience_feedback")
    ]

    if profiling_results:
        for skill_name, score_key, feedback_key in profile_skills:
            score_value = profiling_results.get(score_key, 'N/A')
            feedback_value = profiling_results.get(feedback_key, 'N/A')
            profile_data.append({
                "Category": "Profile",
                "Metric": skill_name,
                "Score": score_value,
                "Feedback": feedback_value if feedback_value != 'N/A' else '-'
            })

    # Combine all data
    all_data = conversation_data + profile_data

    # Create comprehensive DataFrame
    df_all = pd.DataFrame(all_data)

    # Configure pandas display options for better formatting
    pd.set_option('display.max_colwidth', None)
    pd.set_option('display.width', None)
    pd.set_option('display.max_columns', None)

    # Display the comprehensive table
    print("=" * 100)
    print("üìä COMPLETE SCORING RESULTS WITH FEEDBACK")
    print("=" * 100)
    print()

    # Display with HTML for better formatting in Jupyter
    display(HTML(df_all.to_html(index=False, escape=False, classes='table table-striped table-hover')))


    # Create separate tables for better readability
    print("\n" + "=" * 100)
    print("üéØ CONVERSATION SCORES:")
    print("=" * 100)
    df_conv = pd.DataFrame(conversation_data)
    display(HTML(df_conv.to_html(index=False, escape=False)))

    print("\n" + "=" * 100)
    print("üë• PROFILE SCORES:")
    print("=" * 100)
    if profile_data:
        df_prof = pd.DataFrame(profile_data)
        display(HTML(df_prof.to_html(index=False, escape=False)))
    else:
        print("No profile scores available")

    print("\n" + "=" * 100)

In [16]:
async def stress_test(iterations=10, transcript=None):

    print("üß™ STRESS TEST: Computing scores 5 times...")
    print("=" * 100)
    print("This will help us understand the variance in LLM-based scoring.\n")

    # Store all runs
    all_conversation_runs = []
    all_profile_runs = []

    # Run 5 iterations
    num_runs = iterations
    for run_num in range(1, num_runs + 1):
        print(f"üîÑ Run {run_num}/{num_runs}...")
        start_time = time.time()
        
        # Compute conversation scores
        scoring_results = await get_conver_scores(transcript, COURSE_ID, STAGE_ID)
        
        # Extract conversation scores
        conv_run = {
            'run': run_num,
            'general_score': scoring_results.get('puntuacion_global', 'N/A'),
            'fillerwords_scoring': scoring_results.get('detalle', {}).get('muletillas_pausas', 'N/A'),
            'clarity_scoring': scoring_results.get('detalle', {}).get('claridad', 'N/A'),
            'participation_scoring': scoring_results.get('detalle', {}).get('participacion', 'N/A'),
            'keythemes_scoring': scoring_results.get('detalle', {}).get('cobertura', 'N/A'),
            'indexofquestions_scoring': scoring_results.get('detalle', {}).get('preguntas', 'N/A'),
            'rhythm_scoring': scoring_results.get('detalle', {}).get('ppm', 'N/A'),
            'is_accomplished': scoring_results.get('objetivo', {}),
        }
        all_conversation_runs.append(conv_run)
        
        # Compute profile scores
        profiling_results_computed = await get_conver_skills(transcript)
        
        # Extract profile scores
        prof_run = {
            'run': run_num,
            'prospection_scoring': profiling_results_computed.get('prospection', {}).get('score', 'N/A'),
            'empathy_scoring': profiling_results_computed.get('empathy', {}).get('score', 'N/A'),
            'technical_domain_scoring': profiling_results_computed.get('technical_domain', {}).get('score', 'N/A'),
            'negotiation_scoring': profiling_results_computed.get('negociation', {}).get('score', 'N/A'),
            'resilience_scoring': profiling_results_computed.get('resilience', {}).get('score', 'N/A'),
        }
        all_profile_runs.append(prof_run)
        
        elapsed = time.time() - start_time
        print(f"   ‚úÖ Completed in {elapsed:.2f}s\n")

    print("=" * 100)
    print("‚úÖ All 5 runs completed!")
    print("=" * 100)

    return all_conversation_runs, all_profile_runs

In [17]:
def display_stress_test(all_conversation_runs, all_profile_runs):

    df_conv_runs = pd.DataFrame(all_conversation_runs)
    df_prof_runs = pd.DataFrame(all_profile_runs)

    # Calculate statistics for each metric
    def calculate_stats(values):
        """Calculate statistics for a list of numeric values"""
        numeric_values = [v for v in values if isinstance(v, (int, float)) and not isinstance(v, bool)]
        if not numeric_values:
            return {'mean': 'N/A', 'std': 'N/A', 'min': 'N/A', 'max': 'N/A', 'range': 'N/A'}
        
        mean_val = np.mean(numeric_values)
        std_val = np.std(numeric_values)
        min_val = np.min(numeric_values)
        max_val = np.max(numeric_values)
        range_val = max_val - min_val
        
        return {
            'mean': round(mean_val, 2),
            'std': round(std_val, 2),
            'min': min_val,
            'max': max_val,
            'range': round(range_val, 2)
        }

    # Conversation scores statistics
    conv_stats = {}
    conv_metrics = [
        col for col in df_conv_runs.columns 
        if any(key in col for key in [
            'general_score', 'fillerwords_scoring', 'clarity_scoring', 
            'participation_scoring', 'keythemes_scoring', 'indexofquestions_scoring', 
            'rhythm_scoring'
        ])
    ]
    for metric in conv_metrics:
        # skip if not present
        if metric not in df_conv_runs:
            continue
        values = df_conv_runs[metric].tolist()
        conv_stats[metric] = calculate_stats(values)

    # Profile scores statistics
    prof_stats = {}
    prof_metrics = [
        col for col in df_prof_runs.columns
        if any(key in col for key in [
            'prospection_scoring', 'empathy_scoring', 'technical_domain_scoring',
            'negotiation_scoring', 'resilience_scoring'
        ])
    ]
    for metric in prof_metrics:
        if metric not in df_prof_runs:
            continue
        values = df_prof_runs[metric].tolist()
        prof_stats[metric] = calculate_stats(values)

    # Display all 5 runs side by side
    print("\n" + "=" * 100)
    print("üìã ALL RUNS - CONVERSATION SCORES")
    print("=" * 100)
    display(HTML(df_conv_runs.to_html(index=False, escape=False, classes='table table-striped table-hover')))

    print("\n" + "=" * 100)
    print("üìã ALL RUNS - PROFILE SCORES")
    print("=" * 100)
    display(HTML(df_prof_runs.to_html(index=False, escape=False, classes='table table-striped table-hover')))

## Input Parameters

Enter the `conversation_id` and `user_id` below:

In [38]:
# Input parameters
CONVERSATION_ID = "744cb3bd-58f7-4731-bd59-f884f6e1c31f"  # Replace with your conversation_id
USER_ID = "your-user-id-here"  # Replace with your user_id

# Validate UUIDs
try:
    conv_uuid = UUID(CONVERSATION_ID)
    user_uuid = UUID(USER_ID) if USER_ID != "your-user-id-here" else None
    print(f"‚úÖ Conversation ID: {CONVERSATION_ID}")
    if user_uuid:
        print(f"‚úÖ User ID: {USER_ID}")
    else:
        print("‚ö†Ô∏è  User ID not provided (will be extracted from conversation)")
except ValueError as e:
    print(f"‚ùå Invalid UUID format: {e}")
    raise

‚úÖ Conversation ID: 744cb3bd-58f7-4731-bd59-f884f6e1c31f
‚ö†Ô∏è  User ID not provided (will be extracted from conversation)


## Get Conversation Details

First, we'll fetch the conversation details to get `course_id` and `stage_id`:

In [39]:
# Fetch conversation details using await (Jupyter supports top-level await)
conv_info = await fetch_conversation_details(conv_uuid, USER_ID if USER_ID != "your-user-id-here" else None)

# Extract the IDs for use in subsequent cells
COURSE_ID = conv_info['course_id']
STAGE_ID = conv_info['stage_id']
CONV_USER_ID = conv_info['user_id']

conversation_details = conv_info['conversation_details']

üìã Conversation Details:
   User ID: ae1ed3b5-0ea0-4ad5-9565-fec1e4ef1ee7
   Course ID: 3eeeda53-7dff-40bc-b036-b608acb89e6f
   Stage ID: afa3a709-7c7b-4a64-9642-3d6eec19cc53
   Status: FINISHED
   Start: 2026-02-10 17:28:23.314834
   End: 2026-02-10 17:33:08.437490
----------------------------------------------------------------------------------------------------
   Key themes: Calidad de Soldadura: Garant√≠a de un acabado limpio, sin proyecciones y listo para pasar inspecciones exigentes (ultrasonidos) gracias a la tecnolog√≠a de arco pulsado.

Precisi√≥n T√©cnica: Funcionamiento constante y sin interrupciones ni fallos en el hilo, asegurado por el potente sistema de tracci√≥n de 4 ruletas.

Cumplimiento Normativo: Seguridad legal y t√©cnica mediante el servicio de calibraci√≥n certificado que valida los equipos seg√∫n la normativa vigente.
----------------------------------------------------------------------------------------------------
   Stage Objectives: Cerrar un presupuest

## Conversation Transcript

Displaying the full conversation transcript:

In [20]:
# Get the conversation transcript
transcript = await get_conversation_transcript(conv_uuid)
display_transcript(transcript)

üìù CONVERSATION TRANSCRIPT

üìä Summary:
   Total Turns: 23
   Vendedor Turns: 11 (47.8%)
   Cliente Turns: 12 (52.2%)
   Total Words: 528
   Vendedor Words: 160 (30.3%)
   Cliente Words: 368 (69.7%)



#,Speaker,Text,Words,Duration (s)
1,ü§ñ Cliente,"¬°Hola!, ¬øqu√© tal?",3,
2,üë§ Vendedor,"[sonido de golpe] Hola, Roberto. Soy Daniel de Galagar. Ven√≠a a venderte la nueva gama de productos de soldadura que ofrecemos aqu√≠ en Galagar. [sonido de golpe] ¬øMe oyes, Roberto?",30,16.783
3,ü§ñ Cliente,"S√≠, te oigo perfectamente, Daniel. Gracias por venir. Como sabes, en SoldaQueen valoramos mucho la eficiencia y la fiabilidad de nuestros equipos. Me interesa saber m√°s sobre vuestra nueva gama, pero tengo algunas preguntas t√©cnicas importantes.",36,
4,üë§ Vendedor,...,1,3.167
5,ü§ñ Cliente,"Daniel, ¬øsigues ah√≠?",3,
6,üë§ Vendedor,Dime.,1,1.787
7,ü§ñ Cliente,Necesito trabajar en turnos intensivos. ¬øRealmente la m√°quina de 500A aguanta o se calienta?,14,
8,üë§ Vendedor,"Aguanta, aguanta perfectamente Roberto. Tenemos unos t√©cnicos que corroboran que tiene mucha resistencia.",13,15.971
9,ü§ñ Cliente,"Entiendo, pero necesito saber el Factor de Marcha de la m√°quina de 500A al ochenta por ciento para asegurarme de su rendimiento en condiciones intensivas. ¬øPodr√≠as proporcionarme ese dato?",29,
10,üë§ Vendedor,"El factor de marcha, Roberto, es de 1,6.",8,10.33


## Compute Conversation Scores

Computing all conversation scores using `scoring_service`:

In [21]:
# Compute and display conversation scores
updated_scores = await show_conver_scores(transcript, COURSE_ID, STAGE_ID)

üîÑ Computing conversation scores...

üìä CONVERSATION SCORES SUMMARY (COMPUTED)
General Score: 37.8
Fillerwords: 75
Clarity: 60
Participation: 90
Key Themes: 100
Index of Questions: 0
Rhythm: 20
Objective Accomplished: False

üìù FEEDBACK:
Fillerwords: El porcentaje de muletillas empleadas es 3.12%, siendo las muletillas mas repetidas: pues, este ...
Clarity: Deber√≠as usar t√©rminos precisos y verificar los datos t√©cnicos antes de responder. Evita respuestas vagas y aseg√∫rate de entender exactamente lo que p...
Participation: Tu porcentaje de participaci√≥n ha sido del 30.30%. Has mostrado escucha activa en 2 ocasiones....
Key Themes: Has cubierto todos los temas clave de manera adecuada y desarrollada, felicitaciones por la cobertura completa....
Index of Questions: M√©trica desactivada temporalmente...
Rhythm: Velocidad de habla demasiado lenta (65.9 PPM). Intenta aumentar el ritmo para mantener la atenci√≥n del cliente....


## Compute Profile Scores

Computing all profile scores using `profiling_service`:

In [22]:
profiling_results = await show_conver_profiles(transcript)

üîÑ Computing profile scores...

üë• PROFILE SCORES SUMMARY (COMPUTED)
Prospection: 4
Empathy: 2
Technical Domain: 2
Negotiation: 2
Resilience: 3

üìù FEEDBACK:
Prospection: Mostraste investigaci√≥n previa, abordando temas espec√≠ficos como 'factor de marcha' y 'certificaci√≥n ISO', adem√°s de hacer preguntas t√©cnicas detallad...
Empathy: El vendedor valida algunas preocupaciones y mantiene contacto, pero no profundiza ni demuestra empat√≠a genuina, especialmente al no reconocer la impor...
Technical Domain: Responde de forma superficial a preguntas t√©cnicas espec√≠ficas como el ciclo de trabajo y el sistema de alimentaci√≥n, sin datos precisos ni explicacio...
Negotiation: El vendedor aborda algunas objeciones con respuestas t√©cnicas, como al mencionar que la m√°quina 'aguanta perfectamente' y que el 'factor de marcha' es...
Resilience: El vendedor mantiene un tono positivo y en√©rgico, recuper√°ndose tras objeciones con frases como 'Creo que no nos estamos entendiendo' y '¬øEst

## Complete Results Summary

Displaying all scores in a formatted table:

In [23]:
display_results()

üìä COMPLETE SCORING RESULTS WITH FEEDBACK



Category,Metric,Score,Feedback
Conversation,General Score,37.8,-
Conversation,Fillerwords,75,"El porcentaje de muletillas empleadas es 3.12%, siendo las muletillas mas repetidas: pues, este"
Conversation,Clarity,60,"Deber√≠as usar t√©rminos precisos y verificar los datos t√©cnicos antes de responder. Evita respuestas vagas y aseg√∫rate de entender exactamente lo que pregunta el cliente. Para mejorar, prepara informaci√≥n concreta y espec√≠fica sobre factor de marcha y calibraciones, en lugar de generalidades. Por ejemplo, no digas 'Creo que es del 80%', sino confirma con datos verificables. Organiza tus ideas para no desviarte y responder de forma coherente a cada consulta t√©cnica."
Conversation,Participation,90,Tu porcentaje de participaci√≥n ha sido del 30.30%. Has mostrado escucha activa en 2 ocasiones.
Conversation,Key Themes,100,"Has cubierto todos los temas clave de manera adecuada y desarrollada, felicitaciones por la cobertura completa."
Conversation,Index of Questions,0,M√©trica desactivada temporalmente
Conversation,Rhythm,20,Velocidad de habla demasiado lenta (65.9 PPM). Intenta aumentar el ritmo para mantener la atenci√≥n del cliente.
Conversation,Objective Accomplished,No,-
Profile,Prospection,4,"Mostraste investigaci√≥n previa, abordando temas espec√≠ficos como 'factor de marcha' y 'certificaci√≥n ISO', adem√°s de hacer preguntas t√©cnicas detalladas. Sin embargo, no mencionaste datos concretos sobre la empresa o eventos recientes, limitando la personalizaci√≥n."
Profile,Empathy,2,"El vendedor valida algunas preocupaciones y mantiene contacto, pero no profundiza ni demuestra empat√≠a genuina, especialmente al no reconocer la importancia de las certificaciones ni responder con empat√≠a a las dudas t√©cnicas del cliente."



üéØ CONVERSATION SCORES:


Category,Metric,Score,Feedback
Conversation,General Score,37.8,-
Conversation,Fillerwords,75,"El porcentaje de muletillas empleadas es 3.12%, siendo las muletillas mas repetidas: pues, este"
Conversation,Clarity,60,"Deber√≠as usar t√©rminos precisos y verificar los datos t√©cnicos antes de responder. Evita respuestas vagas y aseg√∫rate de entender exactamente lo que pregunta el cliente. Para mejorar, prepara informaci√≥n concreta y espec√≠fica sobre factor de marcha y calibraciones, en lugar de generalidades. Por ejemplo, no digas 'Creo que es del 80%', sino confirma con datos verificables. Organiza tus ideas para no desviarte y responder de forma coherente a cada consulta t√©cnica."
Conversation,Participation,90,Tu porcentaje de participaci√≥n ha sido del 30.30%. Has mostrado escucha activa en 2 ocasiones.
Conversation,Key Themes,100,"Has cubierto todos los temas clave de manera adecuada y desarrollada, felicitaciones por la cobertura completa."
Conversation,Index of Questions,0,M√©trica desactivada temporalmente
Conversation,Rhythm,20,Velocidad de habla demasiado lenta (65.9 PPM). Intenta aumentar el ritmo para mantener la atenci√≥n del cliente.
Conversation,Objective Accomplished,No,-



üë• PROFILE SCORES:


Category,Metric,Score,Feedback
Profile,Prospection,4,"Mostraste investigaci√≥n previa, abordando temas espec√≠ficos como 'factor de marcha' y 'certificaci√≥n ISO', adem√°s de hacer preguntas t√©cnicas detalladas. Sin embargo, no mencionaste datos concretos sobre la empresa o eventos recientes, limitando la personalizaci√≥n."
Profile,Empathy,2,"El vendedor valida algunas preocupaciones y mantiene contacto, pero no profundiza ni demuestra empat√≠a genuina, especialmente al no reconocer la importancia de las certificaciones ni responder con empat√≠a a las dudas t√©cnicas del cliente."
Profile,Technical Domain,2,"Responde de forma superficial a preguntas t√©cnicas espec√≠ficas como el ciclo de trabajo y el sistema de alimentaci√≥n, sin datos precisos ni explicaciones detalladas, demostrando conocimiento t√©cnico limitado."
Profile,Negotiation,2,"El vendedor aborda algunas objeciones con respuestas t√©cnicas, como al mencionar que la m√°quina 'aguanta perfectamente' y que el 'factor de marcha' es de '1,6', pero no justifica el valor ni propone pasos claros tras las objeciones, adem√°s cede en aspectos t√©cnicos sin profundizar ni ofrecer soluciones espec√≠ficas o pr√≥ximos pasos."
Profile,Resilience,3,"El vendedor mantiene un tono positivo y en√©rgico, recuper√°ndose tras objeciones con frases como 'Creo que no nos estamos entendiendo' y '¬øEst√°s ah√≠, Roberto?', demostrando resiliencia. Aunque muestra cierta frustraci√≥n al no entender algunos conceptos t√©cnicos, contin√∫a intentando responder y finalizar sin rendirse, pero no mantiene energ√≠a constante tras rechazos directos."





## Stress Test: LLM Scoring Consistency

This cell runs the scoring and profiling functions **5 times** to test the consistency and variance of LLM-based scoring. This helps identify how much variation exists in the scores due to the non-deterministic nature of LLMs.

In [24]:
all_conversation_runs, all_profile_runs = await stress_test(iterations=10, transcript=transcript)

üß™ STRESS TEST: Computing scores 5 times...
This will help us understand the variance in LLM-based scoring.

üîÑ Run 1/10...
   ‚úÖ Completed in 23.35s

üîÑ Run 2/10...
llamando a gpt otra vez porque no daba un JSON bien formado... (intento 1/10)
   ‚úÖ Completed in 20.55s

üîÑ Run 3/10...
   ‚úÖ Completed in 20.80s

üîÑ Run 4/10...
   ‚úÖ Completed in 22.10s

üîÑ Run 5/10...
   ‚úÖ Completed in 18.50s

üîÑ Run 6/10...
   ‚úÖ Completed in 19.24s

üîÑ Run 7/10...
   ‚úÖ Completed in 21.45s

üîÑ Run 8/10...
   ‚úÖ Completed in 18.30s

üîÑ Run 9/10...
   ‚úÖ Completed in 19.53s

üîÑ Run 10/10...
   ‚úÖ Completed in 23.38s

‚úÖ All 5 runs completed!


In [25]:
display_stress_test(all_conversation_runs, all_profile_runs)


üìã ALL RUNS - CONVERSATION SCORES


run,general_score,fillerwords_scoring,clarity_scoring,participation_scoring,keythemes_scoring,indexofquestions_scoring,rhythm_scoring,is_accomplished
1,38.5,75,70,90,100,0,20,False
2,38.5,75,70,90,100,0,20,False
3,37.8,75,60,90,100,0,20,False
4,39.2,75,80,90,100,0,20,False
5,39.2,75,80,90,100,0,20,False
6,39.2,75,80,90,100,0,20,False
7,37.8,75,60,90,100,0,20,False
8,37.8,75,60,90,100,0,20,False
9,39.2,75,80,90,100,0,20,False
10,37.8,75,60,90,100,0,20,False



üìã ALL RUNS - PROFILE SCORES


run,prospection_scoring,empathy_scoring,technical_domain_scoring,negotiation_scoring,resilience_scoring
1,4,4,3,2,3
2,4,2,2,2,3
3,4,2,2,2,3
4,4,2,2,2,4
5,3,3,2,2,4
6,3,4,2,2,3
7,4,4,2,2,2
8,4,3,3,3,5
9,4,4,3,3,3
10,4,5,2,2,4
