Este pipeline toma historias de usuario del dataset salony_train.csv y las descompone
en tareas de desarrollo

In [16]:
import pandas as pd
import argparse
from pathlib import Path
from typing import Dict

from simple_pipeline import SimplePipeline
from simple_pipeline.steps import LoadDataFrame, OllamaLLMStep, OllamaJudgeStep, AddColumn

In [17]:
def create_task_generation_prompt(row: Dict) -> str:
    """
    Crea el prompt para generar tareas a partir de una historia de usuario del dataset Salony.
    
    Args:
        row: Fila del DataFrame con la columna 'input' que contiene la historia
    
    Returns:
        Prompt formateado
    """
    user_story = row['input'].strip()
    
    prompt = f"""Below is an instruction that describes a task, paired with an input that provides a user story.

Write a response that appropriately completes the request.


Instruction:

Break this user story into smaller development tasks to help the developers implement it efficiently. You can divide this user story into as many tasks as needed, depending on its complexity. Each task must be unique, actionable, and non-overlapping.

Use the following format for the response:

1. summary: ‚Äπtask summary 1‚Ä∫
description: ‚Äπtask description 1‚Ä∫
2. summary: ‚Äπtask summary 2‚Ä∫
description: ‚Äπtask description 2‚Ä∫

N. summary: ‚Äπtask summary N‚Ä∫
description: ‚Äπtask description N‚Ä∫


Input:

{user_story}


Response:"""
    
    return prompt

In [18]:
def run_salony_pipeline(
    output_csv: str,
    model_name: str = "llama3.1:8b",
    judge_model_name: str = "llama3.1:8b", 
    batch_size: int = 2,
    temperature: float = 0.3,
    num_predict: int = 1000,
    sample_size: int = None,
    use_judge: bool = True,
    judge_threshold: float = 35.0
):
    """
    Ejecuta el pipeline de generaci√≥n de tareas para historias de usuario Salony.
    
    Args:
        output_csv: Ruta donde guardar el resultado
        model_name: Modelo de Ollama a usar para generaci√≥n de tareas
        judge_model_name: Modelo de Ollama a usar para validaci√≥n (juez)
        batch_size: N√∫mero de historias a procesar simult√°neamente
        temperature: Temperatura para generaci√≥n
        num_predict: Tokens m√°ximos a generar
        sample_size: Si se especifica, procesa solo N historias (para pruebas)
        use_judge: Si activar validaci√≥n con LLM juez
        judge_threshold: Umbral de aprobaci√≥n del juez (0-50)
    """
    
    print(f"\n{'='*80}")
    print("üöÄ SALONY USER STORIES TO TASKS PIPELINE")
    if use_judge:
        print("üîç CON VALIDACI√ìN LLM JUEZ ACTIVADA")
    print(f"{'='*80}\n")
    
    # Cargar datos - Usar ruta relativa desde el notebook
    input_csv = Path("../data/salony_train.csv")
    print(f"üì• Cargando datos desde: {input_csv}")
    
    if not input_csv.exists():
        raise FileNotFoundError(f"No se encontr√≥ el archivo: {input_csv}")
    
    df = pd.read_csv(input_csv)
    
    # Eliminar la primera columna si es un √≠ndice
    if df.columns[0] == 'Unnamed: 0' or df.columns[0] == '':
        df = df.iloc[:, 1:]
    
    print(f"   ‚úì {len(df)} historias cargadas")
    
    # Verificar columna 'input'
    if 'input' not in df.columns:
        raise ValueError("El CSV debe tener una columna 'input' con las historias de usuario")
    
    # Aplicar sampling si se solicita
    if sample_size:
        df = df.head(sample_size)
        print(f"   ‚ÑπÔ∏è  Procesando solo {sample_size} historias (modo muestra)")
    
    # Limpiar datos
    df = df.dropna(subset=['input'])
    df['input'] = df['input'].str.strip()
    
    # Crear pipeline
    print(f"\n‚öôÔ∏è Configurando pipeline:")
    print(f"   Modelo generador: {model_name}")
    if use_judge:
        print(f"   Modelo juez: {judge_model_name}")
        print(f"   Umbral de aprobaci√≥n: {judge_threshold}/50")
    print(f"   Batch size: {batch_size}")
    print(f"   Temperature: {temperature}")
    print(f"   Historias a procesar: {len(df)}")
    
    pipeline = SimplePipeline(
        name="salony-tasks-pipeline-with-judge",
        description="Pipeline para generar y validar tareas de desarrollo del dataset Salony"
    )
    
    # Paso 1: Cargar datos
    pipeline.add_step(
        LoadDataFrame(name="load", df=df)
    )
    
    # Paso 2: Agregar columna con nombre del modelo generador
    pipeline.add_step(
        AddColumn(
            name="add_generator_model",
            input_columns=[],  # No necesita columnas de entrada
            output_column="generator_model_name",
            func=lambda: model_name
        )
    )
    
    # Paso 3: Generar tareas
    pipeline.add_step(
        OllamaLLMStep(
            name="generate_tasks",
            model_name=model_name,
            prompt_column="input",
            output_column="tasks",
            prompt_template=create_task_generation_prompt,
            system_prompt="You are an expert software development lead who excels at breaking down user stories into clear, actionable development tasks.",
            batch_size=batch_size,
            generation_kwargs={
                "temperature": temperature,
                "num_predict": num_predict
            },
        )
    )
    
    # Paso 4: Validar tareas con LLM juez (opcional)
    if use_judge:
        # Agregar columna con nombre del modelo juez
        pipeline.add_step(
            AddColumn(
                name="add_judge_model",
                input_columns=[],  # No necesita columnas de entrada
                output_column="judge_model_name",
                func=lambda: judge_model_name
            )
        )
        
        pipeline.add_step(
            OllamaJudgeStep(
                name="validate_tasks",
                model_name=judge_model_name,
                historia_usuario_column="input",
                tareas_generadas_column="tasks",
                approval_threshold=judge_threshold,
                batch_size=max(1, batch_size // 2),  # Batch m√°s peque√±o para juez
                generation_kwargs={
                    "temperature": 0.2,  # Temperatura baja para juez m√°s consistente
                    "num_predict": 800
                }
            )
        )
    
    # Ejecutar
    print(f"\nüîÑ Procesando historias...\n")
    result_df = pipeline.run(use_cache=False)
    
    # Guardar
    print(f"\nüíæ Guardando resultados...")
    result_df.to_csv(output_csv, index=False)
    print(f"   ‚úì CSV guardado: {output_csv}")
    print(f"   ‚úì {len(result_df)} historias procesadas")
    
    # Mostrar estad√≠sticas de validaci√≥n si se us√≥ juez
    if use_judge and 'validacion_aprobado' in result_df.columns:
        aprobadas = result_df['validacion_aprobado'].sum()
        total = len(result_df)
        print(f"\nüìä Estad√≠sticas de validaci√≥n:")
        print(f"   ‚úÖ Aprobadas: {aprobadas}/{total} ({aprobadas/total*100:.1f}%)")
        print(f"   ‚ùå Rechazadas: {total-aprobadas}/{total} ({(total-aprobadas)/total*100:.1f}%)")
        
        if 'validacion_total' in result_df.columns:
            avg_score = result_df['validacion_total'].mean()
            print(f"   üìà Puntuaci√≥n promedio: {avg_score:.1f}/50")
    
    print(f"\n{'='*80}")
    print("‚úÖ PIPELINE COMPLETADO EXITOSAMENTE")
    print(f"{'='*80}\n")
    
    # Mostrar ejemplo
    print("üìã Ejemplo de resultado (primeras 2 filas):\n")
    for idx, row in result_df.head(2).iterrows():
        print(f"üîπ Historia #{idx}:")
        print(f"   Input: {row['input'][:100]}...")
        if 'tasks' in row and pd.notna(row['tasks']):
            print(f"   Tasks: {row['tasks'][:150]}...")
        if 'model_generador' in row:
            print(f"   Modelo generador: {row['model_generador']}")
        if use_judge and 'model_juez' in row:
            print(f"   Modelo juez: {row['model_juez']}")
        if use_judge and 'validacion_aprobado' in row:
            status = "‚úÖ APROBADO" if row['validacion_aprobado'] else "‚ùå RECHAZADO"
            print(f"   Validaci√≥n: {status} (Score: {row.get('validacion_total', 'N/A')}/50)")
        print()
    
    return result_df

In [19]:
# Ejemplo 1: Pipeline completo con validaci√≥n LLM juez
result_df = run_salony_pipeline(
    output_csv="salony_tasks_with_validation.csv",
    model_name="llama3.1:8b",
    judge_model_name="llama3.1:8b", 
    batch_size=2,
    temperature=0.3,
    num_predict=1000,
    sample_size=3,
    use_judge=True,
    judge_threshold=35.0
)

2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Added step: load
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Added step: add_generator_model
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Added step: generate_tasks
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Added step: add_judge_model
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Added step: validate_tasks
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Starting pipeline: salony-tasks-pipeline-with-judge
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Number of steps: 5
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Executing generator step: load
2025-11-11 12:41:07 - SimplePipeline.salony-tasks-pipeline-with-judge - INFO - Executing step: add_generator_model
2025-11-11 12:41:07 


üöÄ SALONY USER STORIES TO TASKS PIPELINE
üîç CON VALIDACI√ìN LLM JUEZ ACTIVADA

üì• Cargando datos desde: ../data/salony_train.csv
   ‚úì 1999 historias cargadas
   ‚ÑπÔ∏è  Procesando solo 3 historias (modo muestra)

‚öôÔ∏è Configurando pipeline:
   Modelo generador: llama3.1:8b
   Modelo juez: llama3.1:8b
   Umbral de aprobaci√≥n: 35.0/50
   Batch size: 2
   Temperature: 0.3
   Historias a procesar: 3

üîÑ Procesando historias...



Processing generate_tasks:   0%|          | 0/3 [00:00<?, ?it/s]2025-11-11 12:41:23 - OllamaLLMStep.generate_tasks - INFO - Generation for row 0: 1. summary: Retrieve Transaction History Data
description: Develop an API endpoint to fetch the user's transaction history from the database, including date, amount, and description.

2. summary: Display Transaction History on User Interface
description: Create a UI component to display the retrieved transaction history in a readable format, allowing users to scroll through their past transactions.

3. summary: Implement Record-Keeping Functionality
description: Develop a feature that allows users to mark specific transactions as "recorded" or "archived," enabling them to easily identify and access important transactions later.

4. summary: Store Recorded Transactions in Database
description: Update the database schema to store recorded transactions separately from regular transaction history, ensuring easy retrieval of marked transactions.




üíæ Guardando resultados...
   ‚úì CSV guardado: salony_tasks_with_validation.csv
   ‚úì 3 historias procesadas

üìä Estad√≠sticas de validaci√≥n:
   ‚úÖ Aprobadas: 3/3 (100.0%)
   ‚ùå Rechazadas: 0/3 (0.0%)
   üìà Puntuaci√≥n promedio: 44.0/50

‚úÖ PIPELINE COMPLETADO EXITOSAMENTE

üìã Ejemplo de resultado (primeras 2 filas):

üîπ Historia #0:
   Input: As a user, I want to be able to check transaction history and keep a record of it, so that I can go ...
   Tasks: 1. summary: Retrieve Transaction History Data
description: Develop an API endpoint to fetch the user's transaction history from the database, includin...
   Validaci√≥n: ‚úÖ APROBADO (Score: 44/50)

üîπ Historia #1:
   Input: As a researcher, I want to have the ability to insert Greek symbols into my logbook entries....
   Tasks: 1. summary: Research and Document Required Greek Symbols
description: Identify all necessary Greek symbols that will be supported by the system, inclu...
   Validaci√≥n: ‚úÖ APROBADO (Score: