# Customize output format in prompt

## User prompt configuration

### [X] Create prompt template

The prompt template is defined in the file:

[/workspace/data/prompts/weather/date_range/template.md](/workspace/data/prompts/weather/date_range/template.md)

And it will be loaded as follows:

In [15]:
name_prompt = 'futbol' #TODO: define your folder
name_template = 'liga_jornada' #TODO: define your folder

In [16]:
from pathlib import Path

folder_template = f'{name_prompt}/{name_template}' 
folder = Path(f'/workspace/data/prompts/{folder_template}')

path = folder / 'template.md'
with open(path, 'r') as file:
    template = file.read()

template

'Quiero informacion sobre los partidos de la liga {LIGA} en la jornada {JORNADA}.\n'

### [X] Define custom output format and import it

[/workspace/data/prompts/weather/output_parser.py](/workspace/data/prompts/weather/output_parser.py)

In [17]:
output_class_name = 'ListaPartidos' #TODO: define your class

In [18]:
from importlib import import_module
OutputParser = getattr(import_module(f'data.prompts.{name_prompt}.output_parser'), output_class_name)

In [19]:
OutputParser

data.prompts.futbol.output_parser.ListaPartidos

## Combine template and output format

In [20]:
from modules.prompt import CustomPrompt

custom_prompt = CustomPrompt(template, OutputParser)
prompt = custom_prompt.get_prompt()
prompt

PromptTemplate(input_variables=['JORNADA', 'LIGA'], input_types={}, partial_variables={'FORMAT_INSTRUCTIONS': 'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"$defs": {"Partido": {"properties": {"liga": {"description": "Nombre de la liga de fútbol", "title": "Liga", "type": "string"}, "jornada": {"description": "Número de la jornada en la temporada", "title": "Jornada", "type": "integer"}, "fecha": {"description": "Fecha en la que se juega el partido", "format": "date", "title": "Fecha", "type": "string"}, "equipo_local": {"description": "Nombre del equipo local", "title": "Equipo 

## Chain

### Define model

In [21]:
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-search-preview")

model

ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x71e8b2affe00>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x71e8b2944cb0>, root_client=<openai.OpenAI object at 0x71e8b28fe270>, root_async_client=<openai.AsyncOpenAI object at 0x71e8b290e5d0>, model_name='gpt-4o-search-preview', model_kwargs={}, openai_api_key=SecretStr('**********'))

### Compose chain

In [22]:
chain = prompt | model | custom_prompt.parser
chain

PromptTemplate(input_variables=['JORNADA', 'LIGA'], input_types={}, partial_variables={'FORMAT_INSTRUCTIONS': 'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"$defs": {"Partido": {"properties": {"liga": {"description": "Nombre de la liga de fútbol", "title": "Liga", "type": "string"}, "jornada": {"description": "Número de la jornada en la temporada", "title": "Jornada", "type": "integer"}, "fecha": {"description": "Fecha en la que se juega el partido", "format": "date", "title": "Fecha", "type": "string"}, "equipo_local": {"description": "Nombre del equipo local", "title": "Equipo 

### [X] Invoke chain

To get the response, we need to invoke the chain with the input data.

In [24]:
output = chain.invoke({
    'LIGA': 'Liga Espanola',
    'JORNADA': '2025-03-01'
})

## Output

### JSON

In [25]:
data = output.model_dump()
data

{'partidos': [{'liga': 'LaLiga Española',
   'jornada': 26,
   'fecha': datetime.date(2025, 3, 1),
   'equipo_local': 'Girona FC',
   'equipo_visitante': 'RC Celta de Vigo',
   'goles_local': 2,
   'goles_visitante': 2,
   'estadio': 'Municipal de Montilivi',
   'url': 'https://as.com/futbol/primera/girona-celta-a-que-hora-es-canal-tv-donde-y-como-ver-laliga-ea-sports-online-hoy-n/'},
  {'liga': 'LaLiga Española',
   'jornada': 26,
   'fecha': datetime.date(2025, 3, 1),
   'equipo_local': 'Rayo Vallecano',
   'equipo_visitante': 'Sevilla FC',
   'goles_local': 1,
   'goles_visitante': 1,
   'estadio': 'Estadio de Vallecas',
   'url': 'https://espndeportes.espn.com/futbol/resultados/_/liga/esp.1/fecha/20250301'},
  {'liga': 'LaLiga Española',
   'jornada': 26,
   'fecha': datetime.date(2025, 3, 1),
   'equipo_local': 'Real Betis',
   'equipo_visitante': 'Real Madrid',
   'goles_local': 2,
   'goles_visitante': 1,
   'estadio': 'Benito Villamarín',
   'url': 'https://as.com/futbol/primer

### DataFrame

In [26]:
import pandas as pd

data_values = list(data.values())[0]
df = pd.DataFrame(data_values)
df.style

Unnamed: 0,liga,jornada,fecha,equipo_local,equipo_visitante,goles_local,goles_visitante,estadio,url
0,LaLiga Española,26,2025-03-01,Girona FC,RC Celta de Vigo,2,2,Municipal de Montilivi,https://as.com/futbol/primera/girona-celta-a-que-hora-es-canal-tv-donde-y-como-ver-laliga-ea-sports-online-hoy-n/
1,LaLiga Española,26,2025-03-01,Rayo Vallecano,Sevilla FC,1,1,Estadio de Vallecas,https://espndeportes.espn.com/futbol/resultados/_/liga/esp.1/fecha/20250301
2,LaLiga Española,26,2025-03-01,Real Betis,Real Madrid,2,1,Benito Villamarín,https://as.com/futbol/primera/ya-hay-horarios-para-la-jornada-26-el-betis-madrid-el-domingo-2-a-las-1630-n/
3,LaLiga Española,26,2025-03-01,Atlético de Madrid,Athletic Club,1,0,Riyadh Air Metropolitano,https://as.com/us/futbol/atletico-de-madrid-athletic-club-horario-tv-como-y-donde-ver-laliga-ea-sports-en-usa-n/


### Export to Excel and CSV

In [27]:
from datetime import datetime
import os

# Create a folder with the current datetime
current_datetime = datetime.now().strftime('%Y%m%d_%H%M%S')
output_folder = folder / f'outputs/{current_datetime}'
output_folder.mkdir(parents=True, exist_ok=True)

# Save the files in the newly created folder
df.to_excel(output_folder / 'output.xlsx', index=False)
df.to_csv(output_folder / 'output.csv', index=False)