# Customize output format in prompt

## User prompt configuration

### [X] Create prompt template

The prompt template is defined in the file:

[/workspace/data/prompts/weather/template.md](/workspace/data/prompts/weather/template.md)

And it will be loaded as follows:

In [10]:
name_prompt = 'weather' #TODO: define your folder
name_template = 'date_range' #TODO: define your folder

In [11]:
from pathlib import Path

folder_template = f'{name_prompt}/{name_template}' 
folder = Path(f'/workspace/data/prompts/{folder_template}')

path = folder / 'template.md'
with open(path, 'r') as file:
    template = file.read()

template

'Find the weather forecast for {LOCATION}, starting from {DATE_START} and ending on {DATE_END}.\n\nGive me the URL of the weather forecast.'

### [X] Define custom output format and import it

[/workspace/data/prompts/weather/output_format.py](/workspace/data/prompts/weather/output_format.py)

In [12]:
output_class_name = 'WeatherList' #TODO: define your class

In [13]:
from importlib import import_module
OutputParser = getattr(import_module(f'data.prompts.{name_prompt}.output_parser'), output_class_name)

In [14]:
OutputParser

data.prompts.weather.output_parser.WeatherList

## Combine template and output format

In [15]:
from modules.prompt import CustomPrompt

custom_prompt = CustomPrompt(template, OutputParser)
prompt = custom_prompt.get_prompt()
prompt

PromptTemplate(input_variables=['DATE_END', 'DATE_START', 'LOCATION'], input_types={}, partial_variables={'FORMAT_INSTRUCTIONS': 'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"$defs": {"Weather": {"description": "This is the output format for the weather prompt.", "properties": {"location": {"description": "location of the weather", "title": "Location", "type": "string"}, "date": {"description": "date of the weather", "format": "date", "title": "Date", "type": "string"}, "description": {"description": "description of the weather", "title": "Description", "type": "string"}, "tempe

## Chain

### Define model

In [16]:
from langchain_openai import ChatOpenAI
model = ChatOpenAI(model="gpt-4o-search-preview")

model

ChatOpenAI(client=<openai.resources.chat.completions.completions.Completions object at 0x73f865a927b0>, async_client=<openai.resources.chat.completions.completions.AsyncCompletions object at 0x73f865a96cf0>, root_client=<openai.OpenAI object at 0x73f865a90170>, root_async_client=<openai.AsyncOpenAI object at 0x73f865a92a80>, model_name='gpt-4o-search-preview', model_kwargs={}, openai_api_key=SecretStr('**********'))

### Compose chain

In [17]:
chain = prompt | model | custom_prompt.parser
chain

PromptTemplate(input_variables=['DATE_END', 'DATE_START', 'LOCATION'], input_types={}, partial_variables={'FORMAT_INSTRUCTIONS': 'The output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nAs an example, for the schema {"properties": {"foo": {"title": "Foo", "description": "a list of strings", "type": "array", "items": {"type": "string"}}}, "required": ["foo"]}\nthe object {"foo": ["bar", "baz"]} is a well-formatted instance of the schema. The object {"properties": {"foo": ["bar", "baz"]}} is not well-formatted.\n\nHere is the output schema:\n```\n{"$defs": {"Weather": {"description": "This is the output format for the weather prompt.", "properties": {"location": {"description": "location of the weather", "title": "Location", "type": "string"}, "date": {"description": "date of the weather", "format": "date", "title": "Date", "type": "string"}, "description": {"description": "description of the weather", "title": "Description", "type": "string"}, "tempe

### [X] Invoke chain

To get the response, we need to invoke the chain with the input data.

In [18]:
output = chain.invoke({
    'LOCATION': 'New York',
    'DATE_START': '2025-04-12',
    'DATE_END': '2025-04-18'
})

## Output

### JSON

In [19]:
data = output.model_dump()
data

{'weathers': [{'location': 'New York City',
   'date': datetime.date(2025, 4, 12),
   'description': 'Rain',
   'temperature_min': 8.0,
   'temperature_max': 16.0,
   'temperature_avg': 12.0,
   'url': 'https://meteum.ai/new-york/month/april'},
  {'location': 'New York City',
   'date': datetime.date(2025, 4, 13),
   'description': 'Clear',
   'temperature_min': 8.0,
   'temperature_max': 17.0,
   'temperature_avg': 12.5,
   'url': 'https://meteum.ai/new-york/month/april'},
  {'location': 'New York City',
   'date': datetime.date(2025, 4, 14),
   'description': 'Clear',
   'temperature_min': 9.0,
   'temperature_max': 17.0,
   'temperature_avg': 13.0,
   'url': 'https://meteum.ai/new-york/month/april'},
  {'location': 'New York City',
   'date': datetime.date(2025, 4, 15),
   'description': 'Rain',
   'temperature_min': 9.0,
   'temperature_max': 17.0,
   'temperature_avg': 13.0,
   'url': 'https://meteum.ai/new-york/month/april'},
  {'location': 'New York City',
   'date': datetime.da

### DataFrame

In [20]:
import pandas as pd

data_values = list(data.values())[0]
df = pd.DataFrame(data_values)
df.style

Unnamed: 0,location,date,description,temperature_min,temperature_max,temperature_avg,url
0,New York City,2025-04-12,Rain,8.0,16.0,12.0,https://meteum.ai/new-york/month/april
1,New York City,2025-04-13,Clear,8.0,17.0,12.5,https://meteum.ai/new-york/month/april
2,New York City,2025-04-14,Clear,9.0,17.0,13.0,https://meteum.ai/new-york/month/april
3,New York City,2025-04-15,Rain,9.0,17.0,13.0,https://meteum.ai/new-york/month/april
4,New York City,2025-04-16,Clear,8.0,17.0,12.5,https://meteum.ai/new-york/month/april
5,New York City,2025-04-17,Rain,9.0,17.0,13.0,https://meteum.ai/new-york/month/april
6,New York City,2025-04-18,Clear,8.0,16.0,12.0,https://meteum.ai/new-york/month/april


### Export to Excel and CSV

In [21]:
from datetime import datetime
import os

# Create a folder with the current datetime
current_datetime = datetime.now().strftime('%Y%m%d_%H%M%S')
output_folder = folder / f'outputs/{current_datetime}'
output_folder.mkdir(parents=True, exist_ok=True)

# Save the files in the newly created folder
df.to_excel(output_folder / 'output.xlsx', index=False)
df.to_csv(output_folder / 'output.csv', index=False)