# Chapter 08: LLMs integrating with Forecasting Models

Integrating LLMs with forecasting tools is a powerful strategy for enhancing decision-making across various domains. LLMs excel at interpreting complex datasets and generating actionable insights. When combined with forecasting models, this synergy allows for automatic forecast interpretation and action recommendations. While applications such as demand forecasting and machine degradation predictions are common, our focus will be on earthquake forecasting. We will outline practical steps for integrating LLMs with forecasting models, demonstrating how this combination can transform abstract concepts into actionable and insightful implementations.

In [1]:
from assets.tools.earthquake import count_earthquakes, query_earthquakes, USGeopoliticalSurveyEarthquakeAPI
from assets.tools.forecasting import forecast_earthquakes, get_regions
from pydantic import BaseModel, Field, EmailStr
from language_models.models.llm import OpenAILanguageModel
from language_models.agent import Agent, Workflow, WorkflowAgentStep, WorkflowFunctionStep, OutputType, PromptingStrategy
from language_models.tools import Tool, current_date
from language_models.proxy_client import ProxyClient
from language_models.settings import settings

In [2]:
proxy_client = ProxyClient(
    client_id=settings.CLIENT_ID,
    client_secret=settings.CLIENT_SECRET,
    auth_url=settings.AUTH_URL,
    api_base=settings.API_BASE,
)

## Use Case: Automating Earthquake Forecast Inquiries

Integrating LLMs with forecasting tools offers a transformative approach to managing inquiries about earthquake forecasts. Currently, handling these inquiries involves significant manual effort, with responses being drafted and sent individually. By automating this process, we can streamline operations and enhance efficiency. To automate responses to earthquake forecast inquiries effectively, we will design a workflow that integrates various steps, starting from the receipt of an email and ending in the generation of an email response, ensuring accurate and timely replies.

**Step 1: Extracting relevant information**

To handle the unstructured data in the email body, we first leverage an LLM. The LLM is tasked with extracting key details such as the regions and the forecasting horizon. To ensure that the regions are in the same format as the dataset, the LLM is tasked to provide the extracted region names in a format that matches the dataset’s spelling. This step is crucial as it transforms the free-form text into structured data.

In [3]:
get_regions_tool = Tool(
    function=get_regions,
    name="Get Valid Regions",
    description="Use this tool to access the valid regions that can be used for forecasting",
)

In [4]:
system_prompt = """You are tasked with responding to an inquiry about earthquake forecasts.

Use the tool called Get Valid Regions to validate and standardize the spelling of the region names.

Your response should include:
- horizon: Provide the forecasting horizon as specified in the inquiry.
- regions: List the names of the regions, ensuring that the spelling of each region follows the standardized format provided by the Get Valid Regions tool.

The email may contain misspellings or abbreviations.

Additionally, the USA has many regions. In this case, only provide the name of the region and not the country name. For example:
AK (USA) -> Alaska
Idaho, USA -> Idaho
OK, US -> Oklahoma"""

llm = OpenAILanguageModel(
    proxy_client=proxy_client,
    model='gpt-4',
    max_tokens=250,
    temperature=0.2,
)

class Forecast(BaseModel):
    horizon: int = Field(description="The number of days to forecast for")
    regions: list[str] = Field(description="The regions to forecast for")

extract_regions = Agent.create(
    llm=llm,
    system_prompt=system_prompt,
    prompt="{email.body}",
    prompt_variables=["email"],
    tools=[get_regions_tool],
    output_type=OutputType.OBJECT,
    output_schema=Forecast,
    prompting_strategy=PromptingStrategy.SINGLE_COMPLETION,
    verbose=True,
)

extract_regions_step = WorkflowAgentStep(name="extract_regions", agent=extract_regions)

**Step 2: Forecasting earthquakes**

With standardized region names, the workflow runs the forecasting model to retrieve predictions for earthquake magnitudes and depths based on specified regions and forecasting horizons.

In [5]:
class ExtractedRegions(BaseModel):
    extract_regions: Forecast

class RegionForecast(BaseModel):
    region: str
    forecast: list[dict]

def forecast_earthquakes_for_regions(extract_regions: Forecast) -> list[dict]:
    forecasts = []
    for region in extract_regions.regions:
        forecast = forecast_earthquakes(region, extract_regions.horizon)
        forecasts.append(RegionForecast(region=region, forecast=forecast))
    return [forecast.model_dump() for forecast in forecasts]

forecast_earthquakes_step = WorkflowFunctionStep(name="forecast_earthquakes", inputs=ExtractedRegions, function=forecast_earthquakes_for_regions)

**Step 3: Writing the email**

Once the forecast data is obtained, the LLM integrates this information into a comprehensive response. The response includes the predicted magnitudes and depths, reasoning behind the forecasts, and any additional context that might be useful to the inquirer. This draft response is reviewed and then sent back to the inquirer via the Email Management System.

In [6]:
system_prompt = """Create an email regarding the earthquake forecasts based on the specified regions.

The email should include:
- details of the earthquake forecast
- whether the region is in danger
- actions that you recommend to take

Use the closing signature:
Best regards,
Earthquake Forecasting Team"""

prompt = """Respond to this email: {email.sender}

Forecast horizon: {extract_regions.horizon}

Forecasts:
{forecast_earthquakes}"""

llm = OpenAILanguageModel(
    proxy_client=proxy_client,
    model='gpt-4',
    max_tokens=1000,
    temperature=0.2,
)

class Email(BaseModel):
    to: EmailStr
    subject: str
    body: str


create_email = Agent.create(
    llm=llm,
    system_prompt=system_prompt,
    prompt=prompt,
    prompt_variables=["email", "extract_regions", "forecast_earthquakes"],
    tools=[get_regions_tool],
    output_type=OutputType.OBJECT,
    output_schema=Email,
    prompting_strategy=PromptingStrategy.SINGLE_COMPLETION,
    verbose=True,
)

create_email_step = WorkflowAgentStep(name="create_email", agent=create_email)

**Step 4: Creating the workflow**

Here, we define the input to structure the email details for our workflow. We then create the workflow, which manages earthquake forecast inquiries by extracting regions, generating forecasts, and creating email responses through a series of defined steps.

In [7]:
class Email(BaseModel):
    sender: EmailStr = Field(description="Person to send it to")
    subject: str = Field(description="Subject of the email")
    body: str = Field(description="Body of the email")

class WorkflowInput(BaseModel):
    email: Email

workflow = Workflow(
    name="Automate Earthquake Forecast Inquiries",
    description="Allows you to respond to inquiries/questions about earthquake forecasts",
    inputs=WorkflowInput,
    output="create_email",
    steps=[extract_regions_step, forecast_earthquakes_step, create_email_step],
    verbose=True,
)

**Step 5: Running the workflow**

Finally, we create an email with specified sender, subject, and body content. We then invoke the workflow using this email as input to process the inquiry and generate the appropriate output.

In [8]:
email = Email(
    sender="jennifer.smith@researchinstitute.org",
    subject="Request for 5-Day Earthquake Forecast for Selected Regions",
    body="""Dear Earthquake Forecasting Team,

I hope this email finds you well.

I am reaching out to request a 5-day earthquake forecast for the following regions:
- CA, USA
- JP
- GR

Could you please provide the predicted magnitudes and depths for these regions over the next five days? Additionally, any relevant context or insights related to these forecasts would be very helpful.

Thank you very much for your assistance.

Best regards,
Jennifer Smith
Research Scientist
Research Institute
jennifer.smith@researchinstitute.org""",
)

output = workflow.invoke({"email": email})

[1m[38;2;45;114;210mRunning Step[0m[1m[0m: extract_regions
[1m[38;2;236;154;60mAgent Input[0m[1m[0m: {'email': Email(sender='jennifer.smith@researchinstitute.org', subject='Request for 5-Day Earthquake Forecast for Selected Regions', body='Dear Earthquake Forecasting Team,\n\nI hope this email finds you well.\n\nI am reaching out to request a 5-day earthquake forecast for the following regions:\n- CA, USA\n- JP\n- GR\n\nCould you please provide the predicted magnitudes and depths for these regions over the next five days? Additionally, any relevant context or insights related to these forecasts would be very helpful.\n\nThank you very much for your assistance.\n\nBest regards,\nJennifer Smith\nResearch Scientist\nResearch Institute\njennifer.smith@researchinstitute.org')}


[1m[38;2;50;164;103mFinal Answer[0m[1m[0m: horizon=5 regions=['California', 'Japan', 'Greece']
[1m[38;2;236;154;60mAgent Output[0m[1m[0m: horizon=5 regions=['California', 'Japan', 'Greece']
[1m[38;2;45;114;210mRunning Step[0m[1m[0m: forecast_earthquakes
[1m[38;2;236;154;60mFunction Input[0m[1m[0m: {'extract_regions': Forecast(horizon=5, regions=['California', 'Japan', 'Greece'])}
[1m[38;2;236;154;60mFunction Output[0m[1m[0m: [{'region': 'California', 'forecast': [{'Date': Timestamp('2024-08-15 00:00:00'), 'Magnitude Forecast': 1.1931233305441462, 'Depth Forecast': 7.249995970823374}, {'Date': Timestamp('2024-08-16 00:00:00'), 'Magnitude Forecast': 1.1256780698750068, 'Depth Forecast': 7.452276402910943}, {'Date': Timestamp('2024-08-17 00:00:00'), 'Magnitude Forecast': 1.1024515783958497, 'Depth Forecast': 7.573951044932844}, {'Date': Timestamp('2024-08-18 00:00:00'), 'Magnitude Forecast': 1.0377529478345109, 'Depth Forecast': 7.423034704682362}, {'Date': Timest

In [9]:
print(f"Email to: {output.output.to}")
print(f"Email subject: {output.output.subject}")
print(f"Email body: {output.output.body}")

Email to: jennifer.smith@researchinstitute.org
Email subject: Earthquake Forecast for the Next 5 Days
Email body: Dear Jennifer,

We are writing to provide you with the earthquake forecast for the next 5 days in California, Japan, and Greece.

In California, the forecasted magnitudes range from 1.037 to 1.193, with depths ranging from 7.25 to 7.57. These are relatively low magnitudes and depths, so the region is not in immediate danger. However, we recommend monitoring seismic activity and ensuring that earthquake safety measures are in place.

In Japan, the forecasted magnitudes range from 4.487 to 4.828, with depths ranging from 67.94 to 87.03. These are moderate magnitudes and depths, which could potentially cause damage. We recommend that the region prepares for potential earthquakes, including reviewing and implementing earthquake safety measures.

In Greece, the forecasted magnitudes range from 4.305 to 4.697, with depths ranging from 16.54 to 28.62. These are moderate magnitudes

In [10]:
query_earthquakes_tool = Tool(
    function=query_earthquakes,
    name="Query Earthquakes",
    description="Use this tool to search recent earthquakes",
    args_schema=USGeopoliticalSurveyEarthquakeAPI,
)

count_earthquakes_tool = Tool(
    function=count_earthquakes,
    name="Count Earthquakes",
    description="Use this tool to count and aggregate recent earthquakes",
    args_schema=USGeopoliticalSurveyEarthquakeAPI,
)

In [11]:
system_prompt = "You are an United States Geological Survey expert who can answer questions regarding earthquakes."

agent = Agent.create(
    llm=llm,
    system_prompt=system_prompt,
    prompt="{question}",
    prompt_variables=["question"],
    output_type=OutputType.STRING,
    tools=[current_date, count_earthquakes_tool, query_earthquakes_tool, workflow.as_tool()],
    prompting_strategy=PromptingStrategy.CHAIN_OF_THOUGHT,
    verbose=True,
)

In [12]:
prompt = """I received this email:

Sender: exampleuser@example.com
Subject: Request for 5-Day Earthquake Forecast for Selected Regions

Dear Earthquake Forecasting Team,

I hope this message finds you well.

I am writing to request a 5-day earthquake forecast for the following regions:

1. **California (CA)**
2. **Japan (JP)**
3. **Greece (GR)**

Could you please provide the predicted magnitudes and depths for these regions over the next five days? Any additional context or insights related to these forecasts would be greatly appreciated.

Thank you for your assistance.

Best regards,
[Your Name]
exampleuser@example.com

Can you reply to this email."""

output = agent.invoke({"question": prompt})

[1m[38;2;45;114;210mThought[0m[1m[0m: The request is for a 5-day earthquake forecast for California, Japan, and Greece. However, it's important to note that precise earthquake forecasting is currently not possible. We can provide information on the seismic activity in these regions and the general risk, but we cannot predict specific earthquakes. I will use the Automate Earthquake Forecast Inquiries tool to generate a response.
[1m[38;2;236;154;60mTool[0m[1m[0m: Automate Earthquake Forecast Inquiries
[1m[38;2;236;154;60mTool Input[0m[1m[0m: {'email': AttributedDict([('sender', 'exampleuser@example.com'), ('subject', 'Re: Request for 5-Day Earthquake Forecast for Selected Regions'), ('body', 'Dear [Your Name],\n\nThank you for your email and interest in earthquake forecasting. At this time, precise earthquake prediction is not yet possible. While we can provide information on the seismic activity in a region and the general risk, we cannot predict specific earthquakes, th

In [13]:
print(output.final_answer)

The email has been successfully responded to with the following content:

Dear User,

We are writing to provide you with the earthquake forecast for the next 5 days in California, Japan, and Greece.

In California, the forecasted magnitudes range from 1.037 to 1.193, with depths ranging from 7.25 to 7.57. These are relatively low magnitudes and depths, so the region is not in immediate danger. However, we recommend monitoring seismic activity and ensuring that earthquake safety measures are in place.

In Japan, the forecasted magnitudes range from 4.487 to 4.828, with depths ranging from 67.94 to 87.03. These are moderate magnitudes and depths, which could potentially cause damage. We recommend that the region prepares for potential earthquakes, including reviewing and implementing earthquake safety measures.

In Greece, the forecasted magnitudes range from 4.305 to 4.697, with depths ranging from 16.54 to 28.62. These are moderate magnitudes and depths, which could potentially cause d

In [17]:
output = agent.invoke({"question": "How many earthquakes occurred today?"})

[1m[38;2;45;114;210mThought[0m[1m[0m: To find out how many earthquakes occurred today, I can use the "Count Earthquakes" tool. I will need to set the start time to the beginning of today and the end time to the current time.
[1m[38;2;236;154;60mTool[0m[1m[0m: Current Date
[1m[38;2;236;154;60mTool Input[0m[1m[0m: {}
[1m[38;2;236;154;60mTool Output[0m[1m[0m: 2024-08-14 09:03:20.548516
[1m[38;2;45;114;210mThought[0m[1m[0m: The current date is 2024-08-14. I'll use the "Count Earthquakes" tool to find out how many earthquakes occurred today. The start time will be the beginning of today (2024-08-14 00:00:00) and the end time will be the current time (2024-08-14 09:03:20).
[1m[38;2;236;154;60mTool[0m[1m[0m: Count Earthquakes
[1m[38;2;236;154;60mTool Input[0m[1m[0m: {'start_time': '2024-08-14T00:00:00', 'end_time': '2024-08-14T09:03:20'}
[1m[38;2;236;154;60mTool Output[0m[1m[0m: {'count': 99, 'maxAllowed': 20000}
[1m[38;2;45;114;210mThought[0m[1m[0

In [18]:
print(output.final_answer)

There have been 99 earthquakes today.
