<a href="https://www.kaggle.com/code/brijeshkachalia/tornadoes-landfall-analyzer?scriptVersionId=234721835" target="_blank"><img align="left" alt="Kaggle" title="Open in Kaggle" src="https://kaggle.com/static/images/open-in-kaggle.svg"></a>

# Tornadoes Landfall Analyzer
---
This code performs a comprehensive analysis of tornado occurrences for a Kaggle competition by utilizing Google Search grounding techniques. It queries a model to assess the likelihood of future tornadoes and identifies the states most at risk. The code retrieves and displays relevant grounding data, including citations from web sources, and visualizes tornado statistics using Seaborn charts. This approach combines natural language processing with real-time data retrieval to enhance the accuracy and relevance of the analysis.

### Key APIs and Libraries
___
1. Tool.Types: Tool to support Google Search in Model

2. GenerateContentConfig : Holds all the configs when making a generate content API call

3. client.models.generate_content :  Generates content based on a query and configuration, utilizing the specified model.

4. client.chats.create() : Initializes a chat session with the specified model.

5. chat.send_message() : Sends a message to the chat model and retrieves the response.

6. Markdown(): Renders text in Markdown format for display.

7. Models: gemini-2.0-flash

### Key Functionalities:
---
**Search Grounding**: Enables the model to use real-time Google Search data to enhance the accuracy of responses. 

**Retry Mechanism**: Ensures that if grounding data is incomplete, the query is retried until valid data is obtained. 

**Footnote Generation**: Creates citations for the supported text, linking back to the original sources. 

**Data Visualization**: Utilizes Seaborn to plot charts based on tornado data, enhancing the analysis with visual representation. 


Use the API¶
Start by installing and importing the Gemini API Python SDK

In [1]:
# Uninstall packages from Kaggle base image that are not needed.
!pip uninstall -qy jupyterlab jupyterlab-lsp
# Install the google-genai SDK for this codelab.
!pip install -qU 'google-genai==1.7.0'

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m5.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m6.0 MB/s[0m eta [36m0:00:00[0m
[?25h

In [2]:
from google import genai
from google.genai import types

from IPython.display import Markdown, HTML, display

genai.__version__

'1.7.0'

In [3]:
from kaggle_secrets import UserSecretsClient

GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")

client = genai.Client(api_key=GOOGLE_API_KEY)

In [4]:
# And now re-run the same query with search grounding enabled.
config_with_search = types.GenerateContentConfig(
    tools=[types.Tool(google_search=types.GoogleSearch())],
)

def query_with_grounding():
    response = client.models.generate_content(
        model='gemini-2.0-flash',
        contents="What is likelyhood of having next tornadoes and in which state?",
        config=config_with_search,
    )
    return response.candidates[0]


rc = query_with_grounding()
Markdown(rc.content.parts[0].text)

Based on the weather reports from April 18, 2025, here's a breakdown of the tornado likelihood and affected areas:

*   **Immediate Threat (April 18, 2025):**
    *   **Western Oklahoma:** Supercells developing in the evening could produce a couple of tornadoes.
*   **Later in the Weekend (April 19-20, 2025):**
    *   **Ozarks, Texarkana, and parts of the southern Plains:** Ingredients are being monitored for potential tornadoes.
    *   **West-Central Texas into Southern Missouri:** There is a slight risk of severe thunderstorms, and initial supercells could pose a risk for a tornado or two.
*   **Other areas with severe weather risks:**
    *   **Southern Wisconsin to Missouri:** Thunderstorms are expected, with damaging winds being the primary concern. Cities like St. Louis, Milwaukee, Chicago, Indianapolis, and Detroit are included in this threat area.

It's important to stay updated with the latest forecasts from the National Weather Service and local news outlets.


In [5]:
while not rc.grounding_metadata.grounding_supports or not rc.grounding_metadata.grounding_chunks:
    # If incomplete grounding data was returned, retry.
    rc = query_with_grounding()

chunks = rc.grounding_metadata.grounding_chunks
for chunk in chunks:
    print(f'{chunk.web.title}: {chunk.web.uri}')

foxweather.com: https://vertexaisearch.cloud.google.com/grounding-api-redirect/AWQVqALS8928-7XWFncAEzJVhhpIYAd4D-UdKQWR1KZ1bieqvZEOIeoZFgnWc9VSEaVtAkIGUeklemC3c6NicBVc_LcMHRTLcCCbWOBgrD7ald-0WmYS1EVz6IvzXoHjjYrvZ3-U032NS3ZXJM4hJKrKJlQf7MIN2Z8CFosNqlvkjJDA7Xrt7Gk=
severeweatheroutlook.com: https://vertexaisearch.cloud.google.com/grounding-api-redirect/AWQVqAKoPRX-uG61qc2PwWh54lp9iK4n6yj6XmQW_e29uEUPMTs1HkDZPX5IZ_EKE5BU-AIbrIGZroyiehR94wATfuQ1wqo7RebEJ7r_ahWveFDo5SkXjU5x6tc3Bqujx07CrGMiwuGkUIn-sQvL
noaa.gov: https://vertexaisearch.cloud.google.com/grounding-api-redirect/AWQVqAJek4NQKRVhzVRbAvv_V9NdqXXGA9Bcx9KInIY_HLymb4pNLrDSvGLGe96QuDptK5YPYVSYfkcK_4rqcEPhTtoyI0HTZF_SrcHmDYSwIUQPyO7Z1NOd0IPaPmPzX74q1xN0oJPHQvU5UPT-T14DQWPWjg==


In [6]:
HTML(rc.grounding_metadata.search_entry_point.rendered_content)

# Provide the Tornadoes metadata 

In [7]:
from pprint import pprint

supports = rc.grounding_metadata.grounding_supports
for support in supports:
    pprint(support.to_json_dict())

{'confidence_scores': [0.9432088],
 'grounding_chunk_indices': [0],
 'segment': {'end_index': 262,
             'start_index': 163,
             'text': '*   **Western Oklahoma:** Supercells developing in the '
                     'evening could produce a couple of tornadoes.'}}
{'confidence_scores': [0.9808925],
 'grounding_chunk_indices': [0],
 'segment': {'end_index': 434,
             'start_index': 317,
             'text': '*   **Ozarks, Texarkana, and parts of the southern '
                     'Plains:** Ingredients are being monitored for potential '
                     'tornadoes.'}}
{'confidence_scores': [0.9128719],
 'grounding_chunk_indices': [1],
 'segment': {'end_index': 600,
             'start_index': 439,
             'text': '*   **West-Central Texas into Southern Missouri:** There '
                     'is a slight risk of severe thunderstorms, and initial '
                     'supercells could pose a risk for a tornado or two.'}}
{'confidence_scores': [0.9706

# Tornadoes Footnotes

In [8]:
import io

markdown_buffer = io.StringIO()

# Print the text with footnote markers.
markdown_buffer.write("Supported text:\n\n")
for support in supports:
    markdown_buffer.write(" * ")
    markdown_buffer.write(
        rc.content.parts[0].text[support.segment.start_index : support.segment.end_index]
    )

    for i in support.grounding_chunk_indices:
        chunk = chunks[i].web
        markdown_buffer.write(f"<sup>[{i+1}]</sup>")

    markdown_buffer.write("\n\n")


# And print the footnotes.
markdown_buffer.write("Citations:\n\n")
for i, chunk in enumerate(chunks, start=1):
    markdown_buffer.write(f"{i}. [{chunk.web.title}]({chunk.web.uri})\n")


Markdown(markdown_buffer.getvalue())

Supported text:

 * *   **Western Oklahoma:** Supercells developing in the evening could produce a couple of tornadoes.<sup>[1]</sup>

 * *   **Ozarks, Texarkana, and parts of the southern Plains:** Ingredients are being monitored for potential tornadoes.<sup>[1]</sup>

 * *   **West-Central Texas into Southern Missouri:** There is a slight risk of severe thunderstorms, and initial supercells could pose a risk for a tornado or two.<sup>[2]</sup>

 * *   **Southern Wisconsin to Missouri:** Thunderstorms are expected, with damaging winds being the primary concern.<sup>[1]</sup>

 * Cities like St. Louis, Milwaukee, Chicago, Indianapolis, and Detroit are included in this threat area.<sup>[3]</sup><sup>[1]</sup>

Citations:

1. [foxweather.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AWQVqALS8928-7XWFncAEzJVhhpIYAd4D-UdKQWR1KZ1bieqvZEOIeoZFgnWc9VSEaVtAkIGUeklemC3c6NicBVc_LcMHRTLcCCbWOBgrD7ald-0WmYS1EVz6IvzXoHjjYrvZ3-U032NS3ZXJM4hJKrKJlQf7MIN2Z8CFosNqlvkjJDA7Xrt7Gk=)
2. [severeweatheroutlook.com](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AWQVqAKoPRX-uG61qc2PwWh54lp9iK4n6yj6XmQW_e29uEUPMTs1HkDZPX5IZ_EKE5BU-AIbrIGZroyiehR94wATfuQ1wqo7RebEJ7r_ahWveFDo5SkXjU5x6tc3Bqujx07CrGMiwuGkUIn-sQvL)
3. [noaa.gov](https://vertexaisearch.cloud.google.com/grounding-api-redirect/AWQVqAJek4NQKRVhzVRbAvv_V9NdqXXGA9Bcx9KInIY_HLymb4pNLrDSvGLGe96QuDptK5YPYVSYfkcK_4rqcEPhTtoyI0HTZF_SrcHmDYSwIUQPyO7Z1NOd0IPaPmPzX74q1xN0oJPHQvU5UPT-T14DQWPWjg==)


In [9]:
from IPython.display import display, Image, Markdown

def show_response(response):
    for p in response.candidates[0].content.parts:
        if p.text:
            display(Markdown(p.text))
        elif p.inline_data:
            display(Image(p.inline_data.data))
        else:
            print(p.to_json_dict())
    
        display(Markdown('----'))

# Tornadoes Stats for the year 2024

In [10]:
config_with_search = types.GenerateContentConfig(
    tools=[types.Tool(google_search=types.GoogleSearch())],
    temperature=0.0,
)

chat = client.chats.create(model='gemini-2.0-flash')

response = chat.send_message(
    message="What were the tornadoes tallies, by top-10 state,  from Jan 2024 to Dec 2024?",
    config=config_with_search,
)

show_response(response)

It appears that the final tornado counts for 2024 are still preliminary as of January 2025. Here's a summary of the top states based on available information, keeping in mind that these numbers might change slightly as data is finalized:

1.  **Texas:** 169
2.  **Nebraska:** 131
3.  **Iowa:** 131
4.  **Illinois:** 126
5.  **Missouri:** 105
6.  **Florida:** 103
7.  **Louisiana:** 92
8.  **Oklahoma:** 91
9.  **Kansas:** 89
10. **Ohio:** 74

It's worth noting that several states, including Illinois, Iowa, New York, Ohio, Oklahoma, and West Virginia, experienced their busiest tornado years on record in 2024.


----

# EF3 Category Tornadoes & Insert statements for function calling 

In [12]:
config_with_search = types.GenerateContentConfig(
    tools=[types.Tool(google_search=types.GoogleSearch())],
    temperature=0.0,
)

chat = client.chats.create(model='gemini-2.0-flash')

response = chat.send_message(
    message="What were EF3 category tornadoes for the year 2023? count tornadoes per state, provide counts only not description and output the insert statements for table tornado_categories with state, count fields and last column as 'EF3'.",
    config=config_with_search,
)

show_response(response)
config_with_code = types.G

Based on the search results, here's a breakdown of EF3 tornadoes in 2023 by state, focusing on counts where available. Note that complete and final official counts for the entire year of 2023 are difficult to ascertain from the provided search results, and some information is preliminary.

Here's a summary of EF3 tornado counts by state based on the provided context:

*   **Alabama:** 1
*   **Arkansas:** 2
*   **Delaware:** 1
*   **Illinois:** 3
*   **Indiana:** 2
*   **Iowa:** 1
*   **Mississippi:** 1
*   **Tennessee:** 2
*   **Texas:** 2

Here are the corresponding SQL insert statements:



----

```sql
INSERT INTO tornado_categories (state, count, category) VALUES ('Alabama', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Arkansas', 2, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Delaware', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Illinois', 3, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Indiana', 2, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Iowa', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Mississippi', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Tennessee', 2, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Texas', 2, 'EF3');
```

----

AttributeError: module 'google.genai.types' has no attribute 'G'

In [14]:
# Define a retry policy. The model might make multiple consecutive calls automatically
# for a complex query, this ensures the client retries if it hits quota limits.
from google.api_core import retry

is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

if not hasattr(genai.models.Models.generate_content, '__wrapped__'):
  genai.models.Models.generate_content = retry.Retry(
      predicate=is_retriable)(genai.models.Models.generate_content)

In [16]:
%load_ext sql
%sql sqlite:///sample.db

In [17]:
%%sql

-- Create the 'tornado_categories' table
CREATE TABLE IF NOT EXISTS tornado_categories  (
  	tc_id INTEGER PRIMARY KEY AUTOINCREMENT,
  	state VARCHAR(255) NOT NULL,
  	count INTEGER NOT NULL,
    category VARCHAR(255) NOT NULL
  	
  );


 * sqlite:///sample.db
Done.


[]

Database connection used across all of the function

In [19]:
import sqlite3

db_file = "sample.db"
db_conn = sqlite3.connect(db_file)

In [20]:
def describe_table(table_name: str) -> list[tuple[str, str]]:
    """Look up the table schema.

    Returns:
      List of columns, where each entry is a tuple of (column, type).
    """
    print(f' - DB CALL: describe_table({table_name})')

    cursor = db_conn.cursor()

    cursor.execute(f"PRAGMA table_info({table_name});")

    schema = cursor.fetchall()
    # [column index, column name, column type, ...]
    return [(col[1], col[2]) for col in schema]


describe_table("tornado_categories")

 - DB CALL: describe_table(tornado_categories)


[('tc_id', 'INTEGER'),
 ('state', 'VARCHAR(255)'),
 ('count', 'INTEGER'),
 ('category', 'VARCHAR(255)')]

Tornadoes EF3 statistical data

In [21]:
%%sql
INSERT INTO tornado_categories (state, count, category) VALUES ('Alabama', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Arkansas', 2, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Delaware', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Illinois', 3, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Indiana', 2, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Iowa', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Mississippi', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Tennessee', 2, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Texas', 1, 'EF3');
INSERT INTO tornado_categories (state, count, category) VALUES ('Oklahoma', 1, 'EF3')

 * sqlite:///sample.db
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.
1 rows affected.


[]

In [31]:
def execute_query(sql: str) -> list[list[str]]:
    """Execute an SQL statement, returning the results."""
    print(f' - DB CALL: execute_query({sql})')

    cursor = db_conn.cursor()

    cursor.execute(sql)
    return cursor.fetchall()


execute_query("select * from tornado_categories")

 - DB CALL: execute_query(select * from tornado_categories)


[(1, 'Alabama', 1, 'EF3'),
 (2, 'Arkansas', 2, 'EF3'),
 (3, 'Delaware', 1, 'EF3'),
 (4, 'Illinois', 3, 'EF3'),
 (5, 'Indiana', 2, 'EF3'),
 (6, 'Iowa', 1, 'EF3'),
 (7, 'Mississippi', 1, 'EF3'),
 (8, 'Tennessee', 2, 'EF3'),
 (9, 'Texas', 1, 'EF3'),
 (10, 'Oklahoma', 1, 'EF3')]

In [24]:
def describe_table(table_name: str) -> list[tuple[str, str]]:
    """Look up the table schema.

    Returns:
      List of columns, where each entry is a tuple of (column, type).
    """
    print(f' - DB CALL: describe_table({table_name})')

    cursor = db_conn.cursor()

    cursor.execute(f"PRAGMA table_info({table_name});")

    schema = cursor.fetchall()
    # [column index, column name, column type, ...]
    return [(col[1], col[2]) for col in schema]


describe_table("tornado_categories")

 - DB CALL: describe_table(tornado_categories)


[('tc_id', 'INTEGER'),
 ('state', 'VARCHAR(255)'),
 ('count', 'INTEGER'),
 ('category', 'VARCHAR(255)')]

In [27]:
def list_tables() -> list[str]:
    """Retrieve the names of all tables in the database."""
    # Include print logging statements so you can see when functions are being called.
    print(' - DB CALL: list_tables()')

    cursor = db_conn.cursor()

    # Fetch the table names.
    cursor.execute("SELECT name FROM sqlite_master WHERE type='table';")

    tables = cursor.fetchall()
    return [t[0] for t in tables]


list_tables()

 - DB CALL: list_tables()


['tornado_categories', 'sqlite_sequence']

In [35]:
# These are the Python functions defined above.
db_tools = [list_tables, describe_table, execute_query]

instruction = """You are a helpful chatbot that can interact with an SQL database
for a Tornadoes Landfall Analyzer. You will take the users questions and turn them into SQL
queries using the tools available. Once you have the information you need, you will
answer the user's question using the data returned.

Use list_tables to see what tables are present, describe_table to understand the
schema, and execute_query to issue an SQL SELECT query."""

client = genai.Client(api_key=GOOGLE_API_KEY)

# Start a chat with automatic function calling enabled.
chat = client.chats.create(
    model="gemini-2.0-flash",
    config=types.GenerateContentConfig(
        system_instruction=instruction,
        tools=db_tools,
    ),
)

# Function calling to analyze EF3 in a state using  a chat session

In [36]:
resp = chat.send_message("Which state has highest EF3 tornadoes?")
print(f"\n{resp.text}")

 - DB CALL: list_tables()
 - DB CALL: describe_table(tornado_categories)
 - DB CALL: execute_query(SELECT state, SUM(count) AS total_count FROM tornado_categories WHERE category = 'EF3' GROUP BY state ORDER BY total_count DESC LIMIT 1)

Illinois has the highest number of EF3 tornadoes, with a count of 3.

