# Kaggle Competition Assistant Demo

This notebook demonstrates the capabilities of the Kaggle Competition Assistant agent. The agent uses the Kaggle API and Google Gemini to help you explore competitions, find winning solutions, and analyze code.

## Prerequisites
1.  Ensure you have a `kaggle.json` file in `~/.kaggle/` (or correct location for your OS) for Kaggle API authentication.
2.  Ensure you have a `GOOGLE_API_KEY` set in your environment (or `.env` file).
3.  Install the required dependencies.

## 1. Setup and Initialization

In [2]:
!git clone https://github.com/lmassaron/Agentic-AI-Kaggle-Competition-Assistant.git

Cloning into 'Agentic-AI-Kaggle-Competition-Assistant'...
remote: Enumerating objects: 58, done.[K
remote: Counting objects: 100% (58/58), done.[K
remote: Compressing objects: 100% (41/41), done.[K
remote: Total 58 (delta 24), reused 50 (delta 16), pack-reused 0 (from 0)[K
Receiving objects: 100% (58/58), 260.96 KiB | 10.04 MiB/s, done.
Resolving deltas: 100% (24/24), done.


In [5]:
import os
from kaggle_secrets import UserSecretsClient

user_secrets = UserSecretsClient()

try:    
    my_username = user_secrets.get_secret("kaggle_username")
    
    my_key = user_secrets.get_secret("kaggle_key") 

    os.environ["KAGGLE_USERNAME"] = my_username
    os.environ["KAGGLE_KEY"] = my_key
    
    print("Kaggle Credentials set successfully!")

except Exception as e:
    print(f"Error retrieving secrets: {e}")
    print("Please check the 'Add-ons -> Secrets' menu to verify your secret labels.")

Kaggle Credentials set successfully!


In [6]:
import sys
import os
import glob
import importlib

project_root = os.path.abspath('Agentic-AI-Kaggle-Competition-Assistant')

if project_root not in sys.path:
    sys.path.append(project_root)

src_folder = os.path.join(project_root, 'src')
py_files = glob.glob(os.path.join(src_folder, '*.py'))

for file_path in py_files:
    filename = os.path.basename(file_path)[:-3]
    
    if filename == "__init__":
        continue
        
    module_name = f"src.{filename}"
    
    print(f"Importing {module_name}...")
    try:
        module = importlib.import_module(module_name)
        # Optional: Add to global namespace if you really need to use functions directly
        globals().update(vars(module)) 
    except Exception as e:
        print(f"Failed to import {module_name}: {e}")

Importing src.kaggle_api...
Importing src.built_in_tools...
Importing src.agent...
Importing src.tools...


In [7]:
import os
import sys
import json
from dotenv import load_dotenv

# Ensure the 'src' module is in the python path
sys.path.append(os.getcwd())

from src.agent import KaggleAgent

# Load environment variables
load_dotenv()
google_api_key = os.getenv("GOOGLE_API_KEY")

if not google_api_key:
    # Try to load from Kaggle secrets if running in a Kaggle notebook
    try:
        from kaggle_secrets import UserSecretsClient
        user_secrets = UserSecretsClient()
        google_api_key = user_secrets.get_secret("GOOGLE_API_KEY")
        print("API Key loaded from Kaggle Secrets.")
    except ImportError:
        print("Error: GOOGLE_API_KEY not found. Please set it in .env or environment variables.")

# Initialize the Agent
if google_api_key:
    agent = KaggleAgent(api_key=google_api_key)
    print("Agent initialized successfully.")

API Key loaded from Kaggle Secrets.
Agent initialized successfully.


## 2. Demonstrate Capabilities

### 2.1 Find Similar Competitions
The agent can search for competitions based on keywords.

In [8]:
response = agent.run("Find competitions similar to 'titanic' and tell me about them.")
print(response)

Tool call: find_similar_competitions(<proto.marshal.collections.maps.MapComposite object at 0x7db3dc57ea50>)
Finding similar competitions for query: 'titanic', metric: 'None'
No similar competitions were found for 'titanic'.


### 2.2 Get Winning Solution Writeups
The agent can find discussion posts or kernels describing winning approaches.

In [9]:
response = agent.run("What are the winning solutions for the 'titanic' competition?")
print(response)

Tool call: get_winning_solution_writeups(<proto.marshal.collections.maps.MapComposite object at 0x7db3dc57dd10>)
Getting winning solution write-ups for competition: titanic
Here are some winning solutions for the 'titanic' competition:

*   'Exercise: Arithmetic and Variables' by Alexis Cook (https://www.kaggle.com/alexisbcook/exercise-arithmetic-and-variables)
*   'Titanic Data Science Solutions' by Manav Sehgal (https://www.kaggle.com/startupsci/titanic-data-science-solutions)
*   'A Data Science Framework: To Achieve 99% Accuracy' by LD Freeman (https://www.kaggle.com/ldfreeman3/a-data-science-framework-to-achieve-99-accuracy)
*   'Exploring Survival on the Titanic' by Meg Risdal (https://www.kaggle.com/mrisdal/exploring-survival-on-the-titanic)
*   'Titanic Survival Predictions (Beginner)' by Nadin Tamer (https://www.kaggle.com/nadintamer/titanic-survival-predictions-beginner)


### 2.3 Get Top Scoring Kernels
The agent can list the highest-voted or highest-scoring public kernels.

In [10]:
response = agent.run("Show me the top scoring Python kernels for 'titanic'.")
print(response)

Tool call: get_top_scoring_kernels(<proto.marshal.collections.maps.MapComposite object at 0x7db3d39feb50>)
Getting top scoring kernels for competition: titanic, language: 'Python', sort_by: 'Votes'
Here are the top scoring Python kernels for the 'titanic' competition:

*   'Exercise: Arithmetic and Variables' by Alexis Cook (Score: 448633) - https://www.kaggle.com/alexisbcook/exercise-arithmetic-and-variables
*   'Titanic Tutorial' by Alexis Cook (Score: 54246) - https://www.kaggle.com/alexisbcook/titanic-tutorial
*   'Titanic Data Science Solutions' by Manav Sehgal (Score: 39569) - https://www.kaggle.com/startupsci/titanic-data-science-solutions
*   'Introduction to Ensembling/Stacking in Python' by Anisotropic (Score: 15407) - https://www.kaggle.com/arthurtok/introduction-to-ensembling-stacking-in-python
*   'A Data Science Framework: To Achieve 99% Accuracy' by LD Freeman (Score: 13815) - https://www.kaggle.com/ldfreeman3/a-data-science-framework-to-achieve-99-accuracy


### 2.4 Analyze Tech Stack
The agent can analyze the imports in top kernels to see what libraries are popular.

In [11]:
response = agent.run("What libraries are most commonly used in the 'titanic' competition?")
print(response)

Tool call: analyze_tech_stack(<proto.marshal.collections.maps.MapComposite object at 0x7db3d3ccdd10>)
Analyzing tech stack for competition: titanic
I'm sorry, I was unable to retrieve the information about the most commonly used libraries for the 'titanic' competition.


### 2.5 Search Code Snippets
The agent can search inside the code of top kernels for specific keywords.

In [12]:
response = agent.run("Search for code snippets using 'RandomForestClassifier' in the 'titanic' competition.")
print(response)

Tool call: search_code_snippets(<proto.marshal.collections.maps.MapComposite object at 0x7db3dc696a10>)
Searching for code snippets with keywords: 'RandomForestClassifier', in competition: titanic
I found several code snippets that use `RandomForestClassifier` in the 'titanic' competition. Here are some of them:

*   **Titanic Tutorial** by Alexis Cook: This kernel demonstrates building a random forest model with features like 'Pclass', 'Sex', 'SibSp', and 'Parch' to predict survival. You can find it at: https://www.kaggle.com/alexisbcook/titanic-tutorial

*   **Titanic Data Science Solutions** by Manav Sehgal: This notebook uses `RandomForestClassifier` as one of the models for predicting survival. It's a comprehensive workflow for data science competitions. You can find it at: https://www.kaggle.com/startupsci/titanic-data-science-solutions

*   **Introduction to Ensembling/Stacking in Python** by Anisotropic: This kernel uses `RandomForestClassifier` as one of the base models in an 

### 2.6 Identify Competition from URL
The agent can parse a URL to identify the competition.

In [13]:
response = agent.run("What is the competition slug for 'https://www.kaggle.com/c/titanic'?")
print(response)

Tool call: get_competition_id_from_url(<proto.marshal.collections.maps.MapComposite object at 0x7db3d3c6e710>)
Extracted slug: titanic
The competition slug for 'https://www.kaggle.com/c/titanic' is 'titanic'.


### 2.7 Summarize URL Content (Advanced)
For Kaggle competition URLs, the agent triggers a specialized summary workflow, gathering solutions, kernels, and tech stack info.

In [14]:
response = agent.run("Summarize this competition: https://www.kaggle.com/c/titanic")
print(response)

Tool call: summarize_url_content(<proto.marshal.collections.maps.MapComposite object at 0x7db3d39a1550>)
Fetching and summarizing URL: https://www.kaggle.com/c/titanic
Kaggle URL detected. Using specialized tools for a detailed summary.
Extracted slug: titanic
I apologize, but I was unable to summarize the competition content from the provided URL using the available tools. While I know the competition slug is 'titanic', the `summarize_url_content` function encountered an issue.


## 3. Session Statistics
Review the agent's performance and usage.

In [15]:
if 'agent' in locals():
    print(json.dumps(agent.get_session_stats(), indent=2))

{
  "agent_stats": {
    "queries_processed": 7,
    "tools_called": 7,
    "errors": 0
  },
  "memory_stats": {
    "total_messages": 14,
    "user_messages": 7,
    "assistant_messages": 7
  },
  "logger_stats": {
    "total_logs": 29,
    "info_count": 29,
    "error_count": 0
  }
}
