# Player Performance Predictor - Instant Prediction Tester

**Goal:** This notebook is for testing our saved models. It does **not** perform any training. It loads our pre-trained models and makes instant predictions.

**How to Use:** Simply run all the cells. Then, change the `player`, `venue`, and `opposition` variables in the last cell to test any scenario you want.

### Step 1: Load Libraries and All Required Files

In [1]:
import pandas as pd
import sqlite3
import joblib

# --- Load the Saved Models and Columns --- #
try:
    saved_batting_model = joblib.load('../models/batting_predictor_model.joblib')
    saved_batting_columns = joblib.load('../models/batting_model_columns.joblib')
    saved_bowling_model = joblib.load('../models/bowling_predictor_model.joblib')
    saved_bowling_columns = joblib.load('../models/bowling_model_columns.joblib')
    print("Models and column lists loaded successfully.")
except FileNotFoundError:
    print("ERROR: Model files not found. Please run the '2_Model_Training.ipynb' notebook first.")

# --- Load the full raw data (needed to calculate a player's form) --- #
db_path = '../cricket_data.db'
conn = sqlite3.connect(db_path)
batting_df = pd.read_sql_query("SELECT * FROM batting_innings", conn)
bowling_df = pd.read_sql_query("SELECT * FROM bowling_innings", conn)
conn.close()
batting_df['date'] = pd.to_datetime(batting_df['date'])
bowling_df['date'] = pd.to_datetime(bowling_df['date'])
print("Full historical data loaded for form calculation.")

Models and column lists loaded successfully.
Full historical data loaded for form calculation.


### Step 2: Define the Prediction Function

In [2]:
def predict_performance(player_name, venue, opposition, full_batting_df, full_bowling_df):
    """
    Predicts both runs and wickets for a player given the context.
    """
    
    # --- Batting Prediction ---
    try:
        player_batting_history = full_batting_df[full_batting_df['player'] == player_name].sort_values(by='date')
        if player_batting_history.empty:
            predicted_runs = "Batting data not found."
        else:
            last_5_innings_form = player_batting_history['runs'].tail(5).mean()
            batting_input = pd.DataFrame(columns=saved_batting_columns, index=[0]).fillna(0)
            
            player_col = f"player_reduced_{player_name}"
            venue_col = f"venue_reduced_{venue}"
            opposition_col = f"against_team_reduced_{opposition}"

            if player_col in batting_input.columns:
                batting_input.loc[0, player_col] = 1
            else:
                batting_input.loc[0, 'player_reduced_Other_Player'] = 1

            if venue_col in batting_input.columns:
                batting_input.loc[0, venue_col] = 1
            else:
                batting_input.loc[0, 'venue_reduced_Other_Venue'] = 1
            
            if opposition_col in batting_input.columns:
                batting_input.loc[0, opposition_col] = 1
            else:
                batting_input.loc[0, 'against_team_reduced_Other_Team'] = 1

            batting_input['form_last_5_innings'] = last_5_innings_form
            prediction = saved_batting_model.predict(batting_input)
            predicted_runs = f"{prediction[0]:.2f}"
    except Exception as e:
        predicted_runs = f"An error occurred during batting prediction: {e}"

    # --- Bowling Prediction ---
    try:
        player_bowling_history = full_bowling_df[full_bowling_df['player'] == player_name].sort_values(by='date')
        if player_bowling_history.empty:
            predicted_wickets = "No bowling history found for this player."
        else:
            last_5_wickets_form = player_bowling_history['wickets'].tail(5).mean()
            bowling_input = pd.DataFrame(columns=saved_bowling_columns, index=[0]).fillna(0)
            
            player_col_bowl = f"player_reduced_{player_name}"
            venue_col_bowl = f"venue_reduced_{venue}"
            opposition_col_bowl = f"against_team_reduced_{opposition}"

            if player_col_bowl in bowling_input.columns:
                bowling_input.loc[0, player_col_bowl] = 1
            else:
                bowling_input.loc[0, 'player_reduced_Other_Player'] = 1

            if venue_col_bowl in bowling_input.columns:
                bowling_input.loc[0, venue_col_bowl] = 1
            else:
                bowling_input.loc[0, 'venue_reduced_Other_Venue'] = 1

            if opposition_col_bowl in bowling_input.columns:
                bowling_input.loc[0, opposition_col_bowl] = 1
            else:
                bowling_input.loc[0, 'against_team_reduced_Other_Team'] = 1

            bowling_input['form_last_5_wickets'] = last_5_wickets_form
            prediction_wickets = saved_bowling_model.predict(bowling_input)
            predicted_wickets = f"{prediction_wickets[0]:.2f}"
    except Exception as e:
        predicted_wickets = f"An error occurred during bowling prediction: {e}"

    return predicted_runs, predicted_wickets

print("Prediction function is ready.")

Prediction function is ready.


### Step 3: Test a Prediction!

In [3]:
# --- Change the values below to test any scenario ---
player = "RG Sharma"
venue = "Eden Gardens, Kolkata"
opposition = "Australia"

predicted_score, predicted_wickets = predict_performance(player, venue, opposition, batting_df, bowling_df)

print(f"--- Prediction Result ---")
print(f"Player: {player}")
print(f"Venue: {venue}")
print(f"Opposition: {opposition}")
print(f"\nPredicted Runs: {predicted_score}")
print(f"Predicted Wickets: {predicted_wickets}")

--- Prediction Result ---
Player: RG Sharma
Venue: Eden Gardens, Kolkata
Opposition: Australia

Predicted Runs: 26.09
Predicted Wickets: 1.04


  batting_input = pd.DataFrame(columns=saved_batting_columns, index=[0]).fillna(0)
[Parallel(n_jobs=12)]: Using backend ThreadingBackend with 12 concurrent workers.
[Parallel(n_jobs=12)]: Done  17 tasks      | elapsed:    0.0s
[Parallel(n_jobs=12)]: Done 100 out of 100 | elapsed:    0.0s finished
  bowling_input = pd.DataFrame(columns=saved_bowling_columns, index=[0]).fillna(0)
[Parallel(n_jobs=12)]: Using backend ThreadingBackend with 12 concurrent workers.
[Parallel(n_jobs=12)]: Done  17 tasks      | elapsed:    0.0s
[Parallel(n_jobs=12)]: Done 100 out of 100 | elapsed:    0.0s finished
