<a href="https://colab.research.google.com/github/SmraSK/ShadowFox/blob/Advanced-Level-(Week-3)/Week3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

***WEEK 3***
1. Cricket Fielding Analysis Data Collection Objective:
As a budding sports analyst with an interest in cricket, your task is to
conduct a detailed fielding performance analysis for three players of
your choice from any innings of a T20 match. This analysis will help to
gauge individual fielding contributions and their impact on the team's
defensive play. A detailed sample data and sample performance matrix
is attached along the mail of task list.
Dataset Features:
- Match No.: Identifier for the match.
- Innings: Which innings the data is being recorded for.
- Team: The team in the field.
- Player Name: The fielder involved in the action.
- Ballcount: Sequence number of the ball in the over.
- Position: Fielding position of the player at the time of the ball.
- Short Description: Brief description of the fielding event.
- Pick: Categorize the pick-up as clean pick, good throw, fumble, bad
throw, catch, or drop catch.
- Throw: Classify the throw as run out, missed stumping, missed run
out, or stumping.
- Runs: Enter the number of runs saved (+) or conceded (-) through the
fielding effort.
- Overcount: The over number in which the event occurred.
- Venue: Location of the match.

Task 3(Advanced Level): Continued...
Performance Metrics Formula:
To assess the fielding performance, use the following formula:
PS=(CP×WCP)+(GT×WGT)+(C×WC)+(DC×WDC)+(ST×WST)+
(RO×WRO )+(MRO×WMRO)+(DH×WDH)+RS
Where:
PS: Performance Score
CP: Clean Picks
GT: Good Throws
C: Catches
DC: Dropped Catches
ST: Stumpings
RO: Run Outs
MRO: Missed Run Outs
DH: Direct Hits
RS: Runs Saved (positive for runs saved, negative for runs conceded)
Task Instructions:
1. Data Collection: For each ball bowled in the match, record the fielding
effort according to the dataset features outlined above. Pay close attention
to the effectiveness of fielding actions and their outcomes.
2. Analysis Preparation: Your collected data will be used for advanced
fielding analysis, identifying key areas of improvement and fielding strengths
within the team.
Deliverable: A well-organized spreadsheet or database containing the
complete fielding data for the match.
This task requires meticulous attention to detail and an understanding of
cricket fielding dynamics. Your analysis will contribute to strategic fielding
placements and improvements in team performance.

In [3]:
# Cricket Fielding Performance Analysis - Complete Solution
# This code provides BOTH data collection template AND analysis

import pandas as pd
import numpy as np
from datetime import datetime

# DATA COLLECTION TEMPLATE CREATION
def create_data_collection_template():
    """
    Creates a structured template for collecting fielding data ball-by-ball
    """
    template_columns = [
        'Match No.',
        'Innings',
        'Team',
        'Player Name',
        'Ballcount',
        'Position',
        'Short Description',
        'Pick',  # Options: Clean Pick (Y), Fumble (N), Catch (C), Drop Catch (DC)
        'Throw',  # Options: Good Throw (Y), Bad Throw (N), Direct Hit (DH)
        'Runs',  # Positive for saved, negative for conceded
        'Overcount',
        'Venue',
        'Stadium'
    ]

    # Create empty DataFrame with proper structure
    df_template = pd.DataFrame(columns=template_columns)

    # Add instructions as comments in first row (to be deleted after reading)
    instructions = {
        'Match No.': 'Enter match identifier (e.g., IPL2367)',
        'Innings': 'Enter 1 or 2',
        'Team': 'Team name in the field',
        'Player Name': 'Fielder name',
        'Ballcount': 'Ball number in over (e.g., 0.1, 0.2)',
        'Position': 'Fielding position',
        'Short Description': 'Brief description of fielding event',
        'Pick': 'Y=Clean Pick, N=Fumble, C=Catch, DC=Drop Catch',
        'Throw': 'Y=Good Throw, N=Bad Throw, DH=Direct Hit, RO=Run Out, MR=Missed Run Out, S=Stumping',
        'Runs': 'Enter + for saved, - for conceded',
        'Overcount': 'Over number',
        'Venue': 'Match location',
        'Stadium': 'Stadium name'
    }

    return df_template, instructions

# DATA PROCESSING AND CATEGORIZATION
def process_fielding_data(df):
    """
    Process raw fielding data and categorize actions into performance metrics
    """
    # Initialize performance metrics columns
    df['Clean Picks (CP)'] = 0
    df['Good Throws (GT)'] = 0
    df['Catches (C)'] = 0
    df['Dropped Catches (DC)'] = 0
    df['Stumpings (ST)'] = 0
    df['Run Outs (RO)'] = 0
    df['Missed Run Outs (MRO)'] = 0
    df['Direct Hits (DH)'] = 0
    df['Runs Saved (RS)'] = 0

    # Process Pick column
    df.loc[df['Pick'] == 'Y', 'Clean Picks (CP)'] = 1
    df.loc[df['Pick'] == 'C', 'Catches (C)'] = 1
    df.loc[df['Pick'] == 'DC', 'Dropped Catches (DC)'] = 1

    # Process Throw column
    df.loc[df['Throw'] == 'Y', 'Good Throws (GT)'] = 1
    df.loc[df['Throw'] == 'DH', 'Direct Hits (DH)'] = 1
    df.loc[df['Throw'] == 'RO', 'Run Outs (RO)'] = 1
    df.loc[df['Throw'] == 'MR', 'Missed Run Outs (MRO)'] = 1
    df.loc[df['Throw'] == 'S', 'Stumpings (ST)'] = 1

    # Process Runs column
    df['Runs Saved (RS)'] = pd.to_numeric(df['Runs'], errors='coerce').fillna(0)

    return df

# PERFORMANCE SCORE CALCULATION
def calculate_performance_scores(df, selected_players=None):
    """
    Calculate Performance Score (PS) for players based on the formula:
    PS = (CP×1) + (GT×1) + (C×3) + (DC×-3) + (ST×3) + (RO×3) + (MRO×-2) + (DH×2) + RS

    Args:
        df: Processed fielding DataFrame
        selected_players: List of 3 player names to analyze (as per requirement)
    """
    # Define weights
    weights = {
        'CP': 1,   # Clean Picks
        'GT': 1,   # Good Throws
        'C': 3,    # Catches
        'DC': -3,  # Dropped Catches
        'ST': 3,   # Stumpings
        'RO': 3,   # Run Outs
        'MRO': -2, # Missed Run Outs
        'DH': 2    # Direct Hits
    }

    # Filter for selected players if specified
    if selected_players:
        df_analysis = df[df['Player Name'].isin(selected_players)].copy()
    else:
        df_analysis = df.copy()

    # Aggregate metrics by player
    player_summary = df_analysis.groupby('Player Name').agg({
        'Clean Picks (CP)': 'sum',
        'Good Throws (GT)': 'sum',
        'Catches (C)': 'sum',
        'Dropped Catches (DC)': 'sum',
        'Stumpings (ST)': 'sum',
        'Run Outs (RO)': 'sum',
        'Missed Run Outs (MRO)': 'sum',
        'Direct Hits (DH)': 'sum',
        'Runs Saved (RS)': 'sum'
    }).reset_index()

    # Calculate Performance Score
    player_summary['Performance Score (PS)'] = (
        player_summary['Clean Picks (CP)'] * weights['CP'] +
        player_summary['Good Throws (GT)'] * weights['GT'] +
        player_summary['Catches (C)'] * weights['C'] +
        player_summary['Dropped Catches (DC)'] * weights['DC'] +
        player_summary['Stumpings (ST)'] * weights['ST'] +
        player_summary['Run Outs (RO)'] * weights['RO'] +
        player_summary['Missed Run Outs (MRO)'] * weights['MRO'] +
        player_summary['Direct Hits (DH)'] * weights['DH'] +
        player_summary['Runs Saved (RS)']
    )

    # Sort by Performance Score
    player_summary = player_summary.sort_values('Performance Score (PS)', ascending=False)

    return player_summary

# MAIN EXECUTION FUNCTION
def main():
    """
    Main function to execute the complete fielding analysis workflow
    """
    print("="*70)
    print("CRICKET FIELDING PERFORMANCE ANALYSIS")
    print("="*70)

    # Step 1: Create data collection template
    print("\n[STEP 1] Creating data collection template...")
    df_template, instructions = create_data_collection_template()
    df_template.to_csv('fielding_data_collection_template.csv', index=False)
    print("✓ Template saved as 'fielding_data_collection_template.csv'")
    print("\nData Collection Instructions:")
    for col, instruction in instructions.items():
        print(f"  • {col}: {instruction}")

    # Step 2: Load and process your actual data
    print("\n[STEP 2] Loading and processing fielding data...")

    # For demonstration, create sample data based on the sample data provided
    sample_data = {
        'Match No.': ['IPL2367'] * 12,
        'Innings': [1] * 12,
        'Team': ['Delhi Capitals'] * 12,
        'Player Name': ['Rilee russouw', 'Phil Salt', 'Yash Dhull', 'Axer Patel',
                       'Lalit yadav', 'Aman Khan', 'Kuldeep yadav', 'Rilee russouw',
                       'Phil Salt', 'Yash Dhull', 'Yash Dhull', 'Lalit yadav'],
        'Ballcount': [0.1, 0.2, 0.3, 0.4, 0.6, 1.1, 1.3, 0.5, 1.0, 1.2, 1.4, 1.5],
        'Position': ['Short mid wicket', 'wicket keeper', 'covers', 'point',
                    'cover point', 'long off', 'Short mid wicket', 'point',
                    'wicket keeper', 'covers', 'covers', 'bowler'],
        'Short Description': ['Clean pick'] * 12,
        'Pick': ['Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'C', 'DC', 'C', 'C', 'Y'],
        'Throw': ['Y', 'Y', 'Y', 'Y', 'Y', 'Y', 'N', 'DH', 'RO', 'Y', 'MR', 'DH'],
        'Runs': [2, -1, 3, 0, -2, 1, 4, 0, 0, 0, 0, 0],
        'Overcount': [0, 0, 0, 0, 0, 1, 1, 0, 1, 1, 1, 1],
        'Venue': ['Delhi'] * 12,
        'Stadium': ['Arun Jaitly Stadium'] * 12
    }

    df = pd.DataFrame(sample_data)

    # Process the data
    df_processed = process_fielding_data(df)
    print("✓ Data processed successfully")

    # Step 3: Select THREE players for analysis (as per requirement)
    print("\n[STEP 3] Selecting THREE players for detailed analysis...")
    all_players = df_processed['Player Name'].unique()

    # Select first 3 players (you can modify this selection)
    selected_players = list(all_players[:3])
    print(f"Selected players: {', '.join(selected_players)}")

    # Step 4: Calculate performance scores
    print("\n[STEP 4] Calculating Performance Scores...")
    performance_summary = calculate_performance_scores(df_processed, selected_players)

    print("\n" + "="*70)
    print("PERFORMANCE SUMMARY FOR SELECTED PLAYERS")
    print("="*70)
    print(performance_summary.to_string(index=False))

    # Step 5: Save deliverables
    print("\n[STEP 5] Saving analysis deliverables...")
    df_processed.to_csv('complete_fielding_data.csv', index=False)
    performance_summary.to_csv('performance_analysis.csv', index=False)

    print("\n✓ Deliverables saved:")
    print("  • complete_fielding_data.csv - Full ball-by-ball data")
    print("  • performance_analysis.csv - Performance summary")

    # Step 6: Generate insights
    print("\n" + "="*70)
    print("KEY INSIGHTS")
    print("="*70)

    for idx, row in performance_summary.iterrows():
        print(f"\n{row['Player Name']}:")
        print(f"  Performance Score: {row['Performance Score (PS)']}")
        print(f"  Strengths: ", end="")
        strengths = []
        if row['Catches (C)'] > 0:
            strengths.append(f"{int(row['Catches (C)'])} catches")
        if row['Direct Hits (DH)'] > 0:
            strengths.append(f"{int(row['Direct Hits (DH)'])} direct hits")
        if row['Runs Saved (RS)'] > 0:
            strengths.append(f"{int(row['Runs Saved (RS)'])} runs saved")
        print(", ".join(strengths) if strengths else "Clean fielding")

        if row['Dropped Catches (DC)'] > 0 or row['Missed Run Outs (MRO)'] > 0:
            print(f"  Areas for improvement: ", end="")
            improvements = []
            if row['Dropped Catches (DC)'] > 0:
                improvements.append(f"Catching ({int(row['Dropped Catches (DC)'])} drops)")
            if row['Missed Run Outs (MRO)'] > 0:
                improvements.append(f"Run out accuracy")
            print(", ".join(improvements))

    print("\n" + "="*70)

# RUN THE COMPLETE ANALYSIS

if __name__ == "__main__":
    main()

CRICKET FIELDING PERFORMANCE ANALYSIS

[STEP 1] Creating data collection template...
✓ Template saved as 'fielding_data_collection_template.csv'

Data Collection Instructions:
  • Match No.: Enter match identifier (e.g., IPL2367)
  • Innings: Enter 1 or 2
  • Team: Team name in the field
  • Player Name: Fielder name
  • Ballcount: Ball number in over (e.g., 0.1, 0.2)
  • Position: Fielding position
  • Short Description: Brief description of fielding event
  • Pick: Y=Clean Pick, N=Fumble, C=Catch, DC=Drop Catch
  • Throw: Y=Good Throw, N=Bad Throw, DH=Direct Hit, RO=Run Out, MR=Missed Run Out, S=Stumping
  • Runs: Enter + for saved, - for conceded
  • Overcount: Over number
  • Venue: Match location
  • Stadium: Stadium name

[STEP 2] Loading and processing fielding data...
✓ Data processed successfully

[STEP 3] Selecting THREE players for detailed analysis...
Selected players: Rilee russouw, Phil Salt, Yash Dhull

[STEP 4] Calculating Performance Scores...

PERFORMANCE SUMMARY FOR 