# IPL Data Analysis

## Overview

This notebook contains the analysis of IPL cricket data, including insights into team and player performance, match outcomes, and various key metrics related to both batting and bowling.

## Data Cleaning and Exploration

The datasets were loaded, cleaned, and inspected to ensure they were ready for analysis. No missing values or duplicates were found.

## Initial Data Analysis

### Top 10 Batsmen by Runs Scored

### Top 10 Bowlers by Wickets Taken

### Team Performance by Number of Wins

## Additional Analysis

### Enhanced Match Outcome Analysis

### Detailed Batting Metrics by Over

### Detailed Bowling Metrics

In [None]:
import pandas as pd

# Load the datasets
fact_bowling = pd.read_csv('fact_bowling_summary.csv')
fact_batting = pd.read_csv('fact_bating_summary.csv')
dim_players = pd.read_csv('dim_players.csv')
dim_match_summary = pd.read_csv('dim_match_summary.csv')

# Initial Data Analysis
# Top Batsmen Analysis
top_batsmen = fact_batting.groupby('batsmanName').agg({
    'runs': 'sum',
    'balls': 'sum',
    '4s': 'sum',
    '6s': 'sum'
}).reset_index()
top_batsmen['SR'] = (top_batsmen['runs'] / top_batsmen['balls']) * 100
top_batsmen = top_batsmen.sort_values(by='runs', ascending=False).head(10)

# Top Bowlers Analysis
top_bowlers = fact_bowling.groupby('bowlerName').agg({
    'wickets': 'sum',
    'overs': 'sum',
    'runs': 'sum'
}).reset_index()
top_bowlers['economy'] = top_bowlers['runs'] / top_bowlers['overs']
top_bowlers = top_bowlers.sort_values(by='wickets', ascending=False).head(10)

# Team Performance Analysis
team_performance = dim_match_summary.groupby('winner').agg({'margin': 'count'}).reset_index()
team_performance.columns = ['Team', 'Wins']
team_performance = team_performance.sort_values(by='Wins', ascending=False)


In [None]:
# Enhanced Match Outcome Analysis
win_loss_analysis = dim_match_summary.groupby(['team1', 'team2', 'winner']).size().unstack(fill_value=0)
win_loss_analysis['Total_Matches'] = win_loss_analysis.sum(axis=1)
win_loss_analysis['Win_Ratio'] = win_loss_analysis.max(axis=1) / win_loss_analysis['Total_Matches']

# Detailed Batting Metrics by Over
fact_batting['over'] = fact_batting['balls'] // 6 + 1
batting_by_over = fact_batting.groupby(['batsmanName', 'over']).agg({
    'runs': 'sum',
    'balls': 'sum',
    '4s': 'sum',
    '6s': 'sum'
}).reset_index()

# Detailed Bowling Metrics
bowling_detailed_metrics = fact_bowling.groupby('bowlerName').agg({
    '0s': 'sum',
    'wides': 'sum',
    'noBalls': 'sum',
    'wickets': 'sum'
}).reset_index()

# Displaying the results
win_loss_analysis.head(), batting_by_over.head(), bowling_detailed_metrics.head()