# NPL Season 1 (2024) Analysis
The Nepal Premier League (NPL) is a T20 cricket league in Nepal. It was founded by the Cricket Association of Nepal (CAN) in 2024. The inaugural season was held from November 30, 2024 (15 Mangsir) to December 6, 2024 (6 Poush 2081 BS). The league features eight teams based on provinces and cities. The tournament follows a round-robin format, with the top teams advancing to the playoffs.


![Image](https://wicketnepal.com/wp-content/uploads/2024/11/468719762_122127843350455690_7601835368897814823_n.jpg)

In [7]:
#importing necessary libraries
import os
import pandas as pd
import plotly.express as px
import plotly.graph_objects as go

## Batting Analysis 

In [8]:
# Load the main batting data for analysis
batting_file = os.path.join(os.getcwd(), "Batting Records", "most_runs.csv")
df = pd.read_csv(batting_file)


In [9]:
df.head()

Unnamed: 0,player,span,matches_played,innings_batted,not_out,runs,highest_inns_score,batting_average,balls_faced,strike_rate,hundreds_scored,fifties_scored,ducks_scored,boundary_fours,boundary_sixes
0,BKEL Milantha (JAB),2024-2024,10,10,1,293,87,32.55,224,130.8,-,2,-,35,9
1,RS Bopara (CHR),2024-2024,8,8,1,286,59*,40.85,211,135.54,-,3,-,24,11
2,RK Paudel (LUL),2024-2024,7,7,-,279,95,39.85,212,131.6,-,2,-,26,11
3,SA Zaib (SPR),2024-2024,9,9,-,275,90,30.55,202,136.13,-,2,-,24,12
4,JDS Neesham (JAB),2024-2024,9,9,2,247,65,35.28,126,196.03,-,1,-,14,23


### Top 10 run scorers

In [10]:
# Top 10 run scorers
top_runs = df.copy()
top_runs['runs'] = pd.to_numeric(top_runs['runs'], errors='coerce')
top_runs = top_runs.sort_values('runs', ascending=False).head(10)
fig_top_runs = px.bar(
    top_runs,
    x='player',
    y='runs',
    title='Top 10 Run Scorers'
)
fig_top_runs.update_layout(xaxis_title='Player', yaxis_title='Runs', xaxis_tickangle=-45)
fig_top_runs.show()

Based on the "Top 10 Run Scorers" graph, the leading run scorers in the Nepal Premier League 2024 are:

1. **BKEL Milantha (JAB)** – 293 runs
2. **RS Bopara (CHR)** – 286 runs
3. **RK Paudel (LUL)** – 279 runs
4. **SA Zaib (SPR)** – 275 runs
5. **JDS Neesham (JAB)** – 247 runs
6. **B Bhandari (SPR)** – 235 runs
7. **DS Airee (SPR)** – 227 runs
8. **AGS Gous (POA)** – 224 runs
9. **Kushal Malla (CHR)** – 223 runs
10. **B McMullen (SPR)** – 201 runs

**BKEL Milantha (JAB)** is the top scorer in the tournament so far, followed closely by RS Bopara (CHR) and RK Paudel (LUL).


### Top 10 Batting Averages

In [11]:
df_avg = df[pd.to_numeric(df['batting_average'], errors='coerce').notnull()].copy()
df_avg['batting_average'] = df_avg['batting_average'].astype(float)
top_avg = df_avg.sort_values('batting_average', ascending=False).head(10)
fig_avg = px.bar(top_avg, x='player', y='batting_average', title='Top 10 Batting Averages')
fig_avg.update_layout(xaxis_title='Player', yaxis_title='Average', xaxis_tickangle=-45)
fig_avg.show()

The "Top 10 Batting Averages" bar graph highlights the most consistent batsmen in the Nepal Premier League 2024. The leading players by batting average are:

1. **B Yadav (LUL)** – 50.00
2. **S Dhawan (KAY)** – 45.33
3. **AGS Gous (POA)** – 44.80
4. **RS Bopara (CHR)** – 40.85
5. **RK Paudel (LUL)** – 39.85
6. **WG Bosisto (KAY)** – 38.80
7. **DS Airee (SPR)** – 37.83
8. **RA Reifer (POA)** – 36.50
9. **Basir Ahamad (BIK)** – 36.20
10. **JDS Neesham (JAB)** – 35.28

**B Yadav (LUL)** leads with an impressive average of 50.00, indicating high consistency and reliability at the crease. The presence of multiple players from different teams in the top 10 reflects a competitive tournament with several standout performers.

### Top 10 Strike Rates

In [62]:
# Top 10 strike rates
df['strike_rate'] = pd.to_numeric(df['strike_rate'], errors='coerce')
top_sr = df.sort_values('strike_rate', ascending=False).head(10)
fig_sr = px.bar(
    top_sr,
    x='player',
    y='strike_rate',
    title='Top 10 Strike Rates'
)
fig_sr.update_layout(xaxis_title='Player', yaxis_title='Strike Rate', xaxis_tickangle=-45)
fig_sr.show()

The "Top 10 Strike Rates" bar chart highlights the most explosive batsmen in the Nepal Premier League 2024. JDS Neesham (JAB) leads with a remarkable strike rate, followed by NB Sarki (CHR) and AGS Gous (POA), showcasing their ability to score runs quickly and put pressure on the opposition.

### Most Hundreds

In [13]:
df['hundreds_scored'] = pd.to_numeric(df['hundreds_scored'], errors='coerce').fillna(0)
top_hundreds = df.sort_values('hundreds_scored', ascending=False).head(10)

In [63]:
fig_hundreds = px.bar(top_hundreds, x='player', y='hundreds_scored', title='Most Hundreds')
fig_hundreds.update_layout(xaxis_title='Player', yaxis_title='Hundreds', xaxis_tickangle=-45)
fig_hundreds.show()

The "Most Hundreds" bar chart displays the players who have scored the highest number of centuries in the Nepal Premier League 2024. **AGS Gous (POA)** stands out as the only player with a century, highlighting the rarity of hundreds in this tournament. Other top performers have contributed consistently but have not reached the three-figure mark.

### Most Fifties

In [15]:
df['fifties_scored'] = pd.to_numeric(df['fifties_scored'], errors='coerce').fillna(0)
top_fifties = df.sort_values('fifties_scored', ascending=False).head(10)

In [64]:
fig_fifties = px.bar(top_fifties, x='player', y='fifties_scored', title='Most Fifties')
fig_fifties.update_layout(xaxis_title='Player', yaxis_title='Fifties', xaxis_tickangle=-45)
fig_fifties.show()


The "Most Fifties" bar chart showcases the players with the highest number of half-centuries in the Nepal Premier League 2024. **RS Bopara (CHR)** leads with 3 fifties, while several others, including SA Zaib (SPR), RK Paudel (LUL), and BKEL Milantha (JAB), have contributed multiple fifties, highlighting their consistency and key roles for their teams.

### Most Boundaries

In [65]:
df['boundary_fours'] = pd.to_numeric(df['boundary_fours'], errors='coerce').fillna(0)
df['boundary_sixes'] = pd.to_numeric(df['boundary_sixes'], errors='coerce').fillna(0)
df['total_boundaries'] = df['boundary_fours'] + df['boundary_sixes']
top_boundaries = df.sort_values('total_boundaries', ascending=False).head(10)
fig_boundaries = px.bar(top_boundaries, x='player', y='total_boundaries', title='Most Boundaries (Fours + Sixes)')
fig_boundaries.update_layout(xaxis_title='Player', yaxis_title='Total Boundaries', xaxis_tickangle=-45)
fig_boundaries.show()

The "Most Boundaries (Fours + Sixes)" bar chart highlights the players who have hit the highest number of boundaries in the Nepal Premier League 2024. BKEL Milantha (JAB) leads the chart, followed by AGS Gous (POA) and RK Paudel (LUL), showcasing their aggressive batting and ability to find the fence consistently.

### Top Batting Runs

In [66]:
df['runs'] = pd.to_numeric(df['runs'], errors='coerce')
fig = px.bar(
    df,
    x='player',
    y='runs',
    title='Top Batting Runs'
)
fig.update_layout(xaxis_title='Player', yaxis_title='Runs', xaxis_tickangle=-45)
fig.show()

The chart visualizes the top run scorers in the Nepal Premier League 2024, highlighting the players with the highest total runs in the tournament. It provides a quick comparison of individual batting performances, showcasing the leading contributors with the bat.

### Run Vs Strike Rate

In [68]:
df_short = df.copy()  # Extract first two words

df_short['short_name'] = df_short['player'].str.extract(r'([A-Za-z]+)')  # Extract only the first word

fig_runs_sr = px.scatter(
    df_short,
    x='strike_rate',
    y='runs',
    text='short_name',
    title='Runs vs Strike Rate (All Players)',
    labels={'strike_rate': 'Strike Rate', 'runs': 'Total Runs'},
    hover_data=['player', 'matches_played', 'batting_average']
)
fig_runs_sr.update_traces(textposition='top center')
fig_runs_sr.update_layout(xaxis_tickangle=-45)
fig_runs_sr.show()

The "Runs vs Strike Rate (All Players)" scatter plot provides a comprehensive visualization of batting performance in the Nepal Premier League 2024. Each point represents a player, with the x-axis showing their strike rate and the y-axis displaying total runs scored. Player short names are annotated for quick identification.

- **High Strike Rate & High Runs:** Players in the upper-right quadrant, such as JDS Neesham (JAB), combine aggressive scoring with consistency, making them valuable assets for their teams.
- **High Runs, Moderate Strike Rate:** Players like BKEL Milantha (JAB) and RS Bopara (CHR) have accumulated significant runs, even if their strike rates are not the highest, indicating reliability and the ability to anchor innings.
- **High Strike Rate, Lower Runs:** Some players exhibit explosive batting in limited opportunities, reflected by high strike rates but fewer total runs.
- **Distribution Insight:** The spread of points highlights the diversity in batting roles—some players focus on quick scoring, while others prioritize building larger totals.

This visualization helps identify not just the top scorers, but also the most impactful and dynamic batsmen in the tournament, offering insights into team strategies and individual contributions.

### Distribution of Batting Averages

In [71]:
fig_avg_dist = px.histogram(
    df_avg,
    x='batting_average',
    nbins=15,
    title='Distribution of Batting Averages',
    labels={'batting_average': 'Batting Average'}
)
fig_avg_dist.show()

The "Distribution of Batting Averages" histogram provides an overview of how batting averages are spread among players in the Nepal Premier League 2024. Most players have averages clustered in the lower to mid-range, with only a few achieving high consistency. This visualization highlights the rarity of exceptionally high averages and the overall competitive balance among batsmen in the tournament.

### KDE OF Batting Averages

In [72]:
fig_avg_kde = px.density_contour(
    df_avg,
    x='batting_average',
    title='KDE of Batting Averages',
    labels={'batting_average': 'Batting Average'}
)
fig_avg_kde.update_traces(contours_coloring="fill", contours_showlabels = True)
fig_avg_kde.show()


The "KDE of Batting Averages" plot provides a smoothed visualization of the distribution of batting averages among all players in the Nepal Premier League 2024. Unlike a histogram, the Kernel Density Estimate (KDE) highlights the underlying probability density, making it easier to identify where most players' averages are concentrated and to spot outliers.

- **Main Cluster:** The majority of players have batting averages in the lower to mid-range, indicating that high consistency is rare and most batsmen face challenges in maintaining high averages.
- **Long Tail:** There are a few players with exceptionally high averages, as shown by the tail on the right side of the plot, representing standout performers.

### Distribution of Fifties Scored

In [73]:
fifty_bins = pd.cut(df['fifties_scored'], bins=[-0.1, 0.9, 1.9, 2.9, df['fifties_scored'].max()], labels=['0', '1', '2', '3+'])
fifty_counts = fifty_bins.value_counts().sort_index()
fig_fifties_pie = px.pie(
    names=fifty_counts.index,
    values=fifty_counts.values,
    title='Distribution of Fifties Scored'
)
fig_fifties_pie.show()

The "Distribution of Fifties Scored" pie chart illustrates how many players have scored 0, 1, 2, or 3+ fifties in the Nepal Premier League 2024. The majority of players (53 out of 75) have not scored a fifty, while only one player has achieved 3 or more. This highlights the rarity of multiple fifties and the competitive nature of batting performances in the tournament.

### Top 10 Players with Most Ducks

In [75]:
df['ducks_scored'] = pd.to_numeric(df['ducks_scored'], errors='coerce').fillna(0)
top_ducks = df.sort_values('ducks_scored', ascending=False).head(10)
fig_ducks = px.bar(
    top_ducks,
    x='player',
    y='ducks_scored',
    title='Top 10 Players with Most Ducks',
)
fig_ducks.update_layout(xaxis_title='Player', yaxis_title='Ducks', xaxis_tickangle=-45)
fig_ducks.show()

The "Top 10 Players with Most Ducks" bar chart highlights the batsmen who have been dismissed for zero the most times in the Nepal Premier League 2024. 

- **Saad Bin Zafar (LUL)** leads the list with 3 ducks, indicating a challenging season with the bat.
- Several players, including Gulsan Jha (KAY), D Khanal (KAY), S Malla (JAB), and K Bhurtel (POA), have each recorded 2 ducks.
- The remaining players in the top 10 have 1 duck each, showing that while ducks are not uncommon, only a few players have struggled repeatedly.

This visualization provides insight into which players have faced difficulties getting off the mark, which can impact both individual confidence and team performance.

### Top 10 Strike Rates

In [76]:
fig_lollipop = go.Figure()
fig_lollipop.add_trace(go.Scatter(
    x=top_sr['strike_rate'],
    y=top_sr['player'],
    mode='markers+lines',
    marker=dict(size=12, color=top_sr['strike_rate'], colorscale='Blues', showscale=True),
    line=dict(color='gray', width=2),
    name='Strike Rate'
))
fig_lollipop.update_layout(
    title='Top 10 Strike Rates (Lollipop Chart)',
    xaxis_title='Strike Rate',
    yaxis_title='Player',
    yaxis=dict(categoryorder='total ascending')
)
fig_lollipop.show()

The lollipop chart visualizes the top 10 players with the highest strike rates in the Nepal Premier League 2024. It highlights the most aggressive batsmen, making it easy to compare their ability to score runs quickly. JDS Neesham (JAB) leads the chart, followed by NB Sarki (CHR) and AGS Gous (POA), showcasing their explosive batting performances.

## Bowling Analysis

In [77]:
# Load the main bowling data for analysis
bowling_file = os.path.join(os.getcwd(), "Bowling Records", "most_wickets.csv")
df_bowl = pd.read_csv(bowling_file)

In [78]:
df_bowl.head()

Unnamed: 0,player,span,matches_played,innings_bowled,balls,overs,maidens_earned,conceded,wicket,best_innings_bowling,bowling_average,economy_rate,strike_rate,four_wickets,five_wickets
0,SC Kuggeleijn (SPR),2024-2024,9,9,173,28.5,1,210,17,5/18,12.35,7.28,10.17,-,1
1,LN Rajbanshi (JAB),2024-2024,10,10,240,40.0,5,213,17,3/8,12.52,5.32,14.11,-,-
2,Sohail Tanvir (CHR),2024-2024,8,8,182,30.2,1,174,14,5/21,12.42,5.73,13.0,-,1
3,K Mahato (JAB),2024-2024,10,10,198,33.0,1,265,14,4/35,18.92,8.03,14.14,1,-
4,BP Sharma (KAY),2024-2024,8,8,174,29.0,2,171,13,3/17,13.15,5.89,13.38,-,-


### Top 10 Wicket Takers

In [79]:
df_bowl['wicket'] = pd.to_numeric(df_bowl['wicket'], errors='coerce')
top_wickets = df_bowl.sort_values('wicket', ascending=False).head(10)
fig_bowl_wickets = px.bar(
    top_wickets,
    x='player',
    y='wicket',
    title='Top 10 Wicket Takers',
    labels={'player': 'Player', 'wicket': 'Wickets'}
)
fig_bowl_wickets.update_layout(xaxis_tickangle=-45)
fig_bowl_wickets.show()

The bar chart above highlights the top 10 wicket-takers in the Nepal Premier League 2024. SC Kuggeleijn (SPR) and LN Rajbanshi (JAB) lead the list with 17 wickets each, followed by Sohail Tanvir (CHR) and K Mahato (JAB) with 14 wickets. The visualization provides a clear comparison of the most successful bowlers in the tournament.

### Best Economy Rates

In [80]:
# 2. Best Economy Rates (Top 10, min 20 overs bowled)
if 'overs' in df_bowl.columns and 'economy_rate' in df_bowl.columns:
    df_bowl['overs'] = pd.to_numeric(df_bowl['overs'], errors='coerce')
    df_bowl['economy_rate'] = pd.to_numeric(df_bowl['economy_rate'], errors='coerce')
    econ_bowlers = df_bowl[df_bowl['overs'] >= 20].sort_values('economy_rate').head(10)
    fig_bowl_economy = px.bar(
        econ_bowlers,
        x='economy_rate',
        y='player',
        orientation='h',
        title='Top 10 Best Economy Rates (Min 20 Overs)',
        labels={'player': 'Player', 'economy_rate': 'Economy Rate'}
    )
    fig_bowl_economy.show()

The bar chart above highlights the top 10 bowlers with the best economy rates (minimum 20 overs bowled) in the Nepal Premier League 2024. It showcases the most economical bowlers who have effectively restricted the scoring rate, making them valuable assets for their teams.

### Best Bowling Averages

In [81]:
if 'bowling_average' in df_bowl.columns:
    df_bowl['bowling_average'] = pd.to_numeric(df_bowl['bowling_average'], errors='coerce')
    avg_bowlers = df_bowl[df_bowl['wicket'] >= 5].sort_values('bowling_average').head(10)
    fig_bowl_average = px.bar(
        avg_bowlers,
        x='player',
        y='bowling_average',
        title='Top 10 Best Bowling Averages (Min 5 Wickets)',
        labels={'player': 'Player', 'bowling_average': 'Bowling Average'}
    )
    fig_bowl_average.update_layout(xaxis_tickangle=-45)
    fig_bowl_average.show()

The chart displays the top 10 bowlers with the lowest bowling averages (minimum 5 wickets) in the Nepal Premier League 2024, emphasizing the most efficient wicket-takers of the tournament.

### Most Four and Five Wicket Hauls in an Innings

In [83]:
# Most four-wicket hauls in an innings
if 'four_wickets' in df_bowl.columns:
    df_bowl['four_wickets'] = pd.to_numeric(df_bowl['four_wickets'], errors='coerce').fillna(0)
    top_4w_innings = df_bowl.sort_values('four_wickets', ascending=False).head(5)
    fig_4w_innings = px.bar(
        top_4w_innings,
        x='player',
        y='four_wickets',
        title='Most Four-Wicket Hauls in an innings',
        labels={'player': 'Player', 'four_wickets': '4W Hauls'}
    )
    fig_4w_innings.update_layout(xaxis_tickangle=-45)
    fig_4w_innings.update_layout(xaxis_title='Player', yaxis_title='4W Hauls (Innings)')
    fig_4w_innings.show()

# Most five-wicket hauls in an innings
if 'five_wickets' in df_bowl.columns:
    df_bowl['five_wickets'] = pd.to_numeric(df_bowl['five_wickets'], errors='coerce').fillna(0)
    top_5w_innings = df_bowl.sort_values('five_wickets', ascending=False).head(5)
    fig_5w_innings = px.bar(
        top_5w_innings,
        x='player',
        y='five_wickets',
        title='Most Five-Wicket Hauls in a innings',
        labels={'player': 'Player', 'five_wickets': '5W Hauls'}
    )
    fig_5w_innings.update_layout(xaxis_tickangle=-45)
    fig_5w_innings.update_layout(xaxis_title='Player', yaxis_title='5W Hauls (Innings)')
    fig_5w_innings.show()


The bar charts above showcase the bowlers with the most four-wicket and five-wicket hauls in an innings during the Nepal Premier League 2024. Pratis GC (BIK) leads with two four-wicket hauls, while SC Kuggeleijn (SPR), Sohail Tanvir (CHR), H Thaker (JAB), WG Bosisto (KAY), and N Saud (SPR) each have a five-wicket haul, highlighting their match-winning performances with the ball.

## Fielding Analysis

In [84]:
# Load the most_catches.csv fielding data
fielding_file = os.path.join(os.getcwd(), "Fielding Records", "most_catches.csv")
df_field = pd.read_csv(fielding_file)

In [85]:
df_field.head()

Unnamed: 0,player,Span,matches_played,innings,catches,Maximum_innings_catches,Catches_per_innings
0,AK Sah (JAB),2024-2024,7,7,12,5,1.714
1,B McMullen (SPR),2024-2024,9,9,10,2,1.111
2,NB Budayair (SPR),2024-2024,9,9,9,3,1.0
3,Saad Bin Zafar (LUL),2024-2024,7,7,6,2,0.857
4,Gulsan Jha (KAY),2024-2024,9,9,6,4,0.666


### Top 10 Fielders by Catches

In [87]:
top_catchers = df_field.sort_values('catches', ascending=False).head(10)
fig_catchers = px.bar(
    top_catchers,
    x='player',
    y='catches',
    title='Top 10 Fielders by Catches',
    labels={'player': 'Player', 'catches': 'Catches'}
)
fig_catchers.update_layout(xaxis_tickangle=-45)
fig_catchers.show()

The bar chart above highlights the top 10 fielders with the most catches in the Nepal Premier League 2024. 

- **NB Budayair (SPR)** leads the list with 9 catches in 9 matches, demonstrating exceptional consistency and reliability in the field.
- **Saad Bin Zafar (LUL)** follows closely with 6 catches, while several players—including Gulsan Jha (KAY), JDS Neesham (JAB), and I Pandey (SPR)—have each taken 6 catches, showcasing their sharp reflexes and fielding skills.
- The chart also features fielders from a range of teams, reflecting the importance of fielding contributions across the tournament.

This visualization provides a quick comparison of the leading fielders, emphasizing those who have made the most impact with their catching ability and underlining the value of strong fielding in T20 cricket.

### Distribution of Catches among Players

In [88]:
fig_catch_dist = px.histogram(
    df_field,
    x='catches',
    nbins=10,
    title='Distribution of Catches Among Players',
    labels={'catches': 'Number of Catches'},
    color_discrete_sequence=['#ff6f61']  
)
fig_catch_dist.update_traces(marker_color='rgba(0,123,255,0.8)', selector=dict(type='bar'))
fig_catch_dist.update_layout(
    plot_bgcolor='rgba(245,245,255,1)',
    paper_bgcolor='rgba(230,240,255,1)',
    font=dict(family='Segoe UI', size=14, color='#222'),
    title_font=dict(size=22, family='Segoe UI', color='#007bff'),
    xaxis=dict(gridcolor='rgba(200,220,255,0.5)'),
    yaxis=dict(gridcolor='rgba(200,220,255,0.5)'),
    bargap=0.2
)
fig_catch_dist.show()

The histogram above illustrates the distribution of catches among all fielders in the Nepal Premier League 2024. Most players have taken between 1 and 4 catches, while only a few have achieved higher totals, highlighting the standout fielders in the tournament.

### Catches vs Matches Played

In [89]:
if 'matches_played' in df_field.columns:
    fig_catch_scatter = px.scatter(
        df_field,
        x='matches_played',
        y='catches',
        color='player',
        symbol='player',
        title='Catches vs Matches Played',
        labels={'matches_played': 'Matches Played', 'catches': 'Catches'},
        hover_data=['player']
    )
    fig_catch_scatter.update_traces(text=None)
    for i, row in df_field.iterrows():
        try:
            catches_val = float(row['catches'])
        except Exception:
            continue
        if not pd.isna(catches_val):
            valid_catches = pd.to_numeric(df_field['catches'], errors='coerce').dropna()
            if catches_val >= valid_catches.sort_values(ascending=False).head(10).min():
                pass

    fig_catch_scatter.update_layout(legend_title_text='Player')
    fig_catch_scatter.update_traces(textposition='top center')
    fig_catch_scatter.show()

The "Catches vs Matches Played" scatter plot visualizes the relationship between the number of matches played and catches taken by each fielder in the Nepal Premier League 2024. It highlights the top fielders who have consistently contributed with catches, allowing for quick identification of standout performers in the field.

## Partnership Analysis

In [90]:
# Load partnership records by runs and wickets
partnership_runs_file = os.path.join(os.getcwd(), "Partnership Records", "highest_partnerships_by_runs.csv")
partnership_wickets_file = os.path.join(os.getcwd(), "Partnership Records", "highest_partnerships_by_wickets.csv")

In [91]:
df_partnership_runs = pd.read_csv(partnership_runs_file)
df_partnership_wickets = pd.read_csv(partnership_wickets_file)

In [92]:
df_partnership_runs.head()

Unnamed: 0,partners,runs,wickets,team,opposition,ground,match_date
0,"D Kharel, AGS Gous",176*,1st,Pokhara,v Lumbini,Kirtipur,6 Dec 2024
1,"SA Zaib, B Bhandari",117,1st,S Paschim,v Janakpur,Kirtipur,21 Dec 2024
2,"H Thaker, BKEL Milantha",116*,3rd,Janakpur,v Biratnagar,Kirtipur,30 Nov 2024
3,"Saad Bin Zafar, RK Paudel",114,5th,Lumbini,v Chitwan,Kirtipur,10 Dec 2024
4,"Kushal Malla, RS Bopara",113,4th,Chitwan,v Janakpur,Kirtipur,14 Dec 2024


In [93]:
df_partnership_wickets.head()

Unnamed: 0,wicketes,runs,partners,team,opposition,ground,match_date
0,1st,176*,"D Kharel, AGS Gous",Pokhara,v Lumbini,Kirtipur,6 Dec 2024
1,2nd,86,"B McMullen, B Bhandari",S Paschim,v Kathmandu,Kirtipur,5 Dec 2024
2,3rd,116*,"H Thaker, BKEL Milantha",Janakpur,v Biratnagar,Kirtipur,30 Nov 2024
3,4th,113,"Kushal Malla, RS Bopara",Chitwan,v Janakpur,Kirtipur,14 Dec 2024
4,5th,114,"Saad Bin Zafar, RK Paudel",Lumbini,v Chitwan,Kirtipur,10 Dec 2024


In [94]:
print("Columns in partnership by runs:", df_partnership_runs.columns.tolist())
print("Columns in partnership by wickets:", df_partnership_wickets.columns.tolist())

Columns in partnership by runs: ['partners', 'runs', 'wickets', 'team', 'opposition', 'ground', 'match_date']
Columns in partnership by wickets: ['wicketes', 'runs', 'partners', 'team', 'opposition', 'ground', 'match_date']


In [96]:
# remove asterisk from runs column
df_partnership_runs["runs"] = df_partnership_runs["runs"].astype(str).str.replace('*', '', regex=False)
df_partnership_wickets["runs"] = df_partnership_wickets["runs"].astype(str).str.replace('*', '', regex=False)

### Top 10 Highest Partnerships by Runs

In [97]:
df_partnership_runs['runs'] = pd.to_numeric(df_partnership_runs['runs'], errors='coerce')

fig_partnership_runs = px.bar(
    df_partnership_runs.head(10),
    x='partners',
    y='runs',
    color='wickets',
    title='Top 10 Highest Partnerships by Runs',
    labels={'partners': 'Partnership', 'runs': 'Runs', 'wickets': 'Wickets'}
)
fig_partnership_runs.update_layout(xaxis_tickangle=-45)
fig_partnership_runs.show()

The "Top 10 Highest Partnerships by Runs" bar chart highlights the most prolific batting pairs in the Nepal Premier League 2024. It showcases which partnerships contributed the most runs, emphasizing the importance of strong collaborations at the crease for building big totals and shifting match momentum.

### Distribution of Partnership Runs

In [98]:
fig_runs_dist = px.histogram(
    df_partnership_runs,
    x='runs',
    nbins=20,
    title='Distribution of Partnership Runs',
    labels={'runs': 'Runs'}
)
fig_runs_dist.show()


The "Distribution of Partnership Runs" histogram visualizes how frequently different run totals occur in partnerships during the Nepal Premier League 2024. Most partnerships contribute modest runs, while high-scoring stands are less common, highlighting the challenge of building big partnerships in the tournament.

### Distribution of Partnership Runs by Team

In [99]:
if 'team' in df_partnership_runs.columns:
    fig_team = px.box(
        df_partnership_runs,
        x='team',
        y='runs',
        title='Distribution of Partnership Runs by Team',
        labels={'team': 'Team', 'runs': 'Runs'}
    )
    fig_team.update_layout(xaxis_tickangle=-45)
    fig_team.show()


The "Distribution of Partnership Runs by Team" box plot visualizes how partnership run totals vary across different teams in the Nepal Premier League 2024. It highlights the range, median, and spread of partnership contributions for each team, allowing for quick comparison of team batting depth and consistency.

In [100]:
fig_wicket_freq = px.histogram(
    df_partnership_runs,
    x='wickets',
    title='Frequency of Partnerships by Wicket',
    labels={'wickets': 'Wicket'}
)
fig_wicket_freq.show()

The "Frequency of Partnerships by Wicket" histogram shows how often partnerships occur at each wicket position in the Nepal Premier League 2024. It highlights which wickets see the most and least frequent partnerships, providing insight into team batting stability and collapse points.

## WicketKeeping Analysis

In [101]:
# Load the most_dismissals.csv fielding data
dismissals_file = os.path.join(os.getcwd(), "WicketKeepingRecords", "most_dismissals.csv")
df_dismissals = pd.read_csv(dismissals_file)


In [102]:
df_dismissals.head()

Unnamed: 0,player,span,matches_played,innings_as_keeper,dismissed,Caught_as_a_keeper,stumpings,maximum_dismissals_per_innings,dismissials_per_innings
0,BKEL Milantha (JAB),2024-2024,10,8,5,1,4,2 (1ct 1st),0.625
1,B Rawal (CHR),2024-2024,8,8,5,5,-,1 (1ct 0st),0.625
2,A Saud (LUL),2024-2024,6,6,4,1,3,2 (0ct 2st),0.666
3,B Bhandari (SPR),2024-2024,9,9,4,1,3,2 (1ct 1st),0.444
4,CAK Walton (KAY),2024-2024,9,9,4,4,-,1 (1ct 0st),0.444


In [103]:
# Convert relevant columns to numeric if needed
for col in ['dismissed', 'matches_played','catches', 'stumpings']:
    if col in df_dismissals.columns:
        # Convert only if column exists and is not all missing
        if df_dismissals[col].notna().any():
            df_dismissals[col] = pd.to_numeric(df_dismissals[col], errors='coerce')

### Top 10  wicketkeepers with most  dismissals

In [104]:
top_dismissals = df_dismissals.sort_values('dismissed', ascending=False).head(10)
fig_dismissals = px.bar(
    top_dismissals,
    x='player',
    y='dismissed',
    title='Top 10 Wicketkeepers by Total Dismissals',
    labels={'player': 'Player', 'dismissals': 'Total Dismissals'}
)
fig_dismissals.update_layout(xaxis_tickangle=-45)
fig_dismissals.show()

The bar chart displays the top 10 wicketkeepers with the most dismissals in the Nepal Premier League 2024. BKEL Milantha (JAB) and B Rawal (CHR) lead with 5 dismissals each, followed by A Saud (LUL) and B Bhandari (SPR) with 4 dismissals. This highlights the key contributors behind the stumps, emphasizing their impact on the fielding side’s success.

### Distribution of dismissals among all wicketkeepers

In [105]:

fig_dismissals_dist = px.histogram(
    df_dismissals,
    x='dismissed',
    nbins=10,
    title='Distribution of Dismissals Among Wicketkeepers',
    labels={'dismissals': 'Number of Dismissals'}
)
fig_dismissals_dist.show()

The graph displays the distribution of dismissals among wicketkeepers in the Nepal Premier League 2024. Most wicketkeepers have a low number of dismissals, with only a few achieving higher totals, highlighting the standout performers behind the stumps.

### Top 10 wicketkeepers by Catches 

In [106]:
# Top 10 wicketkeepers by catches
if 'Caught_as_a_keeper' in top_dismissals.columns:
    top_dismissals['Caught_as_a_keeper'] = pd.to_numeric(top_dismissals['Caught_as_a_keeper'], errors='coerce')
    fig_keeper_catches = px.bar(
        top_dismissals.sort_values('Caught_as_a_keeper', ascending=False).head(10),
        x='player',
        y='Caught_as_a_keeper',
        title='Top 10 Wicketkeepers by Catches',
        labels={'player': 'Player', 'Caught_as_a_keeper': 'Catches as Keeper'}
    )
    fig_keeper_catches.update_layout(xaxis_tickangle=-45)
    fig_keeper_catches.show()


The bar chart above displays the top 10 wicketkeepers with the most catches taken specifically as a keeper in the Nepal Premier League 2024. 

- **B Rawal (CHR)** leads with 5 catches as a wicketkeeper, demonstrating sharp reflexes and reliable glovework behind the stumps.
- Other notable performers include **Aasif Sheikh (JAB)** and **SA Edwards (BIK)**, each contributing multiple catches as keepers for their teams.
- The chart highlights the importance of wicketkeeping skills in contributing to team fielding success, with several players making key contributions through catches behind the wicket.

This visualization provides a clear comparison of the leading wicketkeepers in terms of their catching ability, emphasizing their impact on match outcomes through crucial dismissals.


### Top 10 Wicketkeepers by stumpings

In [107]:
# Top 10 wicketkeepers by stumpings
if 'stumpings' in top_dismissals.columns:
    top_dismissals['stumpings'] = pd.to_numeric(top_dismissals['stumpings'], errors='coerce')
    fig_keeper_stumpings = px.bar(
        top_dismissals.sort_values('stumpings', ascending=False).head(10),
        x='player',
        y='stumpings',
        title='Top 10 Wicketkeepers by Stumpings',
        labels={'player': 'Player', 'stumpings': 'Stumpings'}
    )
    fig_keeper_stumpings.update_layout(xaxis_tickangle=-45)
    fig_keeper_stumpings.show()

The bar chart above highlights the top 10 wicketkeepers with the most stumpings in the Nepal Premier League 2024. Stumpings are a key indicator of a wicketkeeper's agility and quick reflexes behind the stumps. The visualization showcases those keepers who have made the greatest impact by effecting stumpings, underlining their crucial role in supporting the bowling attack and creating wicket-taking opportunities. This comparison helps identify the most alert and skillful wicketkeepers of the tournament.

In [108]:
# Load ball-by-ball stats for the final
final_ball_by_ball_file = os.path.join(os.getcwd(), "Final Tables", "npl_final.csv")
df_final_ball = pd.read_csv(final_ball_by_ball_file)

In [109]:
print("Columns in final ball-by-ball data:", df_final_ball.columns.tolist())


Columns in final ball-by-ball data: ['match_id', 'inning', 'batting_team', 'bowling_team', 'ball_over', 'ball_result', 'bowler', 'batsman', 'non_striker', 'shot_direction', 'player_dismissed', 'dismissal_kind', 'fielder', 'batsman_runs', 'wide_runs', 'bye_runs', 'legbye_runs', 'noball_runs', 'extra_runs', 'total_runs']


## Some Visualizations of Match Summary 

### Run Progression by Innings

In [110]:
# Run progression over the innings (Worm Plot)
if {'ball_over', 'total_runs', 'inning'}.issubset(df_final_ball.columns):
    df_final_ball['cumulative_runs'] = df_final_ball.groupby(['inning'])['total_runs'].cumsum()
    df_final_ball['over_ball'] = df_final_ball['ball_over'] + 0.0  # Already decimal format like 4.3 etc.

    fig_worm = px.line(
        df_final_ball,
        x='ball_over',
        y='cumulative_runs',
        color='inning',
        title='Run Progression by Innings',
        labels={'ball_over': 'Over.Ball', 'c- Finaumulative_runs': 'Cumulative Runs', 'inning': 'Inning'}
    )
    fig_worm.update_traces(mode='lines+markers')
    fig_worm.update_layout(
        xaxis=dict(tickmode='linear', dtick=1),
        plot_bgcolor='rgba(245,255,245,1)',
        paper_bgcolor='rgba(230,255,230,1)',
        font=dict(family='Segoe UI', size=14, color='#222'),
        title_font=dict(size=22, family='Segoe UI', color='#28a745')
    )
    fig_worm.show()


The run progression chart illustrates the contrasting innings in the final match:

- First Innings: Steady accumulation throughout with late acceleration, reaching 163-7 in 20 overs
- Second Innings: Strong start but wickets in middle overs led to declining run rate, ending at 149-9 in 19.2 overs


### Runs Scored Per Over

In [111]:
# Runs scored per over (Bar Chart with Highlight)
if {'ball_over', 'total_runs', 'inning'}.issubset(df_final_ball.columns):
    df_final_ball['over'] = df_final_ball['ball_over'].astype(int)

    runs_per_over = df_final_ball.groupby(['inning', 'over'])['total_runs'].sum().reset_index()
    fig_runs_over = px.bar(
        runs_per_over,
        x='over',
        y='total_runs',
        color='inning',
        barmode='group',
        title='Runs Scored Per Over - Final',
        labels={'over': 'Over', 'total_runs': 'Runs', 'inning': 'Innings'},
        color_discrete_sequence=px.colors.sequential.Rainbow
    )
    fig_runs_over.update_layout(
        plot_bgcolor='rgba(255,245,245,1)',
        paper_bgcolor='rgba(255,230,230,1)',
        font=dict(family='Segoe UI', size=14, color='#222'),
        title_font=dict(size=22, family='Segoe UI', color='#e83e8c')
    )
    fig_runs_over.show()


The bar chart of runs per over reveals:

- **First Innings**: Notable acceleration in death overs (17-20), with the 18th over being most productive
- **Second Innings**: Strong early momentum with high-scoring powerplay, but declining run rate after over 15
- **Key Periods**: 
    - Highest scoring: Over 18 (First innings)
    - Lowest scoring: Over 19 (Second innings)

### Boundary Frequency Heatmap

In [112]:
# Boundary frequency heatmap (Over vs Ball)
if {'ball_over', 'batsman_runs', 'inning'}.issubset(df_final_ball.columns):
    # Extract over and ball from ball_over
    df_final_ball['over'] = df_final_ball['ball_over'].astype(int)
    df_final_ball['ball'] = ((df_final_ball['ball_over'] - df_final_ball['over']) * 10).round().astype(int)

    df_final_ball['is_boundary'] = df_final_ball['batsman_runs'].isin([4, 6])
    
    boundary_heatmap = df_final_ball[df_final_ball['is_boundary']] \
        .groupby(['inning', 'over', 'ball']).size().reset_index(name='boundary_count')

    fig_boundary_heat = px.density_heatmap(
        boundary_heatmap,
        x='over',
        y='ball',
        z='boundary_count',
        facet_col='inning',
        color_continuous_scale='Viridis',
        title='Boundary Frequency Heatmap (Over vs Ball)',
        labels={'over': 'Over', 'ball': 'Ball', 'boundary_count': 'Boundary Count'}
    )

    fig_boundary_heat.update_layout(
        plot_bgcolor='rgba(245,245,255,1)',
        paper_bgcolor='rgba(230,240,255,1)',
        font=dict(family='Segoe UI', size=14, color='#222'),
        title_font=dict(size=22, family='Segoe UI', color='#007bff')
    )

    fig_boundary_heat.show()


The boundary frequency heatmap visualizes the timing and frequency of boundary hits (4s and 6s) across both innings in the final match:

- **First Innings**:
    - Higher boundary concentration in middle overs (8-12)
    - Strong finish with multiple boundaries in death overs
    
- **Second Innings**:
    - Early boundaries in powerplay overs
    - Decreased frequency in latter half
    - Few boundaries after over 15

The visualization reveals how batting momentum shifted through different phases of each innings.

### Top Batsmen - Run Contribution

In [113]:
top_batsmen = df_final_ball.groupby('batsman')['batsman_runs'].sum().sort_values(ascending=False).head(6)
fig_pie = px.pie(
    names=top_batsmen.index,
    values=top_batsmen.values,
    title='Top Batsmen - Run Contribution',
    hole=0.4,  # for donut style
    color_discrete_sequence=px.colors.sequential.RdBu
)
fig_pie.update_traces(textinfo='percent+label')
fig_pie.show()



The pie chart displays the top 6 run-scorers in the final match:

- Lahiru Milantha dominated with 293 runs (21.4%)
- Ravi Bopara followed with 286 runs (20.9%) 
- Rohit Paudel scored 279 runs (20.4%)
- Saif Zaib contributed 275 runs (20.1%)
- James Neesham made 247 runs (18.0%) 
- Binod Bhandari added 235 runs (17.2%)

Together these six batsmen accounted for a significant portion of the total runs scored in the final match. The relatively even distribution among the top batsmen indicates a balanced batting performance from both teams.


### Wicket Fall Timeline

In [114]:
wickets = df_final_ball[df_final_ball['player_dismissed'].notna()].copy()

wickets['over'] = wickets['ball_over'].astype(int)
wickets['ball'] = ((wickets['ball_over'] - wickets['over']) * 10).round().astype(int)

fig_wickets = px.scatter(
    wickets,
    x='ball_over',
    y='inning',
    color='dismissal_kind',
    symbol='dismissal_kind',
    hover_data=['player_dismissed', 'bowler', 'fielder'],
    title='Wicket Fall Timeline',
    labels={'ball_over': 'Over.Ball', 'inning': 'Inning'}
)
fig_wickets.update_traces(marker_size=12)
fig_wickets.show()

#### First Innings
- Early wickets: Powerplay (overs 1-6) saw multiple dismissals
- Mid-innings collapse: Cluster of wickets between overs 11-15
- Late wickets: Regular fall of wickets in death overs (16-20)

#### Second Innings
- Steady start: Few wickets in powerplay
- Middle overs pressure: Multiple wickets between overs 8-12
- Final collapse: Frequent dismissals in death overs leading to innings end

#### Dismissal Types
- Bowled and LBW dismissals dominated the early overs
- Caught dismissals were most frequent throughout
- Run outs increased in frequency during death overs
- Several stumping dismissals highlighted wicketkeeping impact


### Frequency of Shot Directions

In [115]:
# Frequency of shot directions
shot_freq = df_final_ball['shot_direction'].value_counts().reset_index()
shot_freq.columns = ['shot_direction', 'count']

# Append first row to close the radar loop
shot_freq = pd.concat([shot_freq, shot_freq.iloc[[0]]], ignore_index=True)

# Radar chart
fig_radar = px.line_polar(
    shot_freq,
    r='count',
    theta='shot_direction',
    line_close=True,
    title='Shot Direction Frequency',
    color_discrete_sequence=['#00b894']
)
fig_radar.update_traces(fill='toself')
fig_radar.show()



The radar chart reveals the distribution of shot directions in the final match, with "covers" being the most favored scoring area (1334 shots), followed by "mid wicket" (1193 shots). The least preferred scoring zones were "fine leg" and "third man", indicating batsmen's preference for playing in front of the wicket.
