In [1]:
! pip install python-espncricinfo



There are 4 primary modules in espncricinfo - 
1. match
2. player
3. series
4. summary

Let's us explore one by one

In [2]:
from espncricinfo.match import Match

To aquire a match id, google any match that you want and go to espn site, within the url there
exists the match id

The following is the url of 2011 ODI World Cup Final between India and Sri Lanka

https://www.espncricinfo.com/series/icc-cricket-world-cup-2010-11-381449/india-vs-sri-lanka-final-433606/full-scorecard

Here 433606 is the match id.

Similarly, for the final of T20 World Cup 2024 Final between India and South Africa,

https://www.espncricinfo.com/series/icc-men-s-t20-world-cup-2024-1411166/india-vs-south-africa-final-1415755/full-scorecard

1415755 is the match id.

In [3]:
# Let's work with the T20 World Cup Final

WC_final_T20_2024 = Match(1415755)

In [4]:
# There are several attributes if the output is a json. Let's check some of them

series = WC_final_T20_2024.series_name
print(series)

South Africa tour of United States of America and West Indies


The series_name variable seems to join the Host country(s) with the visiting country(s). Here the final is 
between SA and IND. The World Cup is hosted jointly by USA and WI. This the reason we are not getting the 
desired output - T20 World Cup 2024.

In [5]:
import pprint

# To get the match format - Tests, ODI, T20, use .match_class
print("The format of this match is :", WC_final_T20_2024.match_class)

# The scheduled overs for different formats of the game
print(WC_final_T20_2024.scheduled_overs)

# To get the ground name 
print("The match is held at", WC_final_T20_2024.ground_name)

# To get just the teams in the innnings and their meta data use
pprint.pprint(WC_final_T20_2024.innings_list)

# To get brief score card excluding individual scores, use
pprint.pprint(WC_final_T20_2024.innings)

The format of this match is : T20I
20
The match is held at Kensington Oval, Bridgetown, Barbados
[{'current': 0,
  'description': 'India innings',
  'descriptoin_short': 'IND Inns',
  'innings_number': 1,
  'selected': 0,
  'team_id': 6},
 {'current': 1,
  'description': 'South Africa innings',
  'descriptoin_short': 'SA Inns',
  'innings_number': 2,
  'selected': 1,
  'team_id': 3}]
[{'ball_limit': 120,
  'balls': 120,
  'batted': 1,
  'batting_team_id': 6,
  'bowling_team_id': 3,
  'bpo': 6,
  'byes': 0,
  'event': 5,
  'event_name': 'complete',
  'extras': 7,
  'innings_number': '1',
  'innings_numth': '1st',
  'lead': 176,
  'legbyes': 0,
  'live_current': 0,
  'live_current_name': None,
  'minutes': None,
  'noballs': 1,
  'old_penalty_or_bonus': 0,
  'over_limit': '20.0',
  'over_limit_run_rate': 8.8,
  'over_split_limit': '0.0',
  'overs': '20.0',
  'overs_docked': 0,
  'penalties': 0,
  'penalties_field_end': 0,
  'penalties_field_start': 0,
  'run_rate': 8.8,
  'runs': 176,
  

In [6]:
pprint.pprint(WC_final_T20_2024.team_1)


pprint.pprint(WC_final_T20_2024.team_1_players)

{'batsmen_in_side': 11,
 'content_id': 6,
 'country_id': 6,
 'fielders_in_side': 11,
 'image_id': 381895,
 'logo_alt_id': '',
 'logo_espncdn': 'Y',
 'logo_height': 500,
 'logo_image_height': 500,
 'logo_image_path': '/db/PICTURES/CMS/381800/381895.png',
 'logo_image_width': 500,
 'logo_object_id': 1436347,
 'logo_path': '/db/PICTURES/CMS/381800/381895.png',
 'logo_width': 500,
 'object_id': 6,
 'player': [{'age_days': 60,
             'age_years': 37,
             'alpha_name': 'SHARMA,RG',
             'batting_hand': 'right-hand batter',
             'batting_style': 'rhb',
             'batting_style_long': 'right-hand bat',
             'bowling_hand': 'right-arm bowler',
             'bowling_pacespin': 'spin bowler',
             'bowling_style': 'ob',
             'bowling_style_long': 'right-arm offbreak ',
             'captain': 1,
             'card_long': 'RG Sharma',
             'card_qualifier': '',
             'card_short': 'Sharma',
             'dob': '1987-04-30',
 

Both .team_1 and .team_1_players tell us about of players. team_1_players contains some additonal data.

# Methods in espncricinfo

### 1. get_json()
To get a complete match commentary, there is a method called get_json, but it contains a lot of additional data. It is wise to use a part of it.

NOTE: The output is truncated to recent overs. We can't get the full commentry.

In [7]:
results_in_json = WC_final_T20_2024.get_json()
pprint.pprint(results_in_json)

# check for 'comms', you can see the ball by ball commentry
# a similar type of result is obtained using get_html() method

{'centre': {'batting': [{'balls_faced': '27',
                         'batting_style': 'rhb',
                         'control_percentage': 72,
                         'dismissal_name': 'caught',
                         'dot_ball_percentage': 33,
                         'known_as': 'Heinrich Klaasen',
                         'live_current_name': '',
                         'match_award': 0,
                         'notout': 0,
                         'player_id': 61634,
                         'popular_name': 'Klaasen',
                         'preferred_shot': {'balls_faced': '8',
                                            'runs': '12',
                                            'runs_summary': ['3',
                                                             '2',
                                                             '2',
                                                             '0',
                                                             '0',
            

### 2. _legacy_scorecard_url()

Use this to get the scorecard, that we usually see at the end of match. This even contains the innings progression (a brief overview of both team batting performances)

In [8]:
score_card_url = WC_final_T20_2024._legacy_scorecard_url()

print("The score card is avaialble at", score_card_url)

The score card is avaialble at https://static.espncricinfo.com/db/ARCHIVE/2024/ICC-WORLD-2020/SCORECARDS/IND_RSA_ICC-WORLD-2020_29JUN2024


### 3. _officials()

Umpires are called as officials of the match. We can obtain their data using this method

In [9]:
umpires = WC_final_T20_2024._officials()
pprint.pprint(umpires)

[{'age_days': 212,
  'age_years': 48,
  'alpha_name': 'GAFFANEY,CB',
  'batting_hand': 'right-hand batter',
  'bowling_hand': 'unknown arm',
  'bowling_pacespin': 'mixture/unknown',
  'card_long': 'CB Gaffaney',
  'card_qualifier': '',
  'card_short': 'CB Gaffaney',
  'dob': '1975-11-30',
  'image_id': None,
  'known_as': 'Chris Gaffaney',
  'match_player_id': 3019671,
  'mobile_name': 'Gaffaney',
  'object_id': 37097,
  'player_id': '9709',
  'player_name_id': 30233,
  'player_type': 2,
  'player_type_name': 'umpire',
  'popular_name': 'Gaffaney',
  'portrait_alt_id': '1',
  'portrait_object_id': 953899,
  'status_id': 3,
  'team_abbreviation': 'NZ',
  'team_id': 5,
  'team_name': 'New Zealand',
  'team_short_name': 'New Zealand'},
 {'age_days': 311,
  'age_years': 60,
  'alpha_name': 'ILLINGWORTH,RK',
  'batting_hand': 'right-hand batter',
  'bowling_hand': 'left-arm bowler',
  'bowling_pacespin': 'spin bowler',
  'card_long': 'RK Illingworth',
  'card_qualifier': '',
  'card_short':

# PLayer() module

This module will extract key details about players, let's dive through some the main modules

In [10]:
# Now let's test the next one

from espncricinfo.player import Player


# Let's look for Virat Kohli, player id = 253802

Virat_Kohli = Player(253802)

### a .get_career_averages() 

will create a csv file in your current directory starting with player_id, it a very brief stats of the player

NOTE: The cell won't produce any output

In [11]:
Virat_Kohli.get_career_averages()

### b .get_json() 

results in overview of his career but not stats

In [12]:
full_profile = Virat_Kohli.get_json()
pprint.pprint(full_profile)

{'$ref': 'http://core.espnuk.org/v2/sports/cricket/athletes/253802',
 'active': True,
 'age': 35,
 'battingName': 'V Kohli',
 'birthPlace': {},
 'country': 6,
 'dateOfBirth': '1988-11-05T00:00Z',
 'dateOfBirthStr': '1988-11-05T00:00Z',
 'dateOfDeath': '',
 'dateOfDeathStr': '',
 'debutYear': None,
 'debuts': [{'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/489197/events/489226'},
            {'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/489202/events/489226'},
            {'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/343726/events/343732'},
            {'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/343727/events/343732'},
            {'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/452142/events/452153'},
            {'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/452143/events/452153'},
            {'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/254435/events/263077'},
            {'$ref': 'http://core.es

### c. get_career_summary() 

gives us the series wise stats, it will be useful for any model based on numerical features. The output is generated in a seperate csv file.

In [13]:
Virat_Kohli.get_career_summary()

# series() module

There is a definite requirement for looking for all matches within a 
series, let's exlpore this module.

In [14]:
from espncricinfo.series import Series

WorldCup_24 = Series(1411166)

### a. _get_seasons()

generates several URLs, rightnow I can't make much sense of them.
They all look similar, with the series id and some more URL's for
video and audio. Need to explore it further.

You will understand after using the second method.

In [15]:
WorldCup_24._get_seasons()

['http://core.espnuk.org/v2/sports/cricket/leagues/1411166/seasons/2024',
 'http://core.espnuk.org/v2/sports/cricket/leagues/1298134/seasons/2022',
 'http://core.espnuk.org/v2/sports/cricket/leagues/1267897/seasons/2021',
 'http://core.espnuk.org/v2/sports/cricket/leagues/901359/seasons/2016',
 'http://core.espnuk.org/v2/sports/cricket/leagues/628368/seasons/2014',
 'http://core.espnuk.org/v2/sports/cricket/leagues/531597/seasons/2012',
 'http://core.espnuk.org/v2/sports/cricket/leagues/412671/seasons/2010',
 'http://core.espnuk.org/v2/sports/cricket/leagues/335113/seasons/2009',
 'http://core.espnuk.org/v2/sports/cricket/leagues/286109/seasons/2007']

### b. _get_years_from_seasons()

will result in all the years that this series is played. So when you see both these two methods, _get_seasons() gives us the URL's for each season. So we can see every match that belongs to this particular series.

In [16]:
WorldCup_24._get_years_from_seasons()

['2024', '2022', '2021', '2016', '2014', '2012', '2010', '2009', '2007']

### c. _get_events()

produced the info and final match. But I assume I made some mistake somewhere like getting the series or something.

But I can get more info as I work on this.

In [17]:
WorldCup_24._get_events()

[{'$ref': 'http://core.espnuk.org/v2/sports/cricket/leagues/1411166/events/1415755'}]

In [19]:
! pip install BeautifulSoup4

