# Cricket Statistics and Python 

Inspired by an enthralling Ashes series, I wanted to play around with cricket data available from various free and commercial sources out there.. Despite a bit of trawling, I couldn't really find a single document detailing what was available. 

## So what sources are there? 

I did a bit of Googling and also looked at what some of the obvious sites offered. Additionally, I looked for existing python libraries and played around with them. I was more interested in being able to crunch stats as opposed to create widgets showing live scores, for example, but have looked at everything.

### Cricinfo

Every web literate person with even a passing interest in cricket will know Cricinfo. It's been owned by ESPN for years now and various old questions on Stack Exchange at Quora suggest they used to have an API for developers but that it was discontinued. That said, there are still some options for grabbing their data by leveraging feeds or scraping pages. 

#### RSS Feeds

This page lists the available rss feeds. These can be parsed to get information on current matches and recent results.   

http://www.espncricinfo.com/ci/content/rss/feeds_rss_cricket.html

#### Python libraries

Searching on Github returns [quite a few results](https://github.com/search?l=Python&q=cricinfo&type=Repositories) but most are old and unloved. A couple stood out though (there are probably more of interest):

##### Cricpy

Provides a wealth of functions for scraping html pages as seems to be under active development. Has low level functions that pull data and dump to csv for local analysis as well as functions to create charts and perform statistical analysis. Seems geared towards people doing statistical analysis rather than creating apps. I definitely need to explore in more depth.

https://github.com/tvganesh/cricpy

##### python-espncricinfo

This is a simpler library that uses a combination of scraping as well as leveraging some (as far as I can tell) undocumented json files available on cricinfo. It doesn't seem to be under active development though and some functions have bugs and some don't work at all. That said, it gave me an interesting insight into some data I didn't realise was easily accessible. 

https://github.com/dwillis/python-espncricinfo


### Cricbuzz

I'll confess to never really looking at Cricbuzz but it is another popular site dedicated to cricket news. There is a python library for accessing it.

#### pycricbuzz

I haven't really taken a look at this yet but it does seem to have been updated recently and have quite a few stargazers and forks.

https://github.com/codophobia/pycricbuzz


### Cricapi

This is a freemium api offering a free key that grants you 100 calls per day. It gives useful info about current matches and and individual player stats but is probably more geared toward app developers than people just wanting to do analysis.

https://www.cricapi.com/

#### pycricapi

There is a library here that provides a basic wrapper for each endpoint.

https://github.com/KarthikGangadhar/pycricapi

### Rapid API

A few things turn up when searching Rapid API but I haven't really looked at them so far.

https://rapidapi.com/search/cricket

This seems to offer 2500 hits per day and a relatively cheap pricing model after that. Again it's geared toward app developers rather than statisticians. 

https://rapidapi.com/dev132/api/cricket-live-scores/


### Sportradar

Sportradar offer a number of different APIs for different sports. It's a premium service (ie quite expensive) but does provide some pretty useful stuff. I took advantage of their free trial and played around with a bit but I am limited to 1000 calls as part of the trial so haven't explored too much.

#### SportradarAPIs (python wrapper)

A wrapper already existed but didn't include a class to access the cricket endpoints so I added them. 

https://github.com/johnwmillr/SportradarAPIs

### Sportmonks

Sportmonks are another service providing APIs for all sorts of sports. They offer a free trial for 14 days but I haven't taken them up on this yet but might revisit this if I have a bit of time to make the most of it.

https://www.sportmonks.com/cricket-api - 14 day trial available

### Cricket API

I found this which is another premium service but without any option for a free trial that I could see. I've not looked at it beyond that.

https://www.cricketapi.com/


* 

In [94]:
# Trying to do something useful with the cricinfo RSS feeds

import requests
import xml.etree.ElementTree as ET

r = requests.get('http://static.cricinfo.com/rss/livescores.xml')
tree = ET.ElementTree(ET.fromstring(r.text))
root = tree.getroot() 
item = root.findall('channel/item')

for entry in item:
    score = format(entry.findtext('title'))
    match_id = entry.findtext('guid').split('.')[2].split('/')[4]
    print("{0} - {1}".format(score, match_id))


These are the ongoing games as of now and their match ids

Balochistan v Central Punjab (Pakistan) - 1199401
Northern (Pakistan) v Sindh - 1199402
Khyber Pakhtunkhwa v Southern Punjab (Pakistan) - 1199403
New Zealand Under-19s v Bangladesh Under-19s - 1202033
Victoria 168/10 * v Queensland 322/9  - 1196098
South Australia 228/9  v Tasmania 229/5 * - 1196099
Malaysia 134/9 * v Vanuatu 151/5  - 1202007
Andhra 38 * v Chhattisgarh 268/9  - 1200635
Goa 266/8  v Jharkhand 37/2 * - 1200768
Hyderabad (India) 11/1 * v Kerala 226/9  - 1200766
Baroda v Punjab - 1200769
Haryana v Uttar Pradesh - 1200771
Himachal Pradesh v Maharashtra - 1200638
Chandigarh v Manipur - 1200765
Mizoram v Nagaland - 1200692
Sri Lanka A v Bangladesh A - 1201025
Australia Women 217/4  v Sri Lanka Women 176/7 * - 1183510
United Services Recreation Club 67/10  v Hong Kong Cricket Club 69/4 * - 1197541
Pakistan Association of Hong Kong 79/4 * v Kowloon Cricket Club 140/10  - 1197542
Singapore v Zimbabwe - 1201682
India Wome

In [96]:
# Let's pick one of these matches and pull the json file for it

#http://www.espncricinfo.com/matches/engine/match/{0}.json
import json

r = requests.get('http://www.espncricinfo.com/matches/engine/match/1196099.json')
r.json()

{'centre': {},
 'comms': [{'ball': [{'comms_id': '',
     'dismissal': '',
     'event': '1 run',
     'innings_number': '2',
     'is_tweet': '',
     'over_number': '47',
     'overs_actual': '46.2',
     'overs_unique': '46.02',
     'players': 'Valente to Faulkner',
     'speed_kph': '',
     'speed_mph': '',
     'text': ''},
    {'comms_id': '',
     'dismissal': '',
     'event': '1 run',
     'innings_number': '2',
     'is_tweet': '',
     'over_number': '47',
     'overs_actual': '46.1',
     'overs_unique': '46.01',
     'players': 'Valente to McDermott',
     'speed_kph': '',
     'speed_mph': '',
     'text': ''}],
   'innings_number': '2',
   'over_number': '47'},
  {'ball': [{'comms_id': '',
     'dismissal': '',
     'event': '1 run',
     'innings_number': '2',
     'is_tweet': '',
     'over_number': '46',
     'overs_actual': '45.6',
     'overs_unique': '45.07',
     'players': 'Pope to McDermott',
     'speed_kph': '',
     'speed_mph': '',
     'text': ''},
    {'