# Batting collapse frequency. Are England unique?

### Research Question:
- How often do batting collapses happen?
- Does England collapse more often that other teams?
- Is this statement true? "Joe Root rarely stops a collapse, and is often a part of it"
- Which players are best at stopping a collapse?


### Methodology:
- Create a table of fall of wicket:
    - MatchID
    - Date
    - Batting team
    - Bowling team
    - Match type
    - Innings
    - Fall of Wicket 1 (runs, batsman)
    - Fall of Wicket 2 (runs, batsman)
    - etc.


### Problem breakdown:
- Data Source: howstat cricket scorecards
- Extract Fall Of Wickets from a single game
- Extract FoW from multiple games
- Extract FoW from all (relevant) games

In [1]:
from cricsheet.io_html.BaseParser import BaseParser
from cricsheet.io_html.MatchParser import MatchParser

import logging
import time
import sys

In [2]:
logging.basicConfig(level=logging.DEBUG,
                    format='[%(levelname)s]: %(asctime)s: %(message)s',
                    #datefmt='%m-%d %H:%M',
                    filename='../logs/scrapeTests.log',
                    filemode='w')

# define a Handler which writes INFO messages or higher to the sys.stderr
console = logging.StreamHandler(sys.stdout)
console.setLevel(logging.INFO)

# set a format which is simpler for console use
#formatter = logging.Formatter('%(name)-12s: %(levelname)-8s: %(message)s')
formatter = logging.Formatter('[%(levelname)-8s]: %(asctime)s: %(message)s')
# tell the handler to use this format
console.setFormatter(formatter)

# add the handler to the root logger
logging.getLogger('').addHandler(console)

In [5]:
#logging.basicConfig(level=logging.INFO, format='[%(levelname)s]: %(asctime)s: %(message)s')
#log = logging.getLogger('cricsheet')

#start_time = time.time()

# 
j = 0
for i in range(2420,2423):
    j += 1
    if j % 50 == 0:
        logging.info('Sleeping for 10 seconds')
        time.sleep(10)
        
    s_id = str(i).zfill(4)
    match = MatchParser(s_id)
    logging.debug(f'Loaded match {s_id}')
    
    try:
        match.execute()
        logging.info(f'Parsed data from match {s_id}')
        match.scorecards.to_csv(f'../data/processed/howstat/scorecards/scorecard_{i}.csv')
        match.fall_of_wickets.to_csv(f'../data/processed/howstat/fall_of_wickets/fow_{i}.csv')
        logging.info(f'Saved data from Match: {s_id}')        
    except Exception as err:
        logging.error(err)
    
    


[INFO    ]: 2021-01-05 17:50:51,811: Parsing Match: 2420
[INFO    ]: 2021-01-05 17:50:53,114: Parsed data from match 2420
[INFO    ]: 2021-01-05 17:50:53,133: Saved data from Match: 2420
[INFO    ]: 2021-01-05 17:50:53,135: Parsing Match: 2421
[INFO    ]: 2021-01-05 17:50:54,447: Parsed data from match 2421
[INFO    ]: 2021-01-05 17:50:54,460: Saved data from Match: 2421
[INFO    ]: 2021-01-05 17:50:54,462: Parsing Match: 2422
[INFO    ]: 2021-01-05 17:50:56,044: Parsed data from match 2422
[INFO    ]: 2021-01-05 17:50:56,060: Saved data from Match: 2422


In [6]:
match.fall_of_wickets

Unnamed: 0,MatchId,MatchDate,MatchInnings,Team,TeamInnings,Wicket,Runs,Player
0,2422,2021-01-03,1,Sri Lanka,1st,1,19,Karunaratne
1,2422,2021-01-03,1,Sri Lanka,1st,2,71,Perera
2,2422,2021-01-03,1,Sri Lanka,1st,3,71,Mendis
3,2422,2021-01-03,1,Sri Lanka,1st,4,80,Thirimanne
4,2422,2021-01-03,1,Sri Lanka,1st,5,84,Bhanuka
5,2422,2021-01-03,1,Sri Lanka,1st,6,93,Dickwella
6,2422,2021-01-03,1,Sri Lanka,1st,7,110,Shanaka
7,2422,2021-01-03,1,Sri Lanka,1st,8,149,Silva
8,2422,2021-01-03,1,Sri Lanka,1st,9,151,Chameera
9,2422,2021-01-03,1,Sri Lanka,1st,10,157,Fernando
