# Batting collapse frequency. Are England unique?

### Research Question:
- How often do batting collapses happen?
- Does England collapse more often that other teams?
- Is this statement true? "Joe Root rarely stops a collapse, and is often a part of it"
- Which players are best at stopping a collapse?


### Methodology:
- Create a table of fall of wicket:
    - MatchID
    - Date
    - Batting team
    - Bowling team
    - Match type
    - Innings
    - Fall of Wicket 1 (runs, batsman)
    - Fall of Wicket 2 (runs, batsman)
    - etc.


### Problem breakdown:
- Data Source: howstat cricket scorecards
- Extract Fall Of Wickets from a single game
- Extract FoW from multiple games
- Extract FoW from all (relevant) games

In [1]:
from cricsheet.io_html.BaseParser import BaseParser
from cricsheet.io_html.MatchParser import MatchParser

import logging
import time

In [10]:
logging.basicConfig(level=logging.INFO, format='[%(levelname)s]: %(asctime)s: %(message)s')
log = logging.getLogger('cricsheet')

#start_time = time.time()

# 
j = 0
for i in range(1,500):
    j += 1
    if j % 50 == 0:
        log.info('Sleeping for 10 seconds')
        time.sleep(10)
        
    s_id = str(i).zfill(4)
    match = MatchParser(s_id)
    log.debug(f'Loaded match {s_id}')
    
    try:
        match.execute()
        log.debug(f'Parsed data from match {s_id}')
        match.scorecards.to_csv(f'../data/processed/howstat/scorecards/scorecard_{i}.csv')
        match.scorecards.to_csv(f'../data/processed/howstat/fall_of_wickets/fow_{i}.csv')
        log.info(f'Saved data from Match: {s_id}')        
    except Exception as err:
        log.error(err)
    
    


[INFO]: 2020-12-31 16:08:24,077: Parsing Match: 0001
[INFO]: 2020-12-31 16:08:25,759: Saved data from Match: 0001
[INFO]: 2020-12-31 16:08:25,760: Parsing Match: 0002
[INFO]: 2020-12-31 16:08:27,247: Saved data from Match: 0002
[INFO]: 2020-12-31 16:08:27,247: Parsing Match: 0003
[INFO]: 2020-12-31 16:08:28,474: Saved data from Match: 0003
[INFO]: 2020-12-31 16:08:28,474: Parsing Match: 0004
[INFO]: 2020-12-31 16:08:29,918: Saved data from Match: 0004
[INFO]: 2020-12-31 16:08:29,920: Parsing Match: 0005
[INFO]: 2020-12-31 16:08:32,864: Saved data from Match: 0005
[INFO]: 2020-12-31 16:08:32,865: Parsing Match: 0006
[INFO]: 2020-12-31 16:08:34,322: Saved data from Match: 0006
[INFO]: 2020-12-31 16:08:34,323: Parsing Match: 0007
[INFO]: 2020-12-31 16:08:35,739: Saved data from Match: 0007
[INFO]: 2020-12-31 16:08:35,740: Parsing Match: 0008
[INFO]: 2020-12-31 16:08:37,118: Saved data from Match: 0008
[INFO]: 2020-12-31 16:08:37,118: Parsing Match: 0009
[INFO]: 2020-12-31 16:08:38,792: Sa

[INFO]: 2020-12-31 16:10:18,612: Saved data from Match: 0072
[INFO]: 2020-12-31 16:10:18,612: Parsing Match: 0073
[INFO]: 2020-12-31 16:10:20,076: Saved data from Match: 0073
[INFO]: 2020-12-31 16:10:20,077: Parsing Match: 0074
[INFO]: 2020-12-31 16:10:21,319: Saved data from Match: 0074
[INFO]: 2020-12-31 16:10:21,320: Parsing Match: 0075
[INFO]: 2020-12-31 16:10:22,300: Saved data from Match: 0075
[INFO]: 2020-12-31 16:10:22,301: Parsing Match: 0076
[INFO]: 2020-12-31 16:10:23,541: Saved data from Match: 0076
[INFO]: 2020-12-31 16:10:23,542: Parsing Match: 0077
[INFO]: 2020-12-31 16:10:24,798: Saved data from Match: 0077
[INFO]: 2020-12-31 16:10:24,799: Parsing Match: 0078
[INFO]: 2020-12-31 16:10:25,996: Saved data from Match: 0078
[INFO]: 2020-12-31 16:10:25,997: Parsing Match: 0079
[INFO]: 2020-12-31 16:10:27,565: Saved data from Match: 0079
[INFO]: 2020-12-31 16:10:27,565: Parsing Match: 0080
[INFO]: 2020-12-31 16:10:28,894: Saved data from Match: 0080
[INFO]: 2020-12-31 16:10:28

[INFO]: 2020-12-31 16:12:04,501: Parsing Match: 0144
[INFO]: 2020-12-31 16:12:05,592: Saved data from Match: 0144
[INFO]: 2020-12-31 16:12:05,593: Parsing Match: 0145
[INFO]: 2020-12-31 16:12:06,888: Saved data from Match: 0145
[INFO]: 2020-12-31 16:12:06,889: Parsing Match: 0146
[INFO]: 2020-12-31 16:12:09,692: Saved data from Match: 0146
[INFO]: 2020-12-31 16:12:09,693: Parsing Match: 0147
[INFO]: 2020-12-31 16:12:11,230: Saved data from Match: 0147
[INFO]: 2020-12-31 16:12:11,231: Parsing Match: 0148
[INFO]: 2020-12-31 16:12:12,564: Saved data from Match: 0148
[INFO]: 2020-12-31 16:12:12,565: Parsing Match: 0149
[INFO]: 2020-12-31 16:12:13,733: Saved data from Match: 0149
[INFO]: 2020-12-31 16:12:13,733: Sleeping for 10 seconds
[INFO]: 2020-12-31 16:12:23,737: Parsing Match: 0150
[INFO]: 2020-12-31 16:12:24,699: Saved data from Match: 0150
[INFO]: 2020-12-31 16:12:24,699: Parsing Match: 0151
[INFO]: 2020-12-31 16:12:25,861: Saved data from Match: 0151
[INFO]: 2020-12-31 16:12:25,862

[INFO]: 2020-12-31 16:13:53,638: Parsing Match: 0215
[INFO]: 2020-12-31 16:13:55,341: Saved data from Match: 0215
[INFO]: 2020-12-31 16:13:55,343: Parsing Match: 0216
[INFO]: 2020-12-31 16:13:56,842: Saved data from Match: 0216
[INFO]: 2020-12-31 16:13:56,842: Parsing Match: 0217
[INFO]: 2020-12-31 16:13:57,954: Saved data from Match: 0217
[INFO]: 2020-12-31 16:13:57,955: Parsing Match: 0218
[INFO]: 2020-12-31 16:13:59,036: Saved data from Match: 0218
[INFO]: 2020-12-31 16:13:59,037: Parsing Match: 0219
[INFO]: 2020-12-31 16:14:00,068: Saved data from Match: 0219
[INFO]: 2020-12-31 16:14:00,068: Parsing Match: 0220
[INFO]: 2020-12-31 16:14:01,212: Saved data from Match: 0220
[INFO]: 2020-12-31 16:14:01,213: Parsing Match: 0221
[INFO]: 2020-12-31 16:14:02,870: Saved data from Match: 0221
[INFO]: 2020-12-31 16:14:02,871: Parsing Match: 0222
[INFO]: 2020-12-31 16:14:04,359: Saved data from Match: 0222
[INFO]: 2020-12-31 16:14:04,359: Parsing Match: 0223
[INFO]: 2020-12-31 16:14:05,358: Sa

[INFO]: 2020-12-31 16:15:20,847: Saved data from Match: 0286
[INFO]: 2020-12-31 16:15:20,847: Parsing Match: 0287
[INFO]: 2020-12-31 16:15:21,835: Saved data from Match: 0287
[INFO]: 2020-12-31 16:15:21,836: Parsing Match: 0288
[INFO]: 2020-12-31 16:15:22,768: Saved data from Match: 0288
[INFO]: 2020-12-31 16:15:22,769: Parsing Match: 0289
[INFO]: 2020-12-31 16:15:23,713: Saved data from Match: 0289
[INFO]: 2020-12-31 16:15:23,714: Parsing Match: 0290
[INFO]: 2020-12-31 16:15:24,466: Saved data from Match: 0290
[INFO]: 2020-12-31 16:15:24,467: Parsing Match: 0291
[INFO]: 2020-12-31 16:15:25,309: Saved data from Match: 0291
[INFO]: 2020-12-31 16:15:25,310: Parsing Match: 0292
[INFO]: 2020-12-31 16:15:26,537: Saved data from Match: 0292
[INFO]: 2020-12-31 16:15:26,538: Parsing Match: 0293
[INFO]: 2020-12-31 16:15:27,524: Saved data from Match: 0293
[INFO]: 2020-12-31 16:15:27,524: Parsing Match: 0294
[INFO]: 2020-12-31 16:15:28,376: Saved data from Match: 0294
[INFO]: 2020-12-31 16:15:28

[INFO]: 2020-12-31 16:16:55,593: Saved data from Match: 0357
[INFO]: 2020-12-31 16:16:55,593: Parsing Match: 0358
[INFO]: 2020-12-31 16:16:56,467: Saved data from Match: 0358
[INFO]: 2020-12-31 16:16:56,467: Parsing Match: 0359
[INFO]: 2020-12-31 16:16:57,709: Saved data from Match: 0359
[INFO]: 2020-12-31 16:16:57,710: Parsing Match: 0360
[INFO]: 2020-12-31 16:16:58,840: Saved data from Match: 0360
[INFO]: 2020-12-31 16:16:58,841: Parsing Match: 0361
[INFO]: 2020-12-31 16:16:59,858: Saved data from Match: 0361
[INFO]: 2020-12-31 16:16:59,858: Parsing Match: 0362
[INFO]: 2020-12-31 16:17:00,687: Saved data from Match: 0362
[INFO]: 2020-12-31 16:17:00,688: Parsing Match: 0363
[INFO]: 2020-12-31 16:17:01,720: Saved data from Match: 0363
[INFO]: 2020-12-31 16:17:01,721: Parsing Match: 0364
[INFO]: 2020-12-31 16:17:02,743: Saved data from Match: 0364
[INFO]: 2020-12-31 16:17:02,743: Parsing Match: 0365
[INFO]: 2020-12-31 16:17:03,656: Saved data from Match: 0365
[INFO]: 2020-12-31 16:17:03

[INFO]: 2020-12-31 16:18:24,876: Parsing Match: 0429
[INFO]: 2020-12-31 16:18:25,835: Saved data from Match: 0429
[INFO]: 2020-12-31 16:18:25,836: Parsing Match: 0430
[INFO]: 2020-12-31 16:18:26,852: Saved data from Match: 0430
[INFO]: 2020-12-31 16:18:26,852: Parsing Match: 0431
[INFO]: 2020-12-31 16:18:27,635: Saved data from Match: 0431
[INFO]: 2020-12-31 16:18:27,636: Parsing Match: 0432
[INFO]: 2020-12-31 16:18:28,482: Saved data from Match: 0432
[INFO]: 2020-12-31 16:18:28,483: Parsing Match: 0433
[INFO]: 2020-12-31 16:18:29,488: Saved data from Match: 0433
[INFO]: 2020-12-31 16:18:29,488: Parsing Match: 0434
[INFO]: 2020-12-31 16:18:30,460: Saved data from Match: 0434
[INFO]: 2020-12-31 16:18:30,461: Parsing Match: 0435
[INFO]: 2020-12-31 16:18:31,533: Saved data from Match: 0435
[INFO]: 2020-12-31 16:18:31,534: Parsing Match: 0436
[INFO]: 2020-12-31 16:18:32,467: Saved data from Match: 0436
[INFO]: 2020-12-31 16:18:32,468: Parsing Match: 0437
[INFO]: 2020-12-31 16:18:33,554: Sa