# Zero-inflated Poisson

The Zero-Inflated Poisson (ZIP) model is an extension of the standard Poisson model designed to handle an excess of zero-goal outcomes in football match data. 

Traditional Poisson models may struggle with matches that end in goalless draws more frequently than expected, often due to defensive tactics or low-quality attacking play. 

The ZIP model addresses this by introducing a separate process that accounts for the probability of an excess number of zeros, improving predictions for match results, goal distributions, and betting markets like correct scores and over/under goals. 

This makes it particularly useful for leagues or teams where 0-0 results occur more often than a simple Poisson distribution would suggest.

In [1]:
import penaltyblog as pb

## Get data from football-data.co.uk

In [2]:
fb = pb.scrapers.FootballData("ENG Premier League", "2019-2020")
df = fb.get_fixtures()

df.head()

Unnamed: 0_level_0,date,datetime,season,competition,div,time,team_home,team_away,fthg,ftag,...,b365_cahh,b365_caha,pcahh,pcaha,max_cahh,max_caha,avg_cahh,avg_caha,goals_home,goals_away
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
1565308800---liverpool---norwich,2019-08-09,2019-08-09 20:00:00,2019-2020,ENG Premier League,E0,20:00,Liverpool,Norwich,4,1,...,1.91,1.99,1.94,1.98,1.99,2.07,1.9,1.99,4,1
1565395200---bournemouth---sheffield_united,2019-08-10,2019-08-10 15:00:00,2019-2020,ENG Premier League,E0,15:00,Bournemouth,Sheffield United,1,1,...,1.95,1.95,1.98,1.95,2.0,1.96,1.96,1.92,1,1
1565395200---burnley---southampton,2019-08-10,2019-08-10 15:00:00,2019-2020,ENG Premier League,E0,15:00,Burnley,Southampton,3,0,...,1.87,2.03,1.89,2.03,1.9,2.07,1.86,2.02,3,0
1565395200---crystal_palace---everton,2019-08-10,2019-08-10 15:00:00,2019-2020,ENG Premier League,E0,15:00,Crystal Palace,Everton,0,0,...,1.82,2.08,1.97,1.96,2.03,2.08,1.96,1.93,0,0
1565395200---tottenham---aston_villa,2019-08-10,2019-08-10 17:30:00,2019-2020,ENG Premier League,E0,17:30,Tottenham,Aston Villa,3,1,...,2.1,1.7,2.18,1.77,2.21,1.87,2.08,1.8,3,1


## Train the model

In [3]:
clf = pb.models.ZeroInflatedPoissonGoalsModel(
    df["goals_home"], df["goals_away"], df["team_home"], df["team_away"]
)
clf.fit()

## The model's parameters

In [4]:
clf

Module: Penaltyblog

Model: Zero-inflated Poisson

Number of parameters: 42
Log Likelihood: -1057.712
AIC: 2199.424

Team                 Attack               Defence             
------------------------------------------------------------
Arsenal              1.133                -0.937              
Aston Villa          0.84                 -0.618              
Bournemouth          0.813                -0.65               
Brighton             0.776                -0.837              
Burnley              0.87                 -0.91               
Chelsea              1.349                -0.806              
Crystal Palace       0.543                -0.922              
Everton              0.899                -0.795              
Leicester            1.306                -1.084              
Liverpool            1.536                -1.283              
Man City             1.721                -1.206              
Man United           1.286                -1.216              
New

In [5]:
clf.get_params()

{'attack_Arsenal': np.float64(1.1331337818477354),
 'attack_Aston Villa': np.float64(0.8398548129814647),
 'attack_Bournemouth': np.float64(0.8130631756048761),
 'attack_Brighton': np.float64(0.7764416639692485),
 'attack_Burnley': np.float64(0.8703144960564634),
 'attack_Chelsea': np.float64(1.3486579034510162),
 'attack_Crystal Palace': np.float64(0.5425297905093729),
 'attack_Everton': np.float64(0.8994649884432993),
 'attack_Leicester': np.float64(1.305784180089934),
 'attack_Liverpool': np.float64(1.5361253390439675),
 'attack_Man City': np.float64(1.7212534209503916),
 'attack_Man United': np.float64(1.285594954551847),
 'attack_Newcastle': np.float64(0.7543806551711583),
 'attack_Norwich': np.float64(0.39137261829644243),
 'attack_Sheffield United': np.float64(0.7614940069485537),
 'attack_Southampton': np.float64(1.0516551449535727),
 'attack_Tottenham': np.float64(1.2178031343423397),
 'attack_Watford': np.float64(0.7064071488346974),
 'attack_West Ham': np.float64(1.013491352

## Predict Match Outcomes

In [6]:
probs = clf.predict("Liverpool", "Wolves")
probs

Module: Penaltyblog

Class: FootballProbabilityGrid

Home Goal Expectation: 1.8967029982715198
Away Goal Expectation: 0.7772079741215803

Home Win: 0.6384951812791484
Draw: 0.21488467148040608
Away Win: 0.14662014531684164

### 1x2 Probabilities

In [7]:
probs.home_draw_away

[np.float64(0.6384951812791484),
 np.float64(0.21488467148040608),
 np.float64(0.14662014531684164)]

In [8]:
probs.home_win

np.float64(0.6384951812791484)

In [9]:
probs.draw

np.float64(0.21488467148040608)

In [10]:
probs.away_win

np.float64(0.14662014531684164)

### Probablity of Total Goals >1.5

In [11]:
probs.total_goals("over", 1.5)

np.float64(0.7465658546429572)

### Probability of Asian Handicap 1.5

In [12]:
probs.asian_handicap("home", 1.5)

np.float64(0.38439137986550426)

## Probability of both teams scoring

In [13]:
probs.both_teams_to_score

np.float64(0.4592307490223026)