This post is the start in a series of posts (hopefully) on probability theory and how to apply it and the associated tools we have at our disposal in Python to sports betting. We are starting this post a bit late in the NFL season (there's literally one game left in the season), but we're hoping to also come out with betting content related to the NBA, so stay tuned and join our mailing list for updates on that.

In previous content, we haven't focused too much on either betting or probability theory. That's partly because I never really bet before, and I generally stuck to what I know which is traditional, redraft Fantasy Football. That all changed when I hit 2 parlays two weekends in a row this NFL playoffs (took Bengals over Bills money line and SF to cover the spread, then took KC money line and Eagles to cover the spread) and turned 25 dollars in 575. Not bragging since I probably just rewired my brain circuitry to never enjoy another NFL game again without a having a little action on (gamble at your own risk, please). So essentially I got traded $550 for the inability to ever again just sit comfortably and watch the game with the boys. I got the itch now, so to speak, which is half the reason I'm writing a lengthy post on doing this stuff in Python (and also planning on releasing a whole course).

The focus of this post will be for finding potentially profitable bets for this year's Super Bowl. We will do so by teaching you a little bit about probability, then applying two different probability distributions to two different types of props (just to keep things simple), calculating probabilities for outcomes, and then comparing that to the implied probability of the lines we get from our sports book. 

# How to Think Like a Profitable Bettor

Our rule will be that if the probability we calculate of an outcome ocurring is greater than the implied probability from the money line, then we take that bet. That should, in theory, make the bet positive expected value (or EV, for short). 

It's worthy to note that just because we take a positive EV bet, doesn't make it <i>likely</i> to hit. We could take a positive EV bet where the edge is 1%, that is the calculated probability we get from our analysis is 44% but the book is giving us 43% odds. There's still a 56% chance we lose our money, or more likely than not. It's positive EV because that 1% profit margin will be realized over time, or over a series of many bets. The essence of having an edge in any probabilistic endeavour like sports betting is that the edge is unfolded over time, but there is of course an element of randomness that prevents a bettor from having strong predictability around individual occurences (bets). 

I personally learned this way of thinking through trading and investing, and it's exactly this mindset that's taught to successful investors. It applies equally here in sports betting just as well. The difference though is that it's actually <i>easier</i> to think this way in sports betting, because your R-factor, or risk-to-reward ratio, is already explicitly set by the money line, whereas in investing these things are more fluid and you must set the R-factor yourself. 

Remember that we are implying the probabilities from the lines, but what the line is <i>explicitly</i> telling us is our risk-to-reward ratio. If we have a bet that's +200, that's a 2 to 1 payout, and a 33% implied probability (100/200+100). 33% is also our breakeven point on any 2-1 bet, ever. Which means, if we consistently took +200 bets, we would need a 33% win rate to breakeven. If we are able to push our win rate above 33%, even to say 35%, we've developed an edge and positive EV. 

This also means, that by definition, just taking sports bets at random, your EV is 0 and you are expected to lose no money and gain no money assuming Vegas is right over the long run (it's not a bad assumption to make).

This example assumes you use proper bet sizing and money management to not irresponsibly increase your risk of ruin, and also assumes no transaction costs which would push your EV below 0 (there's always transaction costs, so our perfect world example does fail). 

This is also on average. Depending on how conservatively you bet, you may just lose all of your money just off bad luck (variance). For this reason, only ever bet money you are willing to lose.

In investing, literally no one will set that fixed payout structure for you, which can be the most challenging part of the risk management component of investing, intellectually and psychologically. 

We have a head start when coming to sports betting, and we should probably take advantage of that fact (Along with other advantages, including the fact that sports betting markets are signifcantly less efficient than global financial markets).

# Super Bowl Lines

I'm pulling the lines from Bovada here, and won't combine multiple lines from multiple books cause I want to keep things simple. 

We'll be dealing with two types of player props, since they, in my opinion are the easiest ones to conceptually model via our chosen probability distributions. 

We'll be pass attempts for Mahomes and Hurts, and Anytime TD scorer for Travis Kelce, Miles Sanders, and AJ Brown. The lines are below:

Patrick Mahomes Over 38.5 Pass Attempts (-115)

Jalen Hurts Over 21.5 Pass Attempts (-120)

Travis Kelce Anytime TD Scorer (-400)

Miles Sanders Anytime TD scorer (-140)

AJ Brown Anytime TD Scorer (-200)


# Probability Theory

For pass completions, we'll model our players using a poisson distribution.

In [None]:
%pip install nfl-data-py --quiet

[?25l     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.0/1.4 MB[0m [31m?[0m eta [36m-:--:--[0m[2K     [91m━━━━━━━━━[0m[91m╸[0m[90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m0.3/1.4 MB[0m [31m10.2 MB/s[0m eta [36m0:00:01[0m[2K     [91m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m[91m╸[0m [32m1.4/1.4 MB[0m [31m26.1 MB/s[0m eta [36m0:00:01[0m[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.4/1.4 MB[0m [31m16.8 MB/s[0m eta [36m0:00:00[0m
[?25h  Preparing metadata (setup.py) ... [?25l[?25hdone
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.6/1.6 MB[0m [31m50.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m56.1/56.1 KB[0m [31m2.5 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m2.0/2.0 MB[0m [31m43.4 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.2/12.2 MB[0m [31m

In [None]:
from scipy.stats import stats

In [None]:
players = ['P.Mahomes', 'M.Sanders', 'J.Hurts']