# Marcel Projection

This is for development of a Marcel Projection system developed, but _not_ endorsed, by Tom Tango.  

## About Marcel Projections

Tom Tango casually offered Marcel projections as a projection system for baseball stats that could be used to act as a baseline for player projections. I say that were casually offered; he said of them: "I do not stand behind these forecasts. These forecasts are the minimum level of competence that you should expect from any forecaster. Do not attach my name to these forecasts in any kind of evaluation experiment. They should only be referred to as Marcel The Monkey Forecasting System, or simply The Marcels." It is named for a monkey on a sitcom (I think it was "Friends.") Marcels have been shown to be reasonably accurate at projecting basbeall players despite being very simple.

The point of these is not to forecast baseball for "serious" purposes. Rather, they act as a baseline. 

## How they Work

Short version: To project a player's stats in year N, add 5 times their previous season (N - 1), 4 times their N -2 season, and 3 times their N - 3 season. Then add two full seasons of major league average performance. Prorate that number to get an expected number of Plate Appearances or Innings pitched. In essence, it weighs a player's observed stats with the simplest possible expected value, the mean. (Pitcher's don't follow exactly this scheme.)

Complete details at are [here](http://www.tangotiger.net/archives/stud0346.shtml). See Tango's comment #28 for pitcher details.

In [2]:
import numpy as np
import pandas as pd
import os ##for looking at the names of files in this directory.

In [4]:
from marcel import MarcelForecaster

In [5]:
hitters = pd.read_csv('../data/hitters_since_1947.csv')
pitchers = pd.read_csv('../data/pitchers_since_1947.csv')
hitter_ages = pd.read_csv('../data/all_hitter_ages.csv')
pitcher_ages = pd.read_csv('../data/all_pitcher_ages.csv')

hitters = pd.merge(hitters, hitter_ages[['Season','playerid','Age']], how='left',on=['playerid','Season']).reset_index().drop(['index'],axis=1)
pitchers = pd.merge(pitchers, pitcher_ages[['Season','playerid','Age']], how='left',on=['playerid','Season']).reset_index().drop(['index'],axis=1)

In [6]:
pitchers.head()

Unnamed: 0,Season,Name,Team,W,L,ERA,G,GS,CG,ShO,...,ER,HR,BB,IBB,HBP,WP,BK,SO,playerid,Age
0,1947,Hi Bithorn,White Sox,1,0,0.0,2,0,0,0,...,0,0,0,,0,0,0,0,1000998,31
1,1947,Dizzy Dean,Browns,0,0,0.0,1,1,0,0,...,0,0,1,,0,0,0,0,1003106,37
2,1947,Buzz Dozier,Senators,0,0,0.0,2,0,0,0,...,0,0,1,,0,0,0,2,1003470,18
3,1947,Ernest Groth,Indians,0,0,0.0,2,0,0,0,...,0,0,1,,0,0,0,1,1005095,25
4,1947,Ken Johnson,Cardinals,1,0,0.0,2,1,1,0,...,0,0,5,,1,0,0,8,1006476,24


In [7]:

marcel = MarcelForecaster(pitchers,hitters,as_pandas=True)
marcel.hitter_stat_cols
#marcel.pitcher_stat_cols


Index(['G', 'AB', 'PA', 'H', '1B', '2B', '3B', 'HR', 'R', 'RBI', 'BB', 'IBB',
       'SO', 'HBP', 'SF', 'SH', 'GDP', 'SB', 'CS', 'AVG'],
      dtype='object')

In [53]:
#scaled_hitters = marcel._hit_step1(2020)
hitters = marcel.project_hitters(2020)

In [55]:
#prorating = marcel._hit_step4_prorating(2020).sort_values(by='PA')
hitters[(hitters['Age'] <= 21)]

Unnamed: 0_level_0,Season,Name,Team,G,AB,PA,H,1B,2B,3B,...,SH,GDP,SB,CS,AVG,playerid,Age,SLG,OBP,OPS
playerid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
15166,2020,Luiz Gohara,Braves,86.832992,179.05275,200.2,45.702983,28.818242,9.205165,0.885038,...,1.416946,4.393142,2.636569,0.895396,0.255249,15166,21,0.430386,0.32517,0.755556
15722,2020,Allen Cordoba,Padres,84.839049,178.440116,200.0,44.090042,30.313951,6.721005,1.250495,...,0.597356,3.222795,2.402132,1.196062,0.247086,15722,21,0.396356,0.320987,0.717343
17303,2020,Francis Martes,Astros,97.35321,178.704883,200.0,47.103141,29.701118,9.487175,0.912152,...,0.936355,4.003731,2.717343,0.922828,0.263581,17303,21,0.444435,0.335467,0.779902
18383,2020,Mike Soroka,Braves,98.57077,204.854858,232.3,46.215363,30.53667,8.547657,0.821821,...,6.351555,4.394086,2.448243,0.83144,0.225601,18383,21,0.367745,0.290027,0.657772
18401,2020,Ronald Acuna Jr.,Braves,159.260566,535.07806,606.2,156.171371,94.878344,25.354818,2.950382,...,0.506576,7.457836,24.999646,6.078858,0.291867,18401,21,0.535231,0.373514,0.908745
18694,2020,Kolby Allard,Rangers,89.489094,178.934498,201.3,47.246336,30.04653,9.376931,0.901552,...,2.30657,3.957206,2.685766,0.912104,0.264043,18694,21,0.442566,0.332858,0.775424
19611,2020,Vladimir Guerrero Jr.,Blue Jays,137.522992,411.206575,457.0,114.947851,74.748832,23.549313,1.944886,...,0.68493,13.788783,1.987695,1.240333,0.279538,19611,20,0.453547,0.350622,0.804169
19612,2020,Bo Bichette,Blue Jays,98.102958,277.868214,306.0,81.544782,46.832448,20.478041,0.741022,...,0.760685,4.671562,5.045487,3.327678,0.293466,19612,21,0.518176,0.354336,0.872513
19709,2020,Fernando Tatis Jr.,Padres,117.182467,345.915199,386.0,106.321163,65.818926,15.86371,4.682993,...,0.71275,5.706734,12.704875,4.274013,0.307362,19709,20,0.553365,0.375142,0.928508
19716,2020,Dustin May,Dodgers,91.184896,184.578183,205.5,47.135452,30.038485,9.320866,0.896162,...,0.919941,3.933546,2.669708,0.90665,0.255368,19716,21,0.427399,0.324941,0.752339


In [67]:
marcel.pitchers.columns
marcel.pitchers.head()
pitchers = marcel.project_pitchers(2020)

In [71]:
pitchers.sort_values(by='FIP', ascending=True).head()

Unnamed: 0_level_0,Season,Name,Team,W,L,ERA,G,GS,CG,ShO,...,IBB,HBP,WP,BK,SO,playerid,Age,FIP,K/9,BB/9
playerid,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
10954,2020,Jacob deGrom,Mets,10.012988,8.196679,2.737249,39.165321,26.605633,0.395765,0.018495,...,2.346681,5.48544,3.435418,0.11815,216.594545,10954,31,2.831456,10.6116,2.284522
3137,2020,Max Scherzer,Nationals,11.548285,6.700379,3.075129,36.054063,24.014015,0.723432,0.247091,...,2.478951,8.071997,2.719879,0.342623,213.940684,3137,34,2.845307,11.515465,2.277987
10603,2020,Chris Sale,Red Sox,8.359265,7.283785,3.568887,33.279113,21.530452,0.482114,0.353016,...,0.628474,9.828831,3.21358,0.110521,190.746269,10603,30,2.900242,12.229447,2.339531
9073,2020,Kirby Yates,Padres,2.814628,4.023649,3.033176,48.179858,2.975693,0.029212,0.014651,...,1.006092,4.178694,2.058121,0.188368,79.562796,9073,32,2.929435,11.662299,2.745746
12076,2020,Felipe Vazquez,Pirates,4.065949,2.334462,2.959649,48.139364,2.921473,0.028679,0.014384,...,0.894716,2.752931,2.671923,0.184935,74.85657,12076,27,3.012785,10.866276,2.757612


In [137]:
marcel.expected_mean_pitcher(2020).to_dict()

{'Season': 2018.1666666666667,
 'W': 3.0367303071680873,
 'L': 3.0367303071680873,
 'ERA': 5.847070224491219,
 'G': 26.381013513566757,
 'GS': 6.073460614336175,
 'CG': 0.059621503114338927,
 'ShO': 0.02990347444244394,
 'SV': 1.5010360504908942,
 'HLD': 3.1259730273385693,
 'BS': 0.8077962867471701,
 'IP': 54.01530616395814,
 'TBF': 232.11361310712672,
 'H': 52.16922028963712,
 'R': 28.267709217301455,
 'ER': 26.200651000332446,
 'HR': 7.749032043428635,
 'BB': 19.755219561851067,
 'IBB': 1.086317006102944,
 'HBP': 2.380395861422784,
 'WP': 2.2663944934634954,
 'BK': 0.1910347156856312,
 'SO': 51.94216290973079,
 'playerid': 11982.895321932285}