# Baseball - Predicting players' salary

*NOTE: Due to featuretools' memory requirement, this notebook will not run on MyBinder.*

In this notebook, we will benchmark several of getML's feature learning algorithms against featuretools using a dataset related to baseball players' salary.

Summary:

- Prediction type: __Regression model__
- Domain: __Sports__
- Prediction target: __Salaries__ 
- Population size: __23111__

_Author: Dr. Patrick Urbanke_

# Background

In the late 1990s, the Oakland Athletics began focusing on the idea of *sabermetrics*, using statistical methods to identify undervalued baseball players. This was done to compensate for the fact that the team had a significantly smaller budget than most other teams in its league. Under its general manager Billy Beane, the Oakland Athletics became the first team in over 100 years to win 20 consecutive games in a row, despite still being significantly disadvantaged in terms of its budget. After this remarkable success, the use of sabermetrics quickly became the norm in baseball. These events have been documented in a bestselling book and a movie, both called *Moneyball*. 

In this notebook, we will demonstrate that relational learning can be used for sabermetrics. Specifically, we will develop a model to predict players' salary using getML's statistical relational learning algorithms. Such predictions can then be used to identify undervalued players. 

The dataset has been downloaded from the [CTU Prague relational learning repository](https://relational.fit.cvut.cz/dataset/Lahman) (Motl and Schulte, 2015).

We will benchmark [getML](https://www.getml.com) 's feature learning algorithms against [featuretools](https://www.featuretools.com), an open-source implementation of the propositionalization algorithm, similar to getML's FastProp.

### A web frontend for getML

The getML monitor is a frontend built to support your work with getML. The getML monitor displays information such as the imported data frames, trained pipelines and allows easy data and feature exploration. You can launch the getML monitor [here](http://localhost:1709).

### Where is this running?

Your getML live session is running inside a docker container on [mybinder.org](https://mybinder.org/), a service built by the Jupyter community and funded by Google Cloud, OVH, GESIS Notebooks and the Turing Institute. As it is a free service, this session will shut down after 10 minutes of inactivity.

# Analysis

Let's get started with the analysis and set up your session:

In [1]:
import copy
import os
from urllib import request

import numpy as np
import pandas as pd
from IPython.display import Image
import matplotlib.pyplot as plt
plt.style.use('seaborn')
%matplotlib inline  

import featuretools
import getml

getml.engine.set_project('baseball')




Connected to project 'baseball'


## 1. Loading data

### 1.1 Download from source

We begin by downloading the data:

In [2]:
conn = getml.database.connect_mariadb(
    host="relational.fit.cvut.cz",
    dbname="lahman_2014",
    port=3306,
    user="guest",
    password="relational"
)

conn

Connection(conn_id='default',
           dbname='lahman_2014',
           dialect='mysql',
           host='relational.fit.cvut.cz',
           port=3306)

In [3]:
def load_if_needed(name):
    """
    Loads the data from the relational learning
    repository, if the data frame has not already
    been loaded.
    """
    if not getml.data.exists(name):
        data_frame = getml.data.DataFrame.from_db(
            name=name,
            table_name=name,
            conn=conn
        )
        data_frame.save()
    else:
        data_frame = getml.data.load_data_frame(name)
    return data_frame

In [4]:
allstarfull = load_if_needed("allstarfull")
awardsplayers = load_if_needed("awardsplayers")
awardsshareplayers = load_if_needed("awardsshareplayers")
batting = load_if_needed("batting")
battingpost = load_if_needed("battingpost")
fielding = load_if_needed("fielding")
fieldingpost = load_if_needed("fieldingpost")
pitching = load_if_needed("pitching")
pitchingpost = load_if_needed("pitchingpost")
salaries = load_if_needed("salaries")

In [5]:
allstarfull

name,yearID,gameNum,GP,startingPos,playerID,gameID,teamID,lgID
role,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string,unused_string
0.0,1955,0,1,,aaronha01,NLS195507120,ML1,NL
1.0,1956,0,1,,aaronha01,ALS195607100,ML1,NL
2.0,1957,0,1,9,aaronha01,NLS195707090,ML1,NL
3.0,1958,0,1,9,aaronha01,ALS195807080,ML1,NL
4.0,1959,1,1,9,aaronha01,NLS195907070,ML1,NL
,...,...,...,...,...,...,...,...
4826.0,1978,0,1,9,ziskri01,NLS197807110,TEX,AL
4827.0,2002,0,1,,zitoba01,NLS200207090,OAK,AL
4828.0,2003,0,0,,zitoba01,ALS200307150,OAK,AL
4829.0,2006,0,1,,zitoba01,NLS200607110,OAK,AL


In [6]:
awardsplayers

name,yearID,playerID,awardID,lgID,tie,notes
role,unused_float,unused_string,unused_string,unused_string,unused_string,unused_string
0.0,1877,bondto01,Pitching Triple Crown,NL,,
1.0,1878,hinespa01,Triple Crown,NL,,
2.0,1884,heckegu01,Pitching Triple Crown,AA,,
3.0,1884,radboch01,Pitching Triple Crown,NL,,
4.0,1887,oneilti01,Triple Crown,AA,,
,...,...,...,...,...,...
5790.0,2012,larocad01,Silver Slugger,NL,,1B
5791.0,2012,cabremi01,Triple Crown,NL,,
5792.0,2012,cabremi01,TSN Major League Player of the Y...,ML,,
5793.0,2012,verlaju01,TSN Pitcher of the Year,AL,Y,


In [7]:
awardsshareplayers

name,yearID,pointsWon,pointsMax,votesFirst,awardID,lgID,playerID
role,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string
0.0,1956,1,16,1,Cy Young,ML,fordwh01
1.0,1956,4,16,4,Cy Young,ML,maglisa01
2.0,1956,10,16,10,Cy Young,ML,newcodo01
3.0,1956,1,16,1,Cy Young,ML,spahnwa01
4.0,1957,1,16,1,Cy Young,ML,donovdi01
,...,...,...,...,...,...,...
6284.0,2006,1,160,0,Rookie of the Year,NL,willijo03
6285.0,2006,101,160,10,Rookie of the Year,NL,zimmery01
6286.0,2008,3,140,0,Rookie of the Year,AL,devinjo01
6287.0,2008,158,160,31,Rookie of the Year,NL,sotoge01


In [8]:
batting

name,yearID,stint,G,G_batting,AB,R,H,2B,3B,HR,RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP,G_old,playerID,teamID,lgID
role,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string
0.0,2004,1,11,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11,aardsda01,SFN,NL
1.0,2006,1,45,43,2,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,45,aardsda01,CHN,NL
2.0,2007,1,25,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,aardsda01,CHA,AL
3.0,2008,1,47,5,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,5,aardsda01,BOS,AL
4.0,2009,1,73,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,aardsda01,SEA,AL
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
92348.0,1959,1,6,6,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,6,zuverge01,BAL,AL
92349.0,1910,1,27,27,87,7,16,5,0,0,5,1,,11,,,1,1,,,27,zwilldu01,CHA,AL
92350.0,1914,1,154,154,592,91,185,38,8,16,95,21,,46,68,,1,10,,,154,zwilldu01,CHF,FL
92351.0,1915,1,150,150,548,65,157,32,7,13,94,24,,67,65,,2,18,,,150,zwilldu01,CHF,FL


In [9]:
battingpost

name,yearID,G,AB,R,H,2B,3B,HR,RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP,round,playerID,teamID,lgID
role,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string,unused_string
0.0,1884,1,2,0,1,0,0,0,0,0,,0,0,0,,,,,WS,becanbu01,NY4,AA
1.0,1884,3,10,1,0,0,0,0,0,0,,0,1,0,,,,,WS,bradyst01,NY4,AA
2.0,1884,3,10,2,1,0,0,0,1,0,,1,1,0,,,,,WS,carrocl01,PRO,NL
3.0,1884,3,9,3,4,0,1,1,2,0,,0,3,0,,,,,WS,dennyje01,PRO,NL
4.0,1884,3,10,0,3,1,0,0,0,1,,0,3,0,,,,,WS,esterdu01,NY4,AA
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9793.0,2012,2,5,1,1,0,0,0,0,0,0,0,2,,,,,,WS,theriry01,SFN,NL
9794.0,2012,1,0,0,0,0,0,0,0,0,0,0,0,,,,,,WS,valvejo01,DET,AL
9795.0,2012,1,1,0,0,0,0,0,0,0,0,0,0,,,,,,WS,verlaju01,DET,AL
9796.0,2012,1,0,0,0,0,0,0,0,0,0,0,0,,,,,,WS,vogelry01,SFN,NL


In [10]:
fielding

name,yearID,stint,G,GS,InnOuts,PO,A,E,DP,PB,WP,SB,CS,ZR,playerID,teamID,lgID,POS
role,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string,unused_string
0.0,2004,1,11,0,32,0,0,0,0,,,,,,aardsda01,SFN,NL,P
1.0,2006,1,45,0,159,1,5,0,1,,,,,,aardsda01,CHN,NL,P
2.0,2007,1,25,0,97,2,4,1,0,,,,,,aardsda01,CHA,AL,P
3.0,2008,1,47,0,146,3,6,0,0,,,,,,aardsda01,BOS,AL,P
4.0,2009,1,73,0,214,2,5,0,1,0,,0,0,,aardsda01,SEA,AL,P
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
137970.0,1910,1,27,,,45,2,3,1,,,,,,zwilldu01,CHA,AL,OF
137971.0,1914,1,154,,,340,15,14,3,,,,,,zwilldu01,CHF,FL,OF
137972.0,1915,1,3,,,3,0,0,0,,,,,,zwilldu01,CHF,FL,1B
137973.0,1915,1,148,,,356,20,8,6,,,,,,zwilldu01,CHF,FL,OF


In [11]:
fieldingpost

name,yearID,G,GS,InnOuts,PO,A,E,DP,TP,PB,SB,CS,playerID,teamID,lgID,round,POS
role,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string,unused_string,unused_string
0.0,1957,7,7,186,11,0,0,0,0,,,,aaronha01,ML1,NL,WS,CF
1.0,1958,1,1,21,2,0,0,0,0,,,,aaronha01,ML1,NL,WS,CF
2.0,1958,7,6,168,13,0,0,0,0,,,,aaronha01,ML1,NL,WS,RF
3.0,1969,3,3,78,5,1,0,0,0,,,,aaronha01,ATL,NL,NLCS,RF
4.0,1979,2,0,15,0,1,0,0,0,,0,0,aasedo01,CAL,AL,ALCS,P
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10341.0,2006,1,1,24,0,1,0,0,0,,0,1,zitoba01,OAK,AL,ALDS2,P
10342.0,2012,1,1,23,0,1,0,0,0,0,0,0,zitoba01,SFN,NL,NLCS,P
10343.0,2012,1,1,8,0,0,0,0,0,0,0,0,zitoba01,SFN,NL,NLDS2,P
10344.0,2012,1,1,17,0,0,0,0,0,0,0,0,zitoba01,SFN,NL,WS,P


In [12]:
pitching

name,yearID,stint,W,L,G,GS,CG,SHO,SV,IPouts,H,ER,HR,BB,SO,BAOpp,ERA,IBB,WP,HBP,BK,BFP,GF,R,SH,SF,GIDP,playerID,teamID,lgID
role,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string
0.0,2004,1,1,0,11,0,0,0,0,32,20,8,1,10,5,0,6,0,0,2,0,61,5,8,,,,aardsda01,SFN,NL
1.0,2006,1,3,0,45,0,0,0,0,159,41,24,9,28,49,,4,0,1,1,0,225,9,25,,,,aardsda01,CHN,NL
2.0,2007,1,2,1,25,0,0,0,0,97,39,23,4,17,36,,6,3,2,1,0,151,7,24,,,,aardsda01,CHA,AL
3.0,2008,1,4,2,47,0,0,0,0,146,49,30,4,35,49,,5,2,3,5,0,228,7,32,,,,aardsda01,BOS,AL
4.0,2009,1,3,6,73,0,0,0,38,214,49,20,4,34,80,,2,3,2,0,0,296,53,23,,,,aardsda01,SEA,AL
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
39356.0,1955,2,4,3,28,5,0,0,4,259,80,21,5,17,31,0,2,1,2,4,0,333,16,28,,,,zuverge01,BAL,AL
39357.0,1956,1,7,6,62,0,0,0,16,292,112,45,6,34,33,0,4,9,1,3,1,432,40,52,,,,zuverge01,BAL,AL
39358.0,1957,1,10,6,56,0,0,0,9,338,105,31,9,39,36,0,2,13,1,4,0,475,37,37,,,,zuverge01,BAL,AL
39359.0,1958,1,2,2,45,0,0,0,7,207,74,26,4,17,22,0,3,3,2,6,0,294,23,29,,,,zuverge01,BAL,AL


In [13]:
pitchingpost

name,yearID,W,L,G,GS,CG,SHO,SV,IPouts,H,ER,HR,BB,SO,BAOpp,ERA,IBB,WP,HBP,BK,BFP,GF,R,SH,SF,GIDP,playerID,round,teamID,lgID
role,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float,unused_string,unused_string,unused_string,unused_string
0.0,1979,1,0,2,0,0,0,0,15,4,1,0,2,6,0,1,1,0,0,0,20,2,1,0,1,0,aasedo01,ALCS,CAL,AL
1.0,1975,0,0,1,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,3,1,0,0,0,0,abbotgl01,ALCS,OAK,AL
2.0,2000,0,1,1,1,0,0,0,15,3,3,1,3,3,0,5,0,0,0,0,21,0,3,0,0,0,abbotpa01,ALCS,SEA,AL
3.0,2000,1,0,1,1,0,0,0,17,5,1,0,3,1,0,1,0,0,1,0,25,0,2,0,1,1,abbotpa01,ALDS2,SEA,AL
4.0,2001,0,0,1,1,0,0,0,15,0,0,0,8,2,0,0,0,0,0,0,21,0,0,1,0,0,abbotpa01,ALCS,SEA,AL
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4192.0,2006,1,0,1,1,0,0,0,24,4,1,1,3,1,0,1,0,0,0,0,30,0,1,1,0,0,zitoba01,ALDS2,OAK,AL
4193.0,2012,1,0,1,1,0,0,0,23,6,0,0,1,6,0,0,1,0,0,,29,0,0,,,1,zitoba01,NLCS,SFN,NL
4194.0,2012,0,0,1,1,0,0,0,7,4,2,1,4,4,0,6,0,0,0,,16,0,2,,,0,zitoba01,NLDS2,SFN,NL
4195.0,2012,1,0,1,1,0,0,0,17,6,1,0,1,3,0,1,0,0,0,,23,0,1,,,1,zitoba01,WS,SFN,NL


In [14]:
salaries

name,yearID,salary,teamID,lgID,playerID
role,unused_float,unused_float,unused_string,unused_string,unused_string
0.0,1985,870000,ATL,NL,barkele01
1.0,1985,550000,ATL,NL,bedrost01
2.0,1985,545000,ATL,NL,benedbr01
3.0,1985,633333,ATL,NL,campri01
4.0,1985,625000,ATL,NL,ceronri01
,...,...,...,...,...
23106.0,2012,750000,WAS,NL,tracych01
23107.0,2012,4000000,WAS,NL,wangch01
23108.0,2012,13571428,WAS,NL,werthja01
23109.0,2012,2300000,WAS,NL,zimmejo02


### 1.2 Prepare data for getML

getML requires that we define *roles* for each of the columns.

In [15]:
allstarfull["year"] = allstarfull["yearID"].as_str().as_ts(["%Y"])

allstarfull.set_role(["playerID"], getml.data.roles.join_key)
allstarfull.set_role(allstarfull.roles.unused_string, getml.data.roles.categorical)
allstarfull.set_role(allstarfull.roles.unused_float, getml.data.roles.numerical)
allstarfull.set_role("year", getml.data.roles.time_stamp)

allstarfull.set_role("yearID", getml.data.roles.unused_float)

allstarfull

name,year,playerID,gameID,teamID,lgID,gameNum,GP,startingPos,yearID
role,time_stamp,join_key,categorical,categorical,categorical,numerical,numerical,numerical,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2
0.0,1955-01-01,aaronha01,NLS195507120,ML1,NL,0,1,,1955
1.0,1956-01-01,aaronha01,ALS195607100,ML1,NL,0,1,,1956
2.0,1957-01-01,aaronha01,NLS195707090,ML1,NL,0,1,9,1957
3.0,1958-01-01,aaronha01,ALS195807080,ML1,NL,0,1,9,1958
4.0,1959-01-01,aaronha01,NLS195907070,ML1,NL,1,1,9,1959
,...,...,...,...,...,...,...,...,...
4826.0,1978-01-01,ziskri01,NLS197807110,TEX,AL,0,1,9,1978
4827.0,2002-01-01,zitoba01,NLS200207090,OAK,AL,0,1,,2002
4828.0,2003-01-01,zitoba01,ALS200307150,OAK,AL,0,0,,2003
4829.0,2006-01-01,zitoba01,NLS200607110,OAK,AL,0,1,,2006


In [16]:
awardsplayers["year"] = awardsplayers["yearID"].as_str().as_ts(["%Y"])

awardsplayers.set_role(["playerID"], getml.data.roles.join_key)
awardsplayers.set_role(["awardID", "lgID", "notes"], getml.data.roles.categorical)
awardsplayers.set_role("year", getml.data.roles.time_stamp)

awardsplayers

name,year,playerID,awardID,lgID,notes,yearID,tie
role,time_stamp,join_key,categorical,categorical,categorical,unused_float,unused_string
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
0.0,1877-01-01,bondto01,Pitching Triple Crown,NL,,1877,
1.0,1878-01-01,hinespa01,Triple Crown,NL,,1878,
2.0,1884-01-01,heckegu01,Pitching Triple Crown,AA,,1884,
3.0,1884-01-01,radboch01,Pitching Triple Crown,NL,,1884,
4.0,1887-01-01,oneilti01,Triple Crown,AA,,1887,
,...,...,...,...,...,...,...
5790.0,2012-01-01,larocad01,Silver Slugger,NL,1B,2012,
5791.0,2012-01-01,cabremi01,Triple Crown,NL,,2012,
5792.0,2012-01-01,cabremi01,TSN Major League Player of the Y...,ML,,2012,
5793.0,2012-01-01,verlaju01,TSN Pitcher of the Year,AL,,2012,Y


In [17]:
awardsshareplayers["year"] = awardsshareplayers["yearID"].as_str().as_ts(["%Y"])

awardsshareplayers.set_role(["playerID"], getml.data.roles.join_key)
awardsshareplayers.set_role(awardsshareplayers.roles.unused_float, getml.data.roles.numerical)
awardsshareplayers.set_role(awardsshareplayers.roles.unused_string, getml.data.roles.categorical)
awardsshareplayers.set_role("yearID", getml.data.roles.unused_float)
awardsshareplayers.set_role("year", getml.data.roles.time_stamp)

awardsshareplayers

name,year,playerID,awardID,lgID,pointsWon,pointsMax,votesFirst,yearID
role,time_stamp,join_key,categorical,categorical,numerical,numerical,numerical,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2
0.0,1956-01-01,fordwh01,Cy Young,ML,1,16,1,1956
1.0,1956-01-01,maglisa01,Cy Young,ML,4,16,4,1956
2.0,1956-01-01,newcodo01,Cy Young,ML,10,16,10,1956
3.0,1956-01-01,spahnwa01,Cy Young,ML,1,16,1,1956
4.0,1957-01-01,donovdi01,Cy Young,ML,1,16,1,1957
,...,...,...,...,...,...,...,...
6284.0,2006-01-01,willijo03,Rookie of the Year,NL,1,160,0,2006
6285.0,2006-01-01,zimmery01,Rookie of the Year,NL,101,160,10,2006
6286.0,2008-01-01,devinjo01,Rookie of the Year,AL,3,140,0,2008
6287.0,2008-01-01,sotoge01,Rookie of the Year,NL,158,160,31,2008


In [18]:
batting["year"] = batting["yearID"].as_str().as_ts(["%Y"])

batting.set_role(["playerID", "teamID"], getml.data.roles.join_key)
batting.set_role(batting.roles.unused_float, getml.data.roles.numerical)
batting.set_role(batting.roles.unused_string, getml.data.roles.categorical)
batting.set_role("yearID", getml.data.roles.unused_float)
batting.set_role("year", getml.data.roles.time_stamp)

batting

name,year,playerID,teamID,lgID,stint,G,G_batting,AB,R,H,2B,3B,HR,RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP,G_old,yearID
role,time_stamp,join_key,join_key,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2
0.0,2004-01-01,aardsda01,SFN,NL,1,11,11,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,11,2004
1.0,2006-01-01,aardsda01,CHN,NL,1,45,43,2,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,45,2006
2.0,2007-01-01,aardsda01,CHA,AL,1,25,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,2007
3.0,2008-01-01,aardsda01,BOS,AL,1,47,5,1,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,5,2008
4.0,2009-01-01,aardsda01,SEA,AL,1,73,3,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,2009
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
92348.0,1959-01-01,zuverge01,BAL,AL,1,6,6,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,6,1959
92349.0,1910-01-01,zwilldu01,CHA,AL,1,27,27,87,7,16,5,0,0,5,1,,11,,,1,1,,,27,1910
92350.0,1914-01-01,zwilldu01,CHF,FL,1,154,154,592,91,185,38,8,16,95,21,,46,68,,1,10,,,154,1914
92351.0,1915-01-01,zwilldu01,CHF,FL,1,150,150,548,65,157,32,7,13,94,24,,67,65,,2,18,,,150,1915


In [19]:
battingpost["year"] = battingpost["yearID"].as_str().as_ts(["%Y"])

battingpost.set_role(["playerID", "teamID"], getml.data.roles.join_key)
battingpost.set_role(battingpost.roles.unused_float, getml.data.roles.numerical)
battingpost.set_role(battingpost.roles.unused_string, getml.data.roles.categorical)
battingpost.set_role("yearID", getml.data.roles.unused_float)
battingpost.set_role("year", getml.data.roles.time_stamp)

battingpost

name,year,playerID,teamID,round,lgID,G,AB,R,H,2B,3B,HR,RBI,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP,yearID
role,time_stamp,join_key,join_key,categorical,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2
0.0,1884-01-01,becanbu01,NY4,WS,AA,1,2,0,1,0,0,0,0,0,,0,0,0,,,,,1884
1.0,1884-01-01,bradyst01,NY4,WS,AA,3,10,1,0,0,0,0,0,0,,0,1,0,,,,,1884
2.0,1884-01-01,carrocl01,PRO,WS,NL,3,10,2,1,0,0,0,1,0,,1,1,0,,,,,1884
3.0,1884-01-01,dennyje01,PRO,WS,NL,3,9,3,4,0,1,1,2,0,,0,3,0,,,,,1884
4.0,1884-01-01,esterdu01,NY4,WS,AA,3,10,0,3,1,0,0,0,1,,0,3,0,,,,,1884
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
9793.0,2012-01-01,theriry01,SFN,WS,NL,2,5,1,1,0,0,0,0,0,0,0,2,,,,,,2012
9794.0,2012-01-01,valvejo01,DET,WS,AL,1,0,0,0,0,0,0,0,0,0,0,0,,,,,,2012
9795.0,2012-01-01,verlaju01,DET,WS,AL,1,1,0,0,0,0,0,0,0,0,0,0,,,,,,2012
9796.0,2012-01-01,vogelry01,SFN,WS,NL,1,0,0,0,0,0,0,0,0,0,0,0,,,,,,2012


In [20]:
fielding["year"] = fielding["yearID"].as_str().as_ts(["%Y"])

fielding.set_role(["playerID", "teamID"], getml.data.roles.join_key)
fielding.set_role(["stint", "G","GS","InnOuts", "PO", "A", "E", "DP"], getml.data.roles.numerical)
fielding.set_role(fielding.roles.unused_string, getml.data.roles.categorical)
fielding.set_role("year", getml.data.roles.time_stamp)

fielding

name,year,playerID,teamID,lgID,POS,stint,G,GS,InnOuts,PO,A,E,DP,yearID,PB,WP,SB,CS,ZR
role,time_stamp,join_key,join_key,categorical,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,unused_float,unused_float,unused_float,unused_float,unused_float,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2
0.0,2004-01-01,aardsda01,SFN,NL,P,1,11,0,32,0,0,0,0,2004,,,,,
1.0,2006-01-01,aardsda01,CHN,NL,P,1,45,0,159,1,5,0,1,2006,,,,,
2.0,2007-01-01,aardsda01,CHA,AL,P,1,25,0,97,2,4,1,0,2007,,,,,
3.0,2008-01-01,aardsda01,BOS,AL,P,1,47,0,146,3,6,0,0,2008,,,,,
4.0,2009-01-01,aardsda01,SEA,AL,P,1,73,0,214,2,5,0,1,2009,0,,0,0,
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
137970.0,1910-01-01,zwilldu01,CHA,AL,OF,1,27,,,45,2,3,1,1910,,,,,
137971.0,1914-01-01,zwilldu01,CHF,FL,OF,1,154,,,340,15,14,3,1914,,,,,
137972.0,1915-01-01,zwilldu01,CHF,FL,1B,1,3,,,3,0,0,0,1915,,,,,
137973.0,1915-01-01,zwilldu01,CHF,FL,OF,1,148,,,356,20,8,6,1915,,,,,


In [21]:
fieldingpost["year"] = fieldingpost["yearID"].as_str().as_ts(["%Y"])

fieldingpost.set_role(["playerID", "teamID"], getml.data.roles.join_key)
fieldingpost.set_role(["G", "GS", "InnOuts", "PO", "A", "E", "DP", "TP", "SB", "CS"], getml.data.roles.numerical)
fieldingpost.set_role(fieldingpost.roles.unused_string, getml.data.roles.categorical)
fieldingpost.set_role("year", getml.data.roles.time_stamp)

fieldingpost

name,year,playerID,teamID,lgID,round,POS,G,GS,InnOuts,PO,A,E,DP,TP,SB,CS,yearID,PB
role,time_stamp,join_key,join_key,categorical,categorical,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,unused_float,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2
0.0,1957-01-01,aaronha01,ML1,NL,WS,CF,7,7,186,11,0,0,0,0,,,1957,
1.0,1958-01-01,aaronha01,ML1,NL,WS,CF,1,1,21,2,0,0,0,0,,,1958,
2.0,1958-01-01,aaronha01,ML1,NL,WS,RF,7,6,168,13,0,0,0,0,,,1958,
3.0,1969-01-01,aaronha01,ATL,NL,NLCS,RF,3,3,78,5,1,0,0,0,,,1969,
4.0,1979-01-01,aasedo01,CAL,AL,ALCS,P,2,0,15,0,1,0,0,0,0,0,1979,
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
10341.0,2006-01-01,zitoba01,OAK,AL,ALDS2,P,1,1,24,0,1,0,0,0,0,1,2006,
10342.0,2012-01-01,zitoba01,SFN,NL,NLCS,P,1,1,23,0,1,0,0,0,0,0,2012,0
10343.0,2012-01-01,zitoba01,SFN,NL,NLDS2,P,1,1,8,0,0,0,0,0,0,0,2012,0
10344.0,2012-01-01,zitoba01,SFN,NL,WS,P,1,1,17,0,0,0,0,0,0,0,2012,0


In [22]:
pitching["year"] = pitching["yearID"].as_str().as_ts(["%Y"])

pitching.set_role(["playerID", "teamID"], getml.data.roles.join_key)
pitching.set_role([
    "stint", 
    "W", 
    "L", 
    "G", 
    "GS", 
    "CG", 
    "SHO", 
    "SV", 
    "IPouts", 
    "H", 
    "ER", 
    "HR", 
    "BB", 
    "SO", 
    "BAOpp", 
    "ERA", 
    "IBB", 
    "WP", 
    "HBP", 
    "BK", 
    "BFP", 
    "GF", 
    "R"], getml.data.roles.numerical)
pitching.set_role(pitching.roles.unused_string, getml.data.roles.categorical)
pitching.set_role("yearID", getml.data.roles.unused_float)
pitching.set_role("year", getml.data.roles.time_stamp)


pitching

name,year,playerID,teamID,lgID,stint,W,L,G,GS,CG,SHO,SV,IPouts,H,ER,HR,BB,SO,BAOpp,ERA,IBB,WP,HBP,BK,BFP,GF,R,SH,SF,GIDP,yearID
role,time_stamp,join_key,join_key,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,unused_float,unused_float,unused_float,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2
0.0,2004-01-01,aardsda01,SFN,NL,1,1,0,11,0,0,0,0,32,20,8,1,10,5,0,6,0,0,2,0,61,5,8,,,,2004
1.0,2006-01-01,aardsda01,CHN,NL,1,3,0,45,0,0,0,0,159,41,24,9,28,49,,4,0,1,1,0,225,9,25,,,,2006
2.0,2007-01-01,aardsda01,CHA,AL,1,2,1,25,0,0,0,0,97,39,23,4,17,36,,6,3,2,1,0,151,7,24,,,,2007
3.0,2008-01-01,aardsda01,BOS,AL,1,4,2,47,0,0,0,0,146,49,30,4,35,49,,5,2,3,5,0,228,7,32,,,,2008
4.0,2009-01-01,aardsda01,SEA,AL,1,3,6,73,0,0,0,38,214,49,20,4,34,80,,2,3,2,0,0,296,53,23,,,,2009
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
39356.0,1955-01-01,zuverge01,BAL,AL,2,4,3,28,5,0,0,4,259,80,21,5,17,31,0,2,1,2,4,0,333,16,28,,,,1955
39357.0,1956-01-01,zuverge01,BAL,AL,1,7,6,62,0,0,0,16,292,112,45,6,34,33,0,4,9,1,3,1,432,40,52,,,,1956
39358.0,1957-01-01,zuverge01,BAL,AL,1,10,6,56,0,0,0,9,338,105,31,9,39,36,0,2,13,1,4,0,475,37,37,,,,1957
39359.0,1958-01-01,zuverge01,BAL,AL,1,2,2,45,0,0,0,7,207,74,26,4,17,22,0,3,3,2,6,0,294,23,29,,,,1958


In [23]:
pitchingpost["year"] = pitchingpost["yearID"].as_str().as_ts(["%Y"])

pitchingpost.set_role(["playerID", "teamID"], getml.data.roles.join_key)
pitchingpost.set_role(pitchingpost.roles.unused_float, getml.data.roles.numerical)
pitchingpost.set_role(pitchingpost.roles.unused_string, getml.data.roles.categorical)
pitchingpost.set_role("yearID", getml.data.roles.unused_float)
pitchingpost.set_role("year", getml.data.roles.time_stamp)

pitchingpost

name,year,playerID,teamID,round,lgID,W,L,G,GS,CG,SHO,SV,IPouts,H,ER,HR,BB,SO,BAOpp,ERA,IBB,WP,HBP,BK,BFP,GF,R,SH,SF,GIDP,yearID
role,time_stamp,join_key,join_key,categorical,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,unused_float
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2,Unnamed: 8_level_2,Unnamed: 9_level_2,Unnamed: 10_level_2,Unnamed: 11_level_2,Unnamed: 12_level_2,Unnamed: 13_level_2,Unnamed: 14_level_2,Unnamed: 15_level_2,Unnamed: 16_level_2,Unnamed: 17_level_2,Unnamed: 18_level_2,Unnamed: 19_level_2,Unnamed: 20_level_2,Unnamed: 21_level_2,Unnamed: 22_level_2,Unnamed: 23_level_2,Unnamed: 24_level_2,Unnamed: 25_level_2,Unnamed: 26_level_2,Unnamed: 27_level_2,Unnamed: 28_level_2,Unnamed: 29_level_2,Unnamed: 30_level_2,Unnamed: 31_level_2
0.0,1979-01-01,aasedo01,CAL,ALCS,AL,1,0,2,0,0,0,0,15,4,1,0,2,6,0,1,1,0,0,0,20,2,1,0,1,0,1979
1.0,1975-01-01,abbotgl01,OAK,ALCS,AL,0,0,1,0,0,0,0,3,0,0,0,0,0,0,0,0,0,0,0,3,1,0,0,0,0,1975
2.0,2000-01-01,abbotpa01,SEA,ALCS,AL,0,1,1,1,0,0,0,15,3,3,1,3,3,0,5,0,0,0,0,21,0,3,0,0,0,2000
3.0,2000-01-01,abbotpa01,SEA,ALDS2,AL,1,0,1,1,0,0,0,17,5,1,0,3,1,0,1,0,0,1,0,25,0,2,0,1,1,2000
4.0,2001-01-01,abbotpa01,SEA,ALCS,AL,0,0,1,1,0,0,0,15,0,0,0,8,2,0,0,0,0,0,0,21,0,0,1,0,0,2001
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4192.0,2006-01-01,zitoba01,OAK,ALDS2,AL,1,0,1,1,0,0,0,24,4,1,1,3,1,0,1,0,0,0,0,30,0,1,1,0,0,2006
4193.0,2012-01-01,zitoba01,SFN,NLCS,NL,1,0,1,1,0,0,0,23,6,0,0,1,6,0,0,1,0,0,,29,0,0,,,1,2012
4194.0,2012-01-01,zitoba01,SFN,NLDS2,NL,0,0,1,1,0,0,0,7,4,2,1,4,4,0,6,0,0,0,,16,0,2,,,0,2012
4195.0,2012-01-01,zitoba01,SFN,WS,NL,1,0,1,1,0,0,0,17,6,1,0,1,3,0,1,0,0,0,,23,0,1,,,1,2012


In [24]:
salaries["year"] = salaries["yearID"].as_str().as_ts(["%Y"])
salaries["teamIDCat"] = salaries["teamID"]

salaries.set_role(["playerID", "teamID"], getml.data.roles.join_key)
salaries.set_role(["lgID", "teamIDCat"], getml.data.roles.categorical)
salaries.set_role("yearID", getml.data.roles.numerical)
salaries.set_role("salary", getml.data.roles.target)
salaries.set_role("year", getml.data.roles.time_stamp)

salaries

name,year,playerID,teamID,salary,lgID,teamIDCat,yearID
role,time_stamp,join_key,join_key,target,categorical,categorical,numerical
unit,"time stamp, comparison only",Unnamed: 2_level_2,Unnamed: 3_level_2,Unnamed: 4_level_2,Unnamed: 5_level_2,Unnamed: 6_level_2,Unnamed: 7_level_2
0.0,1985-01-01,barkele01,ATL,870000,NL,ATL,1985
1.0,1985-01-01,bedrost01,ATL,550000,NL,ATL,1985
2.0,1985-01-01,benedbr01,ATL,545000,NL,ATL,1985
3.0,1985-01-01,campri01,ATL,633333,NL,ATL,1985
4.0,1985-01-01,ceronri01,ATL,625000,NL,ATL,1985
,...,...,...,...,...,...,...
23106.0,2012-01-01,tracych01,WAS,750000,NL,WAS,2012
23107.0,2012-01-01,wangch01,WAS,4000000,NL,WAS,2012
23108.0,2012-01-01,werthja01,WAS,13571428,NL,WAS,2012
23109.0,2012-01-01,zimmejo02,WAS,2300000,NL,WAS,2012


## 2. Predictive modeling

We loaded the data and defined the roles and units. Next, we create a getML pipeline for relational learning.

In [25]:
split = getml.data.split.random(train=0.8, test=0.2)

### 2.1 Define relational model

In [26]:
star_schema = getml.data.StarSchema(population=salaries, split=split)

star_schema.join(
    allstarfull,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    awardsplayers,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    awardsshareplayers,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    batting,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    battingpost,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    fielding,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    fieldingpost,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    pitching,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema.join(
    pitchingpost,
    on="playerID",
    time_stamps="year",
    horizon=getml.data.time.days(1),
)

star_schema

Unnamed: 0,data frames,staging table
0,salaries,SALARIES__STAGING_TABLE_1
1,allstarfull,ALLSTARFULL__STAGING_TABLE_2
2,awardsplayers,AWARDSPLAYERS__STAGING_TABLE_3
3,awardsshareplayers,AWARDSSHAREPLAYERS__STAGING_TABLE_4
4,batting,BATTING__STAGING_TABLE_5
5,battingpost,BATTINGPOST__STAGING_TABLE_6
6,fielding,FIELDING__STAGING_TABLE_7
7,fieldingpost,FIELDINGPOST__STAGING_TABLE_8
8,pitching,PITCHING__STAGING_TABLE_9
9,pitchingpost,PITCHINGPOST__STAGING_TABLE_10

Unnamed: 0,subset,name,rows,type
0,test,salaries,4539,View
1,train,salaries,18572,View

Unnamed: 0,name,rows,type
0,allstarfull,4831,DataFrame
1,awardsplayers,5795,DataFrame
2,awardsshareplayers,6289,DataFrame
3,batting,92353,DataFrame
4,battingpost,9798,DataFrame
5,fielding,137975,DataFrame
6,fieldingpost,10346,DataFrame
7,pitching,39361,DataFrame
8,pitchingpost,4197,DataFrame


### 2.2 getML pipeline

<!-- #### 2.1.1  -->
__Set-up the feature learner & predictor__

In [27]:
mapping = getml.preprocessors.Mapping()

fast_prop = getml.feature_learning.FastProp(
    loss_function=getml.feature_learning.loss_functions.SquareLoss,
    num_threads=1,
    num_features=700,
)

relboost = getml.feature_learning.Relboost(
    loss_function=getml.feature_learning.loss_functions.SquareLoss,
    num_threads=1,
    max_depth=8,
)

predictor = getml.predictors.XGBoostRegressor(n_jobs=1)

__Build the pipeline__

In [28]:
pipe1 = getml.pipeline.Pipeline(
    tags=['fast_prop'],
    data_model=star_schema.data_model,
    preprocessors=[mapping],
    feature_learners=[fast_prop],
    predictors=[predictor],
    include_categorical=True,
)

pipe1

In [29]:
pipe2 = getml.pipeline.Pipeline(
    tags=['relboost'],
    data_model=star_schema.data_model,
    preprocessors=[mapping],
    feature_learners=[relboost],
    predictors=[predictor],    
    include_categorical=True,
)

pipe2

### 2.3 Model training

In [30]:
pipe1.check(star_schema.train)

Checking data model...


Staging...

Preprocessing...

Checking...


INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and ALLSTARFULL__STAGING_TABLE_2 over 'playerID' and 'playerID', there are no corresponding entries for 64.710317% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSPLAYERS__STAGING_TABLE_3 over 'playerID' and 'playerID', there are no corresponding entries for 75.376911% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSSHAREPLAYERS__STAGING_TABLE_4 over 'playerID' and 'playerID', there are no corresponding entries for 62.459617% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__S

In [31]:
pipe1.fit(star_schema.train)

Checking data model...


Staging...


INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and ALLSTARFULL__STAGING_TABLE_2 over 'playerID' and 'playerID', there are no corresponding entries for 64.710317% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSPLAYERS__STAGING_TABLE_3 over 'playerID' and 'playerID', there are no corresponding entries for 75.376911% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSSHAREPLAYERS__STAGING_TABLE_4 over 'playerID' and 'playerID', there are no corresponding entries for 62.459617% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and BATTING__STA

In [32]:
pipe2.check(star_schema.train)

Checking data model...


Staging...

Preprocessing...

Checking...


INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and ALLSTARFULL__STAGING_TABLE_2 over 'playerID' and 'playerID', there are no corresponding entries for 64.710317% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSPLAYERS__STAGING_TABLE_3 over 'playerID' and 'playerID', there are no corresponding entries for 75.376911% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSSHAREPLAYERS__STAGING_TABLE_4 over 'playerID' and 'playerID', there are no corresponding entries for 62.459617% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__S

In [33]:
pipe2.fit(star_schema.train)

Checking data model...


Staging...


INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and ALLSTARFULL__STAGING_TABLE_2 over 'playerID' and 'playerID', there are no corresponding entries for 64.710317% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSPLAYERS__STAGING_TABLE_3 over 'playerID' and 'playerID', there are no corresponding entries for 75.376911% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and AWARDSSHAREPLAYERS__STAGING_TABLE_4 over 'playerID' and 'playerID', there are no corresponding entries for 62.459617% of entries in 'playerID' in 'SALARIES__STAGING_TABLE_1'. You might want to double-check your join keys.
INFO [FOREIGN KEYS NOT FOUND]: When joining SALARIES__STAGING_TABLE_1 and BATTING__STA

### 2.4 Model evaluation

In [34]:
pipe1.score(star_schema.test)



Staging...

Preprocessing...

FastProp: Building features...




Unnamed: 0,date time,set used,target,mae,rmse,rsquared
0,2021-08-23 17:31:34,train,salary,693195.9259,1251354.7389,0.8222
1,2021-08-23 17:35:07,test,salary,765292.5538,1402960.9382,0.788


In [35]:
pipe2.score(star_schema.test)



Staging...

Preprocessing...

Relboost: Building features...




Unnamed: 0,date time,set used,target,mae,rmse,rsquared
0,2021-08-23 17:34:57,train,salary,461650.4723,798183.2366,0.9276
1,2021-08-23 17:35:13,test,salary,668857.0513,1229058.3492,0.8371


### 2.5 featuretools

In [36]:
population_train_pd = star_schema.train.population.to_pandas()
population_test_pd = star_schema.test.population.to_pandas()

In [37]:
allstarfull_pd = allstarfull.drop(allstarfull.roles.unused).to_pandas()
awardsplayers_pd = awardsplayers.drop(awardsplayers.roles.unused).to_pandas()
awardsshareplayers_pd = awardsshareplayers.drop(awardsshareplayers.roles.unused).to_pandas()
batting_pd = batting.drop(batting.roles.unused).to_pandas()
battingpost_pd = battingpost.drop(battingpost.roles.unused).to_pandas()
fielding_pd = fielding.drop(fielding.roles.unused).to_pandas()
fieldingpost_pd = fieldingpost.drop(fieldingpost.roles.unused).to_pandas()
pitching_pd = pitching.drop(pitching.roles.unused).to_pandas()
pitchingpost_pd = pitchingpost.drop(pitchingpost.roles.unused).to_pandas()

featuretools requires that we manually define a primary key and then join on that primary key. Therefore, we need some manual data preparation.

In [38]:
population_train_pd["id"] = population_train_pd.index
population_train_pd

Unnamed: 0,year,playerID,teamID,salary,lgID,teamIDCat,yearID,id
0,1985-01-01,barkele01,ATL,870000.0,NL,ATL,1985.0,0
1,1985-01-01,bedrost01,ATL,550000.0,NL,ATL,1985.0,1
2,1985-01-01,benedbr01,ATL,545000.0,NL,ATL,1985.0,2
3,1985-01-01,ceronri01,ATL,625000.0,NL,ATL,1985.0,3
4,1985-01-01,chambch01,ATL,800000.0,NL,ATL,1985.0,4
...,...,...,...,...,...,...,...,...
18567,2012-01-01,strasst01,WAS,4875000.0,NL,WAS,2012.0,18567
18568,2012-01-01,tracych01,WAS,750000.0,NL,WAS,2012.0,18568
18569,2012-01-01,wangch01,WAS,4000000.0,NL,WAS,2012.0,18569
18570,2012-01-01,werthja01,WAS,13571428.0,NL,WAS,2012.0,18570


In [39]:
population_test_pd["id"] = population_test_pd.index
population_test_pd

Unnamed: 0,year,playerID,teamID,salary,lgID,teamIDCat,yearID,id
0,1985-01-01,campri01,ATL,633333.0,NL,ATL,1985.0,0
1,1985-01-01,dedmoje01,ATL,150000.0,NL,ATL,1985.0,1
2,1985-01-01,hornebo01,ATL,1500000.0,NL,ATL,1985.0,2
3,1985-01-01,dempsri01,BAL,512500.0,AL,BAL,1985.0,3
4,1985-01-01,martide01,BAL,560000.0,AL,BAL,1985.0,4
...,...,...,...,...,...,...,...,...
4534,2012-01-01,desmoia01,WAS,512500.0,NL,WAS,2012.0,4534
4535,2012-01-01,espinda01,WAS,506000.0,NL,WAS,2012.0,4535
4536,2012-01-01,gorzeto01,WAS,3000000.0,NL,WAS,2012.0,4536
4537,2012-01-01,matthry01,WAS,481000.0,NL,WAS,2012.0,4537


In [40]:
def prepare_peripheral(peripheral_pd, train_or_test):
    """
    Helper function that imitates the behavior of 
    the data model defined above.
    """
    peripheral_new = peripheral_pd.merge(
        train_or_test[["id", "playerID", "year"]],
        on="playerID"
    )

    peripheral_new = peripheral_new[
        peripheral_new["year_x"] < peripheral_new["year_y"]
    ]

    del peripheral_new["year_x"]
    del peripheral_new["year_y"]
    del peripheral_new["playerID"]

    return peripheral_new

In [41]:
allstarfull_train_pd = prepare_peripheral(allstarfull_pd, population_train_pd)
allstarfull_test_pd = prepare_peripheral(allstarfull_pd, population_test_pd)
allstarfull_train_pd

Unnamed: 0,gameID,teamID,lgID,gameNum,GP,startingPos,id
1,NLS198607150,BAL,AL,0.0,1.0,,1051
2,NLS198607150,BAL,AL,0.0,1.0,,1569
3,NLS198607150,BAL,AL,0.0,1.0,,2426
11,NLS200407130,PHI,NL,0.0,1.0,,13684
12,NLS200407130,PHI,NL,0.0,1.0,,14357
...,...,...,...,...,...,...,...
20490,NLS200607110,OAK,AL,0.0,1.0,,17789
20491,NLS200607110,OAK,AL,0.0,1.0,,18456
20494,NLS200907140,TBA,AL,0.0,1.0,,17158
20495,NLS200907140,TBA,AL,0.0,1.0,,17836


In [42]:
awardsplayers_train_pd = prepare_peripheral(awardsplayers_pd, population_train_pd)
awardsplayers_test_pd = prepare_peripheral(awardsplayers_pd, population_test_pd)
awardsplayers_train_pd

Unnamed: 0,awardID,lgID,notes,id
0,TSN Player of the Year,AL,,4683
1,TSN Player of the Year,AL,,5856
2,TSN Player of the Year,AL,,8922
3,TSN Player of the Year,AL,,9727
4,Rookie of the Year,NL,,120
...,...,...,...,...
26203,Hank Aaron Award,AL,,18529
26209,Silver Slugger,AL,OF,18529
26213,NLCS MVP,NL,,17779
26228,Roberto Clemente Award,ML,,17310


In [43]:
awardsshareplayers_train_pd = prepare_peripheral(awardsshareplayers_pd, population_train_pd)
awardsshareplayers_test_pd = prepare_peripheral(awardsshareplayers_pd, population_test_pd)
awardsshareplayers_train_pd

Unnamed: 0,awardID,lgID,pointsWon,pointsMax,votesFirst,id
0,Cy Young,NL,1.0,24.0,1.0,254
1,Cy Young,NL,1.0,24.0,1.0,614
2,Cy Young,NL,15.0,120.0,1.0,254
3,Cy Young,NL,15.0,120.0,1.0,614
4,Cy Young,NL,10.0,120.0,0.0,254
...,...,...,...,...,...,...
24216,Rookie of the Year,AL,3.0,140.0,0.0,16986
24217,Rookie of the Year,AL,3.0,140.0,0.0,18315
24218,Rookie of the Year,NL,9.0,160.0,0.0,16703
24219,Rookie of the Year,NL,9.0,160.0,0.0,17370


In [44]:
batting_train_pd = prepare_peripheral(batting_pd, population_train_pd)
batting_test_pd = prepare_peripheral(batting_pd, population_test_pd)
batting_train_pd

Unnamed: 0,teamID,lgID,stint,G,G_batting,AB,R,H,2B,3B,...,CS,BB,SO,IBB,HBP,SH,SF,GIDP,G_old,id
1,SFN,NL,1.0,11.0,11.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.0,14656
2,SFN,NL,1.0,11.0,11.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.0,16389
3,SFN,NL,1.0,11.0,11.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.0,17069
4,SFN,NL,1.0,11.0,11.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.0,17743
5,SFN,NL,1.0,11.0,11.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,11.0,18263
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
208118,ATL,NL,1.0,11.0,11.0,25.0,2.0,5.0,1.0,0.0,...,0.0,2.0,3.0,0.0,0.0,0.0,0.0,0.0,11.0,2109
208119,ATL,NL,1.0,81.0,81.0,190.0,16.0,48.0,8.0,1.0,...,0.0,16.0,14.0,1.0,0.0,4.0,0.0,3.0,81.0,2109
208120,NYA,AL,1.0,21.0,21.0,48.0,2.0,4.0,1.0,0.0,...,0.0,5.0,4.0,0.0,0.0,4.0,0.0,1.0,21.0,2109
208121,NYA,AL,1.0,14.0,14.0,34.0,2.0,6.0,0.0,0.0,...,0.0,0.0,4.0,0.0,0.0,2.0,0.0,1.0,14.0,2109


In [45]:
battingpost_train_pd = prepare_peripheral(battingpost_pd, population_train_pd)
battingpost_test_pd = prepare_peripheral(battingpost_pd, population_test_pd)
battingpost_train_pd

Unnamed: 0,teamID,round,lgID,G,AB,R,H,2B,3B,HR,...,SB,CS,BB,SO,IBB,HBP,SH,SF,GIDP,id
0,SLN,WS,NL,1.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,301
1,SLN,WS,NL,1.0,1.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,533
2,SLN,WS,NL,2.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,301
3,SLN,WS,NL,2.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,533
4,PHI,NLCS,NL,1.0,2.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,301
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
49525,ARI,NLDS2,NL,3.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,,,,,,17924
49530,ARI,NLDS2,NL,1.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,1.0,0.0,,,,,,17926
49544,TEX,WS,AL,3.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,,,,,,18503
49549,TEX,WS,AL,3.0,0.0,0.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,,,,,,18508


In [46]:
fielding_train_pd = prepare_peripheral(fielding_pd, population_train_pd)
fielding_test_pd = prepare_peripheral(fielding_pd, population_test_pd)
fielding_train_pd

Unnamed: 0,teamID,lgID,POS,stint,G,GS,InnOuts,PO,A,E,DP,id
1,SFN,NL,P,1.0,11.0,0.0,32.0,0.0,0.0,0.0,0.0,14656
2,SFN,NL,P,1.0,11.0,0.0,32.0,0.0,0.0,0.0,0.0,16389
3,SFN,NL,P,1.0,11.0,0.0,32.0,0.0,0.0,0.0,0.0,17069
4,SFN,NL,P,1.0,11.0,0.0,32.0,0.0,0.0,0.0,0.0,17743
5,SFN,NL,P,1.0,11.0,0.0,32.0,0.0,0.0,0.0,0.0,18263
...,...,...,...,...,...,...,...,...,...,...,...,...
361010,NYA,AL,SS,1.0,21.0,18.0,453.0,30.0,54.0,3.0,12.0,2109
361011,NYA,AL,2B,1.0,7.0,6.0,168.0,17.0,18.0,0.0,5.0,2109
361012,NYA,AL,3B,1.0,1.0,0.0,3.0,0.0,0.0,0.0,0.0,2109
361013,NYA,AL,SS,1.0,6.0,1.0,70.0,3.0,7.0,0.0,0.0,2109


In [47]:
fieldingpost_train_pd = prepare_peripheral(fieldingpost_pd, population_train_pd)
fieldingpost_test_pd = prepare_peripheral(fieldingpost_pd, population_test_pd)
fieldingpost_train_pd

Unnamed: 0,teamID,lgID,round,POS,G,GS,InnOuts,PO,A,E,DP,TP,SB,CS,id
0,CAL,AL,ALCS,P,2.0,0.0,15.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,454
1,CAL,AL,ALCS,P,2.0,0.0,15.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1051
2,CAL,AL,ALCS,P,2.0,0.0,15.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,1569
3,CAL,AL,ALCS,P,2.0,0.0,15.0,0.0,1.0,0.0,0.0,0.0,0.0,0.0,2426
11,FLO,NL,NLCS,2B,2.0,2.0,51.0,4.0,1.0,0.0,0.0,0.0,,,8818
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
58087,OAK,AL,ALCS,P,1.0,1.0,11.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,17789
58088,OAK,AL,ALCS,P,1.0,1.0,11.0,0.0,2.0,0.0,0.0,0.0,0.0,0.0,18456
58095,OAK,AL,ALDS2,P,1.0,1.0,24.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,17114
58096,OAK,AL,ALDS2,P,1.0,1.0,24.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,17789


In [48]:
pitching_train_pd = prepare_peripheral(pitching_pd, population_train_pd)
pitching_test_pd = prepare_peripheral(pitching_pd, population_test_pd)
pitching_train_pd

Unnamed: 0,teamID,lgID,stint,W,L,G,GS,CG,SHO,SV,...,BAOpp,ERA,IBB,WP,HBP,BK,BFP,GF,R,id
1,SFN,NL,1.0,1.0,0.0,11.0,0.0,0.0,0.0,0.0,...,0.0,6.0,0.0,0.0,2.0,0.0,61.0,5.0,8.0,14656
2,SFN,NL,1.0,1.0,0.0,11.0,0.0,0.0,0.0,0.0,...,0.0,6.0,0.0,0.0,2.0,0.0,61.0,5.0,8.0,16389
3,SFN,NL,1.0,1.0,0.0,11.0,0.0,0.0,0.0,0.0,...,0.0,6.0,0.0,0.0,2.0,0.0,61.0,5.0,8.0,17069
4,SFN,NL,1.0,1.0,0.0,11.0,0.0,0.0,0.0,0.0,...,0.0,6.0,0.0,0.0,2.0,0.0,61.0,5.0,8.0,17743
5,SFN,NL,1.0,1.0,0.0,11.0,0.0,0.0,0.0,0.0,...,0.0,6.0,0.0,0.0,2.0,0.0,61.0,5.0,8.0,18263
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
91802,SFN,NL,1.0,10.0,13.0,33.0,33.0,1.0,0.0,0.0,...,,4.0,8.0,2.0,8.0,2.0,818.0,0.0,89.0,17789
91803,SFN,NL,1.0,10.0,13.0,33.0,33.0,1.0,0.0,0.0,...,,4.0,8.0,2.0,8.0,2.0,818.0,0.0,89.0,18456
91811,SFN,NL,1.0,9.0,14.0,34.0,33.0,1.0,0.0,0.0,...,,4.0,7.0,7.0,7.0,0.0,848.0,1.0,97.0,17789
91812,SFN,NL,1.0,9.0,14.0,34.0,33.0,1.0,0.0,0.0,...,,4.0,7.0,7.0,7.0,0.0,848.0,1.0,97.0,18456


In [49]:
pitchingpost_train_pd = prepare_peripheral(pitchingpost_pd, population_train_pd)
pitchingpost_test_pd = prepare_peripheral(pitchingpost_pd, population_test_pd)
pitchingpost_train_pd

Unnamed: 0,teamID,round,lgID,W,L,G,GS,CG,SHO,SV,...,WP,HBP,BK,BFP,GF,R,SH,SF,GIDP,id
0,CAL,ALCS,AL,1.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,20.0,2.0,1.0,0.0,1.0,0.0,454
1,CAL,ALCS,AL,1.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,20.0,2.0,1.0,0.0,1.0,0.0,1051
2,CAL,ALCS,AL,1.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,20.0,2.0,1.0,0.0,1.0,0.0,1569
3,CAL,ALCS,AL,1.0,0.0,2.0,0.0,0.0,0.0,0.0,...,0.0,0.0,0.0,20.0,2.0,1.0,0.0,1.0,0.0,2426
7,SEA,ALCS,AL,0.0,1.0,1.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,21.0,0.0,3.0,0.0,0.0,0.0,11801
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
24239,OAK,ALCS,AL,0.0,1.0,1.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,21.0,0.0,5.0,0.0,0.0,0.0,17789
24240,OAK,ALCS,AL,0.0,1.0,1.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,21.0,0.0,5.0,0.0,0.0,0.0,18456
24247,OAK,ALDS2,AL,1.0,0.0,1.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,30.0,0.0,1.0,1.0,0.0,0.0,17114
24248,OAK,ALDS2,AL,1.0,0.0,1.0,1.0,0.0,0.0,0.0,...,0.0,0.0,0.0,30.0,0.0,1.0,1.0,0.0,0.0,17789


In [50]:
population_train_pd

Unnamed: 0,year,playerID,teamID,salary,lgID,teamIDCat,yearID,id
0,1985-01-01,barkele01,ATL,870000.0,NL,ATL,1985.0,0
1,1985-01-01,bedrost01,ATL,550000.0,NL,ATL,1985.0,1
2,1985-01-01,benedbr01,ATL,545000.0,NL,ATL,1985.0,2
3,1985-01-01,ceronri01,ATL,625000.0,NL,ATL,1985.0,3
4,1985-01-01,chambch01,ATL,800000.0,NL,ATL,1985.0,4
...,...,...,...,...,...,...,...,...
18567,2012-01-01,strasst01,WAS,4875000.0,NL,WAS,2012.0,18567
18568,2012-01-01,tracych01,WAS,750000.0,NL,WAS,2012.0,18568
18569,2012-01-01,wangch01,WAS,4000000.0,NL,WAS,2012.0,18569
18570,2012-01-01,werthja01,WAS,13571428.0,NL,WAS,2012.0,18570


In [51]:
entities_train = {
    "population" : (population_train_pd, "id"),
    "allstarfull": (allstarfull_train_pd, "index"),
    "awardsplayers": (awardsplayers_train_pd, "index"),
    "awardsshareplayers": (awardsshareplayers_train_pd, "index"),
    "batting": (batting_train_pd, "index"),
    "battingpost": (battingpost_train_pd, "index"),
    "fielding": (fielding_train_pd, "index"),
    "fieldingpost": (fieldingpost_train_pd, "index"),
    "pitching": (pitching_train_pd, "index"),
    "pitchingpost": (pitchingpost_train_pd, "index"),
}

In [52]:
entities_test = {
    "population" : (population_test_pd, "id"),
    "allstarfull": (allstarfull_test_pd, "index"),
    "awardsplayers": (awardsplayers_test_pd, "index"),
    "awardsshareplayers": (awardsshareplayers_test_pd, "index"),
    "batting": (batting_test_pd, "index"),
    "battingpost": (battingpost_test_pd, "index"),
    "fielding": (fielding_test_pd, "index"),
    "fieldingpost": (fieldingpost_test_pd, "index"),
    "pitching": (pitching_test_pd, "index"),
    "pitchingpost": (pitchingpost_test_pd, "index"),
}

In [53]:
relationships = [
    ("population", "id", "allstarfull", "id"),
    ("population", "id", "awardsplayers", "id"),
    ("population", "id", "awardsshareplayers", "id"),
    ("population", "id", "batting", "id"),
    ("population", "id", "battingpost", "id"),
    ("population", "id", "fielding", "id"),
    ("population", "id", "fieldingpost", "id"),
    ("population", "id", "pitching", "id"),
    ("population", "id", "pitchingpost", "id"),
]

In [54]:
featuretools_train_pd = featuretools.dfs(
    entities=entities_train,
    relationships=relationships,
    target_entity="population")[0]



In [55]:
featuretools_test_pd = featuretools.dfs(
    entities=entities_test,
    relationships=relationships,
    target_entity="population")[0]

In [56]:
featuretools_train = getml.data.DataFrame.from_pandas(featuretools_train_pd, "featuretools_train")
featuretools_test = getml.data.DataFrame.from_pandas(featuretools_test_pd, "featuretools_test")

In [57]:
featuretools_train.set_role("salary", getml.data.roles.target)
featuretools_train.set_role(featuretools_train.roles.unused_float, getml.data.roles.numerical)
featuretools_train.set_role(featuretools_train.roles.unused_string, getml.data.roles.categorical)

featuretools_train

91.880250% of all entries of column 'MAX(allstarfull.startingPos)' are NULL values.
91.880250% of all entries of column 'MEAN(allstarfull.startingPos)' are NULL values.
91.880250% of all entries of column 'MIN(allstarfull.startingPos)' are NULL values.
91.923325% of all entries of column 'SKEW(allstarfull.GP)' are NULL values.
91.912557% of all entries of column 'SKEW(allstarfull.gameNum)' are NULL values.
97.571613% of all entries of column 'SKEW(allstarfull.startingPos)' are NULL values.
96.047814% of all entries of column 'STD(allstarfull.startingPos)' are NULL values.
91.158734% of all entries of column 'SKEW(pitchingpost.BAOpp)' are NULL values.
91.142580% of all entries of column 'SKEW(pitchingpost.BB)' are NULL values.
91.142580% of all entries of column 'SKEW(pitchingpost.BFP)' are NULL values.
91.142580% of all entries of column 'SKEW(pitchingpost.BK)' are NULL values.
91.142580% of all entries of column 'SKEW(pitchingpost.CG)' are NULL values.
91.142580% of all entries of col

name,salary,playerID,teamID,lgID,teamIDCat,MODE(allstarfull.gameID),MODE(allstarfull.lgID),MODE(allstarfull.teamID),MODE(awardsplayers.awardID),MODE(awardsplayers.lgID),MODE(awardsplayers.notes),MODE(awardsshareplayers.awardID),MODE(awardsshareplayers.lgID),MODE(batting.lgID),MODE(batting.teamID),MODE(battingpost.lgID),MODE(battingpost.round),MODE(battingpost.teamID),MODE(fielding.POS),MODE(fielding.lgID),MODE(fielding.teamID),MODE(fieldingpost.POS),MODE(fieldingpost.lgID),MODE(fieldingpost.round),MODE(fieldingpost.teamID),MODE(pitching.lgID),MODE(pitching.teamID),MODE(pitchingpost.lgID),MODE(pitchingpost.round),MODE(pitchingpost.teamID),yearID,COUNT(allstarfull),MAX(allstarfull.GP),MAX(allstarfull.gameNum),MAX(allstarfull.startingPos),MEAN(allstarfull.GP),MEAN(allstarfull.gameNum),MEAN(allstarfull.startingPos),MIN(allstarfull.GP),MIN(allstarfull.gameNum),MIN(allstarfull.startingPos),NUM_UNIQUE(allstarfull.gameID),NUM_UNIQUE(allstarfull.lgID),NUM_UNIQUE(allstarfull.teamID),SKEW(allstarfull.GP),SKEW(allstarfull.gameNum),SKEW(allstarfull.startingPos),STD(allstarfull.GP),STD(allstarfull.gameNum),STD(allstarfull.startingPos),SUM(allstarfull.GP),SUM(allstarfull.gameNum),SUM(allstarfull.startingPos),COUNT(awardsplayers),NUM_UNIQUE(awardsplayers.awardID),NUM_UNIQUE(awardsplayers.lgID),NUM_UNIQUE(awardsplayers.notes),COUNT(awardsshareplayers),MAX(awardsshareplayers.pointsMax),MAX(awardsshareplayers.pointsWon),MAX(awardsshareplayers.votesFirst),MEAN(awardsshareplayers.pointsMax),MEAN(awardsshareplayers.pointsWon),MEAN(awardsshareplayers.votesFirst),MIN(awardsshareplayers.pointsMax),MIN(awardsshareplayers.pointsWon),MIN(awardsshareplayers.votesFirst),NUM_UNIQUE(awardsshareplayers.awardID),NUM_UNIQUE(awardsshareplayers.lgID),SKEW(awardsshareplayers.pointsMax),SKEW(awardsshareplayers.pointsWon),SKEW(awardsshareplayers.votesFirst),STD(awardsshareplayers.pointsMax),STD(awardsshareplayers.pointsWon),STD(awardsshareplayers.votesFirst),SUM(awardsshareplayers.pointsMax),SUM(awardsshareplayers.pointsWon),SUM(awardsshareplayers.votesFirst),COUNT(batting),MAX(batting.2B),MAX(batting.3B),MAX(batting.AB),MAX(batting.BB),MAX(batting.CS),MAX(batting.G),MAX(batting.GIDP),MAX(batting.G_batting),MAX(batting.G_old),MAX(batting.H),MAX(batting.HBP),MAX(batting.HR),MAX(batting.IBB),MAX(batting.R),MAX(batting.RBI),MAX(batting.SB),MAX(batting.SF),MAX(batting.SH),MAX(batting.SO),MAX(batting.stint),MEAN(batting.2B),MEAN(batting.3B),MEAN(batting.AB),MEAN(batting.BB),MEAN(batting.CS),MEAN(batting.G),MEAN(batting.GIDP),MEAN(batting.G_batting),MEAN(batting.G_old),MEAN(batting.H),MEAN(batting.HBP),MEAN(batting.HR),MEAN(batting.IBB),MEAN(batting.R),MEAN(batting.RBI),MEAN(batting.SB),MEAN(batting.SF),MEAN(batting.SH),MEAN(batting.SO),MEAN(batting.stint),MIN(batting.2B),MIN(batting.3B),MIN(batting.AB),MIN(batting.BB),MIN(batting.CS),MIN(batting.G),MIN(batting.GIDP),MIN(batting.G_batting),MIN(batting.G_old),MIN(batting.H),MIN(batting.HBP),MIN(batting.HR),MIN(batting.IBB),MIN(batting.R),MIN(batting.RBI),MIN(batting.SB),MIN(batting.SF),MIN(batting.SH),MIN(batting.SO),MIN(batting.stint),NUM_UNIQUE(batting.lgID),NUM_UNIQUE(batting.teamID),SKEW(batting.2B),SKEW(batting.3B),SKEW(batting.AB),SKEW(batting.BB),SKEW(batting.CS),SKEW(batting.G),SKEW(batting.GIDP),SKEW(batting.G_batting),SKEW(batting.G_old),SKEW(batting.H),SKEW(batting.HBP),SKEW(batting.HR),SKEW(batting.IBB),SKEW(batting.R),SKEW(batting.RBI),SKEW(batting.SB),SKEW(batting.SF),SKEW(batting.SH),SKEW(batting.SO),SKEW(batting.stint),STD(batting.2B),STD(batting.3B),STD(batting.AB),STD(batting.BB),STD(batting.CS),STD(batting.G),STD(batting.GIDP),STD(batting.G_batting),STD(batting.G_old),STD(batting.H),STD(batting.HBP),STD(batting.HR),STD(batting.IBB),STD(batting.R),STD(batting.RBI),STD(batting.SB),STD(batting.SF),STD(batting.SH),STD(batting.SO),STD(batting.stint),SUM(batting.2B),SUM(batting.3B),SUM(batting.AB),SUM(batting.BB),SUM(batting.CS),SUM(batting.G),SUM(batting.GIDP),SUM(batting.G_batting),SUM(batting.G_old),SUM(batting.H),SUM(batting.HBP),SUM(batting.HR),SUM(batting.IBB),SUM(batting.R),SUM(batting.RBI),SUM(batting.SB),SUM(batting.SF),SUM(batting.SH),SUM(batting.SO),SUM(batting.stint),COUNT(battingpost),MAX(battingpost.2B),MAX(battingpost.3B),MAX(battingpost.AB),MAX(battingpost.BB),MAX(battingpost.CS),MAX(battingpost.G),MAX(battingpost.GIDP),MAX(battingpost.H),MAX(battingpost.HBP),MAX(battingpost.HR),MAX(battingpost.IBB),MAX(battingpost.R),MAX(battingpost.RBI),MAX(battingpost.SB),MAX(battingpost.SF),MAX(battingpost.SH),MAX(battingpost.SO),MEAN(battingpost.2B),MEAN(battingpost.3B),MEAN(battingpost.AB),MEAN(battingpost.BB),MEAN(battingpost.CS),MEAN(battingpost.G),MEAN(battingpost.GIDP),MEAN(battingpost.H),MEAN(battingpost.HBP),MEAN(battingpost.HR),MEAN(battingpost.IBB),MEAN(battingpost.R),MEAN(battingpost.RBI),MEAN(battingpost.SB),MEAN(battingpost.SF),MEAN(battingpost.SH),MEAN(battingpost.SO),MIN(battingpost.2B),MIN(battingpost.3B),MIN(battingpost.AB),MIN(battingpost.BB),MIN(battingpost.CS),MIN(battingpost.G),MIN(battingpost.GIDP),MIN(battingpost.H),MIN(battingpost.HBP),MIN(battingpost.HR),MIN(battingpost.IBB),MIN(battingpost.R),MIN(battingpost.RBI),MIN(battingpost.SB),MIN(battingpost.SF),MIN(battingpost.SH),MIN(battingpost.SO),NUM_UNIQUE(battingpost.lgID),NUM_UNIQUE(battingpost.round),NUM_UNIQUE(battingpost.teamID),SKEW(battingpost.2B),SKEW(battingpost.3B),SKEW(battingpost.AB),SKEW(battingpost.BB),SKEW(battingpost.CS),SKEW(battingpost.G),SKEW(battingpost.GIDP),SKEW(battingpost.H),SKEW(battingpost.HBP),SKEW(battingpost.HR),SKEW(battingpost.IBB),SKEW(battingpost.R),SKEW(battingpost.RBI),SKEW(battingpost.SB),SKEW(battingpost.SF),SKEW(battingpost.SH),SKEW(battingpost.SO),STD(battingpost.2B),STD(battingpost.3B),STD(battingpost.AB),STD(battingpost.BB),STD(battingpost.CS),STD(battingpost.G),STD(battingpost.GIDP),STD(battingpost.H),STD(battingpost.HBP),STD(battingpost.HR),STD(battingpost.IBB),STD(battingpost.R),STD(battingpost.RBI),STD(battingpost.SB),STD(battingpost.SF),STD(battingpost.SH),STD(battingpost.SO),SUM(battingpost.2B),SUM(battingpost.3B),SUM(battingpost.AB),SUM(battingpost.BB),SUM(battingpost.CS),SUM(battingpost.G),SUM(battingpost.GIDP),SUM(battingpost.H),SUM(battingpost.HBP),SUM(battingpost.HR),SUM(battingpost.IBB),SUM(battingpost.R),SUM(battingpost.RBI),SUM(battingpost.SB),SUM(battingpost.SF),SUM(battingpost.SH),SUM(battingpost.SO),COUNT(fielding),MAX(fielding.A),MAX(fielding.DP),MAX(fielding.E),MAX(fielding.G),MAX(fielding.GS),MAX(fielding.InnOuts),MAX(fielding.PO),MAX(fielding.stint),MEAN(fielding.A),MEAN(fielding.DP),MEAN(fielding.E),MEAN(fielding.G),MEAN(fielding.GS),MEAN(fielding.InnOuts),MEAN(fielding.PO),MEAN(fielding.stint),MIN(fielding.A),MIN(fielding.DP),MIN(fielding.E),MIN(fielding.G),MIN(fielding.GS),MIN(fielding.InnOuts),MIN(fielding.PO),MIN(fielding.stint),NUM_UNIQUE(fielding.POS),NUM_UNIQUE(fielding.lgID),NUM_UNIQUE(fielding.teamID),SKEW(fielding.A),SKEW(fielding.DP),SKEW(fielding.E),SKEW(fielding.G),SKEW(fielding.GS),SKEW(fielding.InnOuts),SKEW(fielding.PO),SKEW(fielding.stint),STD(fielding.A),STD(fielding.DP),STD(fielding.E),STD(fielding.G),STD(fielding.GS),STD(fielding.InnOuts),STD(fielding.PO),STD(fielding.stint),SUM(fielding.A),SUM(fielding.DP),SUM(fielding.E),SUM(fielding.G),SUM(fielding.GS),SUM(fielding.InnOuts),SUM(fielding.PO),SUM(fielding.stint),COUNT(fieldingpost),MAX(fieldingpost.A),MAX(fieldingpost.CS),MAX(fieldingpost.DP),MAX(fieldingpost.E),MAX(fieldingpost.G),MAX(fieldingpost.GS),MAX(fieldingpost.InnOuts),MAX(fieldingpost.PO),MAX(fieldingpost.SB),MAX(fieldingpost.TP),MEAN(fieldingpost.A),MEAN(fieldingpost.CS),MEAN(fieldingpost.DP),MEAN(fieldingpost.E),MEAN(fieldingpost.G),MEAN(fieldingpost.GS),MEAN(fieldingpost.InnOuts),MEAN(fieldingpost.PO),MEAN(fieldingpost.SB),MEAN(fieldingpost.TP),MIN(fieldingpost.A),MIN(fieldingpost.CS),MIN(fieldingpost.DP),MIN(fieldingpost.E),MIN(fieldingpost.G),MIN(fieldingpost.GS),MIN(fieldingpost.InnOuts),MIN(fieldingpost.PO),MIN(fieldingpost.SB),MIN(fieldingpost.TP),NUM_UNIQUE(fieldingpost.POS),NUM_UNIQUE(fieldingpost.lgID),NUM_UNIQUE(fieldingpost.round),NUM_UNIQUE(fieldingpost.teamID),SKEW(fieldingpost.A),SKEW(fieldingpost.CS),SKEW(fieldingpost.DP),SKEW(fieldingpost.E),SKEW(fieldingpost.G),SKEW(fieldingpost.GS),SKEW(fieldingpost.InnOuts),SKEW(fieldingpost.PO),SKEW(fieldingpost.SB),SKEW(fieldingpost.TP),STD(fieldingpost.A),STD(fieldingpost.CS),STD(fieldingpost.DP),STD(fieldingpost.E),STD(fieldingpost.G),STD(fieldingpost.GS),STD(fieldingpost.InnOuts),STD(fieldingpost.PO),STD(fieldingpost.SB),STD(fieldingpost.TP),SUM(fieldingpost.A),SUM(fieldingpost.CS),SUM(fieldingpost.DP),SUM(fieldingpost.E),SUM(fieldingpost.G),SUM(fieldingpost.GS),SUM(fieldingpost.InnOuts),SUM(fieldingpost.PO),SUM(fieldingpost.SB),SUM(fieldingpost.TP),COUNT(pitching),MAX(pitching.BAOpp),MAX(pitching.BB),MAX(pitching.BFP),MAX(pitching.BK),MAX(pitching.CG),MAX(pitching.ER),MAX(pitching.ERA),MAX(pitching.G),MAX(pitching.GF),MAX(pitching.GS),MAX(pitching.H),MAX(pitching.HBP),MAX(pitching.HR),MAX(pitching.IBB),MAX(pitching.IPouts),MAX(pitching.L),MAX(pitching.R),MAX(pitching.SHO),MAX(pitching.SO),MAX(pitching.SV),MAX(pitching.W),MAX(pitching.WP),MAX(pitching.stint),MEAN(pitching.BAOpp),MEAN(pitching.BB),MEAN(pitching.BFP),MEAN(pitching.BK),MEAN(pitching.CG),MEAN(pitching.ER),MEAN(pitching.ERA),MEAN(pitching.G),MEAN(pitching.GF),MEAN(pitching.GS),MEAN(pitching.H),MEAN(pitching.HBP),MEAN(pitching.HR),MEAN(pitching.IBB),MEAN(pitching.IPouts),MEAN(pitching.L),MEAN(pitching.R),MEAN(pitching.SHO),MEAN(pitching.SO),MEAN(pitching.SV),MEAN(pitching.W),MEAN(pitching.WP),MEAN(pitching.stint),MIN(pitching.BAOpp),MIN(pitching.BB),MIN(pitching.BFP),MIN(pitching.BK),MIN(pitching.CG),MIN(pitching.ER),MIN(pitching.ERA),MIN(pitching.G),MIN(pitching.GF),MIN(pitching.GS),MIN(pitching.H),MIN(pitching.HBP),MIN(pitching.HR),MIN(pitching.IBB),MIN(pitching.IPouts),MIN(pitching.L),MIN(pitching.R),MIN(pitching.SHO),MIN(pitching.SO),MIN(pitching.SV),MIN(pitching.W),MIN(pitching.WP),MIN(pitching.stint),NUM_UNIQUE(pitching.lgID),NUM_UNIQUE(pitching.teamID),SKEW(pitching.BAOpp),SKEW(pitching.BB),SKEW(pitching.BFP),SKEW(pitching.BK),SKEW(pitching.CG),SKEW(pitching.ER),SKEW(pitching.ERA),SKEW(pitching.G),SKEW(pitching.GF),SKEW(pitching.GS),SKEW(pitching.H),SKEW(pitching.HBP),SKEW(pitching.HR),SKEW(pitching.IBB),SKEW(pitching.IPouts),SKEW(pitching.L),SKEW(pitching.R),SKEW(pitching.SHO),SKEW(pitching.SO),SKEW(pitching.SV),SKEW(pitching.W),SKEW(pitching.WP),SKEW(pitching.stint),STD(pitching.BAOpp),STD(pitching.BB),STD(pitching.BFP),STD(pitching.BK),STD(pitching.CG),STD(pitching.ER),STD(pitching.ERA),STD(pitching.G),STD(pitching.GF),STD(pitching.GS),STD(pitching.H),STD(pitching.HBP),STD(pitching.HR),STD(pitching.IBB),STD(pitching.IPouts),STD(pitching.L),STD(pitching.R),STD(pitching.SHO),STD(pitching.SO),STD(pitching.SV),STD(pitching.W),STD(pitching.WP),STD(pitching.stint),SUM(pitching.BAOpp),SUM(pitching.BB),SUM(pitching.BFP),SUM(pitching.BK),SUM(pitching.CG),SUM(pitching.ER),SUM(pitching.ERA),SUM(pitching.G),SUM(pitching.GF),SUM(pitching.GS),SUM(pitching.H),SUM(pitching.HBP),SUM(pitching.HR),SUM(pitching.IBB),SUM(pitching.IPouts),SUM(pitching.L),SUM(pitching.R),SUM(pitching.SHO),SUM(pitching.SO),SUM(pitching.SV),SUM(pitching.W),SUM(pitching.WP),SUM(pitching.stint),COUNT(pitchingpost),MAX(pitchingpost.BAOpp),MAX(pitchingpost.BB),MAX(pitchingpost.BFP),MAX(pitchingpost.BK),MAX(pitchingpost.CG),MAX(pitchingpost.ER),MAX(pitchingpost.ERA),MAX(pitchingpost.G),MAX(pitchingpost.GF),MAX(pitchingpost.GIDP),MAX(pitchingpost.GS),MAX(pitchingpost.H),MAX(pitchingpost.HBP),MAX(pitchingpost.HR),MAX(pitchingpost.IBB),MAX(pitchingpost.IPouts),MAX(pitchingpost.L),MAX(pitchingpost.R),MAX(pitchingpost.SF),MAX(pitchingpost.SH),MAX(pitchingpost.SHO),MAX(pitchingpost.SO),MAX(pitchingpost.SV),MAX(pitchingpost.W),MAX(pitchingpost.WP),MEAN(pitchingpost.BAOpp),MEAN(pitchingpost.BB),MEAN(pitchingpost.BFP),MEAN(pitchingpost.BK),MEAN(pitchingpost.CG),MEAN(pitchingpost.ER),MEAN(pitchingpost.ERA),MEAN(pitchingpost.G),MEAN(pitchingpost.GF),MEAN(pitchingpost.GIDP),MEAN(pitchingpost.GS),MEAN(pitchingpost.H),MEAN(pitchingpost.HBP),MEAN(pitchingpost.HR),MEAN(pitchingpost.IBB),MEAN(pitchingpost.IPouts),MEAN(pitchingpost.L),MEAN(pitchingpost.R),MEAN(pitchingpost.SF),MEAN(pitchingpost.SH),MEAN(pitchingpost.SHO),MEAN(pitchingpost.SO),MEAN(pitchingpost.SV),MEAN(pitchingpost.W),MEAN(pitchingpost.WP),MIN(pitchingpost.BAOpp),MIN(pitchingpost.BB),MIN(pitchingpost.BFP),MIN(pitchingpost.BK),MIN(pitchingpost.CG),MIN(pitchingpost.ER),MIN(pitchingpost.ERA),MIN(pitchingpost.G),MIN(pitchingpost.GF),MIN(pitchingpost.GIDP),MIN(pitchingpost.GS),MIN(pitchingpost.H),MIN(pitchingpost.HBP),MIN(pitchingpost.HR),MIN(pitchingpost.IBB),MIN(pitchingpost.IPouts),MIN(pitchingpost.L),MIN(pitchingpost.R),MIN(pitchingpost.SF),MIN(pitchingpost.SH),MIN(pitchingpost.SHO),MIN(pitchingpost.SO),MIN(pitchingpost.SV),MIN(pitchingpost.W),MIN(pitchingpost.WP),NUM_UNIQUE(pitchingpost.lgID),NUM_UNIQUE(pitchingpost.round),NUM_UNIQUE(pitchingpost.teamID),SKEW(pitchingpost.BAOpp),SKEW(pitchingpost.BB),SKEW(pitchingpost.BFP),SKEW(pitchingpost.BK),SKEW(pitchingpost.CG),SKEW(pitchingpost.ER),SKEW(pitchingpost.ERA),SKEW(pitchingpost.G),SKEW(pitchingpost.GF),SKEW(pitchingpost.GIDP),SKEW(pitchingpost.GS),SKEW(pitchingpost.H),SKEW(pitchingpost.HBP),SKEW(pitchingpost.HR),SKEW(pitchingpost.IBB),SKEW(pitchingpost.IPouts),SKEW(pitchingpost.L),SKEW(pitchingpost.R),SKEW(pitchingpost.SF),SKEW(pitchingpost.SH),SKEW(pitchingpost.SHO),SKEW(pitchingpost.SO),SKEW(pitchingpost.SV),SKEW(pitchingpost.W),SKEW(pitchingpost.WP),STD(pitchingpost.BAOpp),STD(pitchingpost.BB),STD(pitchingpost.BFP),STD(pitchingpost.BK),STD(pitchingpost.CG),STD(pitchingpost.ER),STD(pitchingpost.ERA),STD(pitchingpost.G),STD(pitchingpost.GF),STD(pitchingpost.GIDP),STD(pitchingpost.GS),STD(pitchingpost.H),STD(pitchingpost.HBP),STD(pitchingpost.HR),STD(pitchingpost.IBB),STD(pitchingpost.IPouts),STD(pitchingpost.L),STD(pitchingpost.R),STD(pitchingpost.SF),STD(pitchingpost.SH),STD(pitchingpost.SHO),STD(pitchingpost.SO),STD(pitchingpost.SV),STD(pitchingpost.W),STD(pitchingpost.WP),SUM(pitchingpost.BAOpp),SUM(pitchingpost.BB),SUM(pitchingpost.BFP),SUM(pitchingpost.BK),SUM(pitchingpost.CG),SUM(pitchingpost.ER),SUM(pitchingpost.ERA),SUM(pitchingpost.G),SUM(pitchingpost.GF),SUM(pitchingpost.GIDP),SUM(pitchingpost.GS),SUM(pitchingpost.H),SUM(pitchingpost.HBP),SUM(pitchingpost.HR),SUM(pitchingpost.IBB),SUM(pitchingpost.IPouts),SUM(pitchingpost.L),SUM(pitchingpost.R),SUM(pitchingpost.SF),SUM(pitchingpost.SH),SUM(pitchingpost.SHO),SUM(pitchingpost.SO),SUM(pitchingpost.SV),SUM(pitchingpost.W),SUM(pitchingpost.WP),DAY(year),MONTH(year),WEEKDAY(year),YEAR(year)
role,target,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical
0.0,870000,barkele01,ATL,NL,ATL,ALS198108090,AL,CLE,,,,,,AL,CLE,,,,P,AL,CLE,,,,,AL,CLE,,,,1985,1,1,0,,1,0,,1,0,,1,1,1,,,,,,,1,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,10,1,0,38,6,0,36,1,21,36,2,0,0,0,2,1,0,0,4,19,2,0.5,0,23,3,0,21.7,0.5,2.7,21.7,1.5,0,0,0,1.5,0.5,0,0,3,12,1.1,0,0,8,0,0,2,0,0,2,1,0,0,0,1,0,0,0,2,5,1,2,3,,,,,,-0.6626,,2.773,-0.6626,,,,,,,,,,,3.1623,0.7071,0,21.2132,4.2426,0,11.1759,0.7071,6.7007,11.1759,0.7071,0,0,0,0.7071,0.7071,0,0,1.4142,9.8995,0.3162,1,0,46,6,0,217,1,27,217,3,0,0,0,3,1,0,0,6,24,11,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,10,34,3,2,36,36,739,23,2,16.2,1.2,1.2,21.7,16.5,361.9,7.1,1.1,2,0,0,2,0,45,0,1,1,2,3,0.4213,0.6014,-0.4725,-0.6626,0.0872,0.3367,1.8029,3.1623,10.0863,0.9189,0.9189,11.1759,13.0491,249.7009,6.4713,0.3162,162,12,12,217,165,3619,71,11,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,10,0,92,1052,1,10,114,5,36,15,36,237,3,17,3,739,13,127,3,187,4,19,14,2,0,45.9,514.1,0.2,3.5,56.1,3.3,21.7,2.5,16.5,115.1,1.8,8,1.8,361.9,6.6,61.3,0.7,90.6,0.5,7,6.2,1.1,0,6,56,0,0,4,2,2,0,0,7,0,0,0,45,0,4,0,7,0,1,1,1,2,3,0,0.3966,0.2943,1.7788,0.7966,0.08345,0.2342,-0.6626,2.2162,0.0872,0.09068,-0.6014,0.2625,-0.6606,0.3367,-0.009864,0.1508,1.7178,0.3541,2.8525,0.9886,0.6971,3.1623,0,29.7263,351.6881,0.4216,4.0069,39.7309,0.9487,11.1759,4.9497,13.0491,78.3162,0.9189,6.7987,1.1353,249.7009,4.5019,43.5917,0.9487,63.9986,1.2693,6.0369,4.0497,0.3162,0,459,5141,2,35,561,33,217,25,165,1151,18,80,18,3619,66,613,7,906,5,70,62,11,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1985
1.0,550000,bedrost01,ATL,NL,ATL,,,,,,,Rookie of the Year,NL,NL,ATL,NL,NLCS,ATL,P,NL,ATL,P,NL,NLCS,ATL,NL,ATL,NL,NLCS,ATL,1985,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,1,120,4,0,120,4,0,120,4,0,1,1,,,,,,,120,4,0,4,0,0,26,1,0,70,0,70,70,2,0,0,0,0,0,0,0,4,9,1,0,0,16,0.25,0,47.25,0,47.25,47.25,1.25,0,0,0,0,0,0,0,1.5,5.75,1,0,0,2,0,0,15,0,15,15,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,-1.1105,2.,0,-0.7352,0,-0.7352,-0.7352,-0.8546,0,0,0,0,0,0,0,1.5396,-1.1985,0,0,0,10.0995,0.5,0,25.1048,0,25.1048,25.1048,0.9574,0,0,0,0,0,0,0,1.7321,3.4034,0,0,0,64,1,0,189,0,189,189,5,0,0,0,0,0,0,0,6,23,4,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,4,16,2,1,70,4,413,12,1,10,1.25,0.5,47.25,2.25,274.25,4.5,1,2,0,0,15,1,73,1,1,1,1,1,-0.6325,-0.8546,0,-0.7352,0.3704,-0.9572,1.5966,0,6.3246,0.9574,0.5774,25.1048,1.5,150.163,5.1962,0,40,5,2,189,9,1097,18,4,1,0,0,0,0,2,0,3,0,0,0,0,0,0,0,2,0,3,0,0,0,0,0,0,0,2,0,3,0,0,0,1,1,1,1,,,,,,,,,,,,,,,,,,,,,0,0,0,0,2,0,3,0,0,0,4,0,57,567,0,0,48,4,70,52,4,102,4,11,8,413,10,50,0,123,19,9,4,1,0,39,380.5,0,0,29.75,2.75,47.25,28.75,2.25,70.5,2.5,6.25,5,274.25,6,31.5,0,81.75,10.25,6.75,1.5,1,0,15,106,0,0,12,2,15,5,1,15,1,2,2,73,2,14,0,9,0,1,0,1,1,1,0,-0.6325,-0.9498,0,0,0.06631,0.8546,-0.7352,-0.0778,0.3704,-1.128,0,0.3579,0,-0.9572,0,0.1248,0,-1.353,-0.5695,-1.9137,0.8546,0,0,18.9737,205.4629,0,0,15.9243,0.9574,25.1048,19.2072,1.5,40.7145,1.7321,3.7749,2.4495,150.163,3.266,16.0935,0,51.7518,7.8049,3.8622,1.9149,0,0,156,1522,0,0,119,11,189,115,9,282,10,25,20,1097,24,126,0,327,41,27,6,4,1,0,1,7,0,0,2,18,2,0,0,0,3,0,0,0,3,0,2,1,0,0,2,0,0,1,0,1,7,0,0,2,18,2,0,0,0,3,0,0,0,3,0,2,1,0,0,2,0,0,1,0,1,7,0,0,2,18,2,0,0,0,3,0,0,0,3,0,2,1,0,0,2,0,0,1,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,1,7,0,0,2,18,2,0,0,0,3,0,0,0,3,0,2,1,0,0,2,0,0,1,1,1,1,1985
2.0,545000,benedbr01,ATL,NL,ATL,ALS198108090,NL,ATL,,,,,,NL,ATL,NL,NLCS,ATL,C,NL,ATL,C,NL,NLCS,ATL,,,,,,1985,2,1,0,,1,0,,1,0,,2,1,1,,,,0,0,,2,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,7,14,1,423,61,4,134,12,134,134,126,3,5,16,43,44,4,4,13,40,1,10.1429,0.7143,288.4286,33.1429,2.2857,93.5714,7.7143,93.5714,93.5714,73.7143,1.2857,2.2857,6.4286,23.4286,28.1429,1.5714,2.2857,4.5714,24.2857,1,2,0,52,6,0,22,0,22,22,13,0,0,2,3,1,0,0,0,6,1,1,1,-1.6097,-1.2296,-1.1677,0.08628,-0.7065,-1.2212,-1.1445,-1.2212,-1.2212,-0.4194,0.6817,0.0508,1.3403,-0.08253,-0.8802,1.0788,-0.7065,1.2214,-0.1462,0,4.0591,0.488,126.3683,16.0979,1.3801,37.3624,4.2706,37.3624,37.3624,36.5227,1.2536,1.8898,4.9952,13.1891,15.6677,1.3973,1.3801,4.4293,11.3242,0,71,5,2019,232,16,655,54,655,655,516,9,16,45,164,197,11,16,32,170,7,1,1,0,8,2,0,3,0,2,0,0,0,1,0,0,0,0,1,1,0,8,2,0,3,0,2,0,0,0,1,0,0,0,0,1,1,0,8,2,0,3,0,2,0,0,0,1,0,0,0,0,1,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,1,0,8,2,0,3,0,2,0,0,0,1,0,0,0,0,1,7,91,12,7,134,128,3320,738,1,57,5.7143,5.4286,93.5714,88,2309.8571,453.5714,1,14,1,1,22,16,423,81,1,1,1,1,-0.4571,0.4045,-1.7671,-1.2212,-1.3323,-1.3386,-0.6846,0,28.1721,3.9881,2.1492,37.3624,37.0945,976.072,208.6032,0,399,40,38,655,616,16169,3175,7,1,2,0,0,0,3,3,76,16,1,0,2,0,0,0,3,3,76,16,1,0,2,0,0,0,3,3,76,16,1,0,1,1,1,1,,,,,,,,,,,,,,,,,,,,,2,0,0,0,3,3,76,16,1,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1985
3.0,625000,ceronri01,ATL,NL,ATL,,,,TSN All-Star,AL,C,MVP,AL,AL,NYA,AL,ALCS,NYA,,,,C,AL,ALCS,NYA,,,,,,1985,0,,,,,,,,,,,,,,,,,,,0,0,0,1,1,1,1,1,392,77,1,392,77,1,392,77,1,1,1,,,,,,,392,77,1,10,30,4,519,37,4,147,14,147,147,144,6,14,2,70,85,1,10,8,56,1,10.3,1.2,229.8,15.4,1.4,69.4,6.1,69.4,69.4,54.8,1.1,3.6,0.5,22.9,26.1,0.3,2.3,3.1,23.7,1,0,0,12,0,0,7,0,7,7,2,0,0,0,1,0,0,0,0,0,1,1,3,1.1975,1.0006,0.3925,0.5283,0.4414,0.2422,0.3565,0.2422,0.2422,0.7775,2.7111,1.8908,1.1785,1.201,1.4726,1.0351,1.4583,0.828,0.3548,0,10.4142,1.6865,173.4735,12.4651,1.5776,48.9335,4.7246,48.9335,48.9335,46.0019,1.792,4.2479,0.7071,21.8553,26.9008,0.483,3.401,2.2828,17.1985,0,103,12,2298,154,14,694,61,694,694,548,11,36,5,229,261,3,23,31,237,10,4,2,0,21,4,0,6,1,6,1,1,1,2,5,0,0,1,2,0.75,0,15.25,1,0,4.25,0.25,3.75,0.25,0.75,0.25,1.25,2.5,0,0,0.25,1.25,0,0,10,0,0,3,0,1,0,0,0,1,0,0,0,0,0,1,3,1,0.8546,0,0.158,2.,0,0.3704,2.,-0.7133,2.,-2.,2.,2.,0,0,0,2.,-0.8546,0.9574,0,5.1235,2,0,1.5,0.5,2.0616,0.5,0.5,0.5,0.5,2.0817,0,0,0.5,0.9574,3,0,61,4,0,17,1,15,1,3,1,5,10,0,0,1,5,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,4,4,5,1,1,6,6,153,42,6,0,2.75,2,0.25,0.25,4.25,4.25,110.25,30.25,3,0,1,0,0,0,3,3,75,14,1,0,1,1,3,1,-0.3704,1.1903,2.,2.,0.3704,0.3704,0.1702,-0.3427,1.1903,0,1.5,2.1602,0.5,0.5,1.5,1.5,39.6768,14.0564,2.1602,0,11,8,1,1,17,17,441,121,12,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1985
4.0,800000,chambch01,ATL,NL,ATL,NLS197607130,AL,NYA,Gold Glove,AL,1B,MVP,AL,AL,NYA,AL,ALCS,NYA,,,,1B,AL,ALCS,NYA,,,,,,1985,1,1,0,,1,0,,1,0,,1,1,1,,,,,,,1,0,0,3,3,1,2,4,392,71,11,272,21.25,2.75,24,1,0,2,2,-1.8467,1.8957,2.,167.4276,33.4701,5.5,1088,85,11,15,38,6,641,63,8,162,22,162,162,188,5,20,15,90,96,7,6,4,83,2,25.1333,2.8,485.2,39.9333,2.2,131.7333,10.6,131.7333,131.7333,135.4,1.8,12,6.0667,58.8667,62.4667,2.6667,3.6,1.2667,58.6,1.0667,4,0,67,5,0,17,1,17,17,22,0,0,1,8,7,0,0,0,5,1,2,3,-0.8293,0.3528,-1.7276,-0.4725,1.414,-2.2755,0.7222,-2.2755,-2.2755,-1.3168,0.6688,-0.2249,0.5322,-0.7309,-0.5715,0.7373,-0.5448,0.7649,-1.3754,3.873,8.9192,1.7403,145.3912,16.0911,2.4842,37.2721,4.9828,37.2721,37.2721,42.6946,1.8205,6.0356,4.8176,20.9995,24.1805,2.2573,1.9928,1.2228,19.7042,0.2582,377,42,7278,599,33,1976,159,1976,1976,2031,27,180,91,883,937,40,54,19,879,16,7,2,1,24,3,0,6,1,11,1,2,2,5,8,2,1,0,4,0.5714,0.1429,16.2857,0.7143,0,4.2857,0.4286,4.5714,0.1429,0.4286,0.4286,1.7143,2.1429,0.2857,0.1429,0,2,0,0,10,0,0,3,0,0,0,0,0,0,0,0,0,0,0,2,3,2,1.1145,2.6458,0.3056,1.7836,0,0.2489,0.3742,0.504,2.6458,1.7598,1.7598,1.0961,1.5735,2.6458,2.6458,0,0.3928,0.7868,0.378,5.0238,1.1127,0,1.1127,0.5345,3.8668,0.378,0.7868,0.7868,1.976,2.9681,0.7559,0.378,0,1.5275,4,1,114,5,0,30,3,32,1,3,3,12,15,2,1,0,14,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,7,6,,6,1,6,6,168,55,,0,3.2857,,2.5714,0.2857,4.2857,4.2857,114,34.7143,,0,1,,0,0,3,3,76,17,,0,1,2,3,2,0.0508,,0.6545,1.2296,0.2489,0.2489,0.528,0.5058,,0,1.8898,,1.9881,0.488,1.1127,1.1127,32.3471,13.5119,,0,23,0,18,2,30,30,798,243,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1985
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
18567.0,4875000,strasst01,WAS,NL,WAS,,,,,,,,,,,,,,,,,,,,,,,,,,2012,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,6,2012
18568.0,750000,tracych01,WAS,NL,WAS,,,,,,,,,NL,ARI,,,,1B,NL,ARI,,,,,,,,,,2012,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,8,41,4,597,54,3,154,11,154,154,168,8,27,7,91,80,5,6,1,129,2,20.125,1.125,310.5,27,0.625,96.625,6.25,96.625,121.2,86.25,2.25,9.875,3.125,38,41.625,1.375,3.25,0.375,54.125,1.125,2,0,44,5,0,28,0,28,76,11,0,0,0,5,5,0,0,0,15,1,1,3,0.2672,1.0284,0.168,0.1675,1.9604,-0.1779,-0.1983,-0.1779,-0.6095,0.2779,1.5549,1.0695,0.04936,0.8056,0.05028,1.2615,-0.3143,0.6441,1.3153,2.8284,13.5376,1.6421,197.8253,17.5987,1.0607,47.9403,4.3342,47.9403,36.2726,59.6436,2.8158,9.1875,2.4165,30.4959,26.5219,1.8468,2.1213,0.5175,36.3492,0.3536,161,9,2484,216,5,773,50,773,606,690,18,79,25,304,333,11,26,3,433,9,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,24,260,75,25,147,143,3834,706,2,39.1,12.4,3.7,30.9583,32.9,862.9,118.95,1.0833,0,0,0,1,0,3,2,1,6,1,3,2.5607,1.974,2.7175,1.7042,1.4604,1.5566,2.1377,3.22,77.8716,20.7171,7.4063,41.8345,41.7081,1103.0638,201.6588,0.2823,782,248,74,743,658,17258,2379,26,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,6,2012
18569.0,4000000,wangch01,WAS,NL,WAS,,,,,,,Cy Young,AL,AL,NYA,,,,P,AL,NYA,P,AL,ALDS2,NYA,AL,NYA,AL,ALDS2,NYA,2012,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,2,392,51,0,266,26.5,0,140,2,0,2,1,,,,178.1909,34.6482,0,532,53,0,6,0,0,19,1,0,34,1,11,34,1,0,0,0,1,1,0,0,1,10,1,0,0,5.5,0.1667,0,20,0.1667,2.8333,13,0.1667,0,0,0,0.1667,0.1667,0,0,0.1667,3,1,0,0,1,0,0,11,0,1,1,0,0,0,0,0,0,0,0,0,0,1,2,2,0,0,2.3279,2.4495,0,0.7801,2.4495,2.3973,0.9459,2.4495,0,0,0,2.4495,2.4495,0,0,2.4495,1.7227,0,0,0,6.6858,0.4082,0,9.6954,0.4082,4.0208,13.7659,0.4082,0,0,0,0.4082,0.4082,0,0,0.4082,3.6878,0,0,0,33,1,0,120,1,17,65,1,0,0,0,1,1,0,0,1,18,6,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,6,42,4,1,34,33,654,15,1,23.3333,1.8333,0.5,20,19.1667,366.5,7.1667,1,6,0,0,11,9,126,3,1,1,2,2,0.0138,0.02609,0,0.7801,0.6705,0.4743,0.9105,0,16.4398,1.7224,0.5477,9.6954,10.0083,215.9896,4.916,0,140,11,3,120,115,2199,43,6,3,2,1,0,1,2,2,20,0,0,0,1.3333,1,0,0.3333,1.3333,1.3333,19,0,0,0,1,1,0,0,1,1,17,0,0,0,1,1,2,1,1.7321,0,0,1.7321,1.7321,1.7321,-1.7321,0,0,0,0.5774,0,0,0.5774,0.5774,0.5774,1.7321,0,0,0,4,3,0,1,4,4,57,0,0,0,6,,59,900,1,2,88,9,34,2,33,233,8,12,4,654,7,92,1,104,1,19,9,1,,35,513.5,0.3333,0.6667,56.3333,4.5,20,0.5,19.1667,128,3.6667,8.1667,1.6667,366.5,4.8333,59.8333,0.1667,55.8333,0.1667,9.8333,3.8333,1,,13,206,0,0,28,3,11,0,9,66,1,4,0,126,2,35,0,25,0,1,0,1,2,2,,0.2052,0.5187,0.9682,0.8573,0.5011,2.188,0.7801,1.5367,0.6705,0.8268,0.9639,-0.2683,0.84,0.4743,-0.6384,0.6152,2.4495,0.7902,2.4495,0.4422,0.7815,0,,17.9666,288.1553,0.5164,0.8165,23.6192,2.2583,9.6954,0.8367,10.0083,71.1337,2.7325,2.6394,1.5055,215.9896,1.9408,23.1553,0.4082,29.9294,0.4082,7.5741,3.1885,0,0,210,3081,2,4,338,27,120,3,115,768,22,49,10,2199,29,359,1,335,1,59,23,6,3,0,4,34,0,0,12,19,2,0,1,2,14,2,3,0,20,2,12,0,2,0,4,0,1,0,0,1.6667,30,0,0,5.3333,8,1.3333,0,0.6667,1.3333,9.3333,1,1.6667,0,19,1,6.3333,0,0.6667,0,2.3333,0,0.3333,0,0,0,27,0,0,1,1,1,0,0,1,6,0,1,0,17,0,3,0,0,0,1,0,0,0,1,2,1,0,1.2933,1.1521,0,0,1.5078,1.5454,1.7321,0,-1.7321,1.7321,1.2933,0,1.7321,0,-1.7321,0,1.6523,0,1.7321,0,0.9352,0,1.7321,0,0,2.0817,3.6056,0,0,5.8595,9.6437,0.5774,0,0.5774,0.5774,4.1633,1,1.1547,0,1.7321,1,4.9329,0,1.1547,0,1.5275,0,0.5774,0,0,5,90,0,0,16,24,4,0,2,4,28,3,5,0,57,3,19,0,2,0,7,0,1,0,1,1,6,2012
18570.0,13571428,werthja01,WAS,NL,WAS,NLS200907140,NL,PHI,,,,MVP,NL,NL,PHI,NL,NLCS,PHI,CF,NL,PHI,RF,NL,NLDS2,PHI,,,,,,2012,1,1,0,,1,0,,1,0,,1,1,1,,,,,,,1,0,0,0,,,,2,448,52,0,448,31,0,448,10,0,1,1,,,,0,29.6985,0,896,62,0,9,46,3,571,91,3,159,11,159,150,164,10,36,8,106,99,20,9,2,160,1,18.2222,1.7778,342.2222,48.3333,1.5556,102.7778,5.4444,102.7778,87.1429,90.4444,4.5556,15.5556,2.5556,55.7778,51.5556,10.6667,3.1111,0.4444,98.5556,1,2,0,46,3,0,15,0,15,15,10,0,0,0,4,6,1,0,0,11,1,2,4,0.9258,-0.1885,-0.3723,-0.2173,0.09246,-0.6935,0.1026,-0.6935,-0.4489,-0.276,0.05138,0.3008,0.9426,-0.174,-0.1046,0.02164,1.1347,1.5007,-0.563,0,13.6086,1.0929,204.7778,31.4046,1.236,53.6954,4.9526,53.6954,50.5979,55.471,3.504,12.2282,3.0046,35.5239,30.725,7.8581,2.8916,0.7265,55.2746,0,164,16,3080,435,14,925,49,925,610,814,41,140,23,502,464,96,28,4,887,9,10,3,1,21,6,0,6,1,8,1,3,2,5,6,3,1,0,7,0.9,0.2,15.3,2.7,0,4.4,0.3,4.1,0.1,1.3,0.4,3,2.6,0.5,0.1,0,5.2,0,0,3,0,0,2,0,0,0,0,0,0,0,0,0,0,1,1,4,2,1.2043,1.7788,-1.6814,0.3071,0,-0.5435,1.0351,-0.2578,3.1623,-0.04206,1.6577,-0.5031,0.2005,2.2698,3.1623,0,-0.9788,1.1972,0.4216,5.1001,2.0028,0,1.2649,0.483,2.079,0.3162,1.0593,0.6992,1.4907,2.0656,0.9718,0.3162,0,1.9889,9,2,153,27,0,44,3,41,1,13,4,30,26,5,1,0,52,35,11,4,8,157,152,4138,353,1,3.4545,0.9697,1.5152,48.6,44.3636,1205.3939,98.2121,1,0,0,0,1,0,3,1,1,6,2,4,0.702,1.0714,1.3512,0.969,1.0709,1.0889,1.1254,0,3.7757,1.2115,2.1083,50.9678,50.33,1351.3185,106.7728,0,114,32,50,1701,1464,39778,3241,35,12,1,,1,1,6,6,158,14,,0,0.3333,,0.1667,0.08333,3.75,3.5833,94.5833,7.0833,,0,0,,0,0,1,0,2,0,,0,2,1,4,2,0.8124,,2.0552,3.4641,-0.5741,-0.794,-0.7422,-0.4648,,0,0.4924,,0.3892,0.2887,1.8647,2.1515,54.9669,4.3788,,0,4,0,2,1,45,43,1135,85,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,6,2012


In [58]:
featuretools_test.set_role("salary", getml.data.roles.target)
featuretools_test.set_role(featuretools_test.roles.unused_float, getml.data.roles.numerical)
featuretools_test.set_role(featuretools_test.roles.unused_string, getml.data.roles.categorical)

featuretools_test

91.451862% of all entries of column 'MAX(allstarfull.startingPos)' are NULL values.
91.451862% of all entries of column 'MEAN(allstarfull.startingPos)' are NULL values.
91.451862% of all entries of column 'MIN(allstarfull.startingPos)' are NULL values.
91.914519% of all entries of column 'SKEW(allstarfull.GP)' are NULL values.
91.892487% of all entries of column 'SKEW(allstarfull.gameNum)' are NULL values.
97.884997% of all entries of column 'SKEW(allstarfull.startingPos)' are NULL values.
96.342807% of all entries of column 'STD(allstarfull.startingPos)' are NULL values.
90.746861% of all entries of column 'SKEW(pitchingpost.BAOpp)' are NULL values.
90.724829% of all entries of column 'SKEW(pitchingpost.BB)' are NULL values.
90.724829% of all entries of column 'SKEW(pitchingpost.BFP)' are NULL values.
90.724829% of all entries of column 'SKEW(pitchingpost.BK)' are NULL values.
90.724829% of all entries of column 'SKEW(pitchingpost.CG)' are NULL values.
90.724829% of all entries of col

name,salary,playerID,teamID,lgID,teamIDCat,MODE(allstarfull.gameID),MODE(allstarfull.lgID),MODE(allstarfull.teamID),MODE(awardsplayers.awardID),MODE(awardsplayers.lgID),MODE(awardsplayers.notes),MODE(awardsshareplayers.awardID),MODE(awardsshareplayers.lgID),MODE(batting.lgID),MODE(batting.teamID),MODE(battingpost.lgID),MODE(battingpost.round),MODE(battingpost.teamID),MODE(fielding.POS),MODE(fielding.lgID),MODE(fielding.teamID),MODE(fieldingpost.POS),MODE(fieldingpost.lgID),MODE(fieldingpost.round),MODE(fieldingpost.teamID),MODE(pitching.lgID),MODE(pitching.teamID),MODE(pitchingpost.lgID),MODE(pitchingpost.round),MODE(pitchingpost.teamID),yearID,COUNT(allstarfull),MAX(allstarfull.GP),MAX(allstarfull.gameNum),MAX(allstarfull.startingPos),MEAN(allstarfull.GP),MEAN(allstarfull.gameNum),MEAN(allstarfull.startingPos),MIN(allstarfull.GP),MIN(allstarfull.gameNum),MIN(allstarfull.startingPos),NUM_UNIQUE(allstarfull.gameID),NUM_UNIQUE(allstarfull.lgID),NUM_UNIQUE(allstarfull.teamID),SKEW(allstarfull.GP),SKEW(allstarfull.gameNum),SKEW(allstarfull.startingPos),STD(allstarfull.GP),STD(allstarfull.gameNum),STD(allstarfull.startingPos),SUM(allstarfull.GP),SUM(allstarfull.gameNum),SUM(allstarfull.startingPos),COUNT(awardsplayers),NUM_UNIQUE(awardsplayers.awardID),NUM_UNIQUE(awardsplayers.lgID),NUM_UNIQUE(awardsplayers.notes),COUNT(awardsshareplayers),MAX(awardsshareplayers.pointsMax),MAX(awardsshareplayers.pointsWon),MAX(awardsshareplayers.votesFirst),MEAN(awardsshareplayers.pointsMax),MEAN(awardsshareplayers.pointsWon),MEAN(awardsshareplayers.votesFirst),MIN(awardsshareplayers.pointsMax),MIN(awardsshareplayers.pointsWon),MIN(awardsshareplayers.votesFirst),NUM_UNIQUE(awardsshareplayers.awardID),NUM_UNIQUE(awardsshareplayers.lgID),SKEW(awardsshareplayers.pointsMax),SKEW(awardsshareplayers.pointsWon),SKEW(awardsshareplayers.votesFirst),STD(awardsshareplayers.pointsMax),STD(awardsshareplayers.pointsWon),STD(awardsshareplayers.votesFirst),SUM(awardsshareplayers.pointsMax),SUM(awardsshareplayers.pointsWon),SUM(awardsshareplayers.votesFirst),COUNT(batting),MAX(batting.2B),MAX(batting.3B),MAX(batting.AB),MAX(batting.BB),MAX(batting.CS),MAX(batting.G),MAX(batting.GIDP),MAX(batting.G_batting),MAX(batting.G_old),MAX(batting.H),MAX(batting.HBP),MAX(batting.HR),MAX(batting.IBB),MAX(batting.R),MAX(batting.RBI),MAX(batting.SB),MAX(batting.SF),MAX(batting.SH),MAX(batting.SO),MAX(batting.stint),MEAN(batting.2B),MEAN(batting.3B),MEAN(batting.AB),MEAN(batting.BB),MEAN(batting.CS),MEAN(batting.G),MEAN(batting.GIDP),MEAN(batting.G_batting),MEAN(batting.G_old),MEAN(batting.H),MEAN(batting.HBP),MEAN(batting.HR),MEAN(batting.IBB),MEAN(batting.R),MEAN(batting.RBI),MEAN(batting.SB),MEAN(batting.SF),MEAN(batting.SH),MEAN(batting.SO),MEAN(batting.stint),MIN(batting.2B),MIN(batting.3B),MIN(batting.AB),MIN(batting.BB),MIN(batting.CS),MIN(batting.G),MIN(batting.GIDP),MIN(batting.G_batting),MIN(batting.G_old),MIN(batting.H),MIN(batting.HBP),MIN(batting.HR),MIN(batting.IBB),MIN(batting.R),MIN(batting.RBI),MIN(batting.SB),MIN(batting.SF),MIN(batting.SH),MIN(batting.SO),MIN(batting.stint),NUM_UNIQUE(batting.lgID),NUM_UNIQUE(batting.teamID),SKEW(batting.2B),SKEW(batting.3B),SKEW(batting.AB),SKEW(batting.BB),SKEW(batting.CS),SKEW(batting.G),SKEW(batting.GIDP),SKEW(batting.G_batting),SKEW(batting.G_old),SKEW(batting.H),SKEW(batting.HBP),SKEW(batting.HR),SKEW(batting.IBB),SKEW(batting.R),SKEW(batting.RBI),SKEW(batting.SB),SKEW(batting.SF),SKEW(batting.SH),SKEW(batting.SO),SKEW(batting.stint),STD(batting.2B),STD(batting.3B),STD(batting.AB),STD(batting.BB),STD(batting.CS),STD(batting.G),STD(batting.GIDP),STD(batting.G_batting),STD(batting.G_old),STD(batting.H),STD(batting.HBP),STD(batting.HR),STD(batting.IBB),STD(batting.R),STD(batting.RBI),STD(batting.SB),STD(batting.SF),STD(batting.SH),STD(batting.SO),STD(batting.stint),SUM(batting.2B),SUM(batting.3B),SUM(batting.AB),SUM(batting.BB),SUM(batting.CS),SUM(batting.G),SUM(batting.GIDP),SUM(batting.G_batting),SUM(batting.G_old),SUM(batting.H),SUM(batting.HBP),SUM(batting.HR),SUM(batting.IBB),SUM(batting.R),SUM(batting.RBI),SUM(batting.SB),SUM(batting.SF),SUM(batting.SH),SUM(batting.SO),SUM(batting.stint),COUNT(battingpost),MAX(battingpost.2B),MAX(battingpost.3B),MAX(battingpost.AB),MAX(battingpost.BB),MAX(battingpost.CS),MAX(battingpost.G),MAX(battingpost.GIDP),MAX(battingpost.H),MAX(battingpost.HBP),MAX(battingpost.HR),MAX(battingpost.IBB),MAX(battingpost.R),MAX(battingpost.RBI),MAX(battingpost.SB),MAX(battingpost.SF),MAX(battingpost.SH),MAX(battingpost.SO),MEAN(battingpost.2B),MEAN(battingpost.3B),MEAN(battingpost.AB),MEAN(battingpost.BB),MEAN(battingpost.CS),MEAN(battingpost.G),MEAN(battingpost.GIDP),MEAN(battingpost.H),MEAN(battingpost.HBP),MEAN(battingpost.HR),MEAN(battingpost.IBB),MEAN(battingpost.R),MEAN(battingpost.RBI),MEAN(battingpost.SB),MEAN(battingpost.SF),MEAN(battingpost.SH),MEAN(battingpost.SO),MIN(battingpost.2B),MIN(battingpost.3B),MIN(battingpost.AB),MIN(battingpost.BB),MIN(battingpost.CS),MIN(battingpost.G),MIN(battingpost.GIDP),MIN(battingpost.H),MIN(battingpost.HBP),MIN(battingpost.HR),MIN(battingpost.IBB),MIN(battingpost.R),MIN(battingpost.RBI),MIN(battingpost.SB),MIN(battingpost.SF),MIN(battingpost.SH),MIN(battingpost.SO),NUM_UNIQUE(battingpost.lgID),NUM_UNIQUE(battingpost.round),NUM_UNIQUE(battingpost.teamID),SKEW(battingpost.2B),SKEW(battingpost.3B),SKEW(battingpost.AB),SKEW(battingpost.BB),SKEW(battingpost.CS),SKEW(battingpost.G),SKEW(battingpost.GIDP),SKEW(battingpost.H),SKEW(battingpost.HBP),SKEW(battingpost.HR),SKEW(battingpost.IBB),SKEW(battingpost.R),SKEW(battingpost.RBI),SKEW(battingpost.SB),SKEW(battingpost.SF),SKEW(battingpost.SH),SKEW(battingpost.SO),STD(battingpost.2B),STD(battingpost.3B),STD(battingpost.AB),STD(battingpost.BB),STD(battingpost.CS),STD(battingpost.G),STD(battingpost.GIDP),STD(battingpost.H),STD(battingpost.HBP),STD(battingpost.HR),STD(battingpost.IBB),STD(battingpost.R),STD(battingpost.RBI),STD(battingpost.SB),STD(battingpost.SF),STD(battingpost.SH),STD(battingpost.SO),SUM(battingpost.2B),SUM(battingpost.3B),SUM(battingpost.AB),SUM(battingpost.BB),SUM(battingpost.CS),SUM(battingpost.G),SUM(battingpost.GIDP),SUM(battingpost.H),SUM(battingpost.HBP),SUM(battingpost.HR),SUM(battingpost.IBB),SUM(battingpost.R),SUM(battingpost.RBI),SUM(battingpost.SB),SUM(battingpost.SF),SUM(battingpost.SH),SUM(battingpost.SO),COUNT(fielding),MAX(fielding.A),MAX(fielding.DP),MAX(fielding.E),MAX(fielding.G),MAX(fielding.GS),MAX(fielding.InnOuts),MAX(fielding.PO),MAX(fielding.stint),MEAN(fielding.A),MEAN(fielding.DP),MEAN(fielding.E),MEAN(fielding.G),MEAN(fielding.GS),MEAN(fielding.InnOuts),MEAN(fielding.PO),MEAN(fielding.stint),MIN(fielding.A),MIN(fielding.DP),MIN(fielding.E),MIN(fielding.G),MIN(fielding.GS),MIN(fielding.InnOuts),MIN(fielding.PO),MIN(fielding.stint),NUM_UNIQUE(fielding.POS),NUM_UNIQUE(fielding.lgID),NUM_UNIQUE(fielding.teamID),SKEW(fielding.A),SKEW(fielding.DP),SKEW(fielding.E),SKEW(fielding.G),SKEW(fielding.GS),SKEW(fielding.InnOuts),SKEW(fielding.PO),SKEW(fielding.stint),STD(fielding.A),STD(fielding.DP),STD(fielding.E),STD(fielding.G),STD(fielding.GS),STD(fielding.InnOuts),STD(fielding.PO),STD(fielding.stint),SUM(fielding.A),SUM(fielding.DP),SUM(fielding.E),SUM(fielding.G),SUM(fielding.GS),SUM(fielding.InnOuts),SUM(fielding.PO),SUM(fielding.stint),COUNT(fieldingpost),MAX(fieldingpost.A),MAX(fieldingpost.CS),MAX(fieldingpost.DP),MAX(fieldingpost.E),MAX(fieldingpost.G),MAX(fieldingpost.GS),MAX(fieldingpost.InnOuts),MAX(fieldingpost.PO),MAX(fieldingpost.SB),MAX(fieldingpost.TP),MEAN(fieldingpost.A),MEAN(fieldingpost.CS),MEAN(fieldingpost.DP),MEAN(fieldingpost.E),MEAN(fieldingpost.G),MEAN(fieldingpost.GS),MEAN(fieldingpost.InnOuts),MEAN(fieldingpost.PO),MEAN(fieldingpost.SB),MEAN(fieldingpost.TP),MIN(fieldingpost.A),MIN(fieldingpost.CS),MIN(fieldingpost.DP),MIN(fieldingpost.E),MIN(fieldingpost.G),MIN(fieldingpost.GS),MIN(fieldingpost.InnOuts),MIN(fieldingpost.PO),MIN(fieldingpost.SB),MIN(fieldingpost.TP),NUM_UNIQUE(fieldingpost.POS),NUM_UNIQUE(fieldingpost.lgID),NUM_UNIQUE(fieldingpost.round),NUM_UNIQUE(fieldingpost.teamID),SKEW(fieldingpost.A),SKEW(fieldingpost.CS),SKEW(fieldingpost.DP),SKEW(fieldingpost.E),SKEW(fieldingpost.G),SKEW(fieldingpost.GS),SKEW(fieldingpost.InnOuts),SKEW(fieldingpost.PO),SKEW(fieldingpost.SB),SKEW(fieldingpost.TP),STD(fieldingpost.A),STD(fieldingpost.CS),STD(fieldingpost.DP),STD(fieldingpost.E),STD(fieldingpost.G),STD(fieldingpost.GS),STD(fieldingpost.InnOuts),STD(fieldingpost.PO),STD(fieldingpost.SB),STD(fieldingpost.TP),SUM(fieldingpost.A),SUM(fieldingpost.CS),SUM(fieldingpost.DP),SUM(fieldingpost.E),SUM(fieldingpost.G),SUM(fieldingpost.GS),SUM(fieldingpost.InnOuts),SUM(fieldingpost.PO),SUM(fieldingpost.SB),SUM(fieldingpost.TP),COUNT(pitching),MAX(pitching.BAOpp),MAX(pitching.BB),MAX(pitching.BFP),MAX(pitching.BK),MAX(pitching.CG),MAX(pitching.ER),MAX(pitching.ERA),MAX(pitching.G),MAX(pitching.GF),MAX(pitching.GS),MAX(pitching.H),MAX(pitching.HBP),MAX(pitching.HR),MAX(pitching.IBB),MAX(pitching.IPouts),MAX(pitching.L),MAX(pitching.R),MAX(pitching.SHO),MAX(pitching.SO),MAX(pitching.SV),MAX(pitching.W),MAX(pitching.WP),MAX(pitching.stint),MEAN(pitching.BAOpp),MEAN(pitching.BB),MEAN(pitching.BFP),MEAN(pitching.BK),MEAN(pitching.CG),MEAN(pitching.ER),MEAN(pitching.ERA),MEAN(pitching.G),MEAN(pitching.GF),MEAN(pitching.GS),MEAN(pitching.H),MEAN(pitching.HBP),MEAN(pitching.HR),MEAN(pitching.IBB),MEAN(pitching.IPouts),MEAN(pitching.L),MEAN(pitching.R),MEAN(pitching.SHO),MEAN(pitching.SO),MEAN(pitching.SV),MEAN(pitching.W),MEAN(pitching.WP),MEAN(pitching.stint),MIN(pitching.BAOpp),MIN(pitching.BB),MIN(pitching.BFP),MIN(pitching.BK),MIN(pitching.CG),MIN(pitching.ER),MIN(pitching.ERA),MIN(pitching.G),MIN(pitching.GF),MIN(pitching.GS),MIN(pitching.H),MIN(pitching.HBP),MIN(pitching.HR),MIN(pitching.IBB),MIN(pitching.IPouts),MIN(pitching.L),MIN(pitching.R),MIN(pitching.SHO),MIN(pitching.SO),MIN(pitching.SV),MIN(pitching.W),MIN(pitching.WP),MIN(pitching.stint),NUM_UNIQUE(pitching.lgID),NUM_UNIQUE(pitching.teamID),SKEW(pitching.BAOpp),SKEW(pitching.BB),SKEW(pitching.BFP),SKEW(pitching.BK),SKEW(pitching.CG),SKEW(pitching.ER),SKEW(pitching.ERA),SKEW(pitching.G),SKEW(pitching.GF),SKEW(pitching.GS),SKEW(pitching.H),SKEW(pitching.HBP),SKEW(pitching.HR),SKEW(pitching.IBB),SKEW(pitching.IPouts),SKEW(pitching.L),SKEW(pitching.R),SKEW(pitching.SHO),SKEW(pitching.SO),SKEW(pitching.SV),SKEW(pitching.W),SKEW(pitching.WP),SKEW(pitching.stint),STD(pitching.BAOpp),STD(pitching.BB),STD(pitching.BFP),STD(pitching.BK),STD(pitching.CG),STD(pitching.ER),STD(pitching.ERA),STD(pitching.G),STD(pitching.GF),STD(pitching.GS),STD(pitching.H),STD(pitching.HBP),STD(pitching.HR),STD(pitching.IBB),STD(pitching.IPouts),STD(pitching.L),STD(pitching.R),STD(pitching.SHO),STD(pitching.SO),STD(pitching.SV),STD(pitching.W),STD(pitching.WP),STD(pitching.stint),SUM(pitching.BAOpp),SUM(pitching.BB),SUM(pitching.BFP),SUM(pitching.BK),SUM(pitching.CG),SUM(pitching.ER),SUM(pitching.ERA),SUM(pitching.G),SUM(pitching.GF),SUM(pitching.GS),SUM(pitching.H),SUM(pitching.HBP),SUM(pitching.HR),SUM(pitching.IBB),SUM(pitching.IPouts),SUM(pitching.L),SUM(pitching.R),SUM(pitching.SHO),SUM(pitching.SO),SUM(pitching.SV),SUM(pitching.W),SUM(pitching.WP),SUM(pitching.stint),COUNT(pitchingpost),MAX(pitchingpost.BAOpp),MAX(pitchingpost.BB),MAX(pitchingpost.BFP),MAX(pitchingpost.BK),MAX(pitchingpost.CG),MAX(pitchingpost.ER),MAX(pitchingpost.ERA),MAX(pitchingpost.G),MAX(pitchingpost.GF),MAX(pitchingpost.GIDP),MAX(pitchingpost.GS),MAX(pitchingpost.H),MAX(pitchingpost.HBP),MAX(pitchingpost.HR),MAX(pitchingpost.IBB),MAX(pitchingpost.IPouts),MAX(pitchingpost.L),MAX(pitchingpost.R),MAX(pitchingpost.SF),MAX(pitchingpost.SH),MAX(pitchingpost.SHO),MAX(pitchingpost.SO),MAX(pitchingpost.SV),MAX(pitchingpost.W),MAX(pitchingpost.WP),MEAN(pitchingpost.BAOpp),MEAN(pitchingpost.BB),MEAN(pitchingpost.BFP),MEAN(pitchingpost.BK),MEAN(pitchingpost.CG),MEAN(pitchingpost.ER),MEAN(pitchingpost.ERA),MEAN(pitchingpost.G),MEAN(pitchingpost.GF),MEAN(pitchingpost.GIDP),MEAN(pitchingpost.GS),MEAN(pitchingpost.H),MEAN(pitchingpost.HBP),MEAN(pitchingpost.HR),MEAN(pitchingpost.IBB),MEAN(pitchingpost.IPouts),MEAN(pitchingpost.L),MEAN(pitchingpost.R),MEAN(pitchingpost.SF),MEAN(pitchingpost.SH),MEAN(pitchingpost.SHO),MEAN(pitchingpost.SO),MEAN(pitchingpost.SV),MEAN(pitchingpost.W),MEAN(pitchingpost.WP),MIN(pitchingpost.BAOpp),MIN(pitchingpost.BB),MIN(pitchingpost.BFP),MIN(pitchingpost.BK),MIN(pitchingpost.CG),MIN(pitchingpost.ER),MIN(pitchingpost.ERA),MIN(pitchingpost.G),MIN(pitchingpost.GF),MIN(pitchingpost.GIDP),MIN(pitchingpost.GS),MIN(pitchingpost.H),MIN(pitchingpost.HBP),MIN(pitchingpost.HR),MIN(pitchingpost.IBB),MIN(pitchingpost.IPouts),MIN(pitchingpost.L),MIN(pitchingpost.R),MIN(pitchingpost.SF),MIN(pitchingpost.SH),MIN(pitchingpost.SHO),MIN(pitchingpost.SO),MIN(pitchingpost.SV),MIN(pitchingpost.W),MIN(pitchingpost.WP),NUM_UNIQUE(pitchingpost.lgID),NUM_UNIQUE(pitchingpost.round),NUM_UNIQUE(pitchingpost.teamID),SKEW(pitchingpost.BAOpp),SKEW(pitchingpost.BB),SKEW(pitchingpost.BFP),SKEW(pitchingpost.BK),SKEW(pitchingpost.CG),SKEW(pitchingpost.ER),SKEW(pitchingpost.ERA),SKEW(pitchingpost.G),SKEW(pitchingpost.GF),SKEW(pitchingpost.GIDP),SKEW(pitchingpost.GS),SKEW(pitchingpost.H),SKEW(pitchingpost.HBP),SKEW(pitchingpost.HR),SKEW(pitchingpost.IBB),SKEW(pitchingpost.IPouts),SKEW(pitchingpost.L),SKEW(pitchingpost.R),SKEW(pitchingpost.SF),SKEW(pitchingpost.SH),SKEW(pitchingpost.SHO),SKEW(pitchingpost.SO),SKEW(pitchingpost.SV),SKEW(pitchingpost.W),SKEW(pitchingpost.WP),STD(pitchingpost.BAOpp),STD(pitchingpost.BB),STD(pitchingpost.BFP),STD(pitchingpost.BK),STD(pitchingpost.CG),STD(pitchingpost.ER),STD(pitchingpost.ERA),STD(pitchingpost.G),STD(pitchingpost.GF),STD(pitchingpost.GIDP),STD(pitchingpost.GS),STD(pitchingpost.H),STD(pitchingpost.HBP),STD(pitchingpost.HR),STD(pitchingpost.IBB),STD(pitchingpost.IPouts),STD(pitchingpost.L),STD(pitchingpost.R),STD(pitchingpost.SF),STD(pitchingpost.SH),STD(pitchingpost.SHO),STD(pitchingpost.SO),STD(pitchingpost.SV),STD(pitchingpost.W),STD(pitchingpost.WP),SUM(pitchingpost.BAOpp),SUM(pitchingpost.BB),SUM(pitchingpost.BFP),SUM(pitchingpost.BK),SUM(pitchingpost.CG),SUM(pitchingpost.ER),SUM(pitchingpost.ERA),SUM(pitchingpost.G),SUM(pitchingpost.GF),SUM(pitchingpost.GIDP),SUM(pitchingpost.GS),SUM(pitchingpost.H),SUM(pitchingpost.HBP),SUM(pitchingpost.HR),SUM(pitchingpost.IBB),SUM(pitchingpost.IPouts),SUM(pitchingpost.L),SUM(pitchingpost.R),SUM(pitchingpost.SF),SUM(pitchingpost.SH),SUM(pitchingpost.SHO),SUM(pitchingpost.SO),SUM(pitchingpost.SV),SUM(pitchingpost.W),SUM(pitchingpost.WP),DAY(year),MONTH(year),WEEKDAY(year),YEAR(year)
role,target,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,categorical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical,numerical
0.0,633333,campri01,ATL,NL,ATL,,,,,,,MVP,NL,NL,ATL,NL,NLCS,ATL,,,,P,NL,NLCS,ATL,NL,ATL,NL,NLCS,ATL,1985,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,1,336,9,0,336,9,0,336,9,0,1,1,,,,,,,336,9,0,8,1,0,45,2,0,77,1,77,77,5,2,0,0,3,3,0,1,5,23,1,0.5,0,20.25,0.375,0,43.5,0.125,43.5,43.5,1.25,0.375,0,0,0.625,0.625,0,0.125,1.625,10,1,0,0,2,0,0,5,0,5,5,0,0,0,0,0,0,0,0,0,1,1,1,1,0,0,0.5815,1.951,0,-0.4333,2.8284,-0.4333,-0.4333,1.556,1.951,0,0,1.9604,1.6519,0,2.8284,0.9157,0.6749,0,0.5345,0,18.0297,0.744,0,20.5704,0.3536,20.5704,20.5704,1.8323,0.744,0,0,1.0607,1.1877,0,0.3536,1.8468,8.3837,0,4,0,162,3,0,348,1,348,348,10,3,0,0,5,5,0,1,13,80,8,1,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,1,0,0,0,0,1,1,3,0,0,0,0,0,0,0,1,1,3,0,0,0,0,0,0,0,1,1,3,0,0,0,1,1,1,1,,,,,,,,,,,,,,,,,,,,,0,0,0,0,1,1,3,0,0,0,8,0,63,761,2,3,72,6,77,44,21,199,4,18,13,532,13,84,0,69,22,11,8,1,0,34.375,435,1,0.625,37.125,3,43.5,18.375,7.875,105,2,8,5.875,305.5,5.375,43.5,0,44.75,6.75,6.5,3.125,1,0,2,46,0,0,8,1,5,1,0,13,0,0,0,34,1,9,0,6,0,0,0,1,1,1,0,-0.3257,-0.3004,0,1.9604,0.3173,0.5543,-0.4333,0.4135,0.6696,0.09749,0.3307,0.6191,0.2945,-0.2618,1.2158,0.1602,0,-0.6459,0.9797,-0.7085,0.693,0,0,20.3044,222.1345,0.9258,1.0607,22.548,1.6036,20.5704,16.8263,9.7018,55.616,1.5119,6.3696,4.5178,158.5145,3.8891,25.394,0,22.5182,8.7301,3.8545,3.0443,0,0,275,3480,8,5,297,24,348,147,63,840,16,64,47,2444,43,348,0,358,54,52,25,8,1,0,1,8,0,0,4,36,1,0,0,1,4,0,0,0,3,1,4,0,0,0,0,0,0,0,0,1,8,0,0,4,36,1,0,0,1,4,0,0,0,3,1,4,0,0,0,0,0,0,0,0,1,8,0,0,4,36,1,0,0,1,4,0,0,0,3,1,4,0,0,0,0,0,0,0,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,1,8,0,0,4,36,1,0,0,1,4,0,0,0,3,1,4,0,0,0,0,0,0,0,1,1,1,1985
1.0,150000,dedmoje01,ATL,NL,ATL,,,,,,,,,NL,ATL,,,,,,,,,,,NL,ATL,,,,1985,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,2,0,0,6,0,0,54,0,54,54,0,0,0,0,0,0,0,0,1,1,1,0,0,3,0,0,29.5,0,29.5,29.5,0,0,0,0,0,0,0,0,0.5,0.5,1,0,0,0,0,0,5,0,5,5,0,0,0,0,0,0,0,0,0,0,1,1,1,,,,,,,,,,,,,,,,,,,,,0,0,4.2426,0,0,34.6482,0,34.6482,34.6482,0,0,0,0,0,0,0,0,0.7071,0.7071,0,0,0,6,0,0,59,0,59,59,0,0,0,0,0,0,0,0,1,1,2,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,2,0,35,354,0,0,34,13,54,19,0,86,2,5,9,243,3,39,0,51,4,4,3,1,0,17.5,188.5,0,0,20,8,29.5,9.5,0,48,1,3,4.5,127.5,1.5,22.5,0,27,2,2,1.5,1,0,0,23,0,0,6,3,5,0,0,10,0,1,0,12,0,6,0,3,0,0,0,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,0,24.7487,234.0523,0,0,19.799,7.0711,34.6482,13.435,0,53.7401,1.4142,2.8284,6.364,163.3417,2.1213,23.3345,0,33.9411,2.8284,2.8284,2.1213,0,0,35,377,0,0,40,16,59,19,0,96,2,6,9,255,3,45,0,54,4,4,3,2,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1985
2.0,1500000,hornebo01,ATL,NL,ATL,NLS198207130,NL,ATL,Rookie of the Year,NL,,MVP,NL,NL,ATL,NL,NLCS,ATL,3B,NL,ATL,3B,NL,NLCS,ATL,,,,,,1985,1,1,0,,1,0,,1,0,,1,1,1,,,,,,,1,0,0,1,1,1,1,4,336,42,12,258,14,3,24,1,0,2,1,-2,1.609,2.,156,19.3735,6,1032,56,12,7,25,1,499,66,5,140,16,140,140,153,4,35,6,85,98,4,9,1,75,1,16.1429,0.5714,367.2857,33.5714,1.8571,98.4286,9.7143,98.4286,98.4286,103.4286,1.7143,23,3,59.1429,68,1.7143,3.7143,0.1429,51.4286,1,8,0,113,14,0,32,3,32,32,31,0,3,2,15,19,0,2,0,17,1,1,1,0.3431,-0.3742,-1.0883,1.1061,0.8,-1.0006,-0.02558,-1.0006,-1.0006,-0.8636,0.7065,-0.7731,1.9799,-0.8853,-0.6669,0.05194,2.2203,2.6458,-0.4431,0,6.466,0.5345,136.8341,18.1462,1.7728,36.0595,4.7509,36.0595,36.0595,40.2859,1.3801,11.5326,1.4142,25.0694,29.676,1.7043,2.43,0.378,20.9353,0,113,4,2571,235,13,689,68,689,689,724,12,161,21,414,476,12,26,1,360,7,1,0,0,11,0,0,3,0,1,0,0,0,0,0,0,0,0,2,0,0,11,0,0,3,0,1,0,0,0,0,0,0,0,0,2,0,0,11,0,0,3,0,1,0,0,0,0,0,0,0,0,2,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,11,0,0,3,0,1,0,0,0,0,0,0,0,0,2,10,251,32,23,137,137,3519,414,1,117.9,13,9.3,69.1,67.9,1724.2,88.3,1,0,0,0,1,0,3,2,1,2,1,1,-0.05114,0.3443,0.3635,-0.2447,-0.2157,-0.2406,2.6782,0,91.5829,10.2198,7.15,47.6269,47.7248,1202.7394,119.4609,0,1179,130,93,691,679,17242,883,10,1,7,,0,0,3,3,72,1,,0,7,,0,0,3,3,72,1,,0,7,,0,0,3,3,72,1,,0,1,1,1,1,,,,,,,,,,,,,,,,,,,,,7,0,0,0,3,3,72,1,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1985
3.0,512500,dempsri01,BAL,AL,BAL,,,,Babe Ruth Award,AL,C,,,AL,BAL,AL,ALCS,BAL,,,,C,AL,ALCS,BAL,,,,,,1985,0,,,,,,,,,,,,,,,,,,,0,0,0,2,2,2,2,0,,,,,,,,,,,,,,,,,,0,0,0,17,26,4,441,48,3,136,12,136,136,114,3,11,2,51,41,7,6,7,58,2,8.7647,0.6471,191.7647,21.8235,0.9412,68.5294,5.5882,68.5294,68.5294,45.8235,0.7059,3.1176,0.5294,20.1176,17.5882,0.9412,1.7647,2.6471,24.1765,1.0588,0,0,6,1,0,5,0,5,5,0,0,0,0,0,0,0,0,0,0,1,1,3,0.8167,1.9434,0.08724,0.07107,0.9792,-0.068,0.1341,-0.068,-0.068,0.2333,1.2955,0.8822,0.7496,0.3107,0.2151,2.7929,0.8217,0.3938,0.3044,4.1231,9.2028,1.2217,155.3752,17.9973,1.144,50.2433,4.5008,50.2433,50.2433,38.917,1.1048,3.5335,0.6243,18.1827,16.3404,1.7843,2.1946,2.0899,21.1076,0.2425,149,11,3260,371,16,1165,95,1165,1165,779,12,53,9,342,299,16,30,45,411,18,4,4,0,21,2,0,7,1,6,0,1,2,3,2,1,0,0,3,2,0,14,1.25,0,4.75,0.5,4.25,0,0.25,0.5,2.5,1,0.25,0,0,1.5,0,0,10,1,0,3,0,2,0,0,0,1,0,0,0,0,0,1,2,1,0,0,1.597,2.,0,0.7528,0,-0.7528,0,2.,2.,-2.,0,2.,0,0,0,1.633,0,4.8305,0.5,0,1.7078,0.5774,1.7078,0,0.5,1,1,1.1547,0.5,0,0,1.291,8,0,56,5,0,19,2,17,0,1,2,10,4,1,0,0,6,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,4,5,3,1,0,7,6,165,38,4,0,3,1.75,0.25,0,4.75,4.5,120,26,1.25,0,1,1,0,0,3,3,84,10,0,0,1,1,2,1,0,0.8546,2.,0,0.7528,0,0.7636,-0.9764,1.6585,0,1.8257,0.9574,0.5,0,1.7078,1.291,33.6749,11.6905,1.893,0,12,7,1,0,19,18,480,104,5,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1985
4.0,560000,martide01,BAL,AL,BAL,,,,,,,Cy Young,AL,AL,BAL,AL,WS,BAL,P,AL,BAL,P,AL,ALCS,BAL,AL,BAL,AL,ALCS,BAL,1985,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,2,392,3,0,266,3,0,140,3,0,2,1,,,,178.1909,0,0,532,6,0,9,0,0,0,0,0,42,0,1,42,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,31.4444,0,0.1111,31.3333,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,4,0,0,4,0,0,0,0,0,0,0,0,0,0,1,1,1,,,,,,-1.6547,,3,-1.6041,,,,,,,,,,,0,,,,,,12.0531,,0.3333,12.114,,,,,,,,,,,0,0,0,0,0,0,283,0,1,282,0,0,0,0,0,0,0,0,0,0,9,1,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,1,1,1,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,9,59,8,5,42,39,877,27,1,33.2222,3.3333,1.4444,31.3333,23.5556,529.4444,15.1111,1,4,0,0,4,2,83,3,1,1,1,1,-0.2338,0.6857,1.8192,-1.6041,-0.1645,-0.209,0.01254,0,18.0331,2.5,1.5092,12.114,13.277,257.65,8.5065,0,299,30,13,282,212,4765,136,9,2,1,0,1,0,2,1,25,2,2,0,0.5,0,0.5,0,1.5,1,15.5,1,1,0,0,0,0,0,1,1,6,0,0,0,1,1,2,1,,,,,,,,,,,0.7071,0,0.7071,0,0.7071,0,13.435,1.4142,1.4142,0,1,0,1,0,3,2,31,2,2,0,9,0,93,1206,2,18,119,5,42,19,39,279,8,30,6,877,16,129,3,142,4,16,13,1,0,57.5556,746.8889,0.5556,7.3333,79.1111,3.5556,31.3333,3.8889,23.5556,178.6667,3.3333,17.5556,2.3333,529.4444,9.1111,87.1111,1,87.5556,0.5556,10.5556,5.6667,1,0,8,106,0,1,8,2,4,0,2,23,0,1,0,83,2,8,0,18,0,1,0,1,1,1,0,-0.4575,-0.3514,1.0143,0.7717,-0.8476,0.2704,-1.6041,2.1216,-0.1645,-0.5972,0.772,-0.3131,0.6624,-0.209,0.1397,-1.0248,0.5249,-0.4028,2.6953,-0.5712,0.2488,0,0,26.9774,357.8325,0.7265,6.0828,36.4535,1.0138,12.114,6.2738,13.277,83.4116,2.7386,9.8249,2.1794,257.65,5.0607,39.9889,1.2247,40.5682,1.3333,5.5703,4.1833,0,0,518,6722,5,66,712,32,282,35,212,1608,30,158,21,4765,82,784,9,788,5,95,51,9,2,0,0,32,0,0,4,18,2,1,2,1,8,1,1,0,25,0,4,0,0,0,4,0,0,0,0,0,21,0,0,3.5,10.5,1.5,0.5,1.5,1,7,0.5,1,0,15.5,0,3.5,0,0,0,2,0,0,0,0,0,10,0,0,3,3,1,0,1,1,6,0,1,0,6,0,3,0,0,0,0,0,0,0,1,2,1,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,15.5563,0,0,0.7071,10.6066,0.7071,0.7071,0.7071,0,1.4142,0.7071,0,0,13.435,0,0.7071,0,0,0,2.8284,0,0,0,0,0,42,0,0,7,21,3,1,3,2,14,1,2,0,31,0,7,0,0,0,4,0,0,0,1,1,1,1985
,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
4534.0,512500,desmoia01,WAS,NL,WAS,,,,,,,,,,,,,,,,,,,,,,,,,,2012,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,6,2012
4535.0,506000,espinda01,WAS,NL,WAS,,,,,,,,,,,,,,,,,,,,,,,,,,2012,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,6,2012
4536.0,3000000,gorzeto01,WAS,NL,WAS,,,,,,,,,NL,PIT,,,,P,NL,PIT,,,,,NL,PIT,,,,2012,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,8,0,0,63,2,0,32,2,29,29,5,1,0,0,2,3,0,0,6,31,2,0,0,24.125,1,0,18.375,0.375,17.75,18.4,2.125,0.125,0,0,0.75,1.5,0,0,2.875,11.375,1.125,0,0,1,0,0,3,0,3,3,0,0,0,0,0,0,0,0,0,1,1,1,3,0,0,0.8142,0,0,-0.01019,1.951,-0.06458,-0.4709,0.07359,2.8284,0,0,0.6154,0,0,0,0.3404,0.9901,2.8284,0,0,20.6705,0.9258,0,10.862,0.744,10.1805,11.393,1.9594,0.3536,0,0,0.8864,1.1952,0,0,2.4749,10.0134,0.3536,0,0,193,8,0,147,3,142,92,17,1,0,0,6,12,0,0,23,91,9,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,8,22,4,3,32,32,605,12,2,11.125,0.75,1.25,18.5,13.75,248.625,4.125,1.125,1,0,0,3,0,18,0,1,1,1,3,0.01566,2.2936,0.3864,-0.002143,0.3057,0.5535,1.3775,2.8284,8.2191,1.3887,1.0351,11.0065,11.1963,202.2452,3.7583,0.3536,89,6,10,148,110,1989,33,9,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,8,,70,874,1,1,87,12,32,3,32,214,11,20,5,605,10,90,1,135,1,14,5,2,,36.25,364.75,0.375,0.125,42.125,5.25,18.5,0.75,13.75,84.625,3.125,9.25,1.75,248.625,5.375,44.5,0.125,63.25,0.125,5,2.375,1.125,,3,32,0,0,5,3,3,0,0,6,0,0,0,18,1,5,0,3,0,0,0,1,1,3,,0.1519,0.5036,0.6441,2.8284,0.2703,2.1617,-0.002143,1.3554,0.3057,0.6789,1.5012,0.1874,0.5779,0.5535,-0.04201,0.1678,2.8284,0.2401,2.8284,1.4367,0.1642,2.8284,,28.9815,294.2529,0.5175,0.3536,31.2841,2.9155,11.0065,1.165,11.1963,71.6757,3.7961,7.8513,2.0529,202.2452,3.7393,32.6321,0.3536,49.5025,0.3536,4.2426,2.3867,0.3536,0,290,2918,3,1,337,42,148,6,110,677,25,74,14,1989,43,356,1,506,1,40,19,9,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,6,2012
4537.0,481000,matthry01,WAS,NL,WAS,,,,,,,,,,,,,,,,,,,,,,,,,,2012,0,,,,,,,,,,,,,,,,,,,0,0,0,0,,,,0,,,,,,,,,,,,,,,,,,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,6,2012


We train an untuned XGBoostRegressor on top of featuretools' features, just like we have done for getML's features.

Since some of featuretools features are categorical, we allow the pipeline to include these features as well. Other features contain NaN values, which is why we also apply getML's Imputation preprocessor.

In [59]:
imputation = getml.preprocessors.Imputation()

predictor = getml.predictors.XGBoostRegressor(n_jobs=1)

pipe3 = getml.pipeline.Pipeline(
    tags=['featuretools'],
    preprocessors=[imputation],
    predictors=[predictor],
    include_categorical=True,
)

pipe3

In [60]:
pipe3.fit(featuretools_train)

Checking data model...


Staging...

Preprocessing...

Checking...


INFO [MIGHT TAKE LONG]: The number of unique entries in column 'playerid' in POPULATION__STAGING_TABLE_1 is 4272. This might take a long time to fit. You should consider setting its role to unused_string or using it for comparison only (you can do the latter by setting a unit that contains 'comparison only').


Staging...

Preprocessing...

XGBoost: Training as predictor...


Trained pipeline.
Time taken: 0h:1m:7.737328



In [61]:
pipe3.score(featuretools_test)



Staging...

Preprocessing...




Unnamed: 0,date time,set used,target,mae,rmse,rsquared
0,2021-08-23 17:42:04,featuretools_train,salary,703972.7245,1283939.7829,0.8144
1,2021-08-23 17:42:07,featuretools_test,salary,769939.416,1431516.0502,0.7797


### 2.6 Productionization

It is possible to productionize the pipeline by transpiling the features into production-ready SQL code. Please also refer to getML's `sqlite3` module.

In [62]:
# Creates a folder named baseball_pipeline containing
# the SQL code.
pipe2.features.to_sql().save("baseball_pipeline")

### 2.7 Discussion

For a more convenient overview, we summarize our results into a table.

Name                 | R-squared  | RMSE      | MAE
-------------------- | ---------- | ----------| ----
getML: FastProp      |     78.80% | 1,402,960 | 765,292
getML: Relboost      |     83.95% | 1,220,382 | 666,548
featuretools         |     77.97% | 1,431,516 | 769,939

As we can see, Relboost outperforms FastProp and both algorithms outperform featuretools according to all three measures.

## 3. Conclusion

We have demonstrated how statistical relational learning can be used for sabermetrics. We have also shown that getML outperforms featuretools on this dataset.

## References

Motl, Jan, and Oliver Schulte. "The CTU prague relational learning repository." arXiv preprint arXiv:1511.03086 (2015).

# Next Steps

This tutorial benchmarked getML against academic state-of-the-art algorithms from relational learning literature and getML's qualities with respect to categorical data.

If you are interested in further real-world applications of getML, head back to the [notebook overview](welcome.md) and choose one of the remaining examples.

Here is some additional material from our [documentation](https://docs.getml.com/latest/) if you want to learn more about getML:
* [Feature learning with Multirel](https://docs.getml.com/latest/user_guide/feature_engineering/feature_engineering.html#multirel)
* [Feature learning with Relboost](https://docs.getml.com/latest/user_guide/feature_engineering/feature_engineering.html#relboost)

# Get in contact

If you have any question schedule a [call with Alex](https://go.getml.com/meetings/alexander-uhlig/getml-demo), the co-founder of getML, or write us an [email](team@getml.com). Prefer a private demo of getML? Just contact us to make an appointment.