In [1]:
import json

import data_dictionary as dd

# initiate empty dictionary
functions_dict = {}

In [2]:
function = 'statcast'
docs = '''# Statcast
`statcast(start_dt=[yesterday's date], end_dt=None, team=None, verbose=True, parallel=True)`

The `statcast` function retrieves pitch-level statcast data for a given date or range or dates. 

## Returned data
This function returns a pandas `DataFrame` with one entry for each pitch in the
query. The data returned for each pitch is explained on
[Baseball Savant](https://baseballsavant.mlb.com/csv-docs).

## Arguments
`start_dt:` first day for which you want to retrieve data. Defaults to yesterday's date if nothing is entered. If you only want data for one date, supply a `start_dt` value but not an `end_dt` value. Format: YYYY-MM-DD. 

`end_dt:` last day for which you want to retrieve data. Defaults to None. If you want to retrieve data for more than one day, both a `start_dt` and `end_dt` value must be given. Format: YYYY-MM-DD. 

`team:` optional. If you only want statcast data for one team, supply that team's abbreviation here (i.e. BOS, SEA, NYY, etc).

`verbose:` Boolean, default=True. If set to True this will provide updates on query progress, if set to False it will not. 

`parallel:` Boolean, default=True. Whether to parallelize HTTP requests in large queries.

### A note on data availability 
The earliest available statcast data comes from the 2008 season when the system was first introduced to Major League Baseball. Queries before this year will not work. Further, some features were introduced after the 2008 season. Launch speed angle, for example, is only available from the 2015 season forward. 

### A note on query time
Baseball savant limits queries to 30000 rows each. For this reason, if your request is for a period of greater than 5 days, it will be broken into two or more smaller requests. The data will still be returned to you in a single dataframe, but it will take slightly longer. 

### A note on parallelization
Large queries with requests made in parallel complete substantially faster. This option exists to accommodate compute environments where multiprocessing is disabled (e.g. some AWS Lambda environments).

## Examples of valid queries

```python
from pybaseball import statcast

# get all statcast data for July 4th, 2017
data = statcast('2017-07-04')

#get data for the first seven days of August in 2016
data = statcast('2016-08-01', '2016-08-07')

#get all data for the Texas Rangers in the 2016 season
data = statcast('2016-04-01', '2016-10-30', team='TEX')

# get data for yesterday
data = statcast()```
'''

data_dictionary = dd.statcast

functions_dict[function] = {'docs':docs, 'data_dictionary':data_dictionary}


In [3]:
function = 'statcast_pitcher'
docs = '''# Statcast Pitcher
`statcast_pitcher(start_dt=[yesterday's date], end_dt=None, player_id)`

The statcast function retrieves pitch-level statcast data for a given date or range or dates. 

## Arguments
`start_dt:` first day for which you want to retrieve data. Defaults to yesterday's date if nothing is entered. If you only want data for one date, supply a `start_dt` value but not an `end_dt` value. Format: YYYY-MM-DD. 

`end_dt:` last day for which you want to retrieve data. Defaults to None. If you want to retrieve data for more than one day, both a `start_dt` and `end_dt` value must be given. Format: YYYY-MM-DD. 

`player_id:` MLBAM player ID for the pitcher you want to retrieve data for. To find a player's MLBAM ID, see the function [playerid_lookup](http://github.com/jldbc/pybaseball/docs/playerid_lookup.md) or the examples below. 

### A note on data availability 
The earliest available statcast data comes from the 2008 season when the system was first introduced to Major League Baseball. Queries before this year will not work. Further, some features were introduced after the 2008 season. Launch speed angle, for example, is only available from the 2015 season forward. 

### Known issue
In rare cases where a player has seen greater than 30,000 pitches over the time period specified in your query, only the first 30,000 of these plays will be returned. There is a fix in the works for this

## Examples of valid queries

```python
from pybaseball import statcast_pitcher
from pybaseball import playerid_lookup

# find Chris Sale's player id (mlbam_key)
playerid_lookup('sale','chris')

# get all available data
data = statcast_pitcher('2008-04-01', '2017-07-15', player_id = 519242)

# get data for July 15th, 2017
data = statcast_pitcher('2017-07-15','2017-07-15', player_id = 519242)
```'''

data_dictionary = dd.statcast

functions_dict[function] = {'docs':docs, 'data_dictionary':data_dictionary}

In [4]:
function = 'statcast_batter'
docs = '''# Statcast Batter
`statcast_batter(start_dt=[yesterday's date], end_dt=None, player_id)`

The statcast function retrieves pitch-level statcast data for a given date or range or dates. 

## Arguments
`start_dt:` first day for which you want to retrieve data. Defaults to yesterday's date if nothing is entered. If you only want data for one date, supply a `start_dt` value but not an `end_dt` value. Format: YYYY-MM-DD. 

`end_dt:` last day for which you want to retrieve data. Defaults to None. If you want to retrieve data for more than one day, both a `start_dt` and `end_dt` value must be given. Format: YYYY-MM-DD. 

`player_id:` MLBAM player ID for the player you want to retrieve data for. To find a player's MLBAM ID, see the function [playerid_lookup](http://github.com/jldbc/pybaseball/docs/playerid_lookup.md) or the examples below. 

### A note on data availability 
The earliest available statcast data comes from the 2008 season when the system was first introduced to Major League Baseball. Queries before this year will not work. Further, some features were introduced after the 2008 season. Launch speed angle, for example, is only available from the 2015 season forward. 

## Examples of valid queries

```python
from pybaseball import statcast_batter
from pybaseball import playerid_lookup

# find David Ortiz's player id (mlbam_key)
playerid_lookup('ortiz','david')

# get all available data
data = statcast_batter('2008-04-01', '2017-07-15', player_id = 120074)

# get data for August 16th, 2014
data = statcast_batter('2014-08-16', player_id = 120074)
```
'''

data_dictionary = dd.statcast

functions_dict[function] = {'docs':docs, 'data_dictionary':data_dictionary}

In [5]:
function = 'playerid_lookup'
docs = '''# Player ID Lookup

## Single Player Lookup

`playerid_lookup(last, first=None, fuzzy=False)`

Look up a player's MLBAM, Retrosheet, FanGraphs, and Baseball Reference ID by name.

## Arguments
`last:` String. The player's last name. Case insensitive.

`first:` String. Optional. The player's first name. Case insensitive.

`fuzzy:` Boolean. Optional. Search for inexact name matches, the 5 closest will be returned.

Providing last name only will return all available id data for players with that last name (this will return several rows for a common last name like Jones, for example.) If multiple players exist for a (last name, first name) pair, you can figure out who's who by seeing their first and last years of play in the fields `mlb_played_first` and `mlb_played_last`.

This data comes from Chadwick Bureau, meaning that there are several people in this data who are not MLB players. For this reason, supplying both last and first name is recommended to narrow your search. 

## Examples of valid queries

```python
from pybaseball import playerid_lookup

# find the ids of all players with last name Jones (returns 1,314 rows)
data = playerid_lookup('jones')

# only return the ids of chipper jones (returns one row)
data = playerid_lookup('jones','chipper')

# Will return all players named Pedro Martinez (returns *2* rows)
data = playerid_lookup("martinez", "pedro", fuzzy=True)

# Will return the 5 closest names to "yadi molina" (returns 5 rows)
# First row will be Yadier Molina
data = playerid_lookup("molina", "yadi", fuzzy=True)
```

## List Lookup

`player_search_list(player_list)`

Look up a list of player ID's by name, return a data frame of all players

`player_list:` List. A list of tuples, of the form `(last, first)`. Case Insensitive.

Sources are the same as those used in the above `playerid_lookup` function. Queries for this function must be exact name matches.

## Examples of valid queries

```python

from pybaseball import player_search_list

# Will return the ids for both Lou Brock and Chipper Jones (returns 2 rows)
data = player_search_list([("brock","lou"), ("jones","chipper")])

```'''

data_dictionary = dd.player_id_lookup

functions_dict[function] = {'docs':docs, 'data_dictionary':data_dictionary}

In [6]:
function = 'schedule_and_record'
docs = '''# Schedule and Record

`schedule_and_record(season, team)`

The schedule_and_record function returns a dataframe of a team's game-level results for a given season, including win/loss/tie result, score, attendance, and winning/losing/saving pitcher. If the season is incomplete, it will provide scheduling information for future games. 

## Arguments
`season:` Integer. The season for which you want a team's record data. 

`team:` String. The abbreviation of the team for which you are requesting data (e.g. "PHI", "BOS", "LAD"). 

Note that if a team did not exist during the year you are requesting data for, the query will be unsuccessful. Historical name and city changes for teams in older seasons can cause some problems as well. The Los Angeles Dodgers ("LAD"), for example, are abbreviated "BRO" in older seasons, due to their origins as the Brooklyn Dodgers. This may at times require some detective work in certain cases.   

## Examples of valid queries

```python
from pybaseball import schedule_and_record

# Game-by-game results from the Yankees' 1927 season
data = schedule_and_record(1927, "NYY")

# Results and upcoming schedule for the Phillies' current season (2017 at the time of writing)
data = schedule_and_record(2017, "PHI")
```
'''

data_dictionary = dd.schedule_and_record

functions_dict[function] = {'docs':docs, 'data_dictionary':data_dictionary}

In [7]:
function = 'standings'
docs = '''# Standings

`standings(season)`

he standings(season) function gives division standings for a given season. If the current season is chosen, 
it will give the most current set of standings. Otherwise, it will give the end-of-season standings for each 
division for the chosen season. This function returns a list of dataframes. Each dataframe is the standings for one of MLB's six divisions.

## Arguments
`season:` Integer. Defaults to the current calendar year if no value is provided. 

## Examples of valid queries

```python
from pybaseball import standings

# get the current season's up-to-date standings
data = standings()

# get the end-of-season division standings for the 1980 season
data = standings(1980)
```
'''

data_dictionary = dd.standings

functions_dict[function] = {'docs':docs, 'data_dictionary':data_dictionary}

In [8]:
function = 'playerid_reverse_lookup'
docs = '''# Player ID Reverse Lookup

`playerid_reverse_lookup(player_ids, key_type='mlbam')`

Find the names and ids of one or several players given a list of MLBAM, FanGraphs, Baseball Reference, or Retrosheet ids. 

## Arguments
`player_ids:` List. A list of player ids.

`key_type:` String. The type of id you're passing in the `player_ids` field. Valid inputs are 'mlbam', 'retro', 'bbref', and 'fangraphs'. Defaults to 'mlbam' if no value is passed. 
 
This function is useful for connecting data sets from various sources or for finding player names when only an id is provided. Data for this function comes from the Chadwick Bureau. 

## Examples of valid queries

```python
from pybaseball import playerid_reverse_lookup

# a list of mlbam ids
player_ids = [116539, 116541, 641728, 116540]

# find the names of the players in player_ids, along with their ids from other data sources
data = playerid_reverse_lookup(player_ids, key_type='mlbam')

# a list of fangraphs ids
fg_ids = [826, 5417, 210, 1101]

# find their names and ids from other data sources
data = playerid_reverse_lookup(fg_ids, key_type='fangraphs')
'''

data_dictionary = dd.playerid_reverse_lookup

functions_dict[function] = {'docs':docs, 'data_dictionary':data_dictionary}

In [9]:
functions_dict[function]

{'docs': "# Player ID Reverse Lookup\n\n`playerid_reverse_lookup(player_ids, key_type='mlbam')`\n\nFind the names and ids of one or several players given a list of MLBAM, FanGraphs, Baseball Reference, or Retrosheet ids. \n\n## Arguments\n`player_ids:` List. A list of player ids.\n\n`key_type:` String. The type of id you're passing in the `player_ids` field. Valid inputs are 'mlbam', 'retro', 'bbref', and 'fangraphs'. Defaults to 'mlbam' if no value is passed. \n \nThis function is useful for connecting data sets from various sources or for finding player names when only an id is provided. Data for this function comes from the Chadwick Bureau. \n\n## Examples of valid queries\n\n```python\nfrom pybaseball import playerid_reverse_lookup\n\n# a list of mlbam ids\nplayer_ids = [116539, 116541, 641728, 116540]\n\n# find the names of the players in player_ids, along with their ids from other data sources\ndata = playerid_reverse_lookup(player_ids, key_type='mlbam')\n\n# a list of fangraphs 

In [10]:
# write to disk
file_path = 'functions.json'
with open(file_path, 'w') as file:
    json.dump(functions_dict, file)

In [11]:
# read from disk
with open(file_path, 'r') as file:
    loaded_dict = json.load(file)

In [12]:
functions_dict[function]

{'docs': "# Player ID Reverse Lookup\n\n`playerid_reverse_lookup(player_ids, key_type='mlbam')`\n\nFind the names and ids of one or several players given a list of MLBAM, FanGraphs, Baseball Reference, or Retrosheet ids. \n\n## Arguments\n`player_ids:` List. A list of player ids.\n\n`key_type:` String. The type of id you're passing in the `player_ids` field. Valid inputs are 'mlbam', 'retro', 'bbref', and 'fangraphs'. Defaults to 'mlbam' if no value is passed. \n \nThis function is useful for connecting data sets from various sources or for finding player names when only an id is provided. Data for this function comes from the Chadwick Bureau. \n\n## Examples of valid queries\n\n```python\nfrom pybaseball import playerid_reverse_lookup\n\n# a list of mlbam ids\nplayer_ids = [116539, 116541, 641728, 116540]\n\n# find the names of the players in player_ids, along with their ids from other data sources\ndata = playerid_reverse_lookup(player_ids, key_type='mlbam')\n\n# a list of fangraphs 