This package consists of the Lahman Baseball Database with the intent of consuming it from Python code via the Pandas library. The package was inspired from the book Analyzing Baseball Data with R. Obviously R is not Python so this package allows one to do the analyses covered in the book (as well as you own) using Python.
Install using pip:
pip install tq-lahman-datasets
Download and load the Pandas DataFrames
into memory:
from teqniqly.lahman_datasets import LahmanDatasets
ld = LahmanDatasets()
ld.load()
Get the dataframe names. Each dataframe corresponds to a CSV file in the Lahman database:
df_names = ld.dataframe_names
Load datasets by providing the dataset name as an indexer to the LahmanDatasets
instance:
batting_df = ld["Batting"]
The datasets are Pandas DataFrames
so work with them as you would with other DataFrames.