## Session 1: Getting Started with Python and Baseball Data

**Objective:** Get comfortable with Python basics and loading baseball data.

### 1. Concepts Covered
- Python syntax: `print`, `variables`, `DataFrames`
- How to install and use `pybaseball`
- Fetching and displaying real MLB dat

### 2. Python Code Walkthrough

Install and import packages:

In [None]:
# Install this once
# !pip install pybaseball

In [None]:
from pybaseball import batting_stats, playerid_lookup, statcast_pitcher
import pandas as pd
pd.set_option('display.max_columns', None)

## Exploring the Batting Stats

In [None]:
# First 5 observations in the Batting Stats df
batting = batting_stats(2023)
display(batting.head())

In [None]:
## Finding Freddie Freeman in the look up table to find 
playerid_lookup('Freeman', 'Freddie')

In [None]:
## How to use the look up table

# Find Clayton Kershaw's player id
playerid_lookup('kershaw', 'clayton') 
# His MLBAM ID is 477132.

# Get Kershaw's stats for a specific date using his ID
kershaw_stats = statcast_pitcher('2017-06-02', '2017-06-02', 477132)

# Get Kershaw's stats for a specific for about a year using his ID
# kershaw_stats = statcast_pitcher('2017-06-02', '2018-06-02', 477132)

In [None]:
## examine only strike outs
kershaw_strike_out_df = kershaw_stats[kershaw_stats.events == 'strikeout']
display(kershaw_strike_out_df[:5])
## How did he strike out batters the most during that game?
print(kershaw_strike_out_df.description.value_counts(normalize=True))

### 3. Activity & Exploration Questions
1.	Find the batting stats for your favorite player and explore a year of their batting data.

**Hint:** Use `playerid_lookup()` to get their ID, then filter the DataFrame.

2.	How many players hit more than 30 home runs in that same year?

**Hint:** Use batting `[batting['HR'] > 30]` and check `.shape`.

3.	Who had the highest batting average that year?

**Hint:** Use `.sort_values()` on the `AVG` column.

In [None]:
# Answer 1
display(playerid_lookup('Aaron', 'Hank', fuzzy=True))
batting = batting_stats(1955)
display(batting[batting.IDfg == 1000001])

In [None]:
# Answer 2
batting[batting['HR'] > 30].shape[0]

In [None]:
# Answer 3
batting.sort_values(
    by=['AVG'], ascending=False)[
    ['Name', 'Team', 'AVG']].reset_index()[:10]

### 4. Extension

Explore top 10 players by a stat of your choice (HR, AVG, RBI) and year of your choice:

In [None]:
batting.sort_values('HR', ascending=False).head(10)

### 5. Helpful Links
- [W3Schools Python Basics](https://www.w3schools.com/python/)