# Table of Contents

## Part I: Foundations

### Chapter 1: Introduction to Baseball Analytics and Data Science
- The evolution of baseball analytics: from box scores to big data  
- The Moneyball era and beyond: how data science changed MLB front offices  
- Core baseball metrics and terminology (AVG, OBP, SLG, WAR, wOBA, FIP, etc.)  
- The role of data science in scouting, in-game strategy, and player development  
- Tools of the trade: Python, pandas, scikit-learn, NumPy, matplotlib, and Jupyter notebooks

### Chapter 2: Setting Up Your Environment
- Installing Python and essential packages (Anaconda distribution)  
- Working with Jupyter notebooks for data exploration and presentations  
- Version control best practices (Git/GitHub) for data science projects  
- Introduction to integrated development environments (IDEs) and notebooks

## Part II: Working with Baseball Data

### Chapter 3: Sourcing and Understanding Baseball Data
- Publicly available baseball datasets: Retrosheet, FanGraphs, Baseball-Reference, and Statcast  
- Differences in data structure, granularity, and reliability  
- APIs and web scraping techniques to gather data  
- SQL and NoSQL databases for storing large volumes of baseball data

### Chapter 4: Data Cleaning and Preprocessing
- Understanding messy data: missing values, inconsistent formats, and outliers  
- Data cleaning techniques using pandas  
- Normalizing and scaling data for machine learning models  
- Combining multiple data sources: merging player stats, game logs, and pitch-level data

## Part III: Exploratory Data Analysis and Visualization

### Chapter 5: Exploratory Data Analysis (EDA)
- Using descriptive statistics to summarize player and team performance  
- Grouping and aggregations in pandas to find trends across seasons, teams, and leagues  
- Time-series analysis of player performance over multiple seasons

### Chapter 6: Data Visualization for Baseball
- Visualizing pitching performance with histograms, box plots, and scatter plots  
- Creating spray charts to understand a hitter’s batted-ball profile  
- Seaborn and matplotlib for polished, publication-quality visuals  
- Interactive visualizations using Plotly or Bokeh to create dashboards

## Part IV: Statistical and Machine Learning Methods for Baseball

### Chapter 7: Fundamentals of Statistical Analysis
- Introduction to hypothesis testing: t-tests, chi-square tests, ANOVA for comparing groups of players  
- Correlation and regression analysis: modeling relationships between offensive stats (e.g., OBP vs. runs scored)  
- Understanding advanced metrics: calculating wOBA, wRC+, WAR

### Chapter 8: Regression Models for Predicting Player Performance
- Linear regression for predicting player batting average or ERA  
- Ridge, Lasso, and Elastic Net for improved performance and feature selection  
- Interpreting regression coefficients to understand what drives performance

### Chapter 9: Classification Models in Baseball
- Logistic regression, decision trees, and random forests for classifying outcomes (e.g., predicting if a pitch will be a strike or a ball)  
- Evaluating classification models with accuracy, precision, recall, and AUC  
- Using cross-validation and hyperparameter tuning to optimize models

### Chapter 10: Clustering and Dimensionality Reduction
- K-means and hierarchical clustering to group similar players  
- Principal Component Analysis (PCA) for simplifying complex datasets  
- Identifying hidden player types (e.g., power hitters, contact hitters, ground-ball pitchers) and market inefficiencies

### Chapter 11: Time-Series Forecasting
- Forecasting player performance and injury likelihood over time  
- ARIMA and Prophet models for predicting future trends in run scoring or attendance  
- Incorporating seasonality and external factors (weather, park factors) into models

## Part V: Advanced Topics

### Chapter 12: Deep Learning Applications in Baseball
- Using neural networks for pitch classification or predicting batted-ball outcomes  
- Convolutional Neural Networks (CNNs) for image-based scouting (e.g., analyzing player swings from video)  
- Recurrent Neural Networks (RNNs) or LSTMs for sequential pitch prediction

### Chapter 13: Natural Language Processing (NLP) for Baseball Journalism and Scouting Reports
- Scraping news articles and scouting reports and performing sentiment analysis  
- Topic modeling to understand common themes in game recaps  
- Creating keyword-based player comparisons

### Chapter 14: Reinforcement Learning and Strategy Optimization
- Modeling decision-making processes, like when to pull a pitcher or attempt a stolen base  
- Simple Q-learning examples to simulate in-game strategy decisions  
- Potential future applications of advanced AI in front-office decision-making

## Part VI: Deploying and Communicating Results

### Chapter 15: Building Interactive Dashboards and Reports
- Creating Shiny or Dash dashboards to share insights with non-technical stakeholders  
- Embedding visualizations in web pages and internal analytics portals  
- Designing automated reports for coaches and scouts

### Chapter 16: Effective Communication of Data Insights
- Best practices for data storytelling: clarity, simplicity, relevance  
- How to present findings to coaches and executives  
- Ethical considerations in sports analytics: player privacy, gambling, and fairness

### Chapter 17: Project Ideas and Case Studies
- Predicting Cy Young Award winners using historical data  
- Finding undervalued free agents with clustering and regression  
- Building a game simulation model to forecast win probabilities  
- Discussion of real-world front-office analytics roles  
- Tips for continuing education and staying updated with new technologies and metrics
