## Nonlinearity in Natural Processes
- Many natural processes exhibit nonlinear behaviors over large scales.
- For instance, plotting height and weight across the human lifespan (including children) shows a nonlinear relationship. While height and weight for adults may be approximately linear, this relationship changes across different age groups.

<img src="images/image32.jpeg" width="600" height="400" />

## Modeling Nonlinear Relationships with Linear Models

- Linear models can still effectively model nonlinear relationships through strategic use.
- They can be thought of as analogous to epicycles, where multiple linear components are combined to form complex shapes.

## Strategies for Creating Nonlinear Shapes with Linear Models

### 1. Polynomial Functions
- **Approach**: Using polynomial functions to introduce curvature.
- **Critique**: Widely used but often problematic due to overfitting and lack of interpretability.

### 2. Additive Functions (Splines)
- **Approach**: Utilizing additive functions such as splines.
- **Advantages**: More flexible and useful compared to polynomials. Splines and related methods like Gaussian processes allow for curves that pass through the center of gravity of data points.

<img src="images/image33.jpeg" width="600" height="400" />

### Conclusion

- While linear models are inherently linear, they can be adapted to model nonlinear phenomena effectively.
- Strategies like splines provide a more mechanistically sound approach compared to polynomial functions, offering flexibility without sacrificing interpretability.
- Understanding these methods is crucial for modeling complex real-world relationships accurately.

----------------------------------------------------------------

# Critique of Polynomial Functions in Modeling

## Characteristics of Polynomials

- **Definition**: Polynomials are functions constructed by multiplying a variable (typically denoted as x) by itself multiple times, creating curves of various shapes.
- **Structure**: A polynomial consists of terms such as the intercept, a linear term (\( \beta_1 x_i \)), and higher-order terms like \( \beta_2 x_i^2 \).
- **Linearity**: Despite their name, polynomials are still considered linear models because they are additive functions of their parameters (\( \beta \) coefficients).

<img src="images/image34.jpeg" width="600" height="400" />

## Issues with Polynomials

- **Symmetry**: Polynomials exhibit unwanted symmetries (e.g., parabolas are perfectly symmetric), which may not align with realistic scientific assumptions.
- **Uncertainty Handling**: They result in high uncertainty at the edges of data, leading to unpredictable model behavior.
- **Global Influence**: Polynomials determine the shape of the curve based on the entire range of the x-axis, not locally. This lack of local smoothing can distort the model's fit.

## Bayesian Perspective on Polynomials

- **Posterior Distribution**: Bayesian updating of polynomial models shows how prior assumptions affect the shape of the curve, often unrealistically assuming symmetry and global influence.
- **Data Influence**: Each data point can significantly alter the shape of the polynomial curve across the entire x-axis, which is undesirable for accurate modeling.

## Conclusion

- **Limitations**: Despite their flexibility in fitting various data shapes, polynomials make assumptions that are often unrealistic and do not reflect local data behavior.
- **Alternative Approaches**: Techniques like splines provide local smoothing and are more suitable for modeling nonlinear relationships in a scientifically rigorous manner.
- **Recommendation**: Avoid relying solely on polynomials for complex data modeling tasks due to their inherent limitations and problematic assumptions.

<video controls width="600" height="400" src="files/recording2.mp4" title="Animation"></video>

<img src="images/image35.jpeg" width="600" height="400" /> <img src="images/image36.jpeg" width="600" height="400" />

# Understanding Splines

## Introduction to Splines

- **Utility**: Splines are highly useful for creating locally inferred functions without extensive scientific background information. They help in stratifying data by a continuously varying prediction variable.
- **Types**: The simplest to explain are basis splines (B-splines), but various types exist. All work by adding together locally trained terms, resulting in a smooth continuous function.

<img src="images/image37.jpeg" width="600" height="400" />

## Mechanism of Splines

- **Drafting Origin**: The term "spline" comes from drafting, where a flexible bar is bent using weights to create smooth curves. Similarly, in statistical splines, anchor points or control points guide the curve's shape.
- **Local Training**: Parameters are trained on local data regions, and these local regions are smoothed together to form a continuous function.

## Example: Cherry Blossom Blooms

- **Historical Data**: Example data includes the day of the first cherry blossom bloom recorded over 1,000 years in a specific region of Japan.
- **Spline Fitting**: Splines can help visualize trends in such data, showing reliable local variations (hills and valleys) without prior scientific assumptions.

<img src="images/image38.jpeg" width="600" height="400" />

## Mechanism of B-Splines

- **Linear Model with Additive Terms**: B-splines are linear models with a series of additive terms, where synthetic variables (basis functions) represent local positions.
- **Weight Parameters**: Each term has a weight parameter (\( W \)) that adjusts the importance of the corresponding basis function (\( B \)).
- **Local Influence**: Each weight affects the curve only in the region where its basis function is non-zero.

<img src="images/image39.jpeg" width="600" height="400" />

## Visualizing Splines

- **Basis Functions**: Colored curves represent basis functions, and their values at any point form the basis of the spline.

<img src="images/image40.jpeg" width="600" height="400" />

- **Weight Adjustment**: Changing the weight of a basis function adjusts the spline locally. For example:
  - Weighting \( W_1 \) affects only basis function 1, influencing the curve on the far left.
  
<img src="images/image41.jpeg" width="600" height="400" />

  - Weighting \( W_2 \) affects basis function 2, impacting the middle left region.

<img src="images/image42.jpeg" width="600" height="400" />

  - Weighting \( W_3 \) affects basis function 3, modifying the middle right region.

<img src="images/image43.jpeg" width="600" height="400" />

  - Weighting \( W_4 \) affects basis function 4, altering the far right region.

<img src="images/image44.jpeg" width="600" height="400" />

## Flexibility and Complexity

- **Infinite Shapes**: Splines can take an infinite number of shapes, providing flexibility in modeling various data trends.
- **Local vs. Global**: Unlike polynomials, splines adjust based on local data, offering more reliable and interpretable models.

## Conclusion

- **Advantages of Splines**: They provide a flexible and locally adaptive approach to modeling nonlinear relationships without strong prior assumptions.
- **Practical Use**: Splines are a major tool in applied statistics, useful for visualizing and analyzing complex data trends.

----------------------------------------------------------------

# Using Splines for Modeling Nonlinear Relationships

## Example of Splines in Practice

- **Height and Age Data**: We model the relationship between age and height using splines. Though age is considered a causal influence on height, it's important to note that:
  - **Age as a Cause**: Age itself cannot be experimentally manipulated. It's a proxy for time, during which various growth-related causes accumulate.
  - **Proxy Utility**: Despite not being a direct cause, age is useful in modeling growth patterns.

## Bayesian Spline Model

- **Initial Training**: Start with a Bayesian spline model trained on data from 10 individuals. 
  - **Wiggly Curves**: Initially, the spline curve is very wiggly, especially outside the data range, reflecting high uncertainty.
  - **Learning Path**: As more data points are added, the spline learns the expected growth path, smoothing locally without the wild fluctuations seen in polynomial models.
- **Biological Realism**: Unlike a biological model, the spline doesn't inherently prevent unrealistic decreases in height. It adjusts based on data without biological constraints.

<video controls width="600" height="400" src="files/recording3.mp4" title="Animation"></video>

## Basis Functions and Weight Parameters

- **Basis Functions**: These are underlying hills that form the spline. The heights of these hills are adjusted by weight parameters.
  - **Visualization**: The black hills represent basis functions, and their sum at any vertical point on the x-axis forms the spline curve (blue).
  - **Local Adjustment**: The spline is the sum of these locally adjusted basis functions, ensuring a smooth, flexible curve.

## Splines vs. Biological Models

- **Advantages of Splines**: Splines are highly flexible and can fit a variety of data shapes without requiring strong prior knowledge. They are additive and linear underneath, making them useful in many applied statistics scenarios.
- **Limitations in Biological Contexts**: For modeling human height, splines may not be ideal. Biological models, which consider distinct growth phases, provide more accurate and meaningful results.

<video controls width="600" height="400" src="files/recording4.mp4" title="Animation"></video>

## Alternative Strategy: Modeling Growth Phases

- **Growth Phases**: Humans undergo distinct biologically controlled growth phases:
  - **Infancy**: Rapid overall growth.
  - **Childhood**: Slower body growth, rapid head growth.
  - **Puberty**: Body growth catches up with head growth; brain growth essentially stops.
  - **Adulthood**: Growth levels off, with a slight decline in stature over time due to gravity.
- **Phase-Specific Models**: Each phase can be modeled with parameters reflecting biological realities:
  - **Infancy and Childhood**: Parameters for rapid and slower growth rates.
  - **Puberty**: Parameters for body catch-up growth.
  - **Adulthood**: Parameters for the leveling off and slight decline in height.
- **Benefits**: Using phase-specific models with biologically meaningful constraints allows for more accurate modeling and meaningful comparisons between different growth phases.

<img src="images/image46.jpeg" width="600" height="400" />

## Conclusion

- **Use Splines Wisely**: While splines are powerful tools for many applications, in contexts with strong underlying biological processes, models that incorporate these processes are preferable.
- **Flexible Modeling**: Splines remain an essential tool in the absence of strong generative models, providing flexible and locally adaptive curve fitting.

<img src="images/image52.png" width="600" height="400" />