# Gaussian Distribution

- Also known as the Normal distribution, it is a type of continuous probability distribution for a real-valued random variable. The general form of its probability density function is:

f(x) = (1 / sqrt(2πσ²)) * exp( - (x - μ)² / (2σ²) )

where:
  - `μ` is the mean or expectation of the distribution (and also its median and mode),
  - `σ` is the standard deviation, and
  - `σ²` is the variance.

- **Intuition**: The Gaussian distribution is symmetric and its mean, median and mode are all equal. It is defined by two parameters: the mean (μ) which determines the center of the distribution, and the standard deviation (σ) which determines the spread or width of the distribution. The shape of the Gaussian distribution is a bell curve, with the majority of the observations falling close to the mean, and fewer observations in the tails.

- **Properties**: The Gaussian distribution has some important properties:
  - It is fully described by its mean and variance.
  - It has a skewness of 0 and a kurtosis of 3.
  - About 68% of values drawn from a Gaussian distribution are within one standard deviation σ away from the mean; about 95% are within two standard deviations and about 99.7% lie within three standard deviations. This is known as the 68-95-99.7 rule or the empirical rule.

## Kalman Filter

- The Kalman filter is an algorithm that uses a series of measurements observed over time, containing statistical noise and other inaccuracies, and produces estimates of unknown variables that tend to be more accurate than those based on a single measurement alone.

- **Intuition**: The Kalman filter works in two steps: prediction and update. In the prediction step, it produces estimates of the current state variables, along with their uncertainties. Once the outcome of the next measurement (necessarily corrupted with some amount of error, including random noise) is observed, these estimates are updated using a weighted average, with more weight being given to estimates with higher certainty.

- **Mathematically**: The Kalman filter operates by producing a joint probability distribution over the variables for each timeframe. The filter updates the current state and variance matrices with the information from the most recent measurement, resulting in a new set of Gaussians. The mean of the resulting Gaussian distribution is a weighted average of the means of the two original distributions, where the weights are the precisions (reciprocals of the variances) of the original distributions. The variance of the resulting distribution is less than the variances of the two original distributions, reflecting the fact that combining information from multiple sources generally reduces uncertainty.

## Univariate and Multivariate Gaussian Distributions

- **Univariate Gaussian Distribution**: This is the Gaussian distribution for a single random variable. It is defined by two parameters: the mean (μ) and the variance (σ²). The probability density function of a univariate Gaussian distribution is given by:

f(x) = (1 / sqrt(2πσ²)) * exp( - (x - μ)² / (2σ²) )

- **Multivariate Gaussian Distribution**: This is the generalization of the Gaussian distribution to multiple dimensions. It is defined by a mean vector (μ) and a covariance matrix (Σ). The mean vector contains the means of each variable and the covariance matrix contains the variances along the diagonals and the covariances off the diagonals. The probability density function of a multivariate Gaussian distribution is given by:

f(x) = (1 / (sqrt((2π)^k * det(Σ)))) * exp( -0.5 * (x - μ)' * Σ^-1 * (x - μ) )

where:
  - `x` is a k-dimensional real vector,
  - `μ` is the mean vector,
  - `Σ` is the covariance matrix,
  - `k` is the number of dimensions (variables),
  - `det(Σ)` is the determinant of the covariance matrix,
  - `'` denotes the transpose of a vector or a matrix.

### Product of Gaussian Distributions

- **Product of Gaussian Distributions**: When you multiply two Gaussian distributions together, you get another Gaussian distribution. This property is unique to Gaussian distributions and is one of the reasons why they are widely used in statistics and machine learning.

- **Mathematically**: If we have two Gaussian distributions N(μ1, σ1²) and N(μ2, σ2²), their product is proportional to a third Gaussian distribution N(μ, σ²), where:

μ = (σ² * μ1 + σ1² * μ2) / (σ1² + σ2²)
σ² = (σ1² * σ2²) / (σ1² + σ2²)

- **Intuition**: The mean of the resulting Gaussian distribution is a weighted average of the means of the two original distributions, where the weights are the precisions (reciprocals of the variances) of the original distributions. The variance of the resulting distribution is less than the variances of the two original distributions, reflecting the fact that combining information from multiple sources generally reduces uncertainty.

### Sums and Linear Transformations of Gaussian Random Variables

- **Sum of Gaussian Random Variables**: If X and Y are independent Gaussian random variables, then their sum Z = X + Y is also a Gaussian random variable. The mean of Z is the sum of the means of X and Y, and the variance of Z is the sum of the variances of X and Y.

- **Mathematically**: If X ~ N(μ1, σ1²) and Y ~ N(μ2, σ2²) are independent, then Z = X + Y ~ N(μ1 + μ2, σ1² + σ2²).

- **Linear Transformation of a Gaussian Random Variable**: If X is a Gaussian random variable and Y = aX + b is a linear transformation of X, then Y is also a Gaussian random variable. The mean of Y is a times the mean of X plus b, and the variance of Y is a² times the variance of X.

- **Mathematically**: If X ~ N(μ, σ²), then Y = aX + b ~ N(aμ + b, a²σ²).

## Gaussian Noise

- **Gaussian Noise**: Gaussian noise, also known as white noise or Gaussian white noise, is a type of statistical noise having a probability density function equal to that of the normal distribution, which is also known as the Gaussian distribution. In other words, the values that the noise can take on are Gaussian-distributed.

- **Intuition**: In many cases, systems are often subjected to random variations or 'noise'. When this noise is assumed to be Gaussian distributed, it is referred to as Gaussian noise. It's called white noise because it has uniform power across the frequency band for the system. It's a common assumption in many systems including communications, control systems, and signal processing.

- **Properties**: Gaussian noise is statistically described by a zero mean and a certain variance σ². These two parameters (mean and variance) completely describe the noise.

- **Applications**: Gaussian noise is used in signal processing and machine learning for its simplicity and properties. For example, in image processing, Gaussian noise can be used to model the effect of sensor noise. In machine learning, Gaussian noise is often added to the input data for regularization and to prevent overfitting.

