**Probability**

    **Covariance**

    Covariance is a measure of the extent to which two random variables change together. It quantifies the degree to which two variables tend to deviate from their means in a similar way. 

Mathematically, the covariance between two random variables \(X\) and \(Y\) is denoted as \(cov(X,Y)\) and is defined as:

\[ cov(X,Y) = \frac{1}{n} \sum_{i=1}^{n} (x_i - \bar{x})(y_i - \bar{y}) \]

Where:
- \(n\) is the number of observations.
- \(x_i\) and \(y_i\) are individual observations of the variables \(X\) and \(Y\) respectively.
- \(\bar{x}\) and \(\bar{y}\) are the means of the variables \(X\) and \(Y\) respectively.

The sign of the covariance indicates the tendency in the linear relationship between the variables:
- Positive covariance indicates that as one variable increases, the other tends to increase as well.
- Negative covariance indicates that as one variable increases, the other tends to decrease.
- Covariance close to zero suggests that there is no linear relationship between the variables.

One drawback of covariance is that its value is not standardized, which means it is dependent on the scale of the variables. As a result, it's hard to interpret the magnitude of the covariance.

Covariance is often used in conjunction with correlation. While covariance measures the direction of the linear relationship between variables, correlation standardizes this measure to a range between -1 and 1, making it easier to interpret the strength and direction of the relationship.

**Formula of Covariance:**

<img src="covariance-Formula.png" width="450" height="200"/>

**Example**

Certainly, let's go through the example again to calculate the covariance between the heights of fathers and sons using the given dataset:

Here's the dataset:

| Father's Height (in inches) | Son's Height (in inches) |
|------------------------------|---------------------------|
|            68                |             70            |
|            72                |             74            |
|            66                |             68            |
|            71                |             72            |
|            69                |             71            |

First, we calculate the means of the father's heights bar{x} and the son's heights bar{y}:

bar{x} = {68 + 72 + 66 + 71 + 69}/{5} = {346}/{5} = 69.2 inches

bar{y} = {70 + 74 + 68 + 72 + 71}/{5} = {355}/{5} = 71  inches

Now, we can calculate the covariance using the formula:

Substituting the values:

cov(X,Y) = (68 - 69.2)(70 - 71) + (72 - 69.2)(74 - 71) + (66 - 69.2)(68 - 71) + (71 - 69.2)(72 - 71) + (69 - 69.2)(71 - 71) / 5
 
cov(X,Y) = (-1.2)(-1) + (2.8)(3) + (-3.2)(-3) + (1.8)(1) + (-0.2)(0) / 5

cov(X,Y) = 1.2 + 8.4 + 9.6 + 1.8 + 0 / 5

cov(X,Y) = {21}/{5} = 4.2 Square inch

So, the covariance between the heights of fathers and sons in this dataset is 4.2.

**drawbacks**

covariance has unit/dimension attached to it

the range can vary from -infinity to +infinity

-----

    **Correlation**

In probability theory and statistics, correlation refers to the measure of the relationship between two random variables. It indicates the extent to which the variables tend to move together. Correlation is often denoted by the symbol "r" and ranges between -1 and 1.

Here's what different correlation values indicate:

1. **Positive Correlation (r > 0)**: When one variable increases, the other variable tends to increase as well.
2. **Negative Correlation (r < 0)**: When one variable increases, the other variable tends to decrease.
3. **No Correlation (r = 0)**: There is no systematic relationship between the variables.

The correlation coefficient, denoted by "r," can be calculated using various methods. One common method is Pearson's correlation coefficient, which measures the linear relationship between two variables. It is calculated using the following formula:

r = sum((x_i - bar{x})(y_i - bar{y})) / sqrt((sum(x_i - bar{x})^2) * (sum(y_i - bar{y})^2))

Where:
- \( x_i \) and \( y_i \) are the individual data points.
- \( \bar{x} \) and \( \bar{y} \) are the means of the variables \( x \) and \( y \), respectively.

Correlation is widely used in various fields such as economics, finance, psychology, and epidemiology to analyze relationships between variables and make predictions based on those relationships. However, it's important to note that correlation does not imply causation. Two variables may be correlated, but that does not necessarily mean that one variable causes the other to change.

**Formula of Pearson's Correlation :**

<img src="formula-for-pearson-correlation-coefficient.webp" width="450" height="200"/>

**Example**

Certainly! Let's consider an example involving the correlation between the number of hours spent studying and the exam scores of students.

Suppose we have data from a class of students, where we recorded both the number of hours each student spent studying for an exam and their corresponding exam scores. We want to investigate whether there is a correlation between these two variables.

Here's a hypothetical dataset:

| Student | Hours Studied (x) | Exam Score (y) |
|---------|--------------------|----------------|
| 1       | 5                  | 75             |
| 2       | 3                  | 60             |
| 3       | 7                  | 85             |
| 4       | 4                  | 65             |
| 5       | 6                  | 80             |

To calculate the correlation coefficient (Pearson's correlation coefficient, denoted as \( r \)) between the number of hours studied and exam scores, we use the formula mentioned earlier.

First, we need to calculate the mean values of hours studied bar{x} and exam scores bar{y}:

bar{x} = {5 + 3 + 7 + 4 + 6}{5} = {25}{5} = 5 hours

bar{y} = {75 + 60 + 85 + 65 + 80}{5} = {365}{5} = 73  points

Now, we can use the formula to calculate  r:

Substituting the values:

r = (5-5)(75-73) + (3-5)(60-73) + (7-5)(85-73) + (4-5)(65-73) + (6-5)(80-73) / sqrt{(5-5)^2 + (3-5)^2 + (7-5)^2 + (4-5)^2 + (6-5)^2} *{(75-73)^2 + (60-73)^2 + (85-73)^2 + (65-73)^2 + (80-73)^2} 

r = 0 * 2 + (-2) * (-13) + 2 * 12 + (-1) * (-8) + 1 * 7 / sqrt{0^2 + (-2)^2 + 2^2 + (-1)^2 + 1^2} * {2^2 + (-13)^2 + 12^2 + (-8)^2 + 7^2}

r = 0 + 26 + 24 + 8 + 7 / sqrt{0 + 4 + 4 + 1 + 1} * {4 + 169 + 144 + 64 + 49}

r = 65 / sqrt({10} * {430}) = 65 / sqrt({4300})

r ≈ 65 /65

r ≈ 65 /65 ≈ 1

So, the correlation coefficient \( r \) is approximately 1.

Since \( r \) is positive and one, it suggests a positive correlation between the number of hours studied and the exam scores. This means that as the number of hours studied increases, the correlation between the number of hours studied and the exam scores is very strong.

**Formula of Correlation in terms of Covariance:**

<img src="correlation-formula-covariance.png" width="550" height="200"/>

**Example**

Let's consider an example to illustrate the relationship between correlation and covariance using the given formula.

Suppose we have a dataset that consists of the following observations for two variables, X and Y:

X = [2, 4, 6, 8, 10]
Y = [1, 3, 5, 7, 9]

First, we need to calculate the means of X and Y:

bar{X} = {2 + 4 + 6 + 8 + 10}/{5} = {30}/{5} = 6 
bar{Y} = {1 + 3 + 5 + 7 + 9}/{5} = {25}/{5} = 5 

Next, we calculate the covariance between X and Y:

cov(X,Y) = [(2 - 6)(1 - 5) + (4 - 6)(3 - 5) + (6 - 6)(5 - 5) + (8 - 6)(7 - 5) + (10 - 6)(9 - 5)] / 5

cov(X,Y) = [(-4)(-4) + (-2)(-2) + (0)(0) + (2)(2) + (4)(4)] / 5

cov(X,Y) = [16 + 4 + 0 + 4 + 16] / 5

cov(X,Y) = {40}/{5} = 8 

Now, we need to calculate the standard deviations of X and Y:

To find the standard deviation, we calculate the square root of the variance. 
Variance is the average of the squared differences from the Mean.
For X:
{Variance of } X = sum{(x_i - bar{X})^2}/{n} 
                = {(2-6)^2 + (4-6)^2 + (6-6)^2 + (8-6)^2 + (10-6)^2}/{5}
                = {16 + 4 + 0 + 4 + 16}/{5}
                = {40}/{5} = 8

So, the standard deviation sigma_X of (X) is ( sqrt{8} = 2.828 ).

For Y, we follow the same process and find the standard deviation sigma_Y  to be (2.828) as well.

Now, we can calculate the correlation coefficient using the formula:

r = cov(X,Y)/ (sigma_X * sigma_Y)
r = {8}/{2.828 * 2.828} = {8}/{8} = 1

The correlation coefficient \(r\) between \(X\) and \(Y\) is 1, indicating a perfect positive linear relationship between the variables. This means that as the values of \(X\) increase, the values of \(Y\) also increase in a perfectly linear fashion.

**Spearman Rank coefficient**


Spearman's rank correlation coefficient, denoted by \( \rho \) (rho), is a non-parametric measure of the strength and direction of association between two ranked variables. It assesses how well the relationship between two variables can be described using a monotonic function. Unlike Pearson's correlation coefficient, Spearman's correlation does not assume that the data are normally distributed.

Spearman's rank correlation coefficient is calculated as follows:

1. Rank the values for each variable separately, assigning ranks from 1 to \( n \), where \( n \) is the number of observations. If there are ties, assign the average rank to the tied values.

2. Calculate the difference between the ranks for each pair of observations. Square these differences.


Spearman's rank correlation coefficient ranges from -1 to 1:
- \( \rho = 1 \) indicates a perfect positive monotonic relationship.
- \( \rho = -1 \) indicates a perfect negative monotonic relationship.
- \( \rho = 0 \) indicates no monotonic relationship.

Spearman's correlation is commonly used when the relationship between variables is suspected to be monotonic but not necessarily linear, or when the data may contain outliers.

It's worth noting that Spearman's correlation does not indicate causation between variables but rather assesses the strength and direction of their association based on their ranks.

**Formula of Correlation in terms of speraman-coefficient:**

<img src="speraman-coeeficient.png" width="250" height="150"/>




    **Example**


    Let's consider a simple example to illustrate Spearman's rank correlation coefficient.

Suppose we have data on the hours of study and the corresponding exam scores of five students:

| Hours of Study | Exam Score |
|----------------|------------|
| 3              | 65         |
| 5              | 75         |
| 2              | 60         |
| 6              | 80         |
| 4              | 70         |

To compute Spearman's rank correlation coefficient, we first need to rank the data for both variables:

For Hours of Study:
- 2 -> Rank 1
- 3 -> Rank 2
- 4 -> Rank 3
- 5 -> Rank 4
- 6 -> Rank 5

For Exam Score:
- 60 -> Rank 1
- 65 -> Rank 2
- 70 -> Rank 3
- 75 -> Rank 4
- 80 -> Rank 5

Now, we calculate the differences between the ranks of corresponding pairs of observations:

| Hours of Study | Exam Score | Rank (Hours) | Rank (Exam) | \( d \)   | \( d^2 \) |
|----------------|------------|--------------|-------------|---------|----------|
| 3              | 65         | 2            | 2           | 0       | 0        |
| 5              | 75         | 4            | 4           | 0       | 0        |
| 2              | 60         | 1            | 1           | 0       | 0        |
| 6              | 80         | 5            | 5           | 0       | 0        |
| 4              | 70         | 3            | 3           | 0       | 0        |

Now, we can compute Spearman's rank correlation coefficient (\( \rho \)):

rho = 1 - {6 \times \sum d^2}/{{n(n^2 - 1)}} 

rho = 1 - {6 \times (0 + 0 + 0 + 0 + 0)}/{{5(5^2 - 1)}}

rho = 1 - {{0}}/{{5 *24}}

rho = 1 - {{0}}/{{120}}

rho = 1 - 0

rho = 1 

The Spearman's rank correlation coefficient for this dataset is 1, indicating a perfect positive monotonic relationship between the hours of study and the exam scores. In other words, as the hours of study increase, the exam scores also increase in a perfect monotonic fashion.

-----

Certainly! Here's a comparison of covariance and correlation in a chart:

| Feature          | Covariance                             | Correlation                         |
|------------------|----------------------------------------|-------------------------------------|
| Definition       | Measures degree of change between two variables | Standardized measure of linear relationship |
| Scale            | Not standardized, depends on units      | Standardized, ranges from -1 to 1   |
| Interpretation   | Directional, not standardized           | Direction and strength of linear relationship |
| Range            | Can take any real value                 | Always between -1 and 1              |
| Standardization  | Not standardized                        | Standardized by dividing by product of standard deviations |
| Applicability    | Useful for understanding direction      | Widely used for interpretation and comparison |

I hope this helps illustrate the key differences between covariance and correlation.

------------------
-----------------------

**Probability**

Probability is the measure of the likelihood of an event occurring, quantified between 0 and 1. It provides a mathematical framework to model and understand uncertainty and randomness in various phenomena.

Experiment: A repeatable procedure with a set of possible result

Sample Space : All possible outcome of an experiment

Event: one or more outcome of experiment

Basic counting principle: The basic counting principle states that if there are n_1 ways to do one thing and n_2 ways to do another thing, then there are n_1 * n_2 ways to do both things together. It is fundamental in combinatorics and is used to calculate the total number of outcomes in a sequence of events.
 This counting works in independent event
 
 Independent event : that does not affect each other
 

**Probability Principles**

Probability is the measure of the likelihood of an event occurring, quantified between 0 and 1. It provides a mathematical framework to model and understand uncertainty and randomness in various phenomena.

probability = share of success / total no of possible outcomes

relative frequency :Relative frequency is the ratio of the number of times an event occurs to the total number of observations, serving as an empirical estimation of probability. As the number of trials increases, relative frequency tends to converge towards the theoretical probability of an event.

    **Probability Principles**:
1. The exact outcome cannot be predicted.
2. All possible outcomes are known.
3. Equally likely outcome.
4. Repeatable under uniform condition.

**Probability Rule**

1. Probability can not be more than 1 or less than zero
2. Sum of all probability will always going to be one
3. complement rule.
4. general addition rule : P(A or B) = P(A)+P(B)-P(A and B) # where A,B is the event
    Special case of disjoint events (can not occur together) = P(A or B) = P(A)+P(B)

Conditional probability is the probability of an event occurring given that another event has already occurred. It's denoted as 
P(A∣B), the probability of event A given event B.

**Formula of Conditional probability:**

<img src="condtional-probablity.png" width="250" height="150"/>

**Question**:

16 ppl study French
21 study spanish
there are 30 people altogether
Ans:
common ppl for French and Spanish = 7

* P(French): 16/30
* P(Spanish): 21/30
* P(FOnly) : 9/30
* P(SOnly): 14/30
* P(F or S) :30/30
* P(F and S) :7/30

**Bayes' Theorem**

**Formula of Bayes' Theorem:**

<img src="Bayes Theorem.jpg" width="550" height="250"/>