##  Covariance
Covariance is a statistical measure that quantifies the degree to which two random variables change together. It indicates whether two variables tend to increase or decrease at the same time. If the covariance is positive, it suggests that as one variable increases, the other tends to increase as well. If the covariance is negative, it indicates that as one variable increases, the other tends to decrease. A covariance of zero suggests that there is no linear relationship between the two variables.

In [5]:
import numpy as np

# Sample data for two variables X and Y
X = np.array([1, 2, 3, 4, 5])
Y = np.array([2, 3, 4, 5, 6])

# Calculate covariance using NumPy
cov_matrix = np.cov(X, Y, bias=True)
cov_XY = cov_matrix[0, 1]

print(f"Covariance between X and Y: {cov_XY}")


Covariance between X and Y: 2.0


## Correlation
Correlation is a statistical measure that quantifies the degree and direction of a linear relationship between two continuous variables. It is used to assess how well the two variables are related to each other. The correlation coefficient provides a value that falls between -1 and 1:

A correlation of 1 indicates a perfect positive linear relationship, meaning that as one variable increases, the other also increases linearly.

A correlation of -1 indicates a perfect negative linear relationship, meaning that as one variable increases, the other decreases linearly.

A correlation of 0 indicates no linear relationship between the two variables.

In [6]:
import numpy as np

# Sample data for two variables X and Y
X = np.array([1, 2, 3, 4, 5])
Y = np.array([2, 3, 4, 5, 6])

# Calculate Pearson correlation coefficient using NumPy
corr_coefficient = np.corrcoef(X, Y)[0, 1]

print(f"Pearson Correlation between X and Y: {corr_coefficient}")


Pearson Correlation between X and Y: 0.9999999999999999


In [1]:
import numpy as np
from scipy.stats import pearsonr

# Sample data for two variables X and Y
X = np.array([1, 2, 3, 4, 5])
Y = np.array([2, 3, 4, 5, 6])

# Calculate covariance using NumPy
cov_matrix = np.cov(X, Y, bias=True)
cov_XY = cov_matrix[0, 1]

# Calculate Pearson correlation coefficient using SciPy
corr_coefficient, _ = pearsonr(X, Y)

print(f"Covariance between X and Y: {cov_XY}")
print(f"Pearson Correlation between X and Y: {corr_coefficient}")


Covariance between X and Y: 2.0
Pearson Correlation between X and Y: 1.0


In [2]:
import numpy as np
from scipy.stats import pearsonr

# Generate random data for two variables X and Y
np.random.seed(0)  # for reproducibility
X = np.random.rand(100)  # Generate 100 random data points for X
Y = 2 * X + np.random.randn(100)  # Generate Y with a linear relationship to X and some noise

# Calculate covariance using NumPy
cov_matrix = np.cov(X, Y, bias=True)
cov_XY = cov_matrix[0, 1]

# Calculate Pearson correlation coefficient using SciPy
corr_coefficient, _ = pearsonr(X, Y)

print(f"Randomly generated data:")
print(f"Covariance between X and Y: {cov_XY}")
print(f"Pearson Correlation between X and Y: {corr_coefficient}")


Randomly generated data:
Covariance between X and Y: 0.16099380765604412
Pearson Correlation between X and Y: 0.4889650609708346


### Pearson correlation co-efficient
The Pearson correlation coefficient, often referred to as Pearson's r or simply the correlation coefficient, is a statistical measure that quantifies the strength and direction of the linear relationship between two continuous variables. It measures how well the relationship between the two variables can be described by a straight line. The Pearson correlation coefficient ranges from -1 to 1, with the following interpretations:

If r = 1, it indicates a perfect positive linear relationship. This means that as one variable increases, the other variable increases proportionally.

If r = -1, it indicates a perfect negative linear relationship. This means that as one variable increases, the other variable decreases proportionally.

If r = 0, it indicates no linear relationship between the two variables.

In [3]:
import numpy as np

# Sample data for two variables X and Y
X = np.array([1, 2, 3, 4, 5])
Y = np.array([2, 3, 4, 5, 6])

# Calculate Pearson correlation coefficient using numpy
corr_coefficient = np.corrcoef(X, Y)[0, 1]

print(f"Pearson Correlation between X and Y: {corr_coefficient}")


Pearson Correlation between X and Y: 0.9999999999999999


### Spearman rank correlation coefficient
The Spearman rank correlation coefficient, often referred to as Spearman's rho (ρ), is a non-parametric measure of statistical dependence between two variables. It assesses the strength and direction of the monotonic relationship between two continuous or ordinal variables, regardless of whether the relationship is linear. Spearman's rank correlation is based on the ranks (orderings) of the data rather than their actual values.

The Spearman rank correlation coefficient ranges from -1 to 1, with the following interpretations:

If ρ = 1, it indicates a perfect positive monotonic relationship. This means that as one variable increases, the other variable also increases monotonically.

If ρ = -1, it indicates a perfect negative monotonic relationship. This means that as one variable increases, the other variable decreases monotonically.

If ρ = 0, it indicates no monotonic relationship between the two variables.

In [4]:
from scipy.stats import spearmanr

# Sample data for two variables X and Y
X = [10, 20, 30, 40, 50]
Y = [5, 15, 25, 35, 45]

# Calculate Spearman rank correlation coefficient using scipy
rho, p_value = spearmanr(X, Y)

print(f"Spearman Rank Correlation (rho): {rho}")


Spearman Rank Correlation (rho): 0.9999999999999999
