<a href="https://colab.research.google.com/github/eisakhan04/Numpy-Course/blob/main/Topic_20_%3D_Numpy_statistical_functions.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#**Topic 20 = Numpy statistical functions**

NumPy is a widely used Python library for scientific computing that provides a variety of mathematical and statistical functions to perform numerical calculations efficiently. In particular, NumPy's statistical functions are powerful tools for analyzing data and generating descriptive statistics.

Here are some of the most commonly used statistical functions in NumPy:

1. mean(): Calculates the arithmetic mean of the given data.

2. median(): Computes the median value of the given data.

3. std(): Calculates the standard deviation of the given data.

4. var(): Computes the variance of the given data.

5. min(): Returns the smallest value in the data.

6. max(): Returns the largest value in the data.

7. percentile(): Computes the q-th percentile of the data.

8. histogram(): Computes the histogram of the data.

9. corrcoef(): Computes the correlation coefficient between two variables.

10. ptp() function:The name of the function numpy.ptp() is derived from the name peak-to-peak. It is used to return the range of values along an axis. Consider the following example.

11. average() function:
The numpy.average() function is used to find the weighted average along the axis of the multi-dimensional arrays where their weights are given in another array.

These functions can be applied to one-dimensional or multi-dimensional arrays, and they can also be used to perform calculations on a subset of the data. NumPy's statistical functions are highly efficient and can handle large datasets with ease, making them a popular choice for data analysis and scientific computing tasks.

**Type (01): numpy.mean() functio**n

The NumPy mean() function is used to calculate the arithmetic mean or average of the given data along a specified axis. The mean value is calculated by summing up all the elements in the array and dividing the total by the number of elements.

The syntax for using the mean() function is:

numpy.mean(arr, axis=None)

where arr is the input array and axis is the axis along which the mean needs to be calculated. If the axis is not specified, the mean is calculated over the entire array.

**Example 01: Calculate the mean of a one-dimensional array:**

In [None]:



import numpy as np
arr = np.array([1, 2, 3, 4, 5])
mean = np.mean(arr)
print(mean)

3.0


**Explanation**

In this example, we used the np.mean() function to calculate the mean of an array of numbers. The function takes an array as input and returns the mean value of the array. Here, the mean value of the array [1, 2, 3, 4, 5] is 3.0.

**Example 02: Calculate the mean of a two-dimensional array along the columns (axis=0):**


In [None]:

import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("Original Array:",a)
print()
b = np.mean(a, axis=0)
print("mean values:",b)


Explanation

In this example, we calculated the mean values of an array along a specific axis using the np.mean() function in NumPy. The mean is calculated by summing up the values along the specified axis and dividing by the number of elements along that axis. In this case, we calculated the mean values along axis 0, which means we are calculating the mean values along the columns of the matrix. Therefore, the mean values of the columns are [4, 5, 6].



**Example 03: Calculate the mean of a two-dimensional array along the rows (axis=1):**




In [None]:

import numpy as np
a = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("Original Array:",a)
print()
b = np.mean(a, axis=1)
print("mean values of the rows:",b)

Original Array: [[1 2 3]
 [4 5 6]
 [7 8 9]]

mean values of the rows: [2. 5. 8.]


     
**Explanation**

In this example, we calculated the mean values of an array along a specific axis using the np.mean() function in NumPy. The mean is calculated by summing up the values along the specified axis and dividing by the number of elements along that axis. In this case, we calculated the mean values along axis 1, which means we are calculating the mean values along the rows of the matrix. Therefore, the mean values of the rows are [2, 5, 8].

**Example 04:Calculate the weighted mean of an array:**




In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
print("original array:",arr)
print()
weights = np.array([1, 2, 3, 4, 5])
print("weight array:",weights)
print()
mean = np.average(arr, weights=weights)
print(mean)


original array: [1 2 3 4 5]

weight array: [1 2 3 4 5]

3.6666666666666665


In [None]:
a= np.array([1,2,3,4,5,6])
b=np.average(a)
print(b)

3.5



**Explanation**

In this example, we calculated the weighted mean value of an array using the np.average() function in NumPy. The weighted mean is calculated by multiplying each element of the original array by its corresponding weight, then taking the sum of these products and dividing it by the sum of the weights. Therefore, the weighted mean value of this array is (1 * 1 + 2 * 2 + 3 * 3 + 4 * 4 + 5 * 5) / (1 + 2 + 3 + 4 + 5), which is equal to 3.6666666666666665.

**Example 05: Calculate the mean of a complex array:**




In [None]:

import numpy as np
arr = np.array([1+2j, 3+4j, 5+6j])
mean = np.mean(arr)
print(mean)


(3+4j)


**Explanation**

In this example, we calculated the mean value of an array with complex numbers. The mean() function in NumPy calculates the arithmetic mean of the complex numbers by taking the mean of their real and imaginary parts separately. Therefore, the mean value of this array is (1 + 3 + 5) / 3 + (2 + 4 + 6) / 3 * 1j, which is equal to (3 + 4j).

**Example 06:Calculate the mean of an array with NaN((Not a Number) values:**





In [None]:
import numpy as np
arr = np.array([1, 2, np.nan, 4, 5])
print("original array:",arr)
print()
mean = np.nanmean(arr)
print("mean of the remaining values:",mean)

original array: [ 1.  2. nan  4.  5.]

mean of the remaining values: 3.0


     
**Explanation**

In this example, we calculated the mean value of an array which has one NaN value. The nanmean() function in NumPy ignores any NaN values and calculates the mean of the remaining values. Therefore, the mean value of this array is (1 + 2 + 4 + 5) / 4, which is equal to 3.

**xample 07: Calculate the mean of a boolean array:**





In [None]:

import numpy as np
arr = np.array([True, False, True, True, False])
print("original array:",arr)
print()
mean = np.mean(arr)
print("mean of boolean:",mean)


original array: [ True False  True  True False]

mean of boolean: 0.6


**Explanation**

In this example, we calculated the mean value of a boolean array which has 2 True values and 3 False values. The mean value of this boolean array is calculated as the sum of the True values divided by the total number of elements in the array. Therefore, the mean value is 2/5, which is equal to 0.6.


**Example 08:Calculate the mean of a large array:**



In [None]:

import numpy as np
arr = np.random.rand(1000000)
print("original array:",arr)
print()
mean = np.mean(arr)
print("mean of large number:",mean)

original array: [0.07118869 0.43097946 0.44780351 ... 0.79756559 0.40501177 0.1606503 ]

mean of large number: 0.5002965597438617


  **Explanation**
  
   In this example, we generated an array of 1,000,000 random numbers between 0 and 1 using the np.random.rand() function.

We then calculated the mean value of this large array using the mean() function in NumPy, which returns the average value of all elements in the array. The mean value of this array is approximately 0.5, which is the expected value of a uniform distribution between 0 and 1.

# **Type(02): numpy.meadian() function**

The NumPy library in Python provides a function called "numpy.median()" that calculates the median value of a given array or list of numbers. The median is the middle value in a sorted list of numbers, or the average of the two middle values if the list has an even number of elements.

The**numpy.median()** function takes a single argument, which is the array or list of numbers for which you want to calculate the median. It returns the median value as a float.

**Example 01:Finding the median of an array of integers:**


In [None]:
import numpy as np
arr = np.array([3, 1, 4, 2, 5])
median = np.median(arr)
print(median)

3.0


**Example 03: Finding the median of a tuple of numbers:**


In [None]:
import numpy as np
tup = (3, 1, 4, 2, 5)
median = np.median(tup)
print(median)


3.0


**Example 04:Finding the median of a 2D array along axis 0 (columns)**
     
**Explanation**

The median values of each column are printed to the console, which are [5.5, 6.5, 7.5]. Therefore, the first median value is the median of [1, 4, 7, 10], the second median value is the median of [2, 5, 8, 11], and the third median value is the median of [3, 6, 9, 12].


In [None]:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9],[10,11,12]])
print("original array:",arr)
print()
median= np.median(arr, axis=0)
print(median)

original array: [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

[5.5 6.5 7.5]


**Example 05:Finding the median of a 2D array along axis 1 (rows)**



     






In [None]:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9],[10,11,12]])
print("original array:",arr)
print()
median= np.median(arr, axis=1)
print(median)

original array: [[ 1  2  3]
 [ 4  5  6]
 [ 7  8  9]
 [10 11 12]]

[ 2.  5.  8. 11.]


**Explanation**

The median values of each row are printed to the console, which are [2. 5. 8. 11.]. Therefore, the first median value is the median of [1, 2, 3], the second median value is the median of [4, 5, 6], the third median value is the median of [7, 8, 9], and the fourth median value is the median of [10, 11, 12].

**Example 06: Finding the median of a slice of an array:**



     


In [None]:
import numpy as np
a = np.array([1,2,3,4,5,6,7,8])
median= np.median(a[2:6])
print(median)

4.5


**Explanation**

The given code creates a numpy array named a that contains the integers 1 through 8.

Next, the np.median() function is used to calculate the median of the subarray of a that starts at the 3rd index and ends at the 6th index (inclusive). This subarray contains the elements [3, 4, 5, 6]. The median of this subarray is calculated to be 4.5, which is saved to the variable named median.

#**Type (03): numpy.std() function**

The numpy.std() function in Python's NumPy library is used to calculate the standard deviation of an array or a set of values. Standard deviation measures the amount of variation or dispersion in a dataset.

The formula for calculating the standard deviation is as follows:

std = sqrt(mean((x - x.mean())**2)) where:

x is the input array or set of values mean() calculates the mean (average) of x sqrt() is the square root function The steps involved in calculating the standard deviation using numpy.std() are as follows:

Calculate the mean of the array or set of values.

Calculate the squared difference between each element in the array and the mean.

Calculate the mean of the squared differences.

Take the square root of the mean of squared differences to obtain the standard deviation. The numpy.std() function allows you to calculate the standard deviation directly without explicitly performing these steps. It takes the array or set of values as its input and returns the standard deviation as the output.

**Example 1: Calculate the standard deviation of an array of integers.**

In [None]:

import numpy as np

arr = np.array([1, 2, 3, 4, 5])
std_dev = np.std(arr)

print(std_dev)

1.4142135623730951


**Example 2: Calculate the standard deviation of a 2D array.**



In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])
std_dev = np.std(arr)

print(std_dev)

1.707825127659933


**Example 3: Calculate the standard deviation of a 2D array along a specified axis.**



     



     

In [None]:
import numpy as np

arr = np.array([[1, 2, 3], [4, 5, 6]])
std_dev = np.std(arr, axis=0)

print(std_dev)

**Example 4: Calculate the standard deviation of a random array using a specified seed.**

In [None]:

import numpy as np
np.random.seed(42)
arr = np.random.rand(5)
print("original array:",arr)
print()
std_dev = np.std(arr)

print(std_dev)

**Example 5: Calculate the standard deviation of a masked array.**



In [None]:

import numpy as np

arr = np.ma.array([1, 2, 3, 4, 5], mask=[True, False, True, False, False])
print("original array:",arr)
print()
std_dev = np.std(arr)

print(std_dev)

original array: [-- 2 -- 4 5]

1.247219128924647


**Example 6: Calculate the standard deviation of a specified portion of an array.**




In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
std_dev = np.std(arr[2:])

print(std_dev)

0.816496580927726


#**Type (04): numpy.var() function**

The numpy.var() function in the NumPy library is used to calculate the variance of a given array or a specific axis of the array. Variance is a statistical measure that describes the spread or dispersion of a set of values.

The formula to calculate the variance is as follows:

variance = sum((x - mean) ** 2) / N

where:

x, represents each element of the array

mean, is the mean (average) of the array

N, is the number of elements in the array

In other words, the variance is the average of the squared differences between each element in the array and the mean, divided by the total number of elements.

The numpy.var(), function allows you to calculate the variance along a specific axis of a multi-dimensional array. By default, if no axis is specified, the variance is calculated for the flattened array.

**Example 1: Calculate the variance of an array of integers.**




In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
variance = np.var(arr)

print(variance)

2.0


**Example 2: Calculate the variance of a 2D array.**







     





In [None]:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
variance = np.var(arr)

print(variance)

     
**Example 3: Calculate the standard deviation of a 2D array along a specified axis.**

In [None]:
import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
variance = np.var(arr,axis=0)

print(variance)

**Example 4: Calculate the standard deviation of a masked array.**

In [None]:
import numpy as np

arr = np.ma.array([1, 2, 3, 4, 5], mask=[True, False, True, False, False])
print("original array:",arr)
print()
variance = np.var(arr)

print(variance)

**Example 5: Calculate the standard deviation of a specified portion of an array.**




In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
variance = np.var(arr[2:])

print(variance)


#**Type(05).numpy.min() function**

Certainly! The numpy.min() function is used to find the minimum value in a NumPy array. It returns the smallest element present in the array.

Here's the syntax of the numpy.min() function:

numpy.min(a, axis=None, keepdims=False, initial=None)

Parameters:

a: This is the input array from which you want to find the minimum value.

axis (optional): This parameter specifies the axis along which the minimum values are computed. By default, it is None, which means the minimum value is calculated over the flattened array.

keepdims (optional): This parameter determines whether the dimensions of the output should be the same as the input. If it is set to True, the dimensions are retained; otherwise, they are removed. By default, it is set to False.

initial (optional): This parameter sets the initial value for the minimum. By default, it is set to None.



**Example 1:Finding the minimum value in an array:**


In [None]:

import numpy as np
arr = np.array([5, 2, 9, 1, 7])
min_value = np.min(arr)
print("Minimum value:", min_value)


Minimum value: 1


**Example 2: Finding the minimum value along a specific axis in a multidimensional array:**



In [None]:
import numpy as np
arr = np.array([[5, 2, 9], [1, 7, 3]])
print("1D array:",arr.flatten())
print()
min_value = np.min(arr.flatten())
print("Minimum value (flattened array):", min_value)




     
**Example 3: Finding the minimum value in a 2D array flattened to a 1D array:**




In [None]:

import numpy as np
arr = np.array([[5, 2, 9],[1, 7, 3]])
print("Original array:",arr)
min_values = np.min(arr, axis=1)
print("Minimum values along axis 1:", min_values)
print()
min_values = np.min(arr, axis=0)
print("Minimum values along axis 0:", min_values)

Original array: [[5 2 9]
 [1 7 3]]
Minimum values along axis 1: [2 1]

Minimum values along axis 0: [1 2 3]


     
**Example 4: Finding the minimum value element-wise between two arrays:**



In [None]:

import numpy as np
arr1 = np.array([5, 2, 9, 1, 7])
arr2 = np.array([3, 6, 8, 2, 4])
min_values = np.minimum(arr1, arr2)
print("Minimum values (element-wise):", min_values)

Minimum values (element-wise): [3 2 8 1 4]


**Example 5: Finding the minimum value along a specific axis 1 in a 3D array:**

In [None]:

import numpy as np
arr = np.array([[[5, 2, 9],  [1, 7, 3]], [[4, 6, 8], [3, 2, 1]]])
print(arr)
print()
min_values = np.min(arr, axis=1)
print("Minimum values along axis 1:", min_values)

# **Type(06).numpy.max() function**

The numpy.max() function is a part of the NumPy library, which is used for numerical computations in Python. It is specifically designed to find the maximum value in an array or along a specified axis.

Here's a breakdown of how the numpy.max() function works:

Syntax:

numpy.max(array, axis=None, keepdims=False)

Parameters:

array: The input array or object to find the maximum value from.

axis (optional): It specifies the axis along which the maximum value should be determined. If not provided, the maximum value of the flattened array will be returned.

keepdims (optional): If set to True, the dimensions of the output will be the same as the input array, with the axis of the maximum value preserved. If set to False or not provided, the dimensions will be reduced.

**Example 1:Finding the maximum value in an array:**



     

     

In [None]:
import numpy as np
arr = np.array([5, 2, 9, 1, 7])
max_value = np.max(arr)
print("Minimum value:", max_value)

Minimum value: 9


**Example 3: Finding the maximum value in a 2D array flattened to a 1D array:**




In [None]:
import numpy as np
arr = np.array([[5, 2, 9], [1, 7, 3]])
print("1D array:",arr.flatten())
print()
max_value = np.max(arr.flatten())
print("Maximum value (flattened array):", max_value)

**Example 2: Finding the maximum value along a specific axis in a multidimensional array:**




In [None]:

import numpy as np
arr = np.array([[5, 2, 9],[1, 7, 3]])
print("Original array:",arr)
max_values = np.max(arr, axis=1)
print("Maximum values along axis 1:", max_values)
print()
max_values = np.max(arr, axis=0)
print("Maximum values along axis 0:", max_values)


# **Numpy. percentile() function**

The numpy.percentile() function in the NumPy library is used to calculate the nth percentile of a given dataset along a specified axis. Percentiles are a way to divide a dataset into equal portions, where each portion represents a specific percentage of the data. This function allows you to find the value below which a given percentage of the data falls.

**The general syntax of the numpy.percentile() function is as follows:**

numpy.percentile(a, q, axis=None, interpolation='linear')

Here's an explanation of the function's parameters:

**a:** This parameter represents the input array or object containing the dataset from which the percentile is calculated.

**q:** This parameter specifies the percentile value(s) to compute. It can be a single value or an array-like object containing multiple percentiles.

**axis:**This optional parameter indicates the axis along which the percentiles are computed. By default, the percentile is calculated over the flattened array.

**interpolation:** This optional parameter specifies the interpolation method to be used if the desired percentile lies between two data points. It can take values like 'linear', 'lower', 'higher', 'midpoint', or 'nearest'. The default value is 'linear'. To calculate the percentile using the numpy.percentile() function, the following formula is used:

**Sort the dataset in ascending order.Calculate the rank of the percentile using the formula:**

rank = (percentile / 100) * (N - 1) + 1,

where N is the total number of data points.

Determine the interpolation method if the rank is not an integer.

If the rank is an integer, the percentile value is the corresponding data point in the sorted dataset.

If the rank is not an integer, interpolate the percentile value based on the chosen interpolation method and the surrounding data points.

The numpy.percentile() function efficiently performs these calculations for you, allowing you to easily compute percentiles in NumPy arrays.

**Example 1: Calculating the 50th percentile (median) of an array:**

In [None]:
import numpy as np
data = np.array([1, 2, 3, 4, 5])
median = np.percentile(data, 3)
print(median)


1.12


In [None]:
import numpy as np
data = np.array([3, 7, 11, 14, 18, 19, 21, 24, 31, 34, 36 ])
percentile_value = np.percentile(data, 35)
print("The 65th percentile of the dataset is:", percentile_value)


The 65th percentile of the dataset is: 16.0


In [None]:
##Example 03: Calculating the 90th percentile of a 2D array along an axis:


import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6]])
percentile_90 = np.percentile(arr, 90, axis=1)
print(percentile_90)

[2.8 5.8]


In [None]:
#Example 04


import numpy as np

data = np.array([3, 7, 11, 14, 18, 19, 21, 24, 31, 34, 36])
percentile_value = np.percentile(data, 35)

print("The 35th percentile of the data is:", percentile_value)

The 35th percentile of the data is: 16.0


In [None]:
#Example 05: Calculating multiple percentiles simultaneously:


arr = np.array([1, 2, 3, 4, 5])
percentiles = np.percentile(arr, [25, 50, 75])
print(percentiles)




[2. 3. 4.]


# **Type 8:Numpy.histogram() Function**

The NumPy library provides a function called numpy.histogram() that allows you to compute the histogram of a dataset. The histogram is a graphical representation of the distribution of data, which shows the frequencies of different values or value ranges within the dataset.

Here's an explanation of the numpy.histogram() function:

**Syntax:**

numpy.histogram(data, bins=None, range=None, density=False, weights=None)

**Parameters:**

**data:** This parameter represents the input array or sequence of data values.

**bins:**It specifies the number of equal-width bins to use in the histogram. This can be an integer representing the number of bins, or a sequence defining the bin edges.

**range:** It defines the lower and upper range of the bins. If not provided, the range is determined by the minimum and maximum values of the data.

**density:** If set to True, the histogram is normalized to form a probability density. If set to False (default), the count of data points in each bin is returned.

**weights:** This parameter is an optional array of weights for each data point. Return values:

**hist:** This represents the histogram values, i.e., the count of data points in each bin.

**bin_edges:**The bin edges (length of hist + 1) that define the bin intervals.
The numpy.histogram() function calculates the histogram of the input data and returns the histogram values and bin edges. You can use these values to visualize and analyze the distribution of your dataset.

**Example 01: Creating a histogram with default settings:**

In [None]:
import numpy as np
data = np.array([1, 2, 3, 1, 2, 4, 5, 3, 2, 4])
hist, bins = np.histogram(data)
print(hist)
print(bins)







[2 0 3 0 0 2 0 2 0 1]
[1.  1.4 1.8 2.2 2.6 3.  3.4 3.8 4.2 4.6 5. ]


In [None]:
#Example 02: Specifying the number of bins for the histogram:


data = np.array([1, 2, 3, 1, 2, 4, 5, 3, 2, 4])
hist, bins = np.histogram(data,bins=3)
print(hist)
print(bins)

[5 2 3]
[1.         2.33333333 3.66666667 5.        ]


In [None]:
#Example 03: Setting custom bin boundaries for the histogram:


data = np.array([1, 2, 3, 1, 2, 4, 5, 3, 2, 4])
hist, bins = np.histogram(data, bins=[0, 2, 4, 6])
print(hist)
print(bins)


[2 5 3]
[0 2 4 6]


In [None]:
# Example 04:

import numpy as np
data = np.array([18, 20, 18, 25, 20, 30, 32, 35, 38, 40])
hist, bin_edges = np.histogram(data)
print("Histogram values:",hist)
print("Bin Edges:", bin_edges)
print("Counts:", hist)

Histogram values: [4 0 0 1 0 1 1 1 0 2]
Bin Edges: [18.  20.2 22.4 24.6 26.8 29.  31.2 33.4 35.6 37.8 40. ]
Counts: [4 0 0 1 0 1 1 1 0 2]


In [None]:
import numpy as np
data = np.array([1, 2, 2, 3, 3, 3, 4, 4, 5])
hist = np.histogram(data)
print("Histogram values:", hist)


In [None]:
#Example 05: Specifying the histogram range:

data = np.array([1, 2, 3, 1, 2, 4, 5, 3, 2, 4])
hist, bins = np.histogram(data, range=(1, 5))
print(hist)
print(bins)

[2 0 3 0 0 2 0 2 0 1]
[1.  1.4 1.8 2.2 2.6 3.  3.4 3.8 4.2 4.6 5. ]


# **Type 9:Numpy.Corrcoef() Function**

The numpy.corrcoef() function is a NumPy method used to compute the correlation coefficient between two or more variables in a dataset. It takes a dataset as input and returns a correlation matrix, which shows the pairwise correlations between the variables.

The function expects the dataset to be arranged such that each row represents an observation, and each column represents a variable. By calling numpy.corrcoef(data), where data is the dataset, the function calculates the correlation matrix.

The resulting correlation matrix is a square matrix where the diagonal elements are always 1, as a variable is perfectly correlated with itself. The off-diagonal elements represent the pairwise correlations between different variables. The correlation coefficients range from -1 to +1, indicating the strength and direction of the relationship between variables.

The numpy.corrcoef() function is commonly used in data analysis and statistical computations to understand the dependencies and relationships between variables. It provides valuable insights into the patterns and associations within a dataset, aiding in decision-making and further analysis tasks such as feature selection, regression modeling, and data visualization.

Corelation Positive Negative Strength

Perfect r=0.9 to 1 r=-0.9 to -1 Strong r=0.5 to 0.9 r=-0.5 to-0.9 Weak r=0.1 to 0.5 r=-0.1 to-0.5

**Example 01:**

In [None]:
import numpy as np

x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 4, 3, 2, 1])
print(x)

# Compute correlation coefficient
corr_matrix = np.corrcoef(x, y)
print(corr_matrix)
print()
# Accessing the correlation coefficient between x and y
correlation_coefficient = corr_matrix[0, 1]
print("Correlation coefficient between x and y:", correlation_coefficient)


[1 2 3 4 5]
[[ 1. -1.]
 [-1.  1.]]

Correlation coefficient between x and y: -0.9999999999999999


In [None]:

# Example 02


import numpy as np
x = np.array([1, 2, 3, 4, 5])
y = np.array([6, 7, 8, 9, 10])
correlation_matrix = np.corrcoef(x, y)
print(correlation_matrix)
print()
# Accessing the correlation coefficient between x and y
correlation_coefficient = correlation_matrix[0, 1]
print("Correlation coefficient between x and y:", correlation_coefficient)

[[1. 1.]
 [1. 1.]]

Correlation coefficient between x and y: 0.9999999999999999


In [None]:
# Example 03


x = np.array([1, 2, 3, 4, 5])
y = np.array([5, 4, 3, 2, 1])
z = np.array([7,8,9,10,11])
# Compute correlation coefficient
corr_matrix = np.corrcoef([x, y, z])
print(corr_matrix)
print()
# Accessing the correlation coefficient between x and y
correlation_coefficient = correlation_matrix[0, 1]
print("Correlation coefficient between x and y:", correlation_coefficient)

[[ 1. -1.  1.]
 [-1.  1. -1.]
 [ 1. -1.  1.]]

Correlation coefficient between x and y: -0.7535228992754175


In [None]:
# Example 04


import numpy as np
data = np.random.randn(5, 5)
print(data)
print()
correlation_matrix = np.corrcoef(data, rowvar=False)
print(correlation_matrix)

[[-0.31595353  1.55211852 -0.78556303  1.82244495  0.7400091 ]
 [ 0.13528935  0.43088985 -1.52284616 -0.15687243  0.28518398]
 [ 1.06862787 -1.43192787 -0.89894167 -0.36739627 -0.63360596]
 [ 0.0442705   0.42811512  0.50127624 -0.30098295  1.08001908]
 [-0.28975199 -0.43692366  0.04555743 -0.17139821  0.17448505]]

[[ 1.         -0.7535229  -0.31425928 -0.50636922 -0.7446183 ]
 [-0.7535229   1.         -0.02934016  0.75830911  0.82532303]
 [-0.31425928 -0.02934016  1.         -0.19451441  0.46940243]
 [-0.50636922  0.75830911 -0.19451441  1.          0.37652679]
 [-0.7446183   0.82532303  0.46940243  0.37652679  1.        ]]


# **Type 10.numpy.ptp() function**

In [None]:
# Example 1: Calculating the range of values in a 1D array:


import numpy as np

# Define a 1D array
arr = np.array([3, 7, 1, 9, 5])
# Calculate the range using numpy.ptp()
range_of_values = np.ptp(arr)
# Print the range
print("Range of values:", range_of_values)

**Explanation:**

In this example, the array arr contains the values [3, 7, 1, 9, 5]. The numpy.ptp() function calculates the range of these values, which is the difference between the maximum value (9) and the minimum value (1), resulting in a range of 8.

In [None]:
# Example 2: Calculating the range of values in a 2D array:


import numpy as np

# Define a 2D array
arr = np.array([[1, 5, 3], [2, 9, 7],[4, 6, 8]])
range_of_values = np.ptp(arr)
# Print the range
print("Range of values:", range_of_values)

Example 3: Calculating the range of values in a 2D array along a specified axis:





In [None]:
import numpy as np

# Define a 2D array
arr = np.array([[1, 5, 3],[2, 9, 7],[4, 6, 8]])
print(arr)
print()
# Calculate the range along axis 0 using numpy.ptp()
range_of_values_axis_0 = np.ptp(arr, axis=0)
# Calculate the range along axis 1 using numpy.ptp()
range_of_values_axis_1 = np.ptp(arr, axis=1)
# Print the ranges
print("Range of values along axis 0:", range_of_values_axis_0)
print("Range of values along axis 1:", range_of_values_axis_1)



[[1 5 3]
 [2 9 7]
 [4 6 8]]

Range of values along axis 0: [3 4 5]
Range of values along axis 1: [4 7 4]


**Explanation:**

In this example, the 2D array arr has dimensions 3x3. By specifying axis=0 in numpy.ptp(), it calculates the range of values along each column, resulting in an array [3, 4, 5]. Similarly, by specifying axis=1, it calculates the range of values along each row, resulting in an array [4, 7, 4].

# **Type 11: The numpy.average() function:**

**Example 1: Calculate the average of an array**








In [None]:
import numpy as np
arr = np.array([1, 2, 3, 4, 5])
average = np.average(arr)
print(average)

3.0


In [None]:

# Example 2: Calculate the average of a 2D array along a specific axis:


import numpy as np
arr = np.array([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print(arr)
print()
average = np.average(arr, axis=1)
print(average)

[[1 2 3]
 [4 5 6]
 [7 8 9]]

[2. 5. 8.]


In [None]:

# Example 3: Calculate the weighted average of an array


import numpy as np
arr = np.array([1, 2, 3, 4, 5])
weights = np.array([0.1, 0.2, 0.3, 0.2, 0.2])
weighted_average = np.average(arr, weights=weights)
print(weighted_average)

3.2
