# Standard Deviation

The standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean (also called the expected value) of the set, while a high standard deviation indicates that the values are spread out over a wider range.

It's commonly used in statistics and data science to understand the variability in data. In this notebook, we will go through the concept of standard deviation step by step and implement it using Python.

## Step 1: Understanding the Data

Before we dive into the calculation of standard deviation, let's first understand the data we are working with. For the purpose of this notebook, we will create a simple dataset using Python's built-in `range` function.

In [None]:
# Importing necessary libraries
import numpy as np

# Creating a simple dataset
data = list(range(1, 11))
print('Data:', data)

Data: [1, 2, 3, 4, 5, 6, 7, 8, 9, 10]


## Step 2: Calculating the Mean

The first step in calculating the standard deviation is to find the mean (average) of the data. The mean is the sum of all the numbers in the dataset divided by the count of numbers in the dataset.

In [None]:
# Calculating the mean
mean = np.mean(data)
print('Mean:', mean)

Mean: 5.5


## Step 3: Calculating the Variance

The next step is to calculate the variance. Variance is the average of the squared differences from the mean. Here's how to do it:

1. Subtract the mean from each number in the data and square the result (the squared difference).
2. Average these squared differences.

In [None]:
# Calculating the variance
variance = np.var(data)
print('Variance:', variance)

Variance: 8.25


## Step 4: Calculating the Standard Deviation

Finally, we can calculate the standard deviation. The standard deviation is simply the square root of the variance.

In [None]:
# Calculating the standard deviation
std_dev = np.std(data)
print('Standard Deviation:', std_dev)

Standard Deviation: 2.8722813232690143


## Conclusion

In this notebook, we have gone through the concept of standard deviation and implemented it using Python. We learned that the standard deviation is a measure of the amount of variation or dispersion of a set of values. A low standard deviation indicates that the values tend to be close to the mean of the set, while a high standard deviation indicates that the values are spread out over a wider range.

Understanding the standard deviation of a data set is a key aspect of a data scientist's work as it provides insights into data variability. It's an essential tool in exploratory data analysis, data cleaning, and data preprocessing.