# Statistics & Math Functions

###### In this notebook, I will implement basic statistical and mathematical functions using Python. These include mean, median, mode, variance, standard deviation, Euclidean distance, and sigmoid function.

## Import Libraries

In [2]:
import math
import statistics as stats

## Mean

In [8]:
def mean(numbers):
  return sum(numbers) / len(numbers)

numbers = [3,87,22,19,39,8,95]

mean = mean(numbers)
print(mean)

39.0


##### This function finds the average value in the list. It shows the central number by dividing the total sum by the count. It helps to know the overall trend of the data.


## Median

In [13]:
def median(numbers):
  numbers = sorted(numbers)
  print(numbers)
  n = len(numbers)
  mid = n // 2

  if n % 2 == 0:
    return (numbers[mid - 1] + numbers[mid]) / 2
  else:
    return numbers[mid]

numbers = [3,87,22,19,39,8,95,56]

median = median(numbers)
print(median)

[3, 8, 19, 22, 39, 56, 87, 95]
30.5


##### This function finds the middle value in the data after sorting it. If the list has an odd count, it picks the middle number, otherwise it averages the two middle numbers. The median is useful to understand the center of the data. It is especially beneficial when the data is skewed or has outliers.


## Mode

In [30]:
def mode(numbers):
    counts = {}
    for num in numbers:
        counts[num] = counts.get(num, 0) + 1

    max_count = max(counts.values())
    modes = [num for num, count in counts.items() if count == max_count]

    return modes

numbers = [3,87,22,19,3,8,95,19]

mode = mode(numbers)
print(mode)

[3, 19]


##### This function finds the value(s) that appear most often in the list. It counts how many times each number occurs and returns the ones with the highest frequency. The mode is useful when we want to know the most common value in the data. It is especially beneficial in categorical or repeating datasets.


## Variance

#### Population Variance

In [21]:
def variance(numbers):
  mean = stats.mean(numbers)
  variance = sum((x - mean) ** 2 for x in numbers) / len(numbers)
  return variance

numbers = [3,87,22,19,39,8,95]

variance = variance(numbers)
print(variance)

1198.0


##### This function measures how spread out the numbers are from the mean. It calculates the average of the squared differences from the mean. A higher variance means the data is more spread out, while a lower variance means the data is closer to the mean. This version is for population variance since we divide by the total number of values.


#### Sample Variance

In [23]:
def variance(numbers):
  mean = stats.mean(numbers)
  variance = sum((x - mean) ** 2 for x in numbers) / (len(numbers) - 1)
  return variance

numbers = [3,87,22,19,39,8,95]

variance = variance(numbers)
print(variance)

1397.6666666666667


##### This function measures how much the data values spread out from the mean. It divides by (n-1), not n, which makes it sample variance. This adjustment is done to make the estimate unbiased when using a sample. Sample variance is useful when the whole population data is not available.


## Standard Deviation

In [27]:
def std_dev(numbers):
    return math.sqrt(stats.variance(numbers))

numbers = [3,87,22,19,39,8,95]

std_dev = std_dev(numbers)
print(std_dev)

37.38538038681253


##### This function shows how much the data values vary from the mean on average. It is the square root of variance, so it is in the same units as the data. Since it uses sample variance, it is also an unbiased estimate of population spread. Standard deviation is widely used to understand consistency and variation in data.

## Euclidean Distance

In [28]:
def euclidean_distance(x, y):
    # If both are single numbers
    if type(x) in [int, float] and type(y) in [int, float]:
        return abs(x - y)

    # If both are lists (vectors)
    return math.sqrt(sum((a - b) ** 2 for a, b in zip(x, y)))

print("Euclidean Distance (values):", euclidean_distance(3, 7))
print("Euclidean Distance (lists):", euclidean_distance([1,2,3], [4,5,6]))

Euclidean Distance (values): 4
Euclidean Distance (lists): 5.196152422706632


##### This function finds how far two numbers or lists are from each other. For numbers it gives the difference, and for lists it measures the straight-line distance. It is useful in maths and machine learning to compare data points.


## Sigmoid Function

In [29]:
def sigmoid(x):
    return 1 / (1 + math.exp(-x))

print("Sigmoid(0):", sigmoid(0))
print("Sigmoid(2):", sigmoid(2))

Sigmoid(0): 0.5
Sigmoid(2): 0.8807970779778823


##### This function changes any number into a value between 0 and 1. Positive numbers go closer to 1, negative numbers closer to 0, and 0 becomes 0.5. It is useful in machine learning for converting values into probabilities. The sigmoid makes the output easy to understand and compare.