# Statistics

Statistics is the mathematical science of **collecting, organizing, analyzing, interpreting, and presenting data**. It's a crucial foundation for fields like data science and machine learning because it provides the methods to make informed decisions based on data.

---

## Data

Data is essentially raw information—be it text, numbers, or images—that is meaningless without context or analysis. It's the starting point for any data-driven task and can be either **structured** or **unstructured**.

---

## Python's `statistics` Module

Python includes a built-in `statistics` module for performing basic mathematical statistics. This module is intended for straightforward calculations, much like a scientific calculator, and is **not** meant to compete with powerful, professional-grade libraries such as **NumPy** and **SciPy**.


## Numpy

**NumPy: The Core of Scientific Python**

- NumPy (Numerical Python) is the foundational library for scientific computing. Its most important feature is the `high-performance multidimensional array object`, which provides an efficient way to store and manipulate numerical data. Think of it as giving Python supercharged lists specifically designed for mathematical operations.

**Recommended Setup: Anaconda & Jupyter Notebook**

- For this type of work, it's recommended to switch from a standard code editor to Jupyter Notebook.

- The easiest way to get set up is by installing Anaconda. Anaconda is a free distribution that bundles Python, Jupyter Notebook, and many essential scientific libraries (like NumPy, pandas, etc.) into a single installation. This saves you the trouble of installing everything separately.

**Installation:**

```bash
pip install numpy
```


In [None]:
import numpy as np

# check the version
print("numpy: ", np.__version__)

# checking the available methods
print(dir(np))

numpy:  2.3.3
['False_', 'ScalarType', 'True_', '_CopyMode', '_NoValue', '__NUMPY_SETUP__', '__all__', '__array_api_version__', '__array_namespace_info__', '__builtins__', '__cached__', '__config__', '__dir__', '__doc__', '__expired_attributes__', '__file__', '__former_attrs__', '__future_scalars__', '__getattr__', '__loader__', '__name__', '__numpy_submodules__', '__package__', '__path__', '__spec__', '__version__', '_array_api_info', '_core', '_distributor_init', '_expired_attrs_2_0', '_globals', '_int_extended_msg', '_mat', '_msg', '_pyinstaller_hooks_dir', '_pytesttester', '_specific_msg', '_type_info', '_typing', '_utils', 'abs', 'absolute', 'acos', 'acosh', 'add', 'all', 'allclose', 'amax', 'amin', 'angle', 'any', 'append', 'apply_along_axis', 'apply_over_axes', 'arange', 'arccos', 'arccosh', 'arcsin', 'arcsinh', 'arctan', 'arctan2', 'arctanh', 'argmax', 'argmin', 'argpartition', 'argsort', 'argwhere', 'around', 'array', 'array2string', 'array_equal', 'array_equiv', 'array_repr',

### Creating int numpy arrays


In [None]:
import numpy as np

# Creating python list
python_list: list[int] = [1, 2, 3, 4, 5]

# Creating Numpy (Numerical python) array from python list
numpy_array_from_list = np.array(python_list)
print(type(numpy_array_from_list))
print(numpy_array_from_list)

<class 'numpy.ndarray'>
[1 2 3 4 5]


### Creating float numpy arrays


In [None]:
import numpy as np

# Creating python list
python_list: list[float] = [1.23, 3.42, 5.3, 7.3]

# Creating numpy array from python list
numpy_array_from_list = np.array(python_list, dtype=float)
print(type(numpy_array_from_list))
print(numpy_array_from_list)

<class 'numpy.ndarray'>
[1.23 3.42 5.3  7.3 ]


### Creating boolean numpy arrays


In [None]:
import numpy as np

# Creating python list
python_list: list[int] = [0, 1, -1, 3]

# Creating numpy array from python list
numpy_array_from_list = np.array(python_list, dtype=bool)
print(type(numpy_array_from_list))
print(numpy_array_from_list)

<class 'numpy.ndarray'>
[False  True  True  True]


### Creating multidimensional array using numpy

A numpy array may have one or multiple rows and columns


In [13]:
import numpy as np

two_dimensional_list = [[0, 1, 2], [3, 4, 5], [6, 7, 8]]
numpy_two_dimensional_list = np.array(two_dimensional_list)
print(type(numpy_two_dimensional_list))
print(numpy_two_dimensional_list)

<class 'numpy.ndarray'>
[[0 1 2]
 [3 4 5]
 [6 7 8]]


### Converting numpy array to list


In [14]:
# We can always convert an array back to a python list using tolist().
np_to_list = numpy_array_from_list.tolist()
print(type(np_to_list))
print("one dimensional array:", np_to_list)
print("two dimensional array: ", numpy_two_dimensional_list.tolist())

<class 'list'>
one dimensional array: [1, 2, 3, 4, 5]
two dimensional array:  [[0, 1, 2], [3, 4, 5], [6, 7, 8]]


### Creating numpy array from tuple


In [11]:
# Numpy array from tuple

# Creating tuple in Python

python_tuple = (1, 2, 3, 4, 5)
print(type(python_tuple))  # <class 'tuple'>
print("python_tuple: ", python_tuple)  # python_tuple: (1, 2, 3, 4, 5)

numpy_array_from_tuple = np.array(python_tuple)
print(type(numpy_array_from_tuple))  # <class 'numpy.ndarray'>
print(
    "numpy_array_from_tuple: ", numpy_array_from_tuple
)  # numpy_array_from_tuple: [1 2 3 4 5]

<class 'tuple'>
python_tuple:  (1, 2, 3, 4, 5)
<class 'numpy.ndarray'>
numpy_array_from_tuple:  [1 2 3 4 5]
