# Scientific Python: NumPy and SciPy

## Benefits

### NumPy

***NumPy*** is the fundamental package for scientific computing in Python.  It is a library that provides a multidimensional array object, additional derived objects, and a variety of methods and routines for executing fast operations on arrays.  These include mathematical, shape manipulation, searching, logical,  sorting, I/O, statistical, random simulation, and many other operations.  NumPy has many benefits, but some of the key benefits include:

- ***Compact storage***: the data structures use less memory per item than a typical python list
- ***Fast array loops***: array operations utilize vectorization that is implemented in C for increased speed
- ***Slicing without copying***: slicing a NumPy array doesn't create a new object
- ***Useful and efficient array operations***: NumPy supports array level operations such as joining, splitting, sorting, filtering, and reducing arrays
- ***Compatibility***: NumPy is compatible with many other scientific python libraries such as SciPy, Pandas, Scikit-Learn, and Matplotlib 


### SciPy

***SciPy*** is a collection of mathematical algorithms and convenience functions built on the NumPy library.  It adds functionality to programs in terms of data-processing to the point that it can rival systems and languages such as IDL, MATLAB, R-Lab, etc.  SciPy has many benefits, but some of the key benefits include:

- High-level commands and classes for visualizing and manipulating data
- Powerful and interactive sessions with Python
- Classes, and web and database routines for parallel programming
- Easy to use
- Fast and efficient
- Open-source with an active community for good support

## NumPy (Multidimensional, ndarrays) Arrays

***NumPy arrays*** are the central data structure of the NumPy library.  They are grids of values and contain information about raw data, how to locate an element, and how to interpret an element.  

Generally speaking, in terms of math, these arrays can be better explained by the term *tensor*, whose dimensions are expressed by the *rank*.  For example, a vector or 1-D array would be referred to as a *rank 1 tensor* and a matrix or 2-D array would be referred to as a *rank 2 tensor*.  Each dimension or rank in an array/tensor is represented by an *axis*.  When performing operations on NumPy Arrays, 

## Basic Statistics

NumPy has basic statistical functions built in such as mean and median.  Just as a recap:
- ***Mean***: average of the dataset
- ***Median***: the middle of the set of numbers
- ***Standard Deviation***: measures the spread of the data relative to the mean
- ***Variance***: measures how much each point differs from the mean

When computing these basic statistics, you can call them in one of two ways.  You can simply call them as directly implemented functions of the NumPy library or as methods of NumPy arrays. One important/interesting thing to note about NumPy is that there is no **mode** function and that **median** is only implemented as a direct function of NumPy, but not a method of NumPy arrays.

In [1]:
import numpy

data = numpy.array([1, 3, 1, 1, 6, 2, 8, 2, 9, 22, 13, 7, 23])

data_mean = data.mean()
print("The mean is:", data_mean)

data_mean = numpy.mean(data)
print("The mean is:", data_mean)

data_median = numpy.median(data)
print("The median is:", data_median)

data_std = data.std()
print("The standard deviation is:", data_std)

data_std = numpy.std(data)
print("The standard deviation is:", data_std)

data_var = data.var()
print("The variance is:", data_var)

data_var = numpy.var(data)
print("The variance is:", data_var)

The mean is: 7.538461538461538
The mean is: 7.538461538461538
The median is: 6.0
The standard deviation is: 7.302427253111272
The standard deviation is: 7.302427253111272
The variance is: 53.32544378698224
The variance is: 53.32544378698224


## Useful Functions and Features

The NumPy and SciPy libraries have an extensive list of functions and features that will most certainly prove to be useful to you while completing your various data science tasks.  Some of the more notable ones include:

- ***min and max*** - used to find the minimum and maximum value of a NumPy array
- ***mean*** - used to find the mean value of the NumPy array
- ***std*** - used to find the standard deviation of the NumPy array
- ***median*** - used to find the median of a NumPy array
- ***percentile*** - used to find the percentile in a NumPy array
- ***linspace*** - used to get evenly spaced numbers over a specified interval
- ***shape*** - used to get the shape of an array 
- ***reshape*** - used to reshape an array
- ***copyto*** - copies the values of one array to another array
- ***transpose*** - used to reverse the axes of an array 
- ***stack*** - used to join the sequence of an array along a new axis
- ***vstack*** - used to join the sequence of an array along a new axis vertically
- ***hstack*** - used to join the sequence of an array along a new axis horizontally
- ***sort*** - used to get a sorted array