# Math Matters with Python, Scipy, & Numpy


## Setup

This guide was written in Python 3.6.


### Python and Pip

Download [Python](https://www.python.org/downloads/) and [Pip](https://pip.pypa.io/en/stable/installing/).


### Libraries

We'll be working with numpy and scipy, so make sure to install them. Pull up your terminal and insert the following: 

```
pip3 install scipy==0.19.1
pip3 install numpy==1.13.1
```

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<h1><center>Why Does This Matter?</center></h1>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<h1><center>Math is Data</center></h1>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

## Data Structures 

### Vectors

Lists are data structures universal to pretty much any programming language. Vectors are very similar to lists, in that a vector is just a set, or collection, of numbers. Because of this similarity, we can represent a vector with a list, for example:

In [5]:
A = [2.0, 3.0, 5.0]

### Matrices

A matrix is similar to a list or vector, but there's one fundamental difference: it's a 2D array that stores numbers. Another way of thinking about them is that they're multiple vectors in an list. Visually, they typically look like:

```
1 2 3
8 2 6
5 6 3
```

So to access any given element, you would use its row and column number. For example, in the following matrix, we would access the number by:

In [30]:
B = [[1,2,3],[8,2,6],[5,6,3]]

print(B[1][0])

8


## Numpy

Using the built-in data structures of the Python programming language, we implemented examples of vectors and matrices, but `numpy` gives us a better way! 

In [31]:
import numpy as np

vector1 = np.array([1,2,3])

matrix1 = np.matrix(
    [[0, 4],
     [2, 0]]
)

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<h1><center>Math Operations = Data Operations</center></h1>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

## Matrix Operations

Within the `numpy` module, there are tons of matrix operations you can use. As with any module, this reduces the amount of code you need to write. But more importantly, because `numpy` is actually written in C, its operations are _incredibly_ fast.

Here are some notable examples!

### Identity Matrix

Recall, that the identity matrix is an n x n matrix with 1s on the diagonal from the top left to the bottom right, such as

```
[[ 1., 0., 0.],
[ 0., 1., 0.],
[ 0., 0., 1.]]
```
We can generate diagonal matrices with the eye() function with numpy:

In [32]:
np.eye(4)

array([[ 1.,  0.,  0.,  0.],
       [ 0.,  1.,  0.,  0.],
       [ 0.,  0.,  1.,  0.],
       [ 0.,  0.,  0.,  1.]])

### Inverse Matrices

Recall, the inverse matrix is the reciprocal function of a matrix. In `numpy`, 

In [34]:
inverse = np.linalg.inv(matrix1)
print(inverse)

[[ 0.    0.5 ]
 [ 0.25  0.  ]]


### Determinant

Recall, the determinant of a matrix is a useful metric with respect to calculating the inverse of a matrix. For reference, the formula is as follows:

![ alt text](https://github.com/lesley2958/linear-algebra-with-python/blob/master/det.png?raw=true)

Instead of implementing this recursive algorithm, you can simply call the `det()` function in numpy. 

In [35]:
det = np.linalg.det(matrix1)
print(det)

-8.0


<br>
<br>
<br>
<br>
<br>
<br>
<br>
<h1><center>Images are Data</center></h1>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

In [20]:
import scipy.misc

img = scipy.misc.imread("./lennon.png")

In [22]:
print(type(img))

<class 'numpy.ndarray'>


In [23]:
img_tinted = img * [1, 0.95, 0.9, .4]

In [25]:
scipy.misc.imsave('lennon_tinted.png', img_tinted)

<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<h1><center>Statistics ♥ Data</center></h1>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>
<br>

## Statistics

While not all data science relies on statistics, a lot of the exciting topics like machine learning or analysis relies on statistical concepts. In this tutorial, we'll review introductory statistics to get started. This isn't meant to be a comprehensive guide, but rather a great starting point to get you started on your data science career.

#### In this first section, we'll begin by asking ourselves, "What is statistics?" 

It's very likely that you've heard of statistics before, whether that be in an article, results for a test grade in school, or pretty much any other context. But to put it formally, statistics is a discipline that uses data to support claims about populations. You'll come to learn that these "populations" are what we refer to as "distributions."


## ... And?

These distributions _are_ your data. Those test scores you and the rest of your classmates bombed? Data. And as we saw above, data isn't very useful without the operations we can use on them. For example,

### Mean

You know what the mean is, you've heard it every time your computer science professor handed your midterms back and announced that the average, or mean, was a disappointing low of 59. Woops.

With that said, the “average” is just one of many summary statistics you might choose to describe the typical value or the central tendency of a sample. As we saw in the linear algebra above, either `scipy` or `numpy` can be used to accomodate even the "simplest" of operations: 

In [40]:
import numpy as np
scores = np.array([17,42,86,21,55,66])
scipy.mean(scores)

47.833333333333336

### Descriptive vs Inferential Statistics

Generally speaking, statistics is split into two subfields: descriptive and inferential. The difference is subtle, but important. Descriptive statistics refer to the portion of statistics dedicated to summarizing a total population. Inferential Statistics, on the other hand, allows us to make inferences of a population from its subpopulation. Unlike descriptive statistics, inferential statistics are never 100% accurate because its calculations are measured without the total population.
