# Numpy Basics

Probably the biggest shift when getting started with data science is the syntax of numpy and pandas because it differs so much from other programming paradigms.  In this section we will walk through some numpy basics:

* why numpy?
* introduction to tensors
* numpy shapes
* numpy slicing
* numpy querying
* linear algebra in numpy

## Why Numpy?

Technically, anything you can do in numpy you can do in plain old python.  And arguably the syntax will be easier to understand, that is, unless you use numpy as it is intended.  Numpy _can_ be used for all the basic stuff that you'll find most of the examples for on the internet.  But it's really _intended_ to be used for a new paradigm of programming.  One that's caught on in the statistical and deep learning communities (which have at least some overlap).  

Numpy's api and computation is optimized for the manipulation of algebraic structures.  You can use it to do most of the computation that you can do with vanilla Python, or any other programming lanugage.  But you probably shouldn't.  We can think of the numpy api as sort of a directed language.  It's not quiet that, because numpy is mostly "about" syntax change.  You are thinking about the world from a difference lense.  But I digress.

Here are just a few of the benefits of numpy:

* incredible speed - numpy is _much_ faster than vanilla python (it can even outperform Java sometimes)
* a beautiful and well organized api
* tons of utility functions
* amazing documentation
* it's completely free (whaaaaat)

To really understand numpy and the power it brings, we need to understand tensors.  Because without them, numpy honestly doesn't make much sense, at least at first.  And even once you start to get used to the syntax, without the mental model of a tensor, you'll completely miss the point of using it.

## Tensors

Tensors are some of the most powerful objects around.  In fact, this book is basically just a "how do I use tensors" most of the time.  The chapter on linear regression?  That's just about tensors.  The chapter on classification?  More applications of tensors.  Much of machine learning is built on tensors.  Specifically, on matrices.  Because I define the matrix in another chapter, I won't go into a ton of detail about what they are, or how to use them.  

Basically, tensors are the generalization of matrices.  Examples of tensors include:

* scalars
* vectors
* matrices
* order-3 tensors and higher

A scalar is an order zero tensor - because it's just a single number, like say the number `5`.  A vector is a one dimensional collection of numbers representing data or an equation, like: 

$$ \begin{pmatrix}
1  \\
4  \\
7 
\end{pmatrix}
$$

A matrix is a two dimensional collect of numbers representing a system of equations, like:

$$
\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{pmatrix}
$$

An order three tensor looks like a data cube.  There is no easy way to show such a cube in latex, so you'll have to imagine this to some extent:

$$ A_{1} = 
\begin{pmatrix}
1 & 2 & 3 \\
4 & 5 & 6 \\
7 & 8 & 9
\end{pmatrix}
$$

$$ A_{2} = 
\begin{pmatrix}
3 & 2 & 3 \\
7 & 6 & 6 \\
7 & 2 & 9
\end{pmatrix}
$$

$$ A_{3} = 
\begin{pmatrix}
1 & 2 & 3 \\
7 & 6 & 4 \\
6 & 2 & 19
\end{pmatrix}
$$

Now imagine $A_{1}, A_{2}, A_{3}$ as one object.  This is an order three tensor.  It has three axes - $(i,j,k)$ and you can specify elements across these three axes.  So $(0,0,0) = 1$, $(1,0,0) = 4$, and $(3,0,3) = 6$.  Here the i is the row index, j is the column index and k is the matrix index.  You can also do this for order 4 and up to n, where n is any finite natural number you like.  Why might you want to ever do this in practice?  It turns out there are actually a ton of good reasons.

Here are just two of them:

1. Let's say you want to model multivariate timeseries geospatial data.  This is naturally an order 4 tensor.  The first two dimensions will be each snapshot of multivariate data.  Your third dimension will be that snapshot overtime.  And your forth will be over time and different geographies.  Thinking about it this way is useful for capturing shared weights between time and geographies.  How you model your data matters.  And by ignoring the time or geospatial components of your data, you might lose some important information.

2. You can get a performance boost, statistically speaking.  As this paper shows: https://arxiv.org/pdf/1811.06569.pdf you can get a decent accuracy boost by treating your neural network as a higher order tensor.

## Numpy Shapes

Now that you know what a tensor is, the syntax of numpy will seem obvious and straight forward.  Let's start by showing how to represent each of the tensors we've discussed thus far:

In [1]:
# order 0 tensor
import numpy as np

scalar = np.array([1])
print(scalar)

[1]


You may think we've done nothing new here.  But actually we have!  For starters, numpy attaches types to anything passed into it.  And it does this _implicitly_.  You never have to name the types.  That by itself would be a feat of engineering prowess.  Let's see what I'm talking about:

In [3]:
scalar.dtype

dtype('int64')

The `dtype` property tells us what kind of data is in our tensor.  Since there are mathematical consequences to what's in our tensor, it's best to define one type per tensor.  Usually floats are the most flexible.  Of course, you can define a tensor with multiple types.  But for any serious mathematical computation, this is discouraged.  However, there are lots of programming instances when defining multiple types in a data structure is useful and important, which is why this paradigm is supported:

In [4]:
np.array(["hello", 1])

array(['hello', '1'], dtype='<U5')

Now that we have seen an order 0 tensor and an order 1 tensor, by accident, let's define another order 1 tensor, called a vector:

In [5]:
vector = np.array([1, 4, 7])
print(vector)

[1 4 7]


There are a few things to note here:

1. a vector is a collection of scalars.
2. a vector represents a mathematical object, not just an array.

Because this is a mathematical object, we can do things like this:

In [10]:
vector_one = np.array([1, 4, 7])
vector_two = np.array([2, 4, 6])

np.matmul(vector_one, vector_two.T)

60

If you've ever taken a linear algebra course the answer that's produced will seem surprising.  That's because technically numpy defaults to an array of scalars for a one dimensional array passed to the `np.array` method, rather than a vector.  The difference here is important.

Because algebraic objects are defined in part by the algebraic operators attached to them, this detail matters.  Specifically, here the "multiplication" attached to our vectors is the inner product in this case.  If we want the outer product, which is what most folks who have taken linear algebra would expect, then we need to tell numpy that we are working with tensors or order 1 aka vectors and not a collection of scalars.

We do that by using the `reshape` method a powerful tool that will allow us to represent tensors of any order we like.  But first let's start with the basics of turning a collection of scalars into an order 1 tensor, aka a vector:

In [17]:
vector_one = np.array([1, 4, 7])
vector_two = np.array([2, 4, 6])

vector_one = vector_one.reshape(3, 1)
vector_two = vector_two.reshape(1, 3)
print("vector one:")
print(vector_one)
print("vector two:", vector_two)
print("result:")
np.matmul(vector_one, vector_two)

vector one:
[[1]
 [4]
 [7]]
vector two: [[2 4 6]]
result:


array([[ 2,  4,  6],
       [ 8, 16, 24],
       [14, 28, 42]])

By reshaping our vectors to the appropriate shapes, we were able to produce a matrix!  This is a general fact of linear algebra - you can get a matrix by applying a matrix multiplication (`matmul`), also known as the outer product, to two vectors.  This "trick" of taking two lower dimensional tensors to create a higher order one will actually work for _any_ tensor we like.  If we want to recover an order 3 tensor we simply need to multiply a matrix by a vector.  That's because the order is additive, by tensor product!  Let's see an example:

In [55]:
a = np.array([[ 5, 1 ,3], 
              [ 1, 1 ,1], 
              [ 1, 2 ,1]])
b = np.array([1, 2, 3])
print(np.tensordot(a, b, axes=0))

[[[ 5 10 15]
  [ 1  2  3]
  [ 3  6  9]]

 [[ 1  2  3]
  [ 1  2  3]
  [ 1  2  3]]

 [[ 1  2  3]
  [ 2  4  6]
  [ 1  2  3]]]


Here `a` is a matrix, `b` is a tensor.  And by taking the tensor product of the two of them, we recover an order 3 tensor!  Notice we have to provide an axes or the `tensorproduct` method.  This is because a tensor product can be defined on any order.  We've already seen an order 1 tensor product, the inner product.  And we've seen an order 2 tensor product, the outer product.  In higher spaces, we generally refer to the product as simply the tensor product where the order comes from context.  However, please take care to be clear about the shapes of your tensors, otherwise you'll end up doing the _wrong multiplication_.  

I'll leave as an exercise creating tensors of order 4, 5, and 6.  Happy multiplying!

## Numpy Slicing

Now that we've seen how to create our tensors