# Numpy Basics

Numpy is used to create and operate on numpy arrays, which are basically vectors and matrices.

Why use numpy arrays instead of lists? Because lists are not vectors. For instance, the + operator does not add two lists element-wise, as we might expect (remember, lists don't even after to contain numbers). Rather, it just concatenates the two lists together.

In [1]:
MyList1 = [1,2,3]
MyList2 = [1,0,0]
print(MyList1 + MyList2)

[1, 2, 3, 1, 0, 0]


To actually add the two lists elementwise, we need to import the numpy module.

In [2]:
import numpy as np

To use any function inside the numpy module, we have to add `np.` in front of it. For example, to use the array function, which constructs a numpy array, we use `np.array()`. 

If we had instead only typed `import numpy`, then we would have to spell out `numpy.array()` instead. We can choose any abbreviation for the package we want, but `np` is traditional.

Now we can define two numpy arrays using `MyList1` and `MyList2` and add them properly.

In [3]:
x = np.array(MyList1) # to use any numpy functions, need to put 'np.' in front
y = np.array(MyList2)
print(x+y)

[2 2 3]


To define matrices, we specify a list of lists.

In [4]:
z = np.array([[1,2], [3,4], [4,5]]) # matrices
print(z)
print(z.shape) # get dimensions
print(z.shape[0]) # get number of rows

[[1 2]
 [3 4]
 [4 5]]
(3, 2)
3


In [5]:
# convenient functions for initializing matrices
a = np.zeros((2,2)) # input is a tuple, which uses parentheses, not a list, which uses brackets
print(a)
b = np.ones((1,2))
print(b)
c = np.eye(2)
print(c)
d = np.random.random((2,2))
print(d)

[[0. 0.]
 [0. 0.]]
[[1. 1.]]
[[1. 0.]
 [0. 1.]]
[[0.96087023 0.32015994]
 [0.60677924 0.9642009 ]]


In [6]:
# side note: how to set seeds
np.random.seed(seed=0)

## Matrix Math

Component-wise operations are the same as for scalars:

In [7]:
x = np.random.random((2,2))
y = np.random.random((2,2))
print(x+y)
print(x-y)
print(x*y)
print(x/y) # note these are all elementwise

[[0.9724683  1.36108348]
 [1.04035059 1.43665618]]
[[ 0.1251587   0.06929525]
 [ 0.16517616 -0.34688982]]
[[0.23250747 0.4619366 ]
 [0.26376154 0.48591211]]
[[1.29542615 1.10728578]
 [1.37747027 0.61101108]]


Some special functions for taking square-roots or raising to a certain power:

In [8]:
print(np.sqrt(x))
print(np.power(x,2))

[[0.74081948 0.84568869]
 [0.77637837 0.73816203]]
[[0.30119626 0.51149583]
 [0.36332369 0.29689768]]


Important matrix algebra functions:

In [9]:
print(x.dot(y)) # dot product
print(np.linalg.inv(x)) # inverse
print(x.T) # transpose

[[0.5454652  0.99226198]
 [0.49379751 0.87523343]]
[[-4.12631777  5.41602068]
 [ 4.5646357  -4.15608149]]
[[0.5488135  0.60276338]
 [0.71518937 0.54488318]]


**Exercise:** Let x be a random 100x3 matrix and y a random 100x1 vector. Print the OLS estimator of a regression of y on x.

## Summary Statisics

In [10]:
x = np.random.random(100)
print(x.mean())
print(x.std())
print(x.max())
print(x.min())

0.4710323678966207
0.29141369298399705
0.9883738380592262
0.004695476192547066


## Slicing

This is the same as for lists.

In [11]:
print(z)
print("{},{}".format(z[0,0],z[0,1]))
print(z[0,0:2])
print(z[0,:])
print(z[0:2,0:2])
print(z[:,0].shape)

[[1 2]
 [3 4]
 [4 5]]
1,2
[1 2]
[1 2]
[[1 2]
 [3 4]]
(3,)


Notice that Python annoyingly converts vectors into horizontal form. Sometimes we need vectors to be vertical. Suppose I want to add 1 to the first row of z and 7 to the second.

In [12]:
AddMe = np.array([1,7])
print(AddMe)
print(z[0:2,0:2] + AddMe)
print(AddMe[:,np.newaxis]) # convert to vertical vector
print(z[0:2,0:2] + AddMe[:,np.newaxis])

[1 7]
[[ 2  9]
 [ 4 11]]
[[1]
 [7]]
[[ 2  3]
 [10 11]]


Numpy has functions for generating random numbers. For other distributions, look up `np.random`.

In [13]:
x = np.random.random(5)
print(x)

[0.22308163 0.95274901 0.44712538 0.84640867 0.69947928]


Here's another slicing trick. Suppose we wanted to print elements of x that are above 0.5.

In [14]:
x = np.random.random(5)
print(x)

indices = x>0.5 # creates a vector of booleans
print(indices)

print(x[indices]) # grabs only components of x whose components in indices are True

[0.29743695 0.81379782 0.39650574 0.8811032  0.58127287]
[False  True False  True  True]
[0.81379782 0.8811032  0.58127287]


**Exercise:** Print the components of x such that their squares are less than 0.25.