# Numpy & Pandas

## Numpy and Arrays

Numpy is a powerful numeric library that is essential for anyone analyzing data with Python. Numpy is a huge package that can support a multitude of tasks. Numpy is also inextricably linked to SciPy, a powerful library for scientific computing with capabilities for fitting, linear algebra, machine learning, etc. Here we are just going to cover some of the basics of numpy, but I encourage you to check out the numpy documentation pages (https://numpy.org/doc/stable/) to get an idea of the variety of things you can do.

Arrays are a data type which is fundamental to Numpy. In some ways Numpy arrays are like Python lists:
    - both are used for storing data/objects
    - both are mutable
    - items can be extracted from both using indexing and slicing
    - both can be iterated over

However there are key aspects of arrays that make them very different:
    - most operators act on the elements of an array instead of the array as a whole
    - arrays can only hold data of a single type
    - arrays can efficiently store large amounts of data in memory


In [1]:
import numpy as np

# create some sample lists
xlist = [1, 2, 3, 4]
ylist = [1, 4, 9, 16]

# create some sample arrays
x = np.array([1, 2, 3, 4])
y = np.array([1, 4, 9, 16])

First lets checkout the different behaviors between lists and arrays

In [3]:
print(xlist * 4)

print(x * 4)

print(x / 4)

print(xlist / 4)

[1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4, 1, 2, 3, 4]
[ 4  8 12 16]
[0.25 0.5  0.75 1.  ]


TypeError: unsupported operand type(s) for /: 'list' and 'int'

Notice how the list was repeated 4 times, whereas each element of the array was multiplied by 4 and the result ended up being the same length.

Division works element-wise for arrays, but division is not defined and produces an error when performed on a list.

## Iterating, indexing, and slicing

Iterating over a 1D array looks just like iterating over a list

In [4]:
for val in xlist:
    print(val)

for val in x:
    print(val)

1
2
3
4
1
2
3
4


Iterating an N-dimensional array will iterate over slices along the first dimension.

In [6]:
y = np.zeros((5, 5))

for val in y:
    print(val)
    
# you could accomplish the same thing liks this
for i in range(y.shape[0]):
    val = 

[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
[[0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]
 [0. 0. 0. 0. 0.]]
