# Arrays with `numpy`

- `numpy` (or NumPy), stands for Numerical Python
    + An open-source, linear algebra/numeric computing library considered the "universal standard" for numerical data in Python
    + At the core of the entirety of the scientific Python and PyData ecosystems
    + Provides the building blocks upon which a number of other libraries are built
        - Pandas
        - SciPy
        - Matplotlib
        - scikit-learn
        - statsmodels
- What `numpy` gives us are _arrays_ and _matrices_, which we likely took for granted in R
    + The **ndarray** structure gives us access to array and matrix data structures and tons of methods for performing standard linear algebra calculations
    + Additionally, things will be _vectorized_ again
- Worth noting - `numpy` is _fast_
    + Built on C; faster and more efficient than Python lists
    + Also less memory-hungry, and more convenient for our use-cases as econometricians
- install `numpy` with `conda install numpy` in the terminal or Anaconda prompt

## `numpy` Practice

#### Import `numpy`

In [1]:
import numpy as np

#### Create a 1-D array of 20 0's

In [2]:
np.zeros(20)

array([0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0., 0.,
       0., 0., 0.])

#### Create a 1-D array of the integers 1 to 20

In [4]:
np.arange(1,20)

array([ 1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11, 12, 13, 14, 15, 16, 17,
       18, 19])

#### Create a 1-D array of the integers 0 to 20, counting by 5

In [5]:
np.arange(0,20,5)

array([ 0,  5, 10, 15])

#### Create a 4x4 2-D array (matrix) of the numbers 0 to 15

In [7]:
np.array(
    [
        [0,1,2,3],
        [4,5,6,7],
        [8,9,10,11],
        [12,13,14,15]
    ]
)

array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11],
       [12, 13, 14, 15]])

#### Create a 3x3 2-D array (matrix) of randomly selected numbers

In [8]:
np.random.randn(9).reshape((3,3))

array([[ 0.62253459, -0.80534983, -0.33074739],
       [-1.88033255, -1.6237748 ,  1.0898316 ],
       [ 0.11509213, -0.28271394, -1.22148106]])

#### Create the following

In [9]:
np.linspace(0,1,9).reshape((3,3))

array([[0.   , 0.125, 0.25 ],
       [0.375, 0.5  , 0.625],
       [0.75 , 0.875, 1.   ]])

#### Given the following array, reproduce the output of the following cells

In [10]:
np.random.seed(999)
matrix = np.random.randint(0, 100, size=20).reshape((5,4))
matrix

array([[64, 92, 97, 72],
       [97, 91, 31, 89],
       [51, 16,  8,  8],
       [48, 69, 66, 11],
       [11, 21, 87, 50]])

In [11]:
matrix[4,3]

50

In [12]:
matrix[3,]

array([48, 69, 66, 11])

In [13]:
matrix[:,2]

array([97, 31,  8, 66, 87])

In [14]:
matrix[0:3,1:4]

array([[92, 97, 72],
       [91, 31, 89],
       [16,  8,  8]])

In [15]:
matrix[:,0].reshape(5,1)

array([[64],
       [97],
       [51],
       [48],
       [11]])

#### How many unique values are in this matrix?

In [16]:
np.unique(matrix)

array([ 8, 11, 16, 21, 31, 48, 50, 51, 64, 66, 69, 72, 87, 89, 91, 92, 97])

#### Take the square root of the 4th column

In [17]:
np.sqrt(matrix[:,3])

array([8.48528137, 9.43398113, 2.82842712, 3.31662479, 7.07106781])

#### Create two separate arrays from this matrix
- One should be 5x2, containing the values of the first two columns of `matrix`
- The second should also be 5x2, containing the values of the _second_ two columns of `matrix`
- Add these together
- Find the log of the 1st and 2nd column of the resulting array, separately

In [18]:
mat1 = matrix[:, 0:2]
mat1

array([[64, 92],
       [97, 91],
       [51, 16],
       [48, 69],
       [11, 21]])

In [19]:
mat2 = matrix[:, 2:]
mat2

array([[97, 72],
       [31, 89],
       [ 8,  8],
       [66, 11],
       [87, 50]])

In [20]:
mat3 = mat1 + mat2
mat3

array([[161, 164],
       [128, 180],
       [ 59,  24],
       [114,  80],
       [ 98,  71]])

In [21]:
np.log(mat3[:,0])

array([5.08140436, 4.85203026, 4.07753744, 4.73619845, 4.58496748])

In [22]:
np.log(mat3[:,1])

array([5.09986643, 5.19295685, 3.17805383, 4.38202663, 4.26267988])