# Numpy tutorial

Oliver W. Layton

CS251/2: Data Analysis and Visualization

Spring 2021

In [1]:
import numpy as np
import time

## Numpy ndarray basics

### Creation from Python lists

In [3]:
# Make a numpy array from a 2D python list
arr = [[1,2,3], [4,5,6]]
print(arr)
arr = np.array(arr)
arr

[[1, 2, 3], [4, 5, 6]]


array([[1, 2, 3],
       [4, 5, 6]])

In [4]:
# print it
print(arr)

[[1 2 3]
 [4 5 6]]


### Data type of ndarray

In [6]:
# determine data type
arr.dtype
print(f'The data type of arr is {arr.dtype}')

The data type of arr is int64


Type can be changed in a few ways. 

1. when creating array — (a) implicitly or (b) explicitly
2. by casting types.

In [8]:
# 1a implicitly
arr = np.array([[1.,2.,3.], [4.,5.,6.]])
print(arr)
print(f'The data type of arr is {arr.dtype}')

[[1. 2. 3.]
 [4. 5. 6.]]
The data type of arr is float64


In [9]:
# 1b explicitly
arr = np.array([[1,2,3], [4,5,6]], dtype=float)
print(arr)
print(f'The data type of arr is {arr.dtype}')

[[1. 2. 3.]
 [4. 5. 6.]]
The data type of arr is float64


In [11]:
# 2. NOTE: This is a METHOD of the array, not a FUNCTION
arr =  np.array([[1,2,3], [4,5,6]])
arrFloat = arr.astype(float)
print(arrFloat)
print(f'The data type of arr is {arrFloat.dtype}')

[[1. 2. 3.]
 [4. 5. 6.]]
The data type of arr is float64


In [12]:
# Can also be string. be careful in your CSV parser that your "numbers"
# aren't actually strings!
arr =  np.array([[1,2,3], [4,5,6]], dtype=str)
arr

array([['1', '2', '3'],
       ['4', '5', '6']], dtype='<U1')

### Convert back to Python list

In [13]:
# Convert back from ndarray to Python list
arrAsList = arr.tolist()
print('Back as a Python list:\n', arrAsList)

Back as a Python list:
 [['1', '2', '3'], ['4', '5', '6']]


In [15]:
arrAsList.dtype

AttributeError: 'list' object has no attribute 'dtype'

### Other ways to create ndarrays quickly

#### 1. zeros

- We can plug in a list to get a multi-dimensional array
- We can plug in one int to get a vector of values

In [19]:
z = np.zeros([2, 10])
z

array([[0., 0., 0., 0., 0., 0., 0., 0., 0., 0.],
       [0., 0., 0., 0., 0., 0., 0., 0., 0., 0.]])

In [21]:
np.zeros(2)

array([0., 0.])

#### 2. ones

In [22]:
one = np.ones([5, 5])
one

array([[1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.],
       [1., 1., 1., 1., 1.]])

In [24]:
# can easily make any constant array
nines = 9*one
nines

array([[9., 9., 9., 9., 9.],
       [9., 9., 9., 9., 9.],
       [9., 9., 9., 9., 9.],
       [9., 9., 9., 9., 9.],
       [9., 9., 9., 9., 9.]])

#### 3. Random values

In [31]:
# Uniform random values
rands = np.random.random([3, 5])
rands

array([[0.8560886 , 0.34850525, 0.38516635, 0.10719155, 0.53227522],
       [0.98850691, 0.09966659, 0.7657851 , 0.67378813, 0.1459946 ],
       [0.62317115, 0.01133387, 0.49331771, 0.02039061, 0.78240208]])

#### 4. Equally spaced floats in an interval

In [33]:
np.linspace(0, 10)

array([ 0.        ,  0.20408163,  0.40816327,  0.6122449 ,  0.81632653,
        1.02040816,  1.2244898 ,  1.42857143,  1.63265306,  1.83673469,
        2.04081633,  2.24489796,  2.44897959,  2.65306122,  2.85714286,
        3.06122449,  3.26530612,  3.46938776,  3.67346939,  3.87755102,
        4.08163265,  4.28571429,  4.48979592,  4.69387755,  4.89795918,
        5.10204082,  5.30612245,  5.51020408,  5.71428571,  5.91836735,
        6.12244898,  6.32653061,  6.53061224,  6.73469388,  6.93877551,
        7.14285714,  7.34693878,  7.55102041,  7.75510204,  7.95918367,
        8.16326531,  8.36734694,  8.57142857,  8.7755102 ,  8.97959184,
        9.18367347,  9.3877551 ,  9.59183673,  9.79591837, 10.        ])

In [34]:
np.linspace(0, 10, 5)

array([ 0. ,  2.5,  5. ,  7.5, 10. ])

#### 5. Equally spaced ints in an interval

In [36]:
np.arange(10)

array([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])

In [37]:
np.arange(5, 10)

array([5, 6, 7, 8, 9])

#### 6. Identify matrix

In [40]:
np.eye(3)

array([[1., 0., 0.],
       [0., 1., 0.],
       [0., 0., 1.]])

### Check dimensions — `shape`

In [42]:
# check shape of 3D array
one = np.ones([3,4,5])
print(one)
print(one.shape)

[[[1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]]

 [[1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]]

 [[1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]
  [1. 1. 1. 1. 1.]]]
(3, 4, 5)


In [43]:
# check number of dimensions (M)
one.ndim

3

In [44]:
# Access 1st dim (#rows), 2nd dim (#cols) (Use f-string)
print(f'Num of rows: {one.shape[0]} Num of cols: {one.shape[1]}')

Num of rows: 3 Num of cols: 4


In [46]:
# Check number of elements total
print('Num elements in arr_1:', one.size)

Num elements in arr_1: 60


In [47]:
3*4*5

60

## Brief detour: Rapidly build python lists (list comprehension)

In [1]:
# Brief detour: In python you can replace the workflow of 
# list-building by creating an empty list and looping to append...
myList = []
for i in range(5):
    myList.append(i)
print('myList build the usual way', myList)

myList build the usual way [0, 1, 2, 3, 4]


In [5]:
# ...with Python list comprehensions
myListComp = [i for i in range(5)]
print('myListComp', myListComp)

myListComp [0, 1, 2, 3, 4]


In [6]:
# you can build lists using any function of i. How about i^2?
myListSqr = [i**2 for i in range(5)]
print('myListSqr', myListSqr)

myListSqr [0, 1, 4, 9, 16]


## ndarray indexing

Basic Accessing and modifying of ndarrays.

### Access and modify single elements

In [57]:
# To access elements in a multidimensional ndarray use ONE set of square brackets []
# Make a new random array
np.random.seed(0)  # ensures random numbers come up the same each time. Useful for debugging.
arr = np.random.random([3, 5])
print(arr)

[[0.5488135  0.71518937 0.60276338 0.54488318 0.4236548 ]
 [0.64589411 0.43758721 0.891773   0.96366276 0.38344152]
 [0.79172504 0.52889492 0.56804456 0.92559664 0.07103606]]


In [59]:
# Get the 1st element
arr[0,0]

0.5488135039273248

In [61]:
arr[1,0]

0.6458941130666561

In [62]:
arr[1,1]

0.4375872112626925

In [63]:
# Modifying single values is similar
arr[0,0] = 9
print('arr is now:\n', arr)

arr is now:
 [[9.         0.71518937 0.60276338 0.54488318 0.4236548 ]
 [0.64589411 0.43758721 0.891773   0.96366276 0.38344152]
 [0.79172504 0.52889492 0.56804456 0.92559664 0.07103606]]


### Slicing: real power of numpy

Use **colon** notation for all values in a dimension

Access and modify different ranges of data along different dimensions 

Make a 3x5 random array. Access 2nd column

In [8]:
np.random.seed(0)
rands = np.random.random([3, 5])
rands

array([[0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ],
       [0.64589411, 0.43758721, 0.891773  , 0.96366276, 0.38344152],
       [0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606]])

In [9]:
rands[:, 1]

array([0.71518937, 0.43758721, 0.52889492])

Access 1st row

In [11]:
rands[0]

array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ])

In [12]:
rands[0, :]

array([0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ])

Access last 2 columns

In [13]:
rands[:, -2:]

array([[0.54488318, 0.4236548 ],
       [0.96366276, 0.38344152],
       [0.92559664, 0.07103606]])

In [14]:
rands

array([[0.5488135 , 0.71518937, 0.60276338, 0.54488318, 0.4236548 ],
       [0.64589411, 0.43758721, 0.891773  , 0.96366276, 0.38344152],
       [0.79172504, 0.52889492, 0.56804456, 0.92559664, 0.07103606]])

Access columns at indices 1-2 and in 1st row. Careful about off-by-one.

- Low range (before :) CONTAINS that index
- High range (after :) DOES NOT contain that index (i-1)

In [16]:
rands[0, 1:2] # NOT what we want

array([0.71518937])

In [17]:
rands[0, 1:3]

array([0.71518937, 0.60276338])

Use slicing to assign values efficiently in batch without loops

In [18]:
# Assign 1st row to -1s
rands[0] = -1
rands

array([[-1.        , -1.        , -1.        , -1.        , -1.        ],
       [ 0.64589411,  0.43758721,  0.891773  ,  0.96366276,  0.38344152],
       [ 0.79172504,  0.52889492,  0.56804456,  0.92559664,  0.07103606]])

In [20]:
np.arange(5)

array([0, 1, 2, 3, 4])

In [21]:
# Assign 2nd row to increasing ints
rands[1] = np.arange(5)
rands

array([[-1.        , -1.        , -1.        , -1.        , -1.        ],
       [ 0.        ,  1.        ,  2.        ,  3.        ,  4.        ],
       [ 0.79172504,  0.52889492,  0.56804456,  0.92559664,  0.07103606]])

In [22]:
rands.shape

(3, 5)

In [23]:
rands[1] = np.arange(rands.shape[1])
rands

array([[-1.        , -1.        , -1.        , -1.        , -1.        ],
       [ 0.        ,  1.        ,  2.        ,  3.        ,  4.        ],
       [ 0.79172504,  0.52889492,  0.56804456,  0.92559664,  0.07103606]])

In [24]:
# Multiply the 3rd row by 5 times itself and update the row
rands[2] = 5*rands[2]
rands

array([[-1.        , -1.        , -1.        , -1.        , -1.        ],
       [ 0.        ,  1.        ,  2.        ,  3.        ,  4.        ],
       [ 3.95862519,  2.6444746 ,  2.84022281,  4.62798319,  0.35518029]])

### What if we want to access a set of rows or columns that are not adjacent?

Can't use colon notation. Instead use `np._ix`

In [25]:
rands

array([[-1.        , -1.        , -1.        , -1.        , -1.        ],
       [ 0.        ,  1.        ,  2.        ,  3.        ,  4.        ],
       [ 3.95862519,  2.6444746 ,  2.84022281,  4.62798319,  0.35518029]])

Example: Say we want column indices 0, 2, 4 and all rows.

**Syntax for `np.ix_`:**
- `np.ix_` goes inside the square brackets: `arr[np.ix_(blah)]`
- Give it `M` arguments (e.g. 2 for a 2D matrix).
- Each argument is a Python list (or ndarray) of indices to take along that dimension.

In [26]:
rands.shape[0]

3

In [28]:
np.arange(rands.shape[0])

array([0, 1, 2])

In [29]:
# pick out a collection of cols
rands[np.ix_(np.arange(rands.shape[0]), [0, 2, 4])]

array([[-1.        , -1.        , -1.        ],
       [ 0.        ,  2.        ,  4.        ],
       [ 3.95862519,  2.84022281,  0.35518029]])

In [30]:
# pick out a collection of rows
rands[np.ix_([0, 2], np.arange(rands.shape[1]))]

array([[-1.        , -1.        , -1.        , -1.        , -1.        ],
       [ 3.95862519,  2.6444746 ,  2.84022281,  4.62798319,  0.35518029]])

## Memory

- Numpy tries to be efficient with arrays so assignment does a shallow copy. To do a deep copy, you need to use `.copy()` method

In [31]:
a = np.linspace(-1, 1, 5)
a

array([-1. , -0.5,  0. ,  0.5,  1. ])

In [32]:
b = a
b[0] = 99

print(b)

[99.  -0.5  0.   0.5  1. ]


In [33]:
# changed a!
a

array([99. , -0.5,  0. ,  0.5,  1. ])

In [34]:
# fixed with .copy()
a = np.linspace(-1, 1, 5)
b = a.copy()
b[0] = 99
print(a)
print(b)

[-1.  -0.5  0.   0.5  1. ]
[99.  -0.5  0.   0.5  1. ]


## Apply functions over dimensions (`axes`)

- Axes are the numpy term for different ndarray dimensions. 
- *Idea*: Do we want to apply an operation (e.g. sum) on the rows OR columns of a ndarray?
- *Example*: axis 0 are the rows, axis 1 are the columns, etc.
- We can apply functions over one or more axis super efficiently in one line of code! This is called **Vectorization** — MUCH MUCH faster than loops (stay tuned).

In [35]:
one = np.array([[1, 1, 1], [2, 2, 2], [3, 3, 3], [4, 4, 4]])
one

array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3],
       [4, 4, 4]])

In [36]:
one.shape

(4, 3)

Sum along rows -> "collapse" across rows to get sum within each column — 3 numbers

In [37]:
np.sum(one, axis=0)

array([10, 10, 10])

Sum along columns -> "collapse" across columns to get sum within each row — 4 numbers

In [38]:
np.sum(one, axis=1)

array([ 3,  6,  9, 12])

**Careful:** Applying a function without specifying the axis may compute across the ENTIRE ndarray.

In [39]:
np.sum(one)

30

**Mnemonic trick:** Applying a function along an axis eliminates that dimension from the shape. Left with remaining dimensions.

In [40]:
print(one.shape)
the_mean = np.mean(one, axis=0)
print(f'Mean across axis 0: {the_mean.shape}')

print(one.shape)
the_mean = np.mean(one, axis=1)
print(f'Mean across axis 1: {the_mean.shape}')

(4, 3)
Mean across axis 0: (3,)
(4, 3)
Mean across axis 1: (4,)


In [2]:
higher = np.ones([3, 4, 5])
print(higher.shape)

(3, 4, 5)


In [3]:
result = np.mean(higher, axis=1)
result.shape

(3, 5)

## Broadcasting

**This is the most useful numpy feature thus far! This will become your bread-and-butter!**

### Simple example: Scalars

As we saw, we can create an array of any size with any constant value WITHOUT ANY LOOPS. This is the simplest example of numpy **broadcasting** the scalar across the ndarray.

In [43]:
# Example with basic arithmetic
one

array([[1, 1, 1],
       [2, 2, 2],
       [3, 3, 3],
       [4, 4, 4]])

In [46]:
# result = 9*one
result = 9*one/2 + 10
result

array([[14.5, 14.5, 14.5],
       [19. , 19. , 19. ],
       [23.5, 23.5, 23.5],
       [28. , 28. , 28. ]])

### Example: Subtract the minimum value of each column for the original 2D array

In [4]:
np.random.seed(0)
rand_inds = np.random.randint(low=0, high=5, size=(5, 6))
rand_inds

array([[4, 0, 3, 3, 3, 1],
       [3, 2, 4, 0, 0, 4],
       [2, 1, 0, 1, 1, 0],
       [1, 4, 3, 0, 3, 0],
       [2, 3, 0, 1, 3, 3]])

In [49]:
rand_inds.shape

(5, 6)

In [5]:
# Take the min across each column and subtract it from rand_inds. Print shape of result
theMins = np.min(rand_inds, axis=0)
theMins

array([1, 0, 0, 0, 0, 0])

In [6]:
theMins.shape

(6,)

In [7]:
subtracted = rand_inds - theMins
subtracted

array([[3, 0, 3, 3, 3, 1],
       [2, 2, 4, 0, 0, 4],
       [1, 1, 0, 1, 1, 0],
       [0, 4, 3, 0, 3, 0],
       [1, 3, 0, 1, 3, 3]])

#### What's going on??

Let's look at the shapes:

In [8]:
print(f'Orig dataset: {rand_inds.shape}')
print(f'col mins: {theMins.shape}')
print(f'Subtracted result: {subtracted.shape}')

Orig dataset: (5, 6)
col mins: (6,)
Subtracted result: (5, 6)


Numpy is **broadcasting** the vector to operate on the 2d ndarray (**draw this out on board**):

- Numpy looks for axis shape compatibility among the different arrays.
- Numpy sees the column dimension (5, **6**) matches the min vector (**6,**)
- Numpy adds a **singleton dimension** (a "fake" leading dimension for rows). Now the min vector is treated with shape: **(1, 6)**.
- Numpy dynamically "grows" the singleton dimension to the needed shape (5). So now the min vector is treated like a (5, 6) array.
- Numpy element-wise subtracts the two arrays: (5, 6) array can be subtracted by a (5, 6) array.
- Numpy returns the result, which is a (5, 6) array!

Process is **very memory efficient**: No new memory gets allocated during broadcasting.

#### Broadcasting only adds singleton dimensions LEFTWARD to the ndarray with smaller number of dimensions.

What if we did the same thing as above, but now wanted to **subtract the minimum value in each row**?

In [9]:
print(f'rand_inds shape: {rand_inds.shape}')
print(f'min shape: {np.min(rand_inds, axis=1).shape}')

rand_inds shape: (5, 6)
min shape: (5,)


In [10]:
rand_inds - np.min(rand_inds, axis=1)

ValueError: operands could not be broadcast together with shapes (5,6) (5,) 

**Problem:** adding singleton dimensions to the LEFT could never make the shapes compatible!! For example, numpy tries: 

    rand_inds shape: (5, 6)
    min shape: (1, 5)

but that won't work! Crash...

#### Adding singleton dimensions by ourselves

We can help numpy out and add a "new axis" ourselves to make the shapes compatible with `np.newaxis`!

In [18]:
rand_inds

array([[4, 0, 3, 3, 3, 1],
       [3, 2, 4, 0, 0, 4],
       [2, 1, 0, 1, 1, 0],
       [1, 4, 3, 0, 3, 0],
       [2, 3, 0, 1, 3, 3]])

In [14]:
theMins = np.min(rand_inds, axis=1)
theMins.shape

(5,)

In [15]:
theMins = theMins[:, np.newaxis]
theMins.shape

(5, 1)

In [19]:
rand_inds.shape

(5, 6)

In [17]:
theMins

array([[0],
       [0],
       [0],
       [0],
       [0]])

Can also make this more readable by defining a temp variable...

In [16]:
rand_inds - theMins

array([[4, 0, 3, 3, 3, 1],
       [3, 2, 4, 0, 0, 4],
       [2, 1, 0, 1, 1, 0],
       [1, 4, 3, 0, 3, 0],
       [2, 3, 0, 1, 3, 3]])

#### Squeeze if you need to get rid of all singleton dimensions

"Undo" a new axis / singleton dimension

In [20]:
z = np.zeros([1, 1, 3, 1, 2, 1, 1, 1, 1])
z.shape

(1, 1, 3, 1, 2, 1, 1, 1, 1)

In [23]:
z

array([[[[[[[[[0.]]]],



           [[[[0.]]]]]],





         [[[[[[0.]]]],



           [[[[0.]]]]]],





         [[[[[[0.]]]],



           [[[[0.]]]]]]]]])

In [22]:
print(np.squeeze(z))
np.squeeze(z).shape


[[0. 0.]
 [0. 0.]
 [0. 0.]]


(3, 2)

In [26]:
theMins.shape
np.squeeze(theMins).shape

(5,)

#### Not automatically squeezing computations

As we saw above, functions like `min` over an axis eliminate that axis from the result: 

In [29]:
rand_inds.shape

(5, 6)

In [27]:
print(f'Min over axis 0 shape: {np.min(rand_inds, axis=0).shape}')
print(f'Min over axis 1 shape: {np.min(rand_inds, axis=1).shape}')

Min over axis 0 shape: (6,)
Min over axis 1 shape: (5,)


If we want `min` (and some other functions) to keep the singleton dimension, we can use the optional argument `keepdims=True`:

In [30]:
print(f'Min over axis 0 shape: {np.min(rand_inds, axis=0, keepdims=True).shape}')
print(f'Min over axis 1 shape: {np.min(rand_inds, axis=1, keepdims=True).shape}')

Min over axis 0 shape: (1, 6)
Min over axis 1 shape: (5, 1)


This can **help with broadcasting compatibility when performing an operation on an axis then modifying the original ndarray.** Then we don't need to manually add a new axis.

Example with subtracting the mean:

In [31]:
rand_inds

array([[4, 0, 3, 3, 3, 1],
       [3, 2, 4, 0, 0, 4],
       [2, 1, 0, 1, 1, 0],
       [1, 4, 3, 0, 3, 0],
       [2, 3, 0, 1, 3, 3]])

In [35]:
rand_inds_centered = rand_inds.copy()
rand_inds_centered - np.mean(rand_inds, axis=0)
rand_inds_centered - np.mean(rand_inds, axis=1, keepdims=True)

array([[ 1.66666667, -2.33333333,  0.66666667,  0.66666667,  0.66666667,
        -1.33333333],
       [ 0.83333333, -0.16666667,  1.83333333, -2.16666667, -2.16666667,
         1.83333333],
       [ 1.16666667,  0.16666667, -0.83333333,  0.16666667,  0.16666667,
        -0.83333333],
       [-0.83333333,  2.16666667,  1.16666667, -1.83333333,  1.16666667,
        -1.83333333],
       [ 0.        ,  1.        , -2.        , -1.        ,  1.        ,
         1.        ]])

## Vectorization speed vs loops

Time computation of summing a ndarray with loop vs vectorized.

In [36]:
def timeit(fun):
    '''Just a function to time the runtime of another function'''
    def timer():
        start = time.time()
        fun()
        end = time.time()
        print(f'Took {end - start:.3} secs to run.')
    return timer


@timeit
def sumLoop():
    '''Use for loop to sum a row vector'''
    longRow = np.array([i for i in range(1, 1000000)])
    theSum = 0
    for i in range(len(longRow)):
        theSum += longRow[i]


@timeit
def sumVectorized():
    '''Vectorized version of summing a row vector'''
    longRow = np.array([i for i in range(1, 1000000)])
    theSum = np.sum(longRow)

In [37]:
# Dynamic typing in python makes for loops with lots of small
# operations slow
print('sumLoop:')
sumLoop()

# Vectorization allows Numpy to stop searching at runtime
# and use efficient pre-compiled functions to batch-process
# the computation over the matrix
print('sumVectorized:')
sumVectorized()

sumLoop:
Took 0.364 secs to run.
sumVectorized:
Took 0.162 secs to run.


## Reshaping

**Problem:**
- You want to preserve the number of elements in an ndarray but "regroup" the elements

Example: Have a `64 x 64` image and want to make one big `64*64` 1D vector by "gluing the rows together":

e.g: (3, 3) -> (9,)

Turn:

    [[1, 2, 3],
     [4, 5, 6],
     [7, 8, 9]]

Into:

     [1, 2, 3, 4, 5, 6, 7, 8, 9]

How can we do this without hard coding?

**Key:** Total number of elements in ndarray doesn't change.

## Combining multiple ndarrays

**Problem:**
- You have two ndarrays and want to concatenate them
- You have an ndarray and **want to append a column or row vector**

### Add/append a new column — "stack horizontally"

**Mnemonic**: Columns go horizontally.

Have `a`:

    [[1, 2]
     [3, 4]]
and `b`

    [[9]
     [9]]
    
want to make:

    [[1, 2, 9]
     [3, 4, 9]]
    
i.e. stack horizontally. Could be two matrices (not just a matrix and a vector).

**Caveat:** We need to make sure shapes are compatible for broadcasting:

- Result shape = `(2, 3)`
- We are starting with `a` shape: `(2, 2)`

The shape of `b` needs to be `(2, 1)` (why wouldn't `(1, 2)` work?)

In [39]:
a = np.array([[1, 2], [3, 4]])
a

array([[1, 2],
       [3, 4]])

In [52]:
b = np.array([[9], [9]])
b

array([[9],
       [9]])

In [53]:
b.shape

(2, 1)

In [54]:
glued = np.hstack([a, b])
glued

array([[1, 2, 9],
       [3, 4, 9]])

## Switching around the axes of an ndarray and matrix multiplication in numpy

We can't matrix multiply the following ndarrays due to shape issues:

In [None]:
a = 3*np.ones([3, 4])
b = 2*np.ones([3, 4])

Need to pair up as

    (3, 4) x (4, 3)

OR

    (4, 3) x (3, 4)
    
Use the transpose to help out!

Note: Transposing a ndarray vector isn't meaningful if you don't have a singleton dimension

In [None]:
a = np.ones(10)
a.shape

### Matrix vs. element-wise multiplication

- Star (*) operator means element-wise multiplication
- Like other basic math operators, can use broadcasting (e.g. can multiply a (3,) and a (5, 3) array)