# ![](https://ga-dash.s3.amazonaws.com/production/assets/logo-9f88ae6c9c3871690e33280fcf557f33.png) Arrays and Array Operations

Week 1, Lecture 4.2

### LEARNING OBJECTIVES
*After this lesson, you will be able to:*
- Explain why using arrays can be preferable to Python container types
- Explain the difference between a vector, a matrix and an array
- Find the dimensionality of an array, reshape it, and transform it
- Find the data type in numpy and numpy's data type behaviors
- Add and subtract arrays
- Explain how the dot product is obtained
- Use numpy to obtain the dot product

## Recap

This week we've discussed some of the basic Python data types including "container types".


- **What are some of those types we discussed?**


- **What were some of the benefits of using container types?**

In [8]:
my_list = ['a', 'b', 'c', 1, 2, 3, ('monkey', 'cat')]

for item in my_list:
    print item

a
b
c
1
2
3
('monkey', 'cat')


## The Downside of data type flexibility

While being able to hold any data type is useful, it also means Python **must evaluate the type of each item as it is processed**.


## Enter the numpy array


Numpy arrays are of one type and one type only. There is no need to evaluate each element at runtime.

[From the numpy documentation](http://docs.scipy.org/doc/numpy/user/whatisnumpy.html):
    
>At the core of the NumPy package, is the ndarray object. This encapsulates n-dimensional arrays of homogeneous data types, with many operations being performed in compiled code for performance. There are several important differences between NumPy arrays and the standard Python sequences:
- NumPy arrays have a **fixed size at creation**, unlike Python lists (which can grow dynamically). Changing the size of an ndarray will create a new array and delete the original.
- The elements in a NumPy array are **all required to be of the same data type**, and thus will be the same size in memory. The exception: one can have arrays of (Python, including NumPy) objects, thereby allowing for arrays of different sized elements.

## Exercise:

1. Create a Python list containing 10 or more elements composed of both ints and floats
2. Save that list using the variable name 'mixed_type'
3. Now create a numpy array using ```np.array()``` with that same list and save it as 'same_type'
4. Run the following cells to compare the time differences

In [1]:
import numpy as np

## enter work below:

In [142]:
# run this

%%timeit

mixed_type * 10000000

1 loop, best of 3: 732 ms per loop


In [143]:
# then this

%%timeit

same_type * 10000000

The slowest run took 16.60 times longer than the fastest. This could mean that an intermediate result is being cached.
1000000 loops, best of 3: 1.13 µs per loop


## Time difference


700 milliseconds is equal to 700,000 microseconds

That means it took nearly **700,000 times longer** for the mixed type to run than the same type

## Terminology & Dimensionality

![](http://i.imgur.com/3gdNc12.jpg)

## Create an array

In [164]:
my_vector = np.array([0, 1, 2, 3, 4, 5])
my_matrix = np.array([[0, 1, 2, 3, 4, 5],[5, 4, 3, 2, 1, 0]])

In [165]:
my_vector

array([0, 1, 2, 3, 4, 5])

In [166]:
my_matrix

array([[0, 1, 2, 3, 4, 5],
       [5, 4, 3, 2, 1, 0]])

## Dimensions and Shapes

### Vector

In [167]:
my_vector

array([0, 1, 2, 3, 4, 5])

In [168]:
my_vector.ndim

1

In [169]:
my_vector.shape

(6,)

### Matrix

In [170]:
my_matrix

array([[0, 1, 2, 3, 4, 5],
       [5, 4, 3, 2, 1, 0]])

In [171]:
my_matrix.ndim

2

In [172]:
my_matrix.shape

(2, 6)

## Reshaping

In [173]:
my_vector

array([0, 1, 2, 3, 4, 5])

In [174]:
my_vector.shape

(6,)

In [179]:
reshaped_array = my_vector.reshape(2,3)

## Check: 

- What dimensionality will our reshaped array have?
- What do we call an array with this dimensionality?
- What will the shape be?

2

## Reshaping the matrix

In [187]:
my_matrix

array([[0, 1, 2, 3, 4, 5],
       [5, 4, 3, 2, 1, 0]])

In [190]:
my_matrix.reshape(1, 12)

array([[0, 1, 2, 3, 4, 5, 5, 4, 3, 2, 1, 0]])

## And another way...

In [199]:
my_matrix

array([[0, 1, 2, 3, 4, 5],
       [5, 4, 3, 2, 1, 0]])

In [197]:
my_matrix.reshape(12, 1)

array([[0],
       [1],
       [2],
       [3],
       [4],
       [5],
       [5],
       [4],
       [3],
       [2],
       [1],
       [0]])

## And another...

In [200]:
my_matrix

array([[0, 1, 2, 3, 4, 5],
       [5, 4, 3, 2, 1, 0]])

In [198]:
my_matrix.reshape(3, 4)

array([[0, 1, 2, 3],
       [4, 5, 5, 4],
       [3, 2, 1, 0]])

## Transposition of arrays

In [207]:
my_matrix

array([[0, 1, 2, 3, 4, 5],
       [5, 4, 3, 2, 1, 0]])

In [210]:
my_matrix.T

array([[0, 5],
       [1, 4],
       [2, 3],
       [3, 2],
       [4, 1],
       [5, 0]])

## Another transposition

In [214]:
my_matrix.reshape(3,4)

array([[0, 1, 2, 3],
       [4, 5, 5, 4],
       [3, 2, 1, 0]])

In [213]:
my_matrix.reshape(3,4).T

array([[0, 4, 3],
       [1, 5, 2],
       [2, 5, 1],
       [3, 4, 0]])

## Exercise

- Create a 16 element int vector in numpy
- Use .ndim to check the dimensionality
- Use .shape to check the shape
- Use .reshape to change the number of rows and columns
- Use .T to transpose the data 
- Notice if .reshape and .T happen in place or if they are just views (i.e., must you save to a new variable to retain what you see in the notebook output)

## Data types of an array

In [218]:
my_vector

array([0, 1, 2, 3, 4, 5])

In [219]:
my_vector.dtype

dtype('int64')

### What if we add a float?

In [229]:
with_float = np.array([1, 2, 3, 4, 5.0])

In [230]:
with_float.dtype

dtype('float64')

### What if we add a string?

In [231]:
with_str = np.array([1, 2, 3, 4, '5'])

In [232]:
with_str.dtype

dtype('S21')

## What if we go nuts?

In [242]:
nuts = np.array(['1', 2, 3.0, 'four', 10/2])

In [243]:
nuts.dtype

dtype('S4')

## Check: 

- Why is it important to check the data type?
- Under what conditions might this be a problem?

## Let's do some math


We're now going to look at how to add and subtract arrays, as well as how to find the dot product of arrays.

### Adding arrays

Arrays are added ith element to ith element

In [246]:
a = np.array([0, 1, 2, 3, 4, 5])
b = np.array([6, 5, 4, 3, 2, 1])

In [247]:
np.add(a, b)

array([6, 6, 6, 6, 6, 6])

### Subtract arrays

Arrays are added ith element to ith element

In [250]:
a = np.array([0, 1, 2, 3, 4, 5])
b = np.array([6, 5, 4, 3, 2, 1])

In [253]:
np.subtract(a, b)

array([-6, -4, -2,  0,  2,  4])

## Check:

- Will np.subtract(a, b) be the same as np.subtract(b, a)?

In [262]:
a = np.array([0, 1, 2, 3, 4, 5])
b = np.array([6, 5, 4, 3, 2, 1])

In [263]:
np.subtract(a, b)

array([-6, -4, -2,  0,  2,  4])

In [264]:
np.subtract(b, a)

array([ 6,  4,  2,  0, -2, -4])

## Multiply arrays (dot product)

The dot product is also known as the scalar product. This is an appropriate name since a scalar is a single value rather that a vector.

In [270]:
d = np.array([1, 2, 3])
e = np.array([3, 2, 1])

In [271]:
np.dot(d, e)

10

In [272]:
np.dot(e, d)

10

### Check:

- What about different sized vectors - will those work?

In [265]:
c = np.array([1, 2, 3, 4, 5, 9])
d = np.array([5, 4, 3, 2, 1])

### Add

In [266]:
np.add(c, d)

ValueError: operands could not be broadcast together with shapes (6,) (5,) 

### Subtract

In [267]:
np.subtract(c, d)

ValueError: operands could not be broadcast together with shapes (6,) (5,) 

### Dot product

In [268]:
np.dot(c, d)

ValueError: shapes (6,) and (5,) not aligned: 6 (dim 0) != 5 (dim 0)

## Exercise

** Using the following list, on your desk, calculate the dot ** <br>
** product by hand. Then do the same thing without numpy using iteration. ** <br>
** Then finally, use numpy. Check that all are equal. ** <br><br>
** Bonus: Do a single line list comprehension to calculate it **<br><br>
list_one = [1,3,5,7,9]<br>
list_two = [2,4,6,8,10]

## Solutions

## Conclusion

In this lecture, we've learned:

- What arrays are and why they are beneficial
- How to create them, check their type, dimensionality, and shape
- How to reshape them
- How to add, subtract, and multiply them (dot product)

There is a lab that follows in the repo, to let you explore more on this and how to work with numpy.