In [1]:
%pip install numpy



In [None]:
import numpy as np

# Numpy Arrays  vs Python Lists

Like Python Lists, numpy Arrays are ordered sequences of data.  However, they have slightly different properties than Python lists:

| Property | Lists | Arrays |
| :--:     | :--:  | :--:   |
| Ordered  | ✔️    | ✔️ |
| Mutable | ✔️    | ✔️ |
| Can Mix Data Types | ✔️ |   |
| Append Data without Copying Whole Structure | ✔️  |  |
| Broadcastable |  | ✔️ |
| Fast Calculations |   | ✔️ |
| Fast Append | ✔️ |  |



### Exercises 
Let's explore each property and compare lists and arrays

#### Ordered Index

**Examples**

Index the third element of these data collections:

List:

In [None]:
x = [10, 20, 30, 40, 50]
x[2]

Array:

In [None]:
x = np.array([10, 20, 30, 40, 50])
x[2]

30

#### Ordered Slicing

Slice out the second-to-fourth element of these data collections:

**Examples**

List:

In [None]:
x = [10, 20, 30, 40, 50]
x[1:4]

[20, 30, 40]

Array:

In [None]:
x = np.array([10, 20, 30, 40, 50])
x[1:4]

array([20, 30, 40])

#### "Mutate" a Value inside a Collection

Just as values inside a collection can be retrieved, they can also be re-assigned.  For example, to change the third element of these data collections to the value "A":

```python
data[index] = value
```

List:

In [None]:
x = ["A", "C", "G", "G", "C", "T"]
x[2] = "A"
x

['A', 'C', 'A', 'G', 'C', 'T']

Array:

In [None]:
x = np.array(["A", "C", "G", "G", "C", "T"])
x[2]="A"
x

array(['A', 'C', 'A', 'G', 'C', 'T'], dtype='<U1')

#### Mixing Data Types

Change the third element of these data collections to the value 40:
```python
data[index] = value
```

List:

In [None]:
x = ["A", "C", "G", "G", "C", "T"]
x[2]="AT"
x

['A', 'C', 'AT', 'G', 'C', 'T']

Array:

In [None]:
x = np.array(["ATG", "CGA", "GCT", "GTT", "CTT", "TGA"])
x[2]="GT"
x

array(['ATG', 'CGA', 'GT', 'GTT', 'CTT', 'TGA'], dtype='<U3')

#### Append Values

Append a new value to the end of these data:

List:
```python
data.append(value)
```

In [None]:
x = ["A", "C", "G", "G", "C", "T"]
x.append("A")
x

['A', 'C', 'G', 'G', 'C', 'T', 'A']

In [None]:
x = ["A", "C", "G", "G", "C", "T"]
x.insert(3, "A")
x

['A', 'C', 'G', 'A', 'G', 'C', 'T']

Array:
```python
np.append(data, value)
```

In [None]:
x = np.array(["A", "C", "G", "G", "C", "T"])
x2 = np.append(x, "A")
x

array(['A', 'C', 'G', 'G', 'C', 'T'], dtype='<U1')

In [None]:
import numpy as np

### Broadcasting

Run the following code, which multiplies every value in the collection by 10

List:

In [None]:
data = [1, 2, 3, 4, 5]
data

[1, 2, 3, 4, 5]

In [None]:
data * 10;

In [None]:
data10 = [x * 10 for x in data]
data10

[10, 20, 30, 40, 50]

Array:

In [None]:
data = np.array([1, 2, 3, 4, 5])
data

array([1, 2, 3, 4, 5])

In [None]:
data10 = data * 10
data10

array([10, 20, 30, 40, 50])

### Fast Calculations

Run the following code, which multiplies every value in the collection by 10.  This time, take a look at how long it takes to run.  Which is faster?

Note: the ```%%timeit``` magic command runs the cell many times and reports the average amount of time each run of the cell's code took.

List:

In [None]:
1e6 

1000000.0

In [None]:
data = list(range(0, 1_000_000))
# data

In [None]:
%%timeit
data10 = [x * 10 for x in data]

97 ms ± 522 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


Array:

In [None]:
data = np.arange(0, 1_000_000)
data

array([     0,      1,      2, ..., 999997, 999998, 999999])

In [None]:
%%timeit
data10 = data * 10

1.99 ms ± 62.3 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


### Fast Append

Run the following code, which appends a new value to a list a thousand times.  Which is faster?


List:

In [None]:
%%timeit
data = []  # an empty list
for _ in range(10000):  # repeat N times
    data.append("A")


779 µs ± 27.4 µs per loop (mean ± std. dev. of 7 runs, 1000 loops each)


Array:

In [None]:
%%timeit
data = np.array([], dtype=str)  # an empty array
for _ in range(10000):  # repeat N times
    data = np.append(data, "A")

81.3 ms ± 734 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)


### Summary
On the other hand, if you want maximum flexibility, lists are perfect!  But if your data is complete and well-organized, arrays are quite handy!  They are simple to work with and can crunch a lot of numbers in a short time!  