In [None]:
!pip install numpy

In [1]:
import numpy as np

## Numpy Arrays

#### Some Array Creation Functions (Review)
| Function | Purpose |  Example |
| :-----------: | :-------------: | :-------------: |
| **np.array()**  | Turns a list into an array |   `np.array([2, 5, 3])` |
| **np.arange()**                  | all the integers between two values | `np.arange(2, 7)` |
| **np.arange()**                  | all the numbers between two values, with a step size | `np.arange(2, 7, 0.5)` |
| **np.linspace()**               | Makes a specific-length array |  `np.linspace(2, 3, 10)` |
| **np.zeros()**                    | Makes an array of all zeros | `np.zeros(5)` |
| **np.ones()**                     | Makes an array of all ones | `np.ones(3)` |
| **np.random.random()** | Makes an array of random numbers | `np.random.random(100)` |
| **np.random.randn()**     | Makes an array of normally-distributed random numbers | `np.random.randn(100)` |


#### Important Array Attributes

| Attribute     | Description                                          | Example Output   |
| :-------:     | :---------:                                          | :-----:          |
| **`x.dtype`**  | numpy data type (64-bit int, 32-bit float, etc)      | `dtype('int64')` | 
| **`x.shape`**  | number of elements along each dimension of the array | `(17,)`          |
| **`x.size`**   | total number of values in the array                  | `17`             | 
| **`x.nbytes`** | total amount of memory the array takes up, in bytes  | `136`            |


#### Important Array Syntax

| Syntax              |  Name 
| :---                | :--  
| `x[0]`              | **Index**
| `x[1:3]`            | **Slice**
| `x[0] = 5`          | **Mutate**
| `x[0:4] = 5`        | **Mutate**
| `y = x * 5`         | **Transform**
| `x[3:] = x[3:] * 5` | **Transform In-Place**
| `x[3:] *= 5`        | **Transform In-Place** (shorter version)



**Exercises**

Make the arrays requested in any way you'd like, then print the property requested from that created array.

**All the integers from 3 to 10**

In [2]:
x = np.arange(3, 11)
x

array([ 3,  4,  5,  6,  7,  8,  9, 10])

How many total values are in this array?

In [3]:
x.size

8

**All the numbers from 2 to 6, spaced 0.3 apart.**

In [8]:
x = np.arange(2, 6, 0.3)
x

array([2. , 2.3, 2.6, 2.9, 3.2, 3.5, 3.8, 4.1, 4.4, 4.7, 5. , 5.3, 5.6,
       5.9])

What is the shape of this array?

In [6]:
x.shape

(14,)

**9 numbers from 3 to 6, spaced equally apart.**

In [7]:
x = np.linspace(3, 6, 9)
x

array([3.   , 3.375, 3.75 , 4.125, 4.5  , 4.875, 5.25 , 5.625, 6.   ])

How many bytes does this array take up?

In [9]:
x.nbytes

112

**The square of the values 40, 20, 3, 5, and 10.**

In [10]:
x = np.array([40, 20, 3, 5, 10])
np.sqrt(x)

array([6.32455532, 4.47213595, 1.73205081, 2.23606798, 3.16227766])

What data type does this array hold?

In [11]:
x.dtype

dtype('int64')

**The square root of the values 32, 4, 3, and 6**

In [13]:
x = np.array([32, 4, 3, 6])
np.sqrt(x)

array([5.65685425, 2.        , 1.73205081, 2.44948974])

What data type does this array hold?

In [14]:
x.dtype

dtype('int64')

**Ten random integers between -3 and 3**

In [15]:
x = np.random.random_integers(-3, 3, 10)
x

  x = np.random.random_integers(-3, 3, 10)


array([ 3,  0, -1, -3,  1, -3, -2, -2,  1, -2])

What is this array's shape?

In [16]:
x.shape

(10,)

**Without hard-coding it, create an array with these values:
`[  0, 100, 200, 300,   4,   5,   6,   7,   8,   9]`**

In [20]:
x = np.arange(10)
x[1:4] = x[1:4] * 100
x

array([  0, 100, 200, 300,   4,   5,   6,   7,   8,   9])

How many total values are in this array?

**Without hard-coding it, create an array with these values: `[ 0, 99,  2, 99,  4, 99, 99,  7,  8,  9]`**

What data type does this array have?

**Create an array of the following five three-letter animal names: dog, cat, pig, rat, ant**

What data type does this array hold?

How many bytes does this array take up?  Why?

**Create an array of the following five animal names: dog, cat, pig, rat, anteater**

What data type does this array hold?

How many bytes does this array take up?  Why?

### Filtering Data With Logical Indexing

Sometimes you want to remove certain values from your dataset.  In Numpy, this can be done with **Logical Indexing**, and in normal Python this is done with an **If Statement**

#### Step 1: Create a Logical Numpy Array

We can convert all of the values in an array at once with a single logical expression.  This is broadcasting, the same as is done with the math operations we saw earlier:

```python
>>> data = np.array([1, 2, 3, 4, 5])
>>> data < 3
[True, True, False, False, False]
```

**Exercises**: Make arrays of True/False values that answer the following questions about the dataset below for each element.

In [None]:
import numpy as np

data = np.array([3, 7, 10, 2, 1, 7, np.nan, 20, 5])
data

array([ 3.,  7., 10.,  2.,  1.,  7., nan, 20.,  5.])

1. Which values are greater than zero?

2. Which values are equal to 7?

array([False,  True, False, False, False,  True, False, False, False])

3. Which values are greater or equal to 7?

array([False,  True,  True, False, False,  True, False,  True, False])

4. Which values are not equal to 7?

array([ True, False,  True,  True,  True, False,  True,  True,  True])

#### Step 2: Filter with Logical Indexing

If an array of True/False values is used to *index* another array, and both arrays are the same size, it will return all of the values that correspond to the True values of the indexing array:

```python
>>> data = np.array([1, 2, 3, 4, 5])
>>> is_big = data > 3
>>> is_big
[False, False, False, True, True]

>>> data[is_big]
[4, 5]
```


**Exercises**:  Using the data below, extract only the values that corresspond to each question

In [None]:
data = np.array([3, 1, -6, 8, 20, 2, np.nan, 7, 1, np.nan, 9, 7, 7, -7])
data

array([ 3.,  1., -6.,  8., 20.,  2., nan,  7.,  1., nan,  9.,  7.,  7.,
       -7.])

1. The values that are less than 0

In [None]:
data[data < 0]

2. The values that are greater than 3

4. The values not equal to 7

array([ 3.,  1., -6.,  8., 20.,  2., nan,  1., nan,  9., -7.])

  5. The values equal to 20

The values that are not missing  (Tip: `np.isnan(x) == False`)

#### Alternatively, Combine Step 1 and Step 2 into a single line

Both steps can be done in a single expression.  Sometimes this can make things clearer!


```python
>>> data = np.array([1, 2, 3, 4, 5])
>>> data[data > 3]
[4, 5]
```



**Exercises**: Do the same as in the previous section, this time in a single line.

In [None]:
data = np.array([3, 1, -6, np.nan, 8, 20, 2, 7, np.nan, 1, 9, 7, 7, -7])
data

array([ 3.,  1., -6., nan,  8., 20.,  2.,  7., nan,  1.,  9.,  7.,  7.,
       -7.])

The values that are less than 0

The values that are greater than 3

The values equal to 7  (will be an array of sevens)

The values not equal to 7

The values equal to 20

The values not missing  (Tip: `np.isnan(x)`)

### Statistics on Filtered Data

Using the following dataset, have Python to calculate the answers to the questions below:

In [None]:
data = np.array([3, 1, -6, 8, 20, 2, 7, 1, 9, 7, 7, -7])
data

array([ 3,  1, -6,  8, 20,  2,  7,  1,  9,  7,  7, -7])

How many values are greater than 4 in this dataset?  
Useful function: `len([2, 3, 4])`

How many values are equal to 7 in this dataset?

What is the mean value of the positive numbers in this dataset?

What is the mean value of the negative numbers in this dataset?

What is the median value of the values in this dataset that are greater than 5?

How many missing values are in this dataset?

What proportion of the values in this dataset are positive?

What proportion of the values in this dataset are less than or equal to 8?

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=976ef4bc-af52-4d0a-a4dd-37f09f53f140' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>