## Grouping Together Your Data into a Collection
Python also has operators for collecting related data together.  Most of this course will revolve around the pros and cons of different ways of collecting data, but let's take a look at them:

| "tuple" (fixed sequence) | "list" (changeable sequence) | "str" (sequence of text characters) |  "set" {mathematical set) 
| :---------:| :----:    | :--------:    | :--------:    |
|  (1, 2, 3) | [1, 2, 3] | "123" or '123' | {1, 2, 3} |

#### Group Exercises: Making Collections

Make a list of your three favorite numbers

Set x equal to your first name

In [None]:
x = ... 

Make a tuple containing 3 names of people in your group.

Make a list of four animals, ordering them by size.

Collect the set of all letters in your first name.

## Statistics Functions from Numpy

**Numpy** is a Python package that, among other things, has many useful statistics **functions**.  These take any array-like object as an input and can be found inside the **np** library.  Sometimes, the same functionality can be found both as a Numpy function  and an array method, giving you the choice of how you'd like to use it.  


```python
>>> np.mean([1, 2, 3, 4])
2.5

>>> np.ptp([1, 2, 3, 4])
3
```

A couple lists of functions in Numpy can be found here:
  - Math:  https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html
  - Statistics: https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.statistics.html


Some useful function names: mean, min, sum, max, var, std, p2p, median, nanmedian, nanmax, nanmean, nanmin

We'll see more later.

**Exercise**: Using only Numpy functions, calculate the statistics on the following numbers:

In [None]:
data = [2, 8, 5, 9, 2, 4, 6]
data

[2, 8, 5, 9, 2, 4, 6]

What is the mean of the data?

What is the sum of the data?

What is the minimum of the data?

The variance?

The standard deviation?

The difference between the data's maximum and minimum? ("peak-to-peak")

The data's median?

(*Time Limit Exercise: 5 Minutes*) This data's median?  

In [None]:
data2 = [2, 6, 7, np.nan, 9, 4, np.nan]

# Arrays in Numpy

**Numpy** has a very useful data collection: the **array**.  Arrays are very similar to lists, with one exception:
  - **all elements in the array must be of the same data type (e.g. int, float, bool)**
  
Despite that limitation, arrays are extremely useful for data analysis, and we'll be taking advantage of its many features throughout the course.  So let's start by learning how to easily generate different patterns of data with arrays!

### Building Arrays

Let's generate some arrays using Numpy functions!  Some commonly-used are examples are **arange()**, **linspace()**, **zeros()**, and the random number generation functions in **random**.

| function | Purpose |  Example |
| :-----------: | :-------------: | :-------------: |
| **np.array()**  | Turns a list into an array |   np.array([2, 5, 3]) |
| **np.arange()**                  | Makes an array with all the integers between two values | np.arange(2, 7) |
| **np.linspace()**               | Makes a specific-length array |  np.linspace(2, 3, 10) |
| **np.zeros()**                    | Makes an array of all zeros | np.zeros(5) |
| **np.ones()**                     | Makes an array of all ones | np.ones(3) |
| **np.random.random()** | Makes an array of random numbers | np.random.random(100) |
| **np.random.randn()**     | Makes an array of normally-distributed random numbers | np.random.randn(100) |


#### Exercises

Import the numpy package as `np`:

Turn this list into an array:

In [None]:
[4, 7, 6, 1]

[4, 7, 6, 1]

Make an array containing the integers from 1 to 15.

Make an array of only 6 numbers between 1 and 10, evenly-spaced between them.

Make an array of the values from 2 and 6, spaced 0.5 from each other.

Turn this list into a an array...

In [None]:
a = [True, False, False, True]

Make an array containing 20 zeros.

Make an array contain 20 ones!

How about an array of the 10 values between 100 and 1000?

Generate an array of 10 random numbers

### Combining array generation with statistics functions

These exercises all involve two steps:
  1. Make the data
  2. Calculate something on the data

for example:
```python
np.mean(np.arange(1, 10))  # the mean of the integers from 1 to 9
```

#### Exercises

What is the standard deviation of the integers between 2 and 20?

What is the standard deviation of the numbers generated from the np.random.randn() function?  

What is the sum of an array of 100 ones?

What is the sum of an array of 100 zeros?

### Array Broadcasting: Combing Arrays with Operators

Remember our math operators?  We can use them on arrays, too!

|Assign to a Variable, | | Add,  | Subract, | Multiply, | Divide, | Power, | Integer Divide, | Remainder after Division | 
|  :---------------:  | :-: |:---:| :-----: | :------: | :----: | :---: | :------------: | :----------------------: |
|         =           | | +   |    -    |    *     |   /    |   **  |       //       |           %              |

Numpy also has functions that can transform each value in an array using a math operation.  For example: `np.log()`, `np.abs()`, `np.sin()`, `np.cos()`, `np.tan()`, `np.sqrt()`

**Exercises**

Add 10 to all of the numbers below

Multiply everything in the array below by 10

Multiply all the numbers from 1 to 100 by 1000

Calculate the absolute value of the following data

In [None]:
np.array([-5, 7, -2, 4, -10])

Calculate the cosine of all integers between 0 and 6

Calculate the square root of 100 normally-distributed random numbers

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=5f82c9ae-724d-4175-af3a-f3de86574cfe' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>