## Grouping Together Your Data into a Collection
Python also has operators for collecting related data together.  Most of this course will revolve around the pros and cons of different ways of collecting data, but let's take a look at them:

| "tuple" (fixed sequence) | "list" (changeable sequence) | "str" (sequence of text characters) |  "set" {mathematical set) 
| :---------:| :----:    | :--------:    | :--------:    |
|  (1, 2, 3) | [1, 2, 3] | "123" or '123' | {1, 2, 3} |

#### Exercises: Making Collections

Make four difference types of sequences, each containing only the numbers from 1 to 5.

In [2]:
x = "Nick"
x

'Nick'

In [4]:
[1, 1, 2, 2, 1]

[1, 1, 2, 2, 1]

In [3]:
{1, 1, 2, 2, 1}

{1, 2}

Make a sequence containing 3 names of people in this class.

In [6]:
("Vera", "Kilian", "Paula")

('Vera', 'Kilian', 'Paula')

Make a list of four animals, ordering them by size.

In [4]:
("Elephant", "Lion", "Cat", "Fly")

('Elephant', 'Lion', 'Cat', 'Fly')

Collect the set of all letters in your first name.

In [5]:
("K", "I", "L", "I", "A", "N")

('K', 'I', 'L', 'I', 'A', 'N')

List examples of each type of collection in Python.

In [12]:
"123"

'123'

In [8]:
{1,2,3}

{1, 2, 3}

In [10]:
(1,2,3)

(1, 2, 3)

In [11]:
[1,2,3]

[1, 2, 3]

## Statistics Functions from Numpy

**Numpy** is a Python package that, among other things, has many useful statistics **functions**.  These take any array-like object as an input and can be found inside the **np** library.  Sometimes, the same functionality can be found both as a Numpy function  and an array method, giving you the choice of how you'd like to use it.  


```python
>>> np.mean([1, 2, 3, 4])
2.5

>>> np.ptp([1, 2, 3, 4])
3
```

A couple lists of functions in Numpy can be found here:
  - Math:  https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.math.html
  - Statistics: https://docs.scipy.org/doc/numpy-1.13.0/reference/routines.statistics.html


Some useful function names: mean, min, sum, max, var, std, p2p, median, nanmedian, nanmax, nanmean, nanmin

We'll see more later.

**Exercise**: Using only Numpy functions, calculate the statistics on the following numbers:

In [14]:
data = [2, 8, 5, 9, 2, 4, 6]
data

[2, 8, 5, 9, 2, 4, 6]

What is the mean of the data?

In [15]:
import numpy as np
np.mean(data)

5.142857142857143

What is the sum of the data?

In [16]:
np.sum(data)

36

What is the minimum of the data?

In [17]:
np.min(data)

2

The variance?

In [18]:
np.var(data)

6.408163265306122

The standard deviation?

In [19]:
np.std(data)

2.531435020952764

The difference between the data's maximum and minimum? ("peak-to-peak")

In [20]:
np.ptp(data)

7

The data's median?

In [21]:
np.median(data)

5.0

This data's median?

In [23]:
data2 = [2, 6, 7, np.nan, 9, 4, np.nan]

In [25]:
np.nan?

[0;31mType:[0m        float
[0;31mString form:[0m nan
[0;31mDocstring:[0m   Convert a string or number to a floating point number, if possible.


In [26]:
np.nanmedian(data2)

6.0