# Introduction to Python 
### Notes 1.2, Introduction to Datasets
---

## Objectives

---

# Part 1: Representing Data

## How do I represent single pieces of data?

In [1]:
name = "Michael"
age = 31
height = 1.81
is_awake = False

## How do I represent a dataset?

Lists are groups of data (collections) which are *ordered* and editable, so you can build them up over time:

In [2]:
hr_dataset = []

hr_dataset.append(60)
hr_dataset.append(60)
hr_dataset.append(61)
hr_dataset.append(59)

In [3]:
hr_dataset

[60, 60, 61, 59]

If you know the values you want to use, you can define it immediately with them in:

In [4]:
hr_dataset = [60, 60, 61, 59]

## How do I represent a fixed set of a data?

Sometimes data should be created *fixed* and then never allowed to change during a program. For this we have `tuple`s:

In [5]:
film = (1, "Annie Hall", 120, 1979)

Tuples cannot be modified after creation...

In [7]:
film.append(1)

AttributeError: 'tuple' object has no attribute 'append'

... it *appears* the syntax for creating a tuple is the `()` (parentheses) .. but it's actually the comma...

In [8]:
film = 1, "Annie Hall", 120, 1979

In [9]:
film

(1, 'Annie Hall', 120, 1979)

we add parentheses for clarity, and so it looks more like lists. 

## How do I represent data relationships?

Lists and tuples are collections of elements, but there is no relationship between elements.


To relate elements we use a dictionary:

In [10]:
dictionary = {
    "happy" : "a positive emotion",
    "sad" : "a negative emotion"
}

In [11]:
dictionary

{'happy': 'a positive emotion', 'sad': 'a negative emotion'}

A dictionary is a set of (key-value) pairs... where the *key* is associated with the *value*. We can find the value using the key. 

In [12]:
dictionary["happy"]

'a positive emotion'

## How do I access elements in a collection?

When accessing the entires of a dataset we use index notation, which is `variable[ index ]`

In [13]:
hr_dataset = [60, 60, 61, 59]

hr_dataset[0]

60

In [14]:
dictionary["happy"]

'a positive emotion'

Lists and tuples are indexed by number, starting at `0` and increasing by `1` for every entry.

Dictionaries use whatever index you give them (eg., above, `happy`).

## How do I access elements by position?

Lists (, tuples, etc.) are indexed "twice", elements have an index going forwards (from 0) and backwards (from -1). 

In [15]:
hr_dataset[0]

60

In [16]:
hr_dataset[-1]

59

In [17]:
hr_dataset[-3]

60

## How do I find the number of elements in a collection?

In [64]:
len(hr_dataset)

4

## How do I distinguishes indexing from creating a list?

Notice that there's two uses of square brakets (`[]`):
* to create a list
* to access elements

To distinguish these uses note that every use of `variable[index]` is always *indexing*. (And anything with a comma in it is a collection.). 

## Can data structures store duplicates?

Every dataset must have unique indexes, including dictionaries. 

In [18]:
{"A": "Alice", "A": "Aaron"}

{'A': 'Aaron'}

...nb., a dictionary will just throw away duplicate keys.

---

## Exericse (15 min)

* review this notebook first

Your goal in this exercise is to simualte tracking health data for a user, and provide them with a custom health warning message if there are any issues with their health data. 


Define three lists for recording: `heart_rate`, `blood_pressure` 

Use `input()` to `.append()` several heart rate readings to the `heart_rate` list. 

* define `heart_rate` as an empty list `[]`
* as a user for an `input()`
* `heart_rate.append()` with that input

* Define the other lists by writing their values in when you create them. 

* Define a dictionary called `warning_sign` which relates
    * `"180"` to "WARNING!"
    * `"120"` to "OK"
    * `"60"` to "WARNING!"
    
* `print()` the first and last reading in your lists 
* `print()` your warning dictionary

* EXTRA:
    * ask a user for their blood pressure
    * use `warning_sign` to `print()` a custom warning message 


## Solution 

In [30]:
heart_rate = []
blood_pressure = [120, 180, 150]

In [19]:
user_hr = input("What's your HR?")
heart_rate.append(user_hr)

user_hr = input("What's your HR?")
heart_rate.append(user_hr)

user_hr = input("What's your HR?")
heart_rate.append(user_hr)

What's your HR?60
What's your HR?63
What's your HR?59


In [20]:
heart_rate

['60', '63', '59']

In [22]:
blood_pressure

[120, 180, 150]

In [24]:
print(heart_rate[0], heart_rate[-1])
print(blood_pressure[0], blood_pressure[-1])

60 59
120 150


In [27]:
warning_sign = {
    "180" : "WARNING!",
    "120" : "OK",
    "60" : "WARNING!"
}

In [28]:
print(warning_sign)



In [29]:
user_bp = input("What's your BP?")

warning_sign[user_bp]

What's your BP?180




# Part 2: Working with Data

## How do I convert between data types?

Input comes in as text:

In [36]:
answer = input("What's your HR?")

What's your HR?60


In [37]:
answer + 1

TypeError: can only concatenate str (not "int") to str

In [38]:
measurement = float(answer)

In [39]:
measurement + 1

61.0

To convert to a different data type, you just use the name of that type *as a function*:

In [50]:
name = "Michael"
address = ""
age = 31
height = 1.81
location = "3"

In [45]:
str(age) + " years old"

'31 years old'

In [48]:
int(location) + 100

103

Converting to a boolean value (`True` or `False`) is a test of whether a variable is empty:

In [52]:
bool(name)

True

In [51]:
bool(address)

False

...`0`, `""`, `[]`, etc. all convert to `False`, everything else is `True`

## How do I convert user input?

In [55]:
user_hr = float(input("What's your HR?"))
user_hr * 2

What's your HR?61


122.0

## How do I access multiple elements in a sequence?

In [56]:
heart_rates = [60, 70, 50, 40]


heart_rates[0]

60

Obtain the elements from index `0` until index `2`:

In [57]:
heart_rates[0:2]

[60, 70]

`variable[ from_index : to_index ]`

## How do I select elements from the beginning of a sequence?

Whenever you leave off `from_index`, it assume `0`: `variable[  : to_index ]`

ie., "from the beginning"

In [59]:
heart_rates[:3]

[60, 70, 50]

## How do I select elements until the end of a sequence?

Whenever you leave off `to_index`, it assume `len(variable)` : `variable[  : to_index ]`

ie., "from the beginning"

In [63]:
heart_rates

[60, 70, 50, 40]

In [60]:
heart_rates[2:]

[50, 40]

Aside: the `len()` of a list is the number of elements, which is also an index you can use:

In [61]:
len(heart_rates)

4

In [62]:
heart_rates[2:4]

[50, 40]

## How do I obtain a part of a string?

Strings may also be sliced:

In [78]:
quote = "be the change you wish to see in the world!"

In [79]:
quote[0:2]

'be'

In [80]:
quote[-6:]

'world!'

## Exercise (15 min)

Using `print()`, produce a report on a user whose health measurements are the following:

```python
heart_rate = [60, 70, 50, 40]
blood_pressure = [120, 180, 150, 175, 110]
```

The report should contain:

* the first two measurements of each
* last two measurements of each
* the number of measurements of HR and BP
    * HINT: `len()`
* the mean of each
    * HINT: use `sum()` and `len()`
    
* EXTRA:
    * use `input()` and `append` to add a new entry to `heart_rate`
        * HINT: use `int()` to convert it to an integer
    * show the report again
        * ie., recompute the totals/means/etc. now you have changes the list

## Solution

In [70]:
heart_rate = [60, 70, 50, 40]
blood_pressure = [120, 180, 150, 175, 110]

heart_rate[0:2]

[60, 70]

In [71]:
blood_pressure[:2]

[120, 180]

In [72]:
blood_pressure[-2:]

[175, 110]

In [73]:
heart_rate[-2:]

[50, 40]

In [75]:
sum(heart_rate) / len(heart_rate)

55.0

In [76]:
user_hr = float(input("What's your Heart Rate?"))

heart_rate.append(user_hr)
sum(heart_rate) / len(heart_rate)

What's your Heart Rate?120


68.0

In [77]:
heart_rate[-2:]

[40, 120.0]