# Other Composite Data Types - Tuples, Sets, and Dictionaries

## Tuple - an unmodifiable "list-like" data type 

- Tuples are organizations of primitive (or even composite) data types and cannot be modified once defined

- You can access values in a list based on their position, starting with 0, 1, 2, etc. by placing square brackets after variable name (just like strings!)

- You can perform "sequence unpacking" to set multiple variables to all values in a tuple.

- Useful for:
    - Creating unrelated groups of data
    - Storing information generated by a method or function
    
keyword: ```()```
keyword: ```tuple```

We define a tuple in one of two ways:

1. Commas between variables or values set equal to a variable (**avoid doing it this way**)

```my_tuple = 1, 2, 3, 4```

2. Commas between variables or values within open and closed parentheses (do it this way)

```my_tuple = (1, 2, 3, 4)```

Case 1 is a common error case when trying to create a ```list``` data type. Specifically--You remembered the commas, but forgot the square brackets. Note that you do **not** have the ```list``` methods available for tuple variables:

In [1]:
my_tuple = 1, 2, 3, 4

my_tuple.append(5)

AttributeError: 'tuple' object has no attribute 'append'

and you cannot change values within the tuple:

In [2]:
my_tuple = (1, 2, 3, 4)

my_tuple[0] = 0

TypeError: 'tuple' object does not support item assignment

However, indexing works just like it does in lists:

In [3]:
my_tuple = (1, 2, 3, 4)

print("my_tuple[0] prints", my_tuple[0])
print("my_tuple[:2] prints", my_tuple[:2])

my_tuple[0] prints 1
my_tuple[:2] prints (1, 2)


You can also "cast" a list into a tuple, or the other way around:

In [4]:
# set x to a tuple
x = (1, 2, 3, 4)

# cast the tuple x into a list
# set the result to x
x = list(x)

# cast the list x into a tuple
# set the result to y
y = tuple(x)

print(x, y)

[1, 2, 3, 4] (1, 2, 3, 4)


### Sequence unpacking

Tuples are most often used for storing the results of a method or function. This is because they have built in functionality to "unpack" their contents into multiple variables.

Consider the following tuple definition:

In [5]:
my_tuple = (85, 70, "cloudy", "West Wind")

print(my_tuple)

(85, 70, 'cloudy', 'West Wind')


If instead you would like to set each one of these to a variable, you could do so like this:

In [6]:
high, low, weather, wind = (85, 70, "cloudy", "west wind")

print(f"The high was {high}F, the low was {low}F")
print(f"the weather was {weather}, and there was a {wind}")

The high was 85F, the low was 70F
the weather was cloudy, and there was a west wind


which is equivalent to:

In [7]:
high = 85
low = 70
weather = "cloudy"
wind = "west wind"

print(f"The high was {high}F, the low was {low}F")
print(f"the weather was {weather}, and there was a {wind}")

The high was 85F, the low was 70F
the weather was cloudy, and there was a west wind


### Advanced Python Alert - Tuples and methods

Remember the exponential method used in HW1?

In [8]:
import math

print(math.exp(5))

148.4131591025766


```math.exp``` is a method that takes a parameter (in this case 5), runs the calculation $e^{5}$, and "returns" the result which is a ```float``` value (in this case, ```148.4131591025766```).

There are some methods that can return 2 or more results.

Consider the following method definition that takes a list as a parameter and returns the unique values and the count of unique values:

In [9]:
def unique(my_list):
    
    unique_values = list(set(my_list))
    unique_count = len(my_list)
    
    return unique_values, unique_count

x = [1, 2, 1, 1, 4, 5]

result = unique(x)

print(result)

([1, 2, 4, 5], 6)


Hopefully you noticed that ```result``` is a ```tuple```!

It may be more convenient to set the result to two variables:

In [10]:
def unique(my_list):
    
    unique_values = list(set(my_list))
    unique_count = len(unique_values)
    
    return unique_values, unique_count

x = [1, 2, 1, 1, 4, 5]

items, count = unique(x)

print(f"The unique values were {items}")
print(f"There were {count} unique values")

The unique values were [1, 2, 4, 5]
There were 4 unique values


## Set - a "list-like" data type that only keeps unique values

- Sets are organizations of primitive (or even composite) data types and cannot be modified once defined

- You can access values in a list based on their position, starting with 0, 1, 2, etc. by placing square brackets after variable name (just like strings!)

- You can use the ```set``` method ```add()``` to add an item to a set, as long as it is not already in the set.

- Useful for:
    - Identifying unique values
    - Removing duplicates
    
keyword: ```{}```
keyword: ```set```

We define a set in one of two ways:

1. Open and closed curly brackets with items separated by commas:

```my_set = {1, 2, 3, 4}```

2. "Casting" an existing ```list``` or ```tuple``` to a ```set``

```my_set = set(my_list)```

Even when you define the set initially, duplicates are removed:

In [11]:
my_set = {1, 2, 3, 4, 5, 5, 5}

print(my_set)

{1, 2, 3, 4, 5}


Duplicates in existing lists will be removed when casting. Order is not preserved in a set.

In [12]:
my_list = ["a", "b", "c", 1, 1, "c", "1", 8]

my_set = set(my_list)

print(my_set)

{1, 8, '1', 'c', 'b', 'a'}


If you try to add an item that already exists in the set, it will not modify the set:

In [13]:
my_set = {1, 2, 3, 4, 5, 5, 5}

print("my_set after definition", my_set)

my_set.add(6)

print("my_set after adding value not in the set", my_set)

my_set.add(6)

print("my_set after adding value already in the set", my_set)

my_set after definition {1, 2, 3, 4, 5}
my_set after adding value not in the set {1, 2, 3, 4, 5, 6}
my_set after adding value already in the set {1, 2, 3, 4, 5, 6}


### "List-like" operations to test if items are in a ```set```, ```list``` or ```tuple```

It may be useful to check if an item is in a composite data type.

The syntax is:

```True``` if item is in the list, ```False``` otherwise:

```item in list``` 

```True``` if item is not in the list, ```False``` otherwise:

```item not in list``` 

In [14]:
my_list = ["a", 1, 1.5, True]

print("Is True in my_list?", True in my_list)

print('Is "a" not in my_list?', "a" not in my_list)

print("Is 1.5 in my_list?", 1.5 in my_list)

print("Is 1 in my_list?", 1 in my_list)

Is True in my_list? True
Is "a" not in my_list? False
Is 1.5 in my_list? True
Is 1 in my_list? True


# Dictionaries

- Dictionaries are organizations of primitive (or even composite) data types and can be modified once defined and can use any primitive or string as an index
    - Items in a dictionary can be numbers, strings, etc., just like in ```list```, ```tuple```, and ```set```
    
- Biggest difference between dictionaries and ```list``` is how you access items in the dictionary.

- Useful for:
    - Organizing data with human-readable indexes
    - Defining parameters to be used in a method
    - Generating data that can be easily converted to csv/excel files and used in ```pandas```
    
keyword: ```dict```

keyword: ```{'index': value} # you need the : to differentiate it from a set```

We define a ```dict``` in one of two ways:

1. Open and closed curly brackets with key, value pairs separated by commas:

```my_dict = {key1: value1, key2: value2}```

2. The keyword ```dict``` with key, value pairs identified using ```=``` and separated by commas 

```my_dict = dict(key1=value1, key2=value2)```

In [15]:
my_dict = {'temperature': 85, 'dewpoint': 70}

print(my_dict)

{'temperature': 85, 'dewpoint': 70}


In [16]:
my_dict = dict(temperature=85, dewpoint=70)

print(my_dict)

{'temperature': 85, 'dewpoint': 70}


### Advanced Python alert

You will notice that composite data types show up everywhere!

For methods, the method definition and the expected "parameters" are basically an explicit or implicit ```dict``` definition:

Notice that after the method name, there are keys (parameters) that need to be set to specific values:

In [17]:
def isin(my_list, value):
    
    return value in my_list

x = [1, 2, 1, 1, 4, 5]

result = isin(my_list=x, value=1)

print(result)

True


We can even pass a dictionary directly into the function by unpacking the dictionary with the ```**``` operator:

In [18]:
def isin(my_list, value):
    
    return value in my_list

x = [1, 2, 1, 1, 4, 5]

params = dict(my_list=x, value=1)

result = isin(**params)

print(result)

True


### Basic ```dict``` usage

You need to choose useful indexes.

For example, we might have data on an observation at NIU

In [19]:
observation = {'temperature': 85,
               'dewpoint': 75,
               'wind_speed': 10,
               'wind_direction': 'SSW',
               'weather_conditions': 'Partly Cloudy'}

observation

{'temperature': 85,
 'dewpoint': 75,
 'wind_speed': 10,
 'wind_direction': 'SSW',
 'weather_conditions': 'Partly Cloudy'}

If you want to access the temperature, you would do the following

1. add an opening square bracket to the dictionary name: observation[
2. type in the key exactly as it appears in the dictionary definition: ```observation['temperature'
3. add a closing square bracket

In [20]:
observation['temperature']

85

you would **not** access it like a list:

In [21]:
observation[0]

KeyError: 0

You get a 'KeyError' when using a dictionary when you try to access a key index that does not exist.

Be aware that case and spelling matter!

In [22]:
observation['Temperature']

KeyError: 'Temperature'

In [23]:
observation['tempratrue']

KeyError: 'tempratrue'

What happened? If you reverse the order, it becomes more clear:

### Dictionaries with composite data types

A very common pattern used with dictionaries is to use the indexes as placeholders for item containers like a ```list```.

For example, you may want to have a list of observations on different days.

You would define the dictionary like above, except you would remove the values and add an empty list: ```[]```:

In [24]:
observation = {'temperature': [],
               'dewpoint': [],
               'wind_speed': [],
               'wind_direction': [],
               'weather_conditions': []}

observation

{'temperature': [],
 'dewpoint': [],
 'wind_speed': [],
 'wind_direction': [],
 'weather_conditions': []}

Now, you can access each ```list``` in the dictionary and add values using ```append```:

In [25]:
observation = {'date': [],
               'temperature': [],
               'dewpoint': [],
               'wind_speed': [],
               'wind_direction': [],
               'weather_conditions': []}


observation['date'].append('1999-05-03')
observation['temperature'].append(85)
observation['dewpoint'].append(75)
observation['wind_speed'].append(10)
observation['wind_direction'].append("SSW")
observation['weather_conditions'].append("Partly Cloudy")

observation['date'].append('1999-05-04')
observation['temperature'].append(95)
observation['dewpoint'].append(78)
observation['wind_speed'].append(15)
observation['wind_direction'].append("S")
observation['weather_conditions'].append("Thunderstorms")

observation

{'date': ['1999-05-03', '1999-05-04'],
 'temperature': [85, 95],
 'dewpoint': [75, 78],
 'wind_speed': [10, 15],
 'wind_direction': ['SSW', 'S'],
 'weather_conditions': ['Partly Cloudy', 'Thunderstorms']}

### Advanced Python alert

Dictionaries and the python ```pandas``` package make it easy to visualize, analyze, and save your data on the hard drive.

If this does not work, please click the Python [conda env...] button in the upper right and choose 'pyEAE' as the environment. Rerun the notebook.

We will discuss ```pandas``` in great detail later in the course.

In [26]:
import pandas as pd

df = pd.DataFrame.from_dict(observation)

df = df.set_index('date')

df

Unnamed: 0_level_0,temperature,dewpoint,wind_speed,wind_direction,weather_conditions
date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
1999-05-03,85,75,10,SSW,Partly Cloudy
1999-05-04,95,78,15,S,Thunderstorms


# Summary of composite data types and comparisons

### Examples using a ```list```

#### Accessing values

Note: You cannot index ```set``` values.

1a. Access one value in a ```list```

In [27]:
a = [1, 2, 3]

print(a[0])

1


1b. Access one value in a ```tuple```

In [28]:
a = (1, 2, 3)

print(a[0])

1


1c. Access one value in a ```dict```

In [29]:
a = {'ijk': 1, 'xyz': 2, 'abc': 3}

print(a['ijk'])

1


#### Adding values

Note: You cannot add values to a ```tuple``` or ```set```

2a. Add a value to a ```list```:

In [30]:
a = [1, 2, 3]

a.append(5)

print(a)

[1, 2, 3, 5]


2b. Add a value to a ```set```:

In [31]:
a = {1, 2, 3}

a.add(4)

print(a)

{1, 2, 3, 4}


2c. Add a key, value pair to a dictionary:

In [32]:
a = {'ijk': 1, 'xyz': 2, 'abc': 3}

a['def'] = 4

print(a)

{'ijk': 1, 'xyz': 2, 'abc': 3, 'def': 4}


#### Combining composite data types

3a. Add a value to a list within a dictionary:

In [33]:
a = {'ijk': [1, 2, 3], 'xyz': [4, 5, 6], 'abc': [7, 8, 9]}

a['abc'].append(10)

print(a)

{'ijk': [1, 2, 3], 'xyz': [4, 5, 6], 'abc': [7, 8, 9, 10]}


3b. Access a value in a list within a dictionary:

In [34]:
a = {'ijk': [1, 2, 3], 'xyz': [4, 5, 6], 'abc': [7, 8, 9]}

print(a['abc'][0])

7


3c. Access multiple values using slicing in a list within a dictionary:

In [35]:
a = {'ijk': [1, 2, 3], 'xyz': [4, 5, 6], 'abc': [7, 8, 9]}

print(a['ijk'][:2])

[1, 2]


### Practice

Create a dictionary that contains the following tornado report data, make sure you use the correct data types:

|Time|F-Scale|Location|County|State|
|----|-------|--------|------|-----|
|0149|   1   |1 E Big Rock|Kane|IL|
|0223|   0   |1 ESE Glen Ellyn|DuPage|IL|
|0229|   0   |Villa Park|DuPage|IL|
|0234|   0   |Bensenville|DuPage|IL|