<a href="https://colab.research.google.com/github/3048476752ksvl-lang/IB2AD0_Data_Science_GenerativeAI/blob/main/1_06_other_data_structures.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

![](https://drive.google.com/uc?export=view&id=1xqQczl0FG-qtNA2_WQYuWePW9oU8irqJ)

# 1.06 Other Data Structures

## Tuples

Tuples are a lot like lists, except they have some slightly different properties which we will touch on in due course. A tuple is (typically) made using normal brackets.

In [17]:
first_tuple = ("a", 2, 5.5, False)

_Note: We do not need the round brackets for single tuples only for nested tuples. So, first_tuple = “a”, 2, 5.5, False would work in exactly the same way. Any time we have a nested tuple (a tuple inside a tuple, or a tuple inside a list or dictionary), the brackets become necessary. Always using brackets is advisable as it makes the code more readable making it easier for you and others to understand the structure being used._

_As with lists, we are creating an unordered structure of single items (rather than the key/value system of dictionaries) which can have repetition. Again we index (or slice) from a tuple in the same way as we would with a list_

In [18]:
first_tuple[1]

2

One key difference between tuples and lists is that a tuple is __immutable__. This means that we cannot change items in a tuple in the way we would change an item in a list. Attemping to change items in a tuple will result in a TypeError.

In [19]:
first_tuple = ("a", 2, 5.5, True)

It is possible to include mutable items inside a tuple. For instance, if an item in a tuple is a list, it is possible to change items inside the list, by indexing in the usual fashion.

In [20]:
second_tuple = (["Michael", "Mark", "Wenjuan", "Katy"], "Cheese", 40000, True,)
second_tuple[0][-1]

'Katy'

In [21]:
second_tuple[0].pop(-1)
second_tuple

(['Michael', 'Mark', 'Wenjuan'], 'Cheese', 40000, True)

In order to change an element in a Tuple, we would need to recreate the whole tuple object.

In [22]:
second_tuple = (["Michael", "Mark", "Wenjuan", "Katy"], "Cheese", 40000, False, True,)
second_tuple

(['Michael', 'Mark', 'Wenjuan', 'Katy'], 'Cheese', 40000, False, True)

Because tuples are immutable most of the list methods are not usuable with the exception of the __index()__ and __count()__ method.

## Sets

Like tuples sets are a lot like lists with the main difference being that you are limited to only a single instance of each item (no duplicate values). Sets use the curly brackets, { }, like dictionaries, but have no key/value pairs (i.e. no colon separating values).

In [23]:
first_set = {"u10000", "u10001", "u10002"}
first_set

{'u10000', 'u10001', 'u10002'}

Unlike a list, indexing is not really relevant to sets. We do not care about the position of items, just whether they are in a set or not in a set. If we try to index a set you would get a TypeError.

In [24]:
'u10001' in first_set

True

We can add data to our set by using either set.add( ) for single items and set.update( ) if we have multiple items.

In [25]:
first_set.add("u10003")
first_set

{'u10000', 'u10001', 'u10002', 'u10003'}

In [26]:
first_set.update(['u10004', 'u10005'])
first_set

{'u10000', 'u10001', 'u10002', 'u10003', 'u10004', 'u10005'}

When updating a list to a set, the list itself does not become an item in the set, rather its individual values are added. In other words, the list is __unpacked__. If we consider the main use case of sets, checking if an item is in a set or not in a set, this makes some sense.

We can also remove items by using the __set.discard( )__ method.

In [27]:
first_set.discard('u10004')
first_set

{'u10000', 'u10001', 'u10002', 'u10003', 'u10005'}

_Note, you can also use set.remove( ) to achieve a similar effect. The key difference is that an item is not in the set,_
__set.remove( )__ _will result in an error, while_ __set.discard( )__ _will not._

_A full list of set methods can be found [here](https://docs.python.org/3/library/stdtypes.html#set)_

# AI generated Practice(Done by myself)

In [28]:
# =========================
# EXERCISE: Other Data Structures (Tuples & Sets)
# =========================

# ---------- Part A: Tuples ----------
# 1) Create a tuple called first_tuple with 4 items:
# "a", 2, 5.5, False
# TODO
first_tuple = (  )

print("A1) first_tuple:", first_tuple)
print("A1) type:", type(first_tuple))

# 2) Indexing: print the SECOND item (index 1)
# TODO
print("A2) first_tuple[1]:", )

# 3) Try to change an item in the tuple (this should cause a TypeError)
#    Keep it commented out, and write a comment explaining why it fails.
# TODO (leave commented)
# first_tuple[3] = True  # TODO: explain why

# 4) Create a tuple called second_tuple where the FIRST item is a LIST of names:
# ["Michael", "Mark", "Wenjuan", "Katy"], then include "Cheese", 40000, True
# TODO
second_tuple = (  )

print("\nA4) second_tuple:", second_tuple)

# 5) Access the LAST name inside the list that is inside second_tuple
# TODO
print("A5) last name:", )

# 6) Remove the last name from the inner list using pop(-1), then print second_tuple again
# TODO
print("A6) after pop:", second_tuple)

# 7) "Change" an element in a tuple by recreating the whole tuple:
# Make a new second_tuple where the boolean values become: False, True (two booleans at the end)
# TODO
second_tuple = (  )
print("A7) recreated second_tuple:", second_tuple)

# ---------- Part B: Sets ----------
# 8) Create a set called first_set with these IDs: "u10000", "u10001", "u10002"
# TODO
first_set = {  }
print("\nB8) first_set:", first_set, type(first_set))

# 9) Explain (as a comment) why indexing like first_set[1] is not appropriate
# TODO comment here

# 10) Add one new item "u10003" using add(), then print
# TODO
print("B10) after add:", first_set)

# 11) Add multiple items ['u10004', 'u10005'] using update(), then print
# TODO
print("B11) after update:", first_set)

# 12) Discard 'u10004' using discard(), then print
# TODO
print("B12) after discard:", first_set)

# 13) Membership check: print whether 'u10005' is in the set
# TODO
print("B13) 'u10005' in first_set?", )


A1) first_tuple: ()
A1) type: <class 'tuple'>
A2) first_tuple[1]:

A4) second_tuple: ()
A5) last name:
A6) after pop: ()
A7) recreated second_tuple: ()

B8) first_set: {} <class 'dict'>
B10) after add: {}
B11) after update: {}
B12) after discard: {}
B13) 'u10005' in first_set?


In [29]:
# =========================
# REFERENCE SOLUTION
# =========================

# Part A: Tuples
first_tuple = ("a", 2, 5.5, False)
print("A1) first_tuple:", first_tuple)
print("A1) type:", type(first_tuple))

print("A2) first_tuple[1]:", first_tuple[1])

# first_tuple[3] = True  # Tuples are immutable, so assignment by index raises TypeError.

second_tuple = (["Michael", "Mark", "Wenjuan", "Katy"], "Cheese", 40000, True)
print("\nA4) second_tuple:", second_tuple)

print("A5) last name:", second_tuple[0][-1])

second_tuple[0].pop(-1)
print("A6) after pop:", second_tuple)

second_tuple = (["Michael", "Mark", "Wenjuan", "Katy"], "Cheese", 40000, False, True)
print("A7) recreated second_tuple:", second_tuple)

# Part B: Sets
first_set = {"u10000", "u10001", "u10002"}
print("\nB8) first_set:", first_set, type(first_set))

# Sets are unordered collections, so positions/indexes are not meaningful.

first_set.add("u10003")
print("B10) after add:", first_set)

first_set.update(['u10004', 'u10005'])
print("B11) after update:", first_set)

first_set.discard("u10004")
print("B12) after discard:", first_set)

print("B13) 'u10005' in first_set?", "u10005" in first_set)


A1) first_tuple: ('a', 2, 5.5, False)
A1) type: <class 'tuple'>
A2) first_tuple[1]: 2

A4) second_tuple: (['Michael', 'Mark', 'Wenjuan', 'Katy'], 'Cheese', 40000, True)
A5) last name: Katy
A6) after pop: (['Michael', 'Mark', 'Wenjuan'], 'Cheese', 40000, True)
A7) recreated second_tuple: (['Michael', 'Mark', 'Wenjuan', 'Katy'], 'Cheese', 40000, False, True)

B8) first_set: {'u10000', 'u10002', 'u10001'} <class 'set'>
B10) after add: {'u10003', 'u10000', 'u10002', 'u10001'}
B11) after update: {'u10003', 'u10000', 'u10004', 'u10005', 'u10002', 'u10001'}
B12) after discard: {'u10003', 'u10000', 'u10005', 'u10002', 'u10001'}
B13) 'u10005' in first_set? True
