<img src="assets/logo-public-bg-color-1024px.png" width=20% align=left>
<br>



# Python 101 - Week 3

### Instructor: <a href='marifdemirtas.github.io'>Mehmet Arif Demirtaş</a>

---

# Today's Plan

- More on Data Types
    - How they are implemented under the hood
- Representing **collections** of variables
    - Strings
    - Tuples
    - Lists
- A different approach to loops: **iteration**
- Collections with more complex structures
    - Sets
    - Dictionaries

---

# Structure of A Data Type

Last week, we have looked at fundamental data types in Python.

In [None]:
a = 5 # int
b = 5.1 # float
c = True # Bool
d = None # NoneType
e = "abc" # str

We have also seen some **operators** (like '+') that work differently for different data types.

In [None]:
print(5 + 3) # addition
print('5' + '3') # concatenation

Therefore, we can say that a data type contains a **data** part, which includes the value we want to store, and some **functionality** that uses the data to perform a calculation. 

This combination of data and functionality is called an **object**. You can think of the objects as an unit of code that includes the instructions related to a concept in real-world. (*We will have a deeper look at objects in Week 5.*)

Remember that our Python code is hiding the low-level computer actions for us. For example, when we use an integer, we actually store the value in a space in the physical memory of our computer. The integer datatype should keep a *reference* to the memory location where our value is stored. For example, the *pseudo*-code that implements the integer datatype can look like this:

In [None]:
Object int:
    name = a_number # variable name
    data = 0x010 # address
    
    functions:
        __add__: # add (+)
            ... # statements that will sum up two numbers
        __mul__:  # multiply (*)
            ... # statements that will multiply two numbers

Integer is a simple example, because it's data is *scalar* - that is, it stores a single value in the memory. In fact, all the data types we have seen so far **except str** are scalar data types.

Strings are different, because they actually store a **collection** of values. Remember the first week on text representation. We have mentioned that each character can be represented with a number, and we can define a lookup chart that translates the numbers to the corresponding characters.

This means that we have now have to store more than one numerical value in the physical memory. A simple way to realize such a system would be to store a **start** address and an **end** address for our data, and since each number takes up an unit of space, we can start at the start address and read up to end address to obtain all the characters.

In [None]:
Object str:
    name = a_string # variable name
    data_start = 0x010 # start address
    data_end = 0x030 # end address
    
    functions:
        __add__: # add (+)
            ... # statements that will concatenate strings
        __mul__:  # multiply (*)
            ... # statements that will concatenate a string with itself N times

The collection idea turns out to be a very strong concept, as we often need to represent concepts that are *part of the same whole*. Just like `int` or `str`, a collection will be a variable that we declare in our code. Now we will have a look at the collection data types in Python.

 

## Tuples

Tuples can be seen as a generalization of strings.

Think of a string as a collection of letters.

In [None]:
x = "abc"

In [None]:
y = 'a' + 'b' + 'c'

In [None]:
print(x)
print(y)
print(x == y)

Tuples will be similar collections of elements of any type. Instead of using quotes, we will show them using parenthesis.

In [None]:
tuple_1 = (1, 2, 3, 4)

In [None]:
tuple_1

Similar to strings, we can concatenate tuples to obtain longer tuples, like this:

In [None]:
tuple_2 = (5, 6, 7, 8)

In [None]:
tuple_1 + tuple_2

##### 💡 Quick Question

In [None]:
tuple_1 * 3

Unlike strings, tuples can contain elements from various types:

In [None]:
t3 = (1, 2, 'abc')

In [None]:
print(t3)
print(type(t3))

Another difference between tuples and strings is that tuples can also contain other tuples:

In [None]:
t4 = (4, 5, t3)
t5 = (4, 5, 1, 2, 'abc')

In [None]:
print(t4)
print(type(t4))

In [None]:
print(t5)
print(type(t5))

In [None]:
print(len(t4))
print(len(t5))

Similar to empty string, we can have an empty tuple.

In [None]:
t6 = ()

In [None]:
print(type(t6))
print(len(t6))

Tuples can be created by calling the *constructor function*.

In [None]:
t7 = tuple()
print(t7)

In [None]:
t8 = tuple((1,2,3))
print(t8)

Tuples are ordered collections.

In [None]:
tuple((1,2)) == tuple((1,2))

In [None]:
tuple((1,2)) == tuple((2,1))

##### Singleton Tuples

In [None]:
t9 = tuple((1))
print(t9)

In [None]:
t9 = (1)
print(t9)

In [None]:
type(t9)

In [None]:
t10 = (1,)
print(t10)

In [None]:
type(t10)

### Indexing

When we view strings as collections, one might ask how to access a single element of the collection directly. Python's answer to this is known as **indexing**.

In [None]:
collection_of_letters = "abcdef"
collection_of_nums = (1,2,3,4,5,6)

Each item in the collection is assigned an *index* - their location in the sequence. The first item is given the 
index zero and the other elements are assigned increasing indices.

**To get an element at a given index, we use the subscript ([]) operator. **

In [None]:
collection_of_letters[0] # get first element

In [None]:
collection_of_letters[2] # get third element -> at index 2

In [None]:
collection_of_nums[1] #?

If we try to access an index that is outside of the borders of the collection, we get an error:

In [None]:
collection_of_nums[10]

**Negative values** can be used in indexing. The last element in a sequence is given the *negative index* of -1.

In [None]:
collection_of_nums[-1]

In [None]:
collection_of_nums[-len(collection_of_nums)]

In [None]:
collection_of_nums[-6]

<img src="assets/list_slicing.png" width=60%>

We can also get multiple elements from a collection by **specifying a slice of indices.** We use the following notation to denote a slice:

    collection[start:stop]
    
This range will take the items in the range [start, end) -> that is, it takes the item at the start index but does not take the item at the end index.

This behaviour is intended for the following property to work:

    collection[a:b] + collection[b:c] == collection[a:c]

In [None]:
collection_of_nums[0:3] # will get indices 0, 1, 2

In [None]:
collection_of_letters[1:5] # will get items starting from the second and up to (but not including) sixth 

Unlike indexing directly, going beyond the boundaries of the input does not produce an error.

In [None]:
collection_of_nums[0:100] # tries to return the slice, stops at the boundary

In [None]:
collection_of_nums[100:200] # if there are no items in the slice, returns empty sequence

Slicing also supports an additional **step** value. If a step is provided, slice is computed to take every *n*th element in the slice.

    collection[start:stop:step]
    
This will take the item at index *start*, *start+step*, *start+2*step*, *start+3*step*... until it reaches index *stop*.

In [None]:
collection_of_nums

In [None]:
collection_of_nums[0:6:2]

In [None]:
collection_of_nums[1:6:2]

In [None]:
collection_of_nums[1:6:3]

In [None]:
collection_of_nums[1:6:10]

You can omit the slicing values in this notation.

In [None]:
collection_of_nums[:3] # omit start -> starts from zero

In [None]:
collection_of_nums[3:] # omit stop -> continues until end

In [None]:
collection_of_nums[:] # omit start AND stop -> copy all elements

In [None]:
collection_of_nums[:3:2] # omit start, provide stop AND step

In [None]:
collection_of_nums[2::2] # omit stop, provide start AND step

Negative values can also be used in slicing:

In [None]:
collection_of_nums[-1:-3] # in slicing, you should always provide [smaller_value, greater_value]

In [None]:
collection_of_nums[-3:-1]

In [None]:
collection_of_nums[-3:] # last three elements

In [None]:
collection_of_nums[0:-2] # up to last two elements

In [None]:
collection_of_nums[::-1] # giving a negative step reverses the list

In [None]:
collection_of_nums[::-2]

In [None]:
collection_of_nums[-3::-1]

## Range

Range is a special application of tuples to create a range of numbers automatically. In ranges, you can provide start, stop and step values to obtain a sequence of numbers.
    
    range(start, stop, step)
    
Unlike tuples, *ranges are not immediately evaluated*. We can cast it to a tuple to evaluate it directly.

In [None]:
r1 = range(0, 10, 1)

In [None]:
r1_tuple = tuple(range(0, 10, 1))

In [None]:
print(r1)
print(r1_tuple)

In [None]:
r1[0] == r1_tuple[0]

In [None]:
r1[4] == r1_tuple[4]

In [None]:
r1[4]

In [None]:
r1[4:8]

In [None]:
r1[4:8:2]

In [None]:
tuple(r1[4:8:2])

## Lists

Lists are another data type for representing collections. Their usage is similar to tuples, but instead of parenthesis, we use square brackets to create them.

In [None]:
l1 = [1, 2, 3, 'dört']
print(l1)
print(type(l1))

In [None]:
l2 = list((1, 2))
print(l2)

In [None]:
test_tuple = (1,2,3)
from_tuple = list(test_tuple)
back_from_list = tuple(from_tuple)
print(test_tuple)
print(from_tuple)
print(back_from_list)

In [None]:
test_tuple == back_from_list

In [None]:
from_tuple == test_tuple

In [None]:
from_tuple_2 = list(test_tuple)
print(from_tuple_2)

In [None]:
from_tuple == from_tuple_2

Slicing can be used on lists.

In [None]:
test_list = ['x', 3, 'abc', 11]

In [None]:
test_list[0]

In [None]:
test_list[1:3]

In [None]:
print(type(test_list[1:3]))

In [None]:
test_list[1:3][0]

In [None]:
['x', 3, 'abc', 11][1:3][0]

### Element Check

In [None]:
'x' in test_list

In [None]:
4 in test_list

# Mutability vs Assignment

Lists are different than the data types we have seen so far because they are **mutable**, compared to other types that are immutable.

Mutable objects can be modified (*mutated*) directly. Immutable objects couldn't be modified and we could only produce new objects and assign them to the same names.

In [None]:
x = 5   # x is assigned 5
print(x, id(x))
x += 1  # x is assigned 6
print(x, id(x)) # different object

In [None]:
x = [5, 10] # x is assigned as [5]
print(x, id(x))
x[0] += 1 # the first element of object x is incremented to 6 
print(x, id(x)) # same object

Other collections we have looked at so far does not support mutable operations. For example:

In [None]:
x = (5, 10)
x[0] += 1

In [None]:
x = "abcdef"
x[0] = "z"

Since lists are mutable, they support item assignment, appending new elements or removing existing elements with built-in functions.

In [None]:
list_of_nums = [4, 8, 15, 16, 23, 42]
print(list_of_nums)

In [None]:
list_of_nums[3] *= 3
print(list_of_nums)

In [None]:
list_of_nums[0:2] = [2, 7]
print(list_of_nums)

In [None]:
list_of_nums.append(100) # add given value to the end of the list
print(list_of_nums)

In [None]:
list_of_nums.remove(23) # find and remove the given value
print(list_of_nums)

In [None]:
list_of_nums.remove(101)

In [None]:
item = list_of_nums.pop(0) # remove the item in the given index
print(item)
print(list_of_nums)

This table shows some of the most common list functions for future reference.

<img src="assets/list_methods.png" width=60%>

# Iteration

Last week, we have seen how a while loop works. The general structure of the loop looked like the following:

In [None]:
while condition is True:
    do_operation()

A common variant of this was used to **iterate** over a series of numbers or elements:

In [None]:
target_value = 10
index_variable = 0
while index_variable < target_value: # runs for index_variable in [0, 10)
    do_operation()
    index_variable += 1

With the collection data types we have seen so far, we can adapt this iteration to go over elements of a collection.

In [None]:
list_of_values = ['a', 'b', 'c']
target_value = len(list_of_values)
index_variable = 0
while index_variable < target_value: # runs for index_variable in [0, 3)
    print(list_of_values[index_variable])
    index_variable += 1

Python offers us a much cleaner loop structure, known as the **for loop** for iterating over a sequence of elements. A for loop represents this:

    for item in sequence:
        take the item and do operations
        continue with next item

Think of `item` here as a variable, you can use any name you want.

`sequence` should be an *iterable*, which are objects that represent sequences i.e. str, list, tuple, range...

In each iteration of the loop, value of the next item will be assigned to `item`.

In [None]:
for item in list_of_values:
    #print("Iteration start")
    print(item)
    #print("Iteration end")

In [None]:
for char in "Python3":
    print(char)

In [None]:
sum_result = 0
for i in range(0, 10, 1):
    print(i)
    sum_result += i
print(sum_result)

### Two useful functions: enumerate and zip

By default, for loop only assigns the value of the item to the iterator variable. `enumerate` allows us to assign both item and the index.

In [None]:
iterable = "abcdef"
for index_and_item in enumerate(iterable):
    print(index_and_item)
print(type(index_and_item))

In [None]:
for (index, item) in enumerate(iterable):
    print(index)
    print(item)
    print("---")

`zip` is a generalization of `enumerate`, and it allows us to connect two iterables and choose the next item from both during an iteration.

In [None]:
iterable1 = "abcdef"
iterable2 = "ABCDEF"
for lower, upper in zip(iterable1, iterable2):
    print(lower + upper + lower)

## Comprehension

Similar to conditional statements vs expressions, we can use for loops as expressions in what we call **List Comprehensions**. 

In [None]:
[i for i in range(0, 10, 2)]

In [None]:
[i**2 for i in range(5)]

In [None]:
[i * char for i, char in enumerate("abcdef")]

In [None]:
print([i for i in range(3, 6) for j in range(10, 13)])
print([j for i in range(3, 6) for j in range(10, 13)])
print([i*j for i in range(3, 6) for j in range(10, 13)])

Comprehensions also work in tuples:

In [None]:
tuple(True if i % 2 == 0 else False for i in range(10))

### Summary of Collections

<img src="assets/seq_sum.png" width=60%>

<img src="assets/seq_methods.png" width=60%>

# Hash Functions, Sets and Dictionaries

Some collections may have more complex structure to speed up certain operations (inserts, lookups or removals). One of the fundamental operations used for this is **hashing**.

In hashing, a hash function (a mathematical function that returns a number for each input data) is used to obtain hashes for elements, and these hashes can be used for quick comparison. The nature of the hash function may change based on the requirements.

Python has two main data structures that use hashes as its core mechanic. The first one is **sets**.

In [None]:
set_x = {1, 2, 3, 4, 5}

In [None]:
set_x

In [None]:
2 in set_x

In [None]:
10 in set_x

In [None]:
set_x.add(6)

In [None]:
set_x

In [None]:
set_x.add(3)

In [None]:
set_x

In [None]:
set_x.remove(3)

In [None]:
set_x

In [None]:
set_x[0]

In [None]:
for item in set_x:
    print(item)

In [None]:
list_with_duplicates = [1, 1, 1, 2, 2, 3, 5, 8, 8]

In [None]:
set(list_with_duplicates)

Sets are useful, but they cannot be indexed or accessed directly. Therefore, an alternative data structure uses hashing mechanism to offer us an useful lookup object.

#### Dictionaries
Dictionary type (`dict`) is a mapping between two collections of items. First collection is a **set** of **keys**, and the second collection is a **list** of **values**. We can use a key to get the associated value.

In [None]:
dict_x = {'a': 1, 'b': 2}

In [None]:
dict_x

In [None]:
dict_x['a']

In [None]:
dict_x['a'] = [1, 2, 3]

In [None]:
dict_x

In [None]:
dict_x['c'] = "hello"

In [None]:
dict_x

In [None]:
dict_x['d']

In [None]:
dict_x.keys(), dict_x.values()

In [None]:
for item in dict_x:
    print(item)

In [None]:
for item in dict_x:
    print(item, dict_x[item])

In [None]:
for value in dict_x.values():
    print(value)

In [None]:
for key, value in dict_x.items():
    print(f"{key}:{value}")

---

# Next Week

- Functions
- File input and output

# References

https://docs.python.org/3/library/stdtypes.html