![Alt text](https://swps.z36.web.core.windows.net/SWPS-baner-eng-slim.jpg)

# Lecture 4: Simple and Complex Data Types


During the lecture we will discuss data types:
- simple:
  - logical
  - character
  - binary
  - numeric
  - date and time
- complex:
  - lists
  - dictionaries
  - tuples
  - sets
  - frozensets

We will also discuss operations on data types (if they have not been discussed earlier).

## Simple Data Types

### Logical Data

Logical data types were discussed in the second lecture in the context of Boolean algebra, hence the term - Boolean types. As a reminder - they accept the values ​​True and False, which can be assigned to a variable or be an enumeration value.

In [None]:
a = True
b = 2 > 10 

print(b)

### Strings

Strings were largely covered in the third lecture.

In [None]:
a = "a"
b = "ssnscc"

print(b[1:3])
print(b[1:-1])

### Binary data

There are two types of binary data: immutable (unchangeable) bytes and bytearray, mutable consisting of numbers from 0 to 255.

Binary data contains information encoded according to a given encoding to avoid problems when passing it between different applications and operating systems. See the code below:

In [None]:
str = "Różnorodność, Hüseyin" # Różnorodność means diversity w Polish
str_bytes = str.encode()

print(str_bytes)

print(type(str_bytes))

print(str_bytes.decode())


As you can see, binary text is unreadable to humans, but a computer program can recreate it.

### Numeric Data

Strings were largely covered in the third lecture. There are three basic types, of which complex numbers were not covered:
- integer
- floating point
- complex

In [None]:
a = 5.9

b = 1.4

print(type(a))
print(type(b))
print(int(a) + int(b))

### Dates and time

Dates and times are not basic, i.e. built-in data types, but they are so commonly used that it is worth discussing them.

The first type is a date in the format year - month day. We can retrieve, for example, the current day. It is necessary to use the datetime library.

In [None]:
from datetime import date

today = date.today()
print("Today is", today)
print(type(today))

The date can also be formatted, for example:

In [None]:
print(today.strftime("%m/%d/%Y"))
print(today.strftime("%A, %b %d, %Y"))
print(today.strftime("%A, %B %d, %Y"))

A link to an article about formating: https://www.w3schools.com/python/python_datetime.asp

Probably the most commonly used data type in computer science is a timestamp, which is the current date and time. Below is an example of how to download it:

In [None]:
import datetime

dt = datetime.datetime.now()

print(dt)
print(type(dt))

# compare with this date type:
print(type(today))

Timestamp can be useful for all kinds of calculations, e.g. elapsed time. To do this, subtract two timestamps:

In [None]:
import datetime, time

dt1 = datetime.datetime.now()
# print(dt1)

time.sleep(2)

dt2 = datetime.datetime.now()
print(dt2)
print(dt1)

print(dt2 - dt1)

print(type(dt1))
print(type(dt2 - dt1))

## Complex Data Types

### List

A list is a basic data type in many programming languages ​​and consists of other data separated by commas. Basic information about a list:
- A sequence of variables of any type
- The order is important
- The size is dynamic

Initializing a list:
- An empty list is created by assigning a value to a variable [] or list()
- You can create a list with data - by passing the data into []
- As a result of other operations, e.g. splitting a string using split()

In [None]:
lst = []
print(lst)

lst = list()
print(lst)

lst = [1, 2, "three", True]
print(lst)

lst = "Hello World   !".split()

print(lst)

Basic list operations:
- append: adds to the end of the list
- insert: places an element before the given index
- pop and del: remove an element with the given index
- remove: removes an element with the given value
- count: counts the number of elements with the given value
- reverse: reverses the elements of the list
- index: returns the index of the given value
- clear: removes everything

In [None]:
lst = [1, 2, "three", True]

lst.append(5)
print(lst)

lst.index(True, 1)
print(lst)
print(lst.count(True))

lst.clear()
print(lst)

Lists can be joined:

In [None]:
lst_a = [12, 13, 14]
lst_b = [14, 15, 16]
print(lst_a + lst_b)

It is very helpful to search the list using the word "in":

In [None]:
lst = [1, 2, "three", True, 1]

if 1 in lst:
    print("value is on the list")

In [None]:
lst = [1, 2, "three", True, 1]

key = 1

if key in lst:
    print("value is on the list")

When you use a list in a for loop, this is what it does (we'll learn more about the for control statement in the next lecture):

In [None]:
lst = [1, 2, "three", True]

for elem in lst:
    print(elem)

# the same
for i in lst:
    print(i)


### Dictionary

A dictionary is a data type for storing a set of key-values. Keys are unique and we access the value by them, which can be of any type.

Similar to a list, a dictionary can be initialized by assigning the value {} or dict() to a variable or by providing data:

In [None]:
dct = {}
print(dct)

dct = dict()
print(dct)

dct = {'a': 5, 'b': 7}
print(dct)

dct = {'a': 5, 'b': 7, 'a': 6}
print(dct)

Access to the dictionary is done by specifying the key or via the get() function:

In [None]:
print(dct['a'])
print(dct.get('a'))

The get() function can return a default value if no key is available. Test both methods for key 'c'. In the case of get, set the default return value to None:

In [None]:
print(dct['c'])

print("Element C found because no error") # not really

In [None]:
print(dct.get('c', None))

print("Element C found because no error") # not really

We will learn about exception handling later, but the absence of an element with a given key can be handled without using the get() function:

In [None]:
try:
    print(dct['c'])
except:
    print("Element C not found. Showing None")
    print(None)

Adding an element can be done by assigning a value to a new key:

In [None]:
dct = {'a': 5, 'b': 7}
dct['c'] = 9
print(dct)

Or using the update() function:

In [None]:
dct = {'a': 5, 'b': 7}
dct.update({'c': 9})
print(dct)

It is worth adding that in the case of an existing key, its value will be overwritten:

In [None]:
print(dct)
dct.update({'c': 11})
print(dct)

A dictionary is a data type with a structure similar to JSON (JavaScript Object Notation). You can often see it when viewing sources of websites and commands in the browser console. JSON is convertible to a dictionary or list, and conversion in the other direction is also possible.

Open the page in your browser: http://api.nbp.pl/api/exchangerates/tables/C/?format=json and then follow the code below:

In [None]:
import requests, json

req = requests.get('http://api.nbp.pl/api/exchangerates/tables/C/?format=json')

req_json = req.text
print(req_json)
print(type(req_json))

req_lst = json.loads(req_json)
print(req_lst)
print(type(req_lst))

In [None]:
for item in req_lst:
    print(type(item))
    print(item)

There is only one element in the list:

In [None]:
for item in req_lst[0]:
    print(type(item))
    print(item)

Take a look at the following code snippet and try to diagnose the problem:

In [None]:
dct = {'a': 5, 'b': 7}
new_dct = dct

dct.update({'c': 9})

print(dct)
print(new_dct)

In this case, the new_dct variable is assigned a reference to the dct dictionary, not its value. To copy, use the copy command:

In [None]:
dct = {'a': 5, 'b': 7}
new_dct = dct.copy()

dct.update({'c': 9})

print(dct)
print(new_dct)

Recall how string variable works:

In [None]:
a = 5
b = a # copy of value not a reference
print(b)
a = 7
print(b)

### Tuple

A tuple is a data type similar to a list, but it is immutable, meaning you can't modify, add, or remove elements. As with a list or dictionary, it can contain elements of any type. We create it by providing data or by transforming the list:

In [None]:
my_tup = (1, 2, 3)
print(my_tup)

my_tup = tuple([1, 2, 3, "abc", 3, 3])
print(my_tup)

Basic operations on a tuple:
- count: counts the occurrence of a given value
- index: returns the index of the first occurrence of a given value (as a reminder - in Python we count from zero)

In [None]:
print(my_tup.count(3))
print(my_tup.index(3))

Tuples are used when authors want the data not to be changed, e.g. tuples are used to pass data from a database. In practice, tuple immutability can be easily bypassed by dynamically typing Python:

In [None]:
my_tup = tuple([1, 2, 3, "abc"])
print(my_tup)

my_tup = list(my_tup)
print(my_tup)

my_tup.insert(1, "nx")
print(my_tup)

my_tup = tuple(my_tup)
print(my_tup)

### Set

A set is a non-indexable, unique set of data of any type. Its elements can be accessed only using a for loop. It is possible to add and remove elements. In addition, sets can be compared and combined.

In [None]:
st = set()
print(st)

st.add(1)
st.add(1)
st.add(1)
st.add(4)
print(st)

The set is used in situations where we care about the uniqueness of values ​​in the list, e.g. we process text and want to find all unique words, or we analyze the system configuration and look for IP addresses of servers. An IP address consists of four numbers with values ​​from 0 to 255, separated by dots. An IP address can be detected using regular expressions discussed in the context of the standard library. The found address can be added to the set.

Another example is creating a list of emails. Please look at the example code below:

In [None]:
email_set = set()

email1 = "ab@domain.com"
email_set.add(email1)

email2 = "AB@domain.com"
email_set.add(email2)

print(email_set)

In order to avoid this, emails can be normalized:

In [None]:
email_set = set()

email1 = "ab@domain.com"
email_set.add(email1.lower())

email2 = "AB@domain.com"
email_set.add(email2.lower())

print(email_set)

### Frozenset

Frozen sets were introduced in Python 3.11. They combine the features of a set (uniqueness) and a tuple (immutable).