# 🐍 Python Basics 🐍


Notebook adapted from [this notebook](https://github.com/mromanello/ADA-DHOxSS/blob/master/notebooks/1.1%20Skills%20Python.ipynb) by Matteo Romanello.

## Data types

### Numerical

Python has two numerical data types:
- `int`, e.g. `10`
- `float`, e.g. `10.12`

In [1]:
i = 10

In [2]:
type(i)

int

In [3]:
f = 10.12

In [4]:
type(f)

float

Python has two signs for division, which produce different results:

In [5]:
i // 3 == i / 3

False

In [6]:
i // 3

3

In [7]:
i / 3

3.3333333333333335

In [8]:
type(i // 3)

int

In [9]:
type(i / 3)

float

### Strings

In [10]:
mystring = "A string of text"

In [11]:
type(mystring)

str

Strings in Python are **list** of characters, thus they can be manipulated as any other *iterable*. 

In [12]:
# we can iterate through the characters
# of a string

for char in mystring:
    print(char)

A
 
s
t
r
i
n
g
 
o
f
 
t
e
x
t


In [13]:
# slicing by means of indices works as expected

mystring[2:]

'string of text'

In [14]:
mystring[-1]

't'

#### Concatenation

In [15]:
newstring = "This is " + mystring.lower()

In [16]:
newstring

'This is a string of text'

A very handy feature introduced in Python 3.6.x are f-strings:
- they are declared by prepending the character `f` to the quote signs containing the text
- they use curly brackets `{variable_name}` to specify the position in a string where the content of an existing variable should be inserted.

In [17]:
f'## {mystring} ##'

'## A string of text ##'

The curly brackets can contain *any* Python expression (except assignment of variables); the expression will be executed and its returned output interpolated within the string template.

In [18]:
f'The length of `mystring` is {len(mystring)} characters.'

'The length of `mystring` is 16 characters.'

**Q**: Can you explain what's going on in the cell below?

In [19]:
s = "repetita iuvant"
print(f'{", ".join([s for i in range(0, 10)])}')

repetita iuvant, repetita iuvant, repetita iuvant, repetita iuvant, repetita iuvant, repetita iuvant, repetita iuvant, repetita iuvant, repetita iuvant, repetita iuvant


Can you rewrite the cell above in an alternative way?

#### Transformation

In [20]:
mystring.lower()

'a string of text'

In [21]:
mystring.upper()

'A STRING OF TEXT'

In [22]:
mystring.replace("string", "list").replace("text", "characters")

'A list of characters'

### Date and time

 Limit of this data type when working with historical data (timestamps failed before a certain date around 1700).

#### `datetime.date`

In [23]:
from datetime import date, datetime

In [24]:
# `date` takes three arguments:
# 1. year, 2. month, 3. day

d = date(1982, 7, 17)

In [25]:
type(d)

datetime.date

**NB**: When creating a date, order matters! Try this:

In [26]:
d = date(19, 7, 1782)

ValueError: day is out of range for month

In [27]:
d.today()

datetime.date(2023, 10, 17)

In [28]:
f'{d.day}.{d.month}.{d.year}'

'17.7.1982'

In [29]:
f'{d.year}/{str(d.month)}/{d.day}'

'1982/7/17'

In [30]:
f'{d.year}/{str(d.month).zfill(2)}/{d.day}'

'1982/07/17'

#### `datetime.datetime`

`datetime` adds information about hour/minute/second/micro second to a date.

In [31]:
from datetime import datetime

In [32]:
dt = datetime.utcnow()

In [33]:
dt

datetime.datetime(2023, 10, 17, 17, 35, 51, 877887)

In [34]:
dt.isoformat()

'2023-10-17T17:35:51.877887'

In [35]:
dt.date()

datetime.date(2023, 10, 17)

In [36]:
datetime.now().strftime("%m/%d/%Y, %H:%M:%S")

'10/17/2023, 18:35:51'

## Python data structures

### Lists

In [37]:
l = list(range(0, 5))

In [38]:
l

[0, 1, 2, 3, 4]

The `extend()` method can be used to append elements to an existing list.

**NB**: `extend` operates directly on the list, modifying it in place.

In [39]:
l.extend(range(1, 10))

In [40]:
l

[0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [41]:
l + list(range(5, 10))

[0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9, 5, 6, 7, 8, 9]

In [42]:
l

[0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9]

`count()` can be used to count the number of times a given value is found within a list:

In [43]:
for n in range(0, 10):
    print(f'{n} occurs {l.count(n)} times in list `l`')

0 occurs 1 times in list `l`
1 occurs 2 times in list `l`
2 occurs 2 times in list `l`
3 occurs 2 times in list `l`
4 occurs 2 times in list `l`
5 occurs 1 times in list `l`
6 occurs 1 times in list `l`
7 occurs 1 times in list `l`
8 occurs 1 times in list `l`
9 occurs 1 times in list `l`


In [44]:
l

[0, 1, 2, 3, 4, 1, 2, 3, 4, 5, 6, 7, 8, 9]

In [45]:
l.index(4)

4

`pop()` remove the last item of a list and, as `extend()`, operates directly on the variable, modifying its value.

In [46]:
while(len(l) > 0):
    print(f'Removing {l.pop()} from my list')
    print(f'Size of `l` is {len(l)}')

Removing 9 from my list
Size of `l` is 13
Removing 8 from my list
Size of `l` is 12
Removing 7 from my list
Size of `l` is 11
Removing 6 from my list
Size of `l` is 10
Removing 5 from my list
Size of `l` is 9
Removing 4 from my list
Size of `l` is 8
Removing 3 from my list
Size of `l` is 7
Removing 2 from my list
Size of `l` is 6
Removing 1 from my list
Size of `l` is 5
Removing 4 from my list
Size of `l` is 4
Removing 3 from my list
Size of `l` is 3
Removing 2 from my list
Size of `l` is 2
Removing 1 from my list
Size of `l` is 1
Removing 0 from my list
Size of `l` is 0


In [47]:
# you cannot remove an element from an empty list

l.pop()

IndexError: pop from empty list

### Dictionaries

In [48]:
d = {
    "count": 0,
    "type": "child",
    "average": 1.2
}

In [49]:
d.keys()

dict_keys(['count', 'type', 'average'])

In [50]:
d.values()

dict_values([0, 'child', 1.2])

In [51]:
d['count']

0

### Tuples

Tuples are similar to lists, as they are both iterables. 

In [52]:
t = tuple((0, "child", 1.2))

In [53]:
t

(0, 'child', 1.2)

As any interable, you can iterate over it (as one would expect):

In [54]:
for value in t:
    print(value)

0
child
1.2


The main difference between the two is that tuples do no support slicing.

In [55]:
t[1] = 'adult'

TypeError: 'tuple' object does not support item assignment

## Exercise

Let's consider two variables named `value_1` and `value_2`, instatiated as follows:

In [58]:
value_1 = "10"
value_2 = 20

❓What value (or outcome) do you expect when summing (`+`) the values of these two variables?

In [None]:
value_1 + value_2

❓How can the cell above be changed so that it can be executed without raising exceptions? (**hint**: there may be more than one way to do so)

### Lists

Let's imagine a spreadsheet containing catalogue data about an archeological collection of a museum. 
We take only the first 10 records, and select the column containing an indication of the type of object:

In [None]:
artefact_type = [
    "kantharos",
    "kylix",
    "krater",
    "dinos",
    "kylix",
    "kantharos",
    "amphora",
    "amphora",
    "amphora",
    "amphora",
]

❓ Write an expression to select the 1st item in this list.

In [None]:
# 📃 Write your solution here

❓ Write now an expression to select the last item. 

In [None]:
# 📃 Write your solution here

❓ And what about retrieving the last 4 items?

In [None]:
# 📃 Write your solution here

### Dictionaries

Building upon the example above, let's imagine that we have a dictionary associating the artefact identifier with its type as in the following:

In [None]:
artefact_type_by_id = {
    'P135':'kantharos',
    'A781':'kylix',
    'Q444':'krater',
    'D912':'dinos',
    'B111':'kylix',
    'C789':'kantharos',
    'Z908':'amphora',
    'W222':'amphora',
    'S456':'amphora',
    'F289':'amphora'
}

❓Write an expression returning from the dictionary `artefact_type_by_id` the value `dinos`.

In [None]:
# 📃 Write your solution here