# Python
`Python` is an interpreted programming language, i.e., a console-like language where we can **interactively** provide instructions to the machine.

Why `Python`?

- the *de facto* standard for data science
- extremely flexible
- low barrier of entry
- ridicolously large library of available programs and libraries

# Everything starts from data!
Our focus is on the data, each datum having a different `type` and thus different `operations` we can perform of them. To ease things, data can be named and stored in `variables`, that is, little containers where data is named and stored.

Syntactically, `Python` allows us to declare a variable where to store some data by simply providing name and value as follows:
```python
variable_name = some_data
```
Optionally, we can also provide a type to said variable:
```python
variable_name: variable_type = some_data
```

And just like in maths, the type of the variable defines the operations said variable can be involved in, e.g., we can't get the square root of a negative number.

In [None]:
# numeric data
an_integer = 3
a_float = 3.1415

# text (string) data
a_string = "Hello, Macerata!"


# now with type indications!
# numeric data
an_integer: int = 3
a_float: float = 3.1415

# text (string) data
a_string: str = "Hello, Macerata!"

# condition data, aka booleans
a_boolean = True
a_boolean = False

# "absence" of data
not_a_meaningful_value = None

## And data can be overwritten
Variables are simply names, the value they associate to can be changed at any time.

In [None]:
an_integer = 3
an_integer = 4
an_integer = 5

## From datum... to data
Single values are rarely useful on their own, we usually think in terms of (heterogeneous) collections of them:
- lists of values: type `list`, dynamically-sized ordered list of datum
- tuples of values: type `tuple`, statically-sized ordered list of datum
- sets of values: type `set`, literal set with unique elements
- dictionaries of pairs: type `dict`, just like a dictionary, maps some key to a value
- objects: type... `object` (for now), arbitrary, user-defined types that can adapt to your needs!

In [None]:
# lists
a_list = [3, a_float, an_integer, a_string]
# tuples
a_tuple = (3, a_float, an_integer, a_string)
# sets
a_set = {1, 2, 3}
# dictionaries (maps)
a_dictionary = {"english": "Italy", "italian": "Italia", "turkish": "Italya"}

# objects
# for now this will give us an error)
# mattia = Person("Mattia", "University of Pisa", 30)

# Working with data
What can we do with data? Depends on its type!

At least, we can, with some rare exceptions:
- compare for equality: is datum `a` equal to datum `b`?
- compare for inequality: is datum `a` different from datum `b`?

In [None]:
print(3 == 3)
print(an_integer == 3)
print(an_integer != 3)

## Numerical data
Standard mathematical operations are supported, more can be added (and defined). And the results of such operations are themselves data, hence they have a type and can be stored in variables to be reused further down the line!

In [None]:
print(an_integer + an_integer)
print(an_integer - an_integer)
print(an_integer * an_integer)
print(an_integer / an_integer)
print(an_integer // an_integer)
print(an_integer % an_integer)

## Boolean data
Booleans (conditions) can be joined to get more complex conditions:
- `and`: are both conditions true?
- `or`: is at least one condition true?
- `not`: flip this condition

In [None]:
print(True)
print(True and True)
print(True and False)
print(False and False)

print(True or True)
print(True or False)
print(False or False)

print(not True)
print(not False)

## String (text) data
Lots of utilities: access characters within the text, subsets of a string, concatenate, etc.

In [None]:
a_string = "Hello, Macerata!"

# access
print(a_string[0])
print(a_string[0:5])

# prefixes and suffixes
print(a_string.startswith("Hello"))
print(a_string.startswith("Macerata!"))

# splitting: from string to list
print(a_string.split(" "))

# formatting: variables within text
name = "Mattia"
surname = "Setzu"
print(f"Hi, I'm {name} {surname}, nice to meet you!")

## Collections
Collections can be generally
- accessed
- overwritten
- measured
- searched

with other operations available depending on the `type` of the collection. For instance, some collection may be sorted.

In [None]:
#### lists
# access
a_list = [3, a_float, an_integer, a_string]

print(a_list[0])
print(a_list[-1])
print(a_list[1:3])

# overwrite
print(a_list[0])
a_list[0] = "new value"
print(a_list[0])

# extension
print(a_list + a_list)
print(a_list + ["tail"])

# measure
print(len(a_list))

# search: is the element in the list?
print("not in list" in a_list)
# search: where in the list is the element?
print(a_list.index(3.1415))

In [None]:
#### dictionaries
a_dictionary = {"english": "Italy", "italian": "Italia", "turkish": "Italya"}

# access
print(a_dictionary["english"])
print(list(a_dictionary.items()))

# overwrite
a_dictionary["english"] = "Eataly"
print(a_dictionary["english"])

# extension
another_dictionary = {"french": "Italie"}
print(another_dictionary)
a_dictionary.update(another_dictionary)
print(a_dictionary)

# measure
print(len(a_dictionary))

# search: is the element in the dictionary?
print("turkish" in a_dictionary)

# Operations on collections
Operations can be generalized to collections.

### Iteration
Accessing elements of a collection.

```python
for element in collection:
    ...
```

In [None]:
pisa_min_temperatures_in_february = [
    8, 9, 13, 12, 5, 6, 5, 5, 8, 8, 5, 7, 8, 10, 10, 9, 10, 10, 8, 10, 8, 8, 5, 9, 13, 8, 10, 12, 10
]


# for each element in collection
for day_minima in pisa_min_temperatures_in_february:
    print(day_minima)

### Iterations 101: `filter`ing
From a collection, retrieve a subcollection of elements respecting some condition.

In [None]:
pisa_min_temperatures_in_february = [
    8, 9, 13, 12, 5, 6, 5, 5, 8, 8, 5, 7, 8, 10, 10, 9, 10, 10, 8, 10, 8, 8, 5, 9, 13, 8, 10, 12, 10
]

average_minima_pisa_february_2023 = 8.4
days_above_2023_average = list(filter(lambda day_minima: day_minima > average_minima_pisa_february_2023,
                                      pisa_min_temperatures_in_february))
days_below_2023_average = list(filter(lambda day_minima: day_minima < average_minima_pisa_february_2023,
                                      pisa_min_temperatures_in_february))

print(f"There where {len(days_above_2023_average)} days above 2023 average and {len(days_below_2023_average)} days below 2023 average.")

We can also directly verify properties on filtered/unfiltered collection by using existential and universal quantifiers:
- `any()` is `True` if at least one element in the collection is `True`
- `all()` is `True` if all elements in the collection are `True`

### Iterations 101: `map`ping
Map all values of a collection to some other value, according to some conversion.

In [None]:
fahrenheit_conversion_rate = 33.8

fahrenheit_conversion = list(map(lambda celsius_temperature: celsius_temperature * fahrenheit_conversion_rate,
                                 pisa_min_temperatures_in_february))
print(f"Original: {pisa_min_temperatures_in_february[:5]}")
print(f"Converted: {fahrenheit_conversion[:5]}")

### Iterations 101: `reduce`ing
Combine all values in a collection to a single value, according to some aggregation.

In [None]:
from functools import reduce


reduced_sum = reduce(lambda value, aggregate_value: value + aggregate_value, pisa_min_temperatures_in_february, 0)
average = reduced_sum / len(pisa_min_temperatures_in_february)
print(f"The sum is {reduced_sum}, the average is {average}.")

## Scaling with size
Each of the above manipulations computes a `generator`, that is, it does not directly compute the results, rather it can be queried to generate one more element of the resulting collection.
This is because computing everything one-shot can be exceedingly slow for very large collections.