# Getting started

## Installing Python

It is recommended that you use Binder or install the full Anaconda Python 3 (any version between 8 to 11 will do), as it set up your Python environment, together with a bunch of often used packages that you will use during this course. A guide on installing Anaconda can be found here: https://docs.anaconda.com/anaconda/install. NB: You don't have to install the optional stuff, such as the PyCharm editor. 

For more instructions, take a look at the setup instructions in the repository. Let us now make sure that everybody is up and running.


If you completed all the steps and you have Python and Jupyter notebooks installed, open this file again as a notebook and continue with the content below. Good luck and have fun! 🎉

## Hello World

This notebook contains some code to allow you to check if everything runs as intended.

[Jupyter notebooks](https://jupyter.org) contain cells of Python code, or text written in [markdown](https://www.markdownguide.org/getting-started/). This cell for instance contains text written in markdown syntax. You can edit it by double clicking on it. You can create new cells using the "+" (top right bar), and you can run cells to 'execute' the markdown syntax they contain and see what happens.

The other type of cells contain Python code and need to be executed. You can either do this by clicking on the cell and then on the play button in the top of the window. Or by pressing `shift + ENTER`. Try this with the next cell, and you'll see the result of this first line of Python. 

**For a more extended revision of these materials, see http://www.karsdorp.io/python-course (Chapter 1).**

In [3]:
# It is customary for your first program to print Hello World! This is how you do it in Python.

print("Hello World!")

Hello World!


In [4]:
# You can comment your code using '#'. What you write afterwards won't be interpreted as code.
# This comes in handy if you want to comment on smaller bits of your code. Or if you want to
# add a TODO for yourself to remind you that some code needs to be added or revised.

The code you write is executed from a certain *working directory* (we will see more when doing input/output). 

You can access your working directory by using a *package* (bundle of Python code which does something for you) part of the so-called Python standard library: `os` (a package to interact with the operating system).

In [5]:
import os # we first import the package

In [6]:
os.getcwd() # we then can use some of its functionalities. In this case, we get the current working directory (cwd)

'/Users/giovannicolavizza/Dropbox/db_projects/Teaching/UNIBO_Programmazione_LM/notebooks'

## Python versions

![You can also do images in markdown!](https://www.python.org/static/img/python-logo@2x.png)

It is important that you at least run a version of Python that is being supported with security updates. Currently, this means Python 3.8 or higher. You can see all current versions and their support dates on the [Python website](https://www.python.org/downloads/)

For this course it is recommended to have Python 3.11 installed, since every Python version adds, but sometimes also changes functionality. If you recently installed Python through [Anaconda](https://www.anaconda.com/products/individual#), you're most likely running an approriate version.

Let's check the Python version you are using by importing the `sys` package. Try running the next cell and see it's output.

In [7]:
import sys

print(sys.executable)  # the path where the Python executable is located
print(sys.version)  # its version
print(sys.version_info)

/Users/giovannicolavizza/anaconda3/envs/programmazione/bin/python
3.11.7 (main, Dec 15 2023, 12:09:56) [Clang 14.0.6 ]
sys.version_info(major=3, minor=11, micro=7, releaselevel='final', serial=0)


You now printed the version of Python you have installed. 

You can also check the version of a package via its property `__version__`. A common package for working with tabular data is `pandas` (more on this package later). You can import the package and make it referencable by another name (a shorthand) by doing:

In [8]:
import pandas as pd  # now 'pd' is the shorthand for the 'pandas' package

NB: Is this raising an error? Look further down for a (possible) explanation.

Now the `pandas` package can be called by typing `pd`. The version number of packages is usually stored in a _magic attribute_ or a _dunder_ (=double underscore) called `__version__`. 

In [11]:
pd.__version__

'2.1.4'

The code above printed something without using the `print()` statement. Let's do the same, but this time by using a `print()` statement. 

In [12]:
print(pd.__version__)

2.1.4


Can you spot the difference? Why do you think this is? What kind of datatype do you think the version number is? And what kind of datatype can be printed on your screen? We'll go over these differences and the involved datatypes during the first lecture and seminar. 

If you want to know more about a (built-in) function of Python, you can check its manual online. The information on the `print()` function can be found in the manual for [built-in functions](https://docs.python.org/3.8/library/functions.html#print). More on datatypes later on. 

### Exercise
Try printing your own name using the `print()` function. 

In [13]:
# TODO: print your own name


In [14]:
# TODO: print your own name and your age on one line


If all of the above cells were executed without any errors, you're clear to go.

However, if you did get an error, you should start debugging. Most of the times, the errors returned by Python are quite meaningful. Perhaps you got this message when trying to import the `pandas` package:

```python
---------------------------------------------------------------------------
ModuleNotFoundError                       Traceback (most recent call last)
<ipython-input-26-981caee58ba7> in <module>
----> 1 import pandas as pd

ModuleNotFoundError: No module named 'pandas'
``` 

If you go over this error message, you can see:

1. The type of error, in this example `ModuleNotFoundError` with some extra explanation
2. The location in your code where the error occurred or was _raised_, indicated with the ----> arrow

In this case, you do not have this (external) package installed in your Python installation. Have you installed the full Anaconda package? You can resolve this error by installing the package from Python's package index ([PyPI](https://pypi.org/)), which is like a store for Python packages you can use in your code. 

To install the `pandas` package (if missing), run in a cell:

```python
pip install pandas
```

Or to update the `pandas` package you already have installed:

```python
pip install pandas -U
```

Try this in the cell below.

In [15]:
# Try either installing or updating (if there is an update) your pandas package
# your code here


If you face other errors, then Google (or another browser) is your friend. You'll see tons of questions on Python related problems on websites such as Stack Overflow. While it is tempting to simply copy paste a coding pattern from there into your own code. You can try the same with ChatGPT or similar chat bots. But if you do, make sure you fully understand what is going on. 

## Basic stuff
The code below does some basic things using Python. Please check if you know what it does and, if not, you can still figure it out. Just traverse through the rest of this notebook by executing each cell if this is all new to you and try to understand what happens.

The first notebook that we will be discussing in class is paced more slowly. You can already take a look at it if you want to work ahead. We'll be repeating the concepts below, and more. If you think you already master these 'Python basics' and the material from the first notebook, then get into contact for some more challenging exercises.

## Variables and operations

In [16]:
a = 2
b = a

In [17]:
# Or, assign two variables at the same time
c, d = 10, 20

In [18]:
c

10

In [19]:
b += c

In [20]:
# Just typing a variable name in the Python interpreter (= terminal/shell/cell) also returns/prints its value
a

2

In [21]:
# Now, what's the value of b?
b

12

In [22]:
# Why the double equals sign? How is this different from the above a = b ? 
a == b

False

In [23]:
# Because the ≠ sign is hard to find on your keyboard
a != b

True

In [24]:
s = "Hello World!"

print(s)

Hello World!


In [25]:
s[-1]

'!'

In [26]:
s[:5]

'Hello'

In [27]:
s[6:]

'World!'

In [28]:
s[6:-1]

'World'

In [29]:
s

'Hello World!'

In [30]:
words = ["A", "list", "of", "strings"]
words

['A', 'list', 'of', 'strings']

In [31]:
letters = list(s) # Names in green are reserved by Python: avoid using them as variable names
letters

['H', 'e', 'l', 'l', 'o', ' ', 'W', 'o', 'r', 'l', 'd', '!']

If you do have bound a value to a built-in function of Python by accident, you can undo this by restarting your 'kernel' in Jupyter Notebook. Click `Kernel` and then `Restart` in the bar in the top of the screen. You'll make Python loose it's memory of previously declared variables. This also means that you must re-run all cells again if you need the executions and their outcomes.

In [32]:
# Sets are unordered collections of unique elements
unique_letters = set(letters)
unique_letters

{' ', '!', 'H', 'W', 'd', 'e', 'l', 'o', 'r'}

In [33]:
# Variables have a certain data type. 
# Python is very flexible with allowing you to assign variables to data as you like
# If you need a certain data type, you need to check it explicitly

type(s)

str

In [34]:
print("If you forgot the value of variable 'a':", a)
type(a)

If you forgot the value of variable 'a': 2


int

In [35]:
type(2.3)

float

In [36]:
type("Hello")

str

In [37]:
type(letters)

list

In [38]:
type(unique_letters)

set

#### Exercise

1. Create variables of each type: integer, float, text, list, and set. 
2. Try using mathematical operators such as `+ - * / **` on the numerical datatypes (integer and float)
3. Print their value as a string

In [39]:
# Your code here

Hint: You can insert more cells by going to `Insert` and then `Insert Cell Above/Below` in this Jupyter Notebook.