                                                           Notebook created by Dragos Gruia and Valentina Giunchiglia


# Introduction to Jupyter Notebook

**Jupyter Notebook** is an interactive computing platform that can be used to combine code and texts into a document-style format. Each separate chunk of code, or of text, is defined as `cell`. The common, most useful, shortcuts are presented in the following table.

| Shortcut | Action |
| :---: | :---: |
| `shift + enter` | Run the current cell |
| `Command` or `Control` + `S` | Save the notebook |
| `C` $\dagger$ | Copy cell |
| `V` $\dagger$ | Paste cell |
| `X` $\dagger$ | Cut cell |
| `M` $\dagger$ | Turn cell into markdown |
| `Y` $\dagger$ | Turn cell into code |



$\dagger$ These shortcuts only work in "Command mode", i.e. when you're not inserting text into a cell. If you're inserting text into a cell ("Edit mode"), press `Esc`to exit "Edit mode" and enter "Command mode"

Instead of the shortcuts you can use the command bar at the top of the notebook. Each element of the bar is described in the following figure.

![Figures/introjupyter.png](attachment:2d6085cc-ee50-4229-946c-c253298196e3.png)

Jupyter notebooks can be used to code in Python or many other languages. Here, however, we will focus exclusively on how to use python or bash. Bash is the default language of the command line, or terminal, which you used already to install your conda environment and set up the jupyter notebook. The notebook will consist of a combination of theory, coding tutorials and `Code here` sections, in which you are required to solve small exercises using what you have learnt so far in the lecture. At the end, you will have few bigger exercises that you can use to repeat what you have learnt during each day of lectures.

The first thing you need to do to be able to run the lectures without any issue, if you are on MAC, is to change the `kernel` of the Jupyter notebook. The `kernel` specifies which `conda environment` to use when running the code in the Jupyter. This needs to be done everytime you open a new Jupyter notebook. To change the Kernel you need to press on top of `Python 3` in the top right corner of the Jupyter Notebook and then change the Kernel to `Python3.9 (PythonTutorial)`.

    IMPORTANT: To complete the lecture properly, you need to run all cells in the notebook that have code. To run it you can either press the play button on top (as shown in the image), or press shift + enter.

# Introduction to Python Syntax:

The first thing we are going to learn in Python is:
- How to store data 
- What different types of data there is out there
- How we can manipulate that data to perform various (basic) types of operations
- How to work with files

`Comments` will be used throughout the tutorial to indicate that we do not want Python to execute a specific line of code.  
Other times, a comment may be used to explain what a certain line of code is doing or even an entire blocks of code.
You can indicate that a line of code is a comment by adding the `#` sign at the beginning of that line, or you can add `#` in the middle of a line to ignore any text starting from that point.

In [None]:
# This is a comment
2 + 2 # This is another comment

## Variables

Python and all other programming languages use variables to store data. In Python you use the following syntax `variable_name` = `variable_value`, such that the name of the variable is followed by an equal sign, which is in turn followed by the value we want to assign to it. There are many different types of data:

In [1]:
age = 60  # Integer number
height = 1.75 # Float number
gender = "Male" # String
pregnant = False # This is not a string, but a special type of data called boolean
# Booleans can only take the value True or False.

It is also possible to assign the same value to multiple variables, which will end up being the same

In [2]:
age = 60
updated_age = age

Once a value is assigned to a variable, it is possible to simply use the variable for the following steps of the analysis. If needed, to visualise what the meaning of each variable is, Python has a very useful feature called `print()`. `print()` can be used to visualise one variable at a time, or multiple variables together. In the latter case, the variables need to be separated by a comma.

In [3]:
print(gender)
print(age, updated_age)

Male
60 60


As you can see, age and updated_age are exactly the same, as expected. If you end a code cell with a specific variable, then the print statement is called automatically

In [4]:
gender

'Male'

Once you have defined your variables you can perform a wide range of `arithmetic operations` with them. For instance:

In [5]:
var_A = 5
var_B = 10
var_A + var_B
var_A - var_B
var_A / var_B # used for division
var_A * var_B # used for multiplication
var_A ** var_B # used for exponentiation
var_A % var_B # used for modulus (which outputs the remainder of the operation) 


5

Let's look at the output of one of the previous operations using `print`

In [6]:
print(var_A)
print(var_A ** var_B)

5
9765625


Now let's try to use what you have learn above to complete the following exercise.

-----------------
### Code here
The current temperature is 27 degrees Celcius. How many degrees are there in Fahrenheit? Store the temperature in Celcius as a variable called `celcius` and then use it to obtain the temperature in Fahrenheit, which should be stored in a variable called `fahrenheit`. The equation to convert from C to F is: 
$$ 
    F^\circ = C^\circ \times 1.8 + 32
$$

In [7]:
## CODE HERE: 
F = 1.8*27 + 32
print(F)

80.6


-------------------

You can also use a similar logic to `concatenate strings`

In [8]:
var_A = "Hello"
var_B = "World"
print(var_A + var_B)

HelloWorld


Note that the result of the print statement came as `HelloWorld`. That is because Python does not add a space between words, unless it is specifically told to do so. So what if we wanted the output to be `Hello World`? 
We could simply add a space at the end of the first string.

### Booleans: a special type of variables

As mentioned above, booleans are a special type of variables that can assume the value of `True` and `False`. They can be used in multiple contexts, e.g. conditional and logical operators (mentioned later on in the lecture) or to establish the **truthiness** of a variable. To check the *truthiness* of a variable, the function  `bool ` can be used.

The truthiness of an empty string is always `False`

In [9]:
bool(''), bool("hello")

(False, True)

The truthiness of an integer is `True` only if different from 0

In [10]:
bool(1), bool(0)

(True, False)

If you want to negate the truthiness of an object, you can easily use the command `not`

In [11]:
print(bool(0))

False


In [12]:
print(not bool(0))

True


### Changing variable types

If you need to change the type of the variable, it is possible to do it with specific commands. However, be careful!! Changing variable types can create problems in the analysis, if you change them incorrectly.

In [13]:
var = "5"
var_int = int(var)
var_str = str(var_int)
var_float = float(var_str)
print(type(var), type(var_int), type(var_str), type(var_float))

<class 'str'> <class 'int'> <class 'str'> <class 'float'>


One common error is to try to convert to integers or floats variable that cannot be converted to numbers

In [14]:
var_name = "hello"
float(var_name)

ValueError: could not convert string to float: 'hello'

As you can see, Python returns an error, because it is not possible to convert "hello" into a number

# Relational Operators



There is a set of 'relational' operators that can be applied to compare numeric variables. If the evaluation is correct, these will output the value `True`, otherwise they will output the value `False`.

- `=`: Assign value
- `==`         Is equal to?
- `!=`	       Is not equal to?
- `>`	       Is greater than?
- `<`	       Is less than?
- `>=`	       Greater than or equal to?
- `<=`	       Less than or equal to?

Try applying some of the relational operators. Below are some examples.

In [15]:
print(1 != 2)
print(1 <= 2)
print(10 > 10)
print(4 == 2 * 2)

True
True
False
True


# Logical Operators

Logical operators can be used to evaluate conjunctions of other operators. These are the `AND` operator and the `OR` operator.

The `AND` operator will only return `True`, if all of the other operators in the command return `True`. For example:

In [16]:
print(1 == 2, 1 == 1)
print(1 == 2 and 1 == 1)  # Will output False as the left side operation returned False and the rigt side returned True

print(5 == 5 , 10/2 == 5)
print(5 == 5 and 10/2 == 5) # Will output True as both operations returned True

False True
False
True True
True


The `OR` operator will return `True` if at least one of the operators in the command return `True`. 

In [17]:
print(1 == 1, 2 == 1)
print(1 == 1 or 2 == 1) # Will output True as the left side operation returned True

print(1 == 1, 2 == 1)
print(1 == 1 and 2 == 1) # Will output True as the left side operation returned True


True False
True
True False
False


---------------

### Code here

Let's try to combine what you have learnt about booleans and logical operators.  What is the ouput when checking the thruthiness of:
- An empty string and the integer 5
- The string a or the integer 0
- The string False and the string a
- The boolean False and the string a
- The not boolean False and the string a

What's the difference between the option 3 and 4?
Why is 5 equal to 3?

In [20]:
## CODE HERE:
print(bool('') and bool(5))
print(bool('a') or bool(0))
print(bool('False') and bool('a'))
print(bool(False) and bool('a'))
print(not bool(False) and bool('a'))

#In option 3, the word 'False' is a string, not a boolean value. Thus python will evaluate it as True.
#In option 4, the boolean False is a logical value and is thus evaluated as False. Thus the expression is False.

#Option 5 is equal to option 3 because we are negating the first boolean, which comes up as True. Thus the overall expression is True.

False
True
True
False
True


------------

# Working with files 

Now that we know how to interact with data, let's see how we can import data from a file. Python has a specific syntax for opening and creating files.

In [None]:
file = open("hello_world.txt", "w")
file.write("Hello World!")
file.close()

To create a new file we have to use the command `open()` with the name of the file in parenthesis as first input and the `w` as second. The `w` means that you are *writing* the file, which implies that if a file with the same name is already there, then it will be simply overwritten. If you want to add another line to the file, then you have to replace `w` with  `a`, which means that you are *appending* new information to the already created file.

When writing the file name you can provide either the name of the file, or a path. If you just specify a path, then the file will be saved in the current working directory. If you specify a path, then the file will be created in that specific path. After creating the file, you can write things in it - in this case "Hello World!" - using the write function. Once you are done working with the file, you need to close the connection. If `.close()` is not called, then some data might be lost. 

**IMPORTANT**: when writing inside a file, the `write` command expects variables of type `string`!!

---------
### Code here
Now try to create a new file, call it `lecture_one.txt` and write a sentence of what you have learnt so far in this lecture. Remember to close the connection to the file when you are done!

In [None]:
# CODE HERE: create a file called lecture_one.txt and write a sentence in the file that summarises what you have learnt so far in this lecture
# 1. Open the file
# 2. Write sentence in the file
# 3. Close the file

file = open("lecture_one.txt", "w")
file.write("I have learnt how to open, read and write files in python.")
file.close()


Now, try to add a new line to the file you already created, which specifies the date of today.

In [None]:
# CODE HERE: 
# 1. Open the file (with append)
# 2. Write the date of today
# 3. Close the file

file = open("lecture_one.txt", "a")
file.write("Today is Friday.")
file.close()


--------------

If you want to import the file you created, you can follow a similar pattern used to create the file, but this time you have to write `r` as second input and not `w`, because you are *reading* the file. If you didn't create the file before, then you will get an error, so be sure to complete the section above! After *reading* the file, you can read it with the `read()` function and save it within the Python environment in a variable of your choice (in this case, the variable is called `file_object`). Once you are done, you have to close the connection with the file as mentioned above

In [None]:
file = open("lecture_one.txt", "r")
file_object = file.read()
file.close()

The issue of `read()` is that it imports the entire file at once, which can require a lot of memory and overwhelm your computer if the file is big. One alternative is to use `readline()` which loads one line at a time by using what is called an *iterator*. `readline()` is used in the same way as `read()` above. 

In [None]:
file = open("lecture_one.txt", "r")
file_object = file.readline()
file.close()

As you may have noticed already, you are always required to open and close the file. A better, and preferred, alternative is to use `context managers`.`Context managers` are temporary environments where some variable can be used, files can be accessed or, more in general, things behave in a specific way. In this context, they can be used to open the file within the created environment and then close it automatically once you are outside of it.

In [None]:
with open('hello_world.txt', "w") as file:
    file.write("Hello World!")

As you can see, to set up the context manager, in this case you have to combine `with` and `as`. Also notice that we used `line identation` (i.e. adding space via TAB at the beginning of the second line). Identation is very important in Python, and we will discuss this in more detail in the next lesson. For now, all you need to know is that if you want to work with the file you just opened, you need to add identation to each line of code. Once you stop identing your code, Python will assume that you finished working with the file and it will close it, e.g. it will exit the context manager. Another thing to mention is that we saved the data from our .txt document in the `file` variable. This variable can take on any name and it is up to you how you want to name it.

---------
### Code here
Now try to create the `lecture_one.txt` file as above, but using a context manager. Once you have created the file, try to read it with `read()` or `readline()` and store it in a variable called `lecture_file`

In [None]:
# CODE HERE
# 1. Create the file with context manager
# 2. read the file with context manager

with open('lecture_one.txt', "w") as file:
    file.write("Hello World!")
    
with open('lecture_one.txt', "r") as file:
    file_object = file.read()


---------

# Introduction to Bash Scripting

The `Command Line` is a tool that allows you to navigate and edit your computer’s filesystem. Through the command line, you can create new files, edit the contents of those files, delete files, and many more. To perform these commands, we commonly use a programming language called `Bash`. You won't need to know how to use this language in a great level of detail, however, knowing the basics will be very helpful when you write your own Python scripts (and even more useful later on if you decide to pick the Computational Stream).

Jupyter Notebook also allows us to use `Bash` within it, so to make things simpler, we won't use the `Command Line` for the purpose of this tutorial. To use `Bash` in Jupyter Notebook you need to add an exclamation mark at the beginning of each line `!`. That tells it to evaluate the code using `Bash`. An alternative is to writre `%%bash` at the begiining of the cell. 

The first command you will learn tells you the directory in which you find yourself; you can call it using `pwd`, which stands for print working directory.

In [None]:
! pwd 

Knowing in which directory you are is very useful if you want to navigate to a certain folder, or if you want to save your data in a specific place. Navigating through your directories using `Bash` is very similar to navigating through your folders via the Finder on Mac or the File Explorer on Windows, but in `Bash` you don't have the  graphical interface to help you get from one place from another. 

As you can see the ouput of `pwd` is a `path`. This `path` can be defined as `absolute path`, because it starts from the first directory in your computer and tells you all the subdirectories until your current one. The relative path, instead, is *relative* to where you are located, so it will tell you only the directories within your current directory.

Now that you know in which folder you are, you might also want to know what files are found in this folder. To do this you need the `ls` command which is short for List

In [None]:
! ls

To create a new folder and make that your working space, we need the `mkdir` command which is short for make directory. This needs to be followed by the name of the folder that you want to create. If the name of the folder already exists, you will get a warning and no new folders will be created.

In [None]:
! mkdir my_workspace

*my_workspspace* is a relative path. You could achieve the same by writing the absolute path, which corresponds to the output of pwd + my_workspace.

Now if we want to change directory and navigate inside the folder we have created, we can run the `cd` command, which is short for change directory. Then, specify the name of the folder that we want to enter.

In [None]:
! cd my_workspace

If we want to exit a folder, we simple add two dots after the cd command:

In [None]:
! cd ..

Play around with the following commands and see if you can navigate through folders with ease.

If we want to `create a file`, there is a simple command in `Bash` that allows you to do that called `touch`. The command then expects the name of the file that you want to create. `Bash` will create the file in the current directory.
 

In [None]:
! touch mynewfile.txt

`Bash` also allows you to copy or move files from your filesystem to wherever you need them. You can do this via the `cp` and `mv` commands which are short for copy and move, respectively. Both of these commands expect the first argument to be the file you want to copy or move. Then, for `mv` the second argument tells `Bash` in which directory the file should be moved, while for `cp` the second argument dictates the name of the new file (though you can also mention the directory for `cp` if you want the file to be copied in a specific place).

In [None]:
! cp introjupyter.png my_copy_of_the_pic.png
! cp introjupyter.png /Users/dg519/Documents/my_copy_of_the_pic.png
! mv my_copy_of_the_pic.png /Users/dg519/Documents/

One important aspect of the `cp` and `mv` commands is that if we want to copy/move a directory, rather than a file, we need to specify the `-r` argument. Otherwise `Bash` will throw you an error. That argument stands for recursive and it lets `Bash` know that there may be more than one file in the folder you inputted and that it should copy/move all of these files.

In [None]:
! cp -r folder1 copy_of_folder1
! mv -r folder1 folder2

You can also delete a file of your choice by using the `rm` command, which is short for remove. Using `-r` will allow you to remove folders. However, use this command with caution, as you don't want to get rid of important analyses or data by mistake.

In [None]:
! rm my_file.txt
! rm -r my_folder1

Now that you know how to navigate your directories, how to modify files and how to create your own files, we will have a short look at how you can add information to a file. The `>` command allows one to add information defined by the user to a chosen file. If the file already contains information, this will be overwritten.

In [None]:
! echo "This is my information" > this_is_a_file.txt

Notice that we also used the `echo` command at the beginning of the line. This is used to tell `Bash` what information we want to store.

If instead of overwritting the existing information, we just want to add more information to the file, then, we can use the `>>` command instead. 

In [None]:
! echo "This information will be appended to the file" >> this_is_a_file.txt

The final step is to see what the new file contains, which can be done via the `cat` command, which is short for concatenate. This is similar to the `print` function in Python. 

In [None]:
! cat this_is_a_file.txt

Below you can see an example of how you can create a file, add information to it, then print its contents.

In [None]:
! touch my_file.txt
! echo "I'm adding some information" >> my_file.txt
! cat my_file.txt

## Stuck and unsure what to do?

`Bash` has many more other commands that we did not cover today, and even the commands that we did cover contain loads of options which can be turned on or off. The `help` command is extremely useful as it not only shows you all of the different options that each command has, but it is also useful if you forget how a command should be used.

In [None]:
%%bash
help pwd

Notice that I used `%%bash` instead of `!`. The two are equivalent, though if you want to use the `help` function, only the former will work. It is up to you which command you want to use, but it is best practice to be consistent