# Insights Through Data course

This course will introduce you to some fundamental aspects of Data Science, including an introduction to Statistics and Machine Learning and will guide you through making a start on understanding data analysing. To facilitate these learning outcomes we need a programming language and we are going to use _Python_ for this purpose. We will use _Jupyter notebooks_ hosted on _Noteable_ to make working with Python much smoother.

# Python

Python is a high-level programming language. It was created by Guido van Rossum and released in 1991. Since then it has become a very popular programming language to make web applications, create workflows alongside software, connect to database systems, do production-ready software development, and handle big data and perform complex mathematics. 

Python is a strong data analysing tool and offers lots of flexibility in this area via its modules. A module could be considered as a code library containing a set of functions you want to include in your application.


# Starting with programming

Starting to code can be difficult if you are not used to such tasks. In the earlier stages, you may spend lots of time figuring out what is wrong in your code and what the errors are about. This is normal and don't let that discourage you! We will do pair-programming on even weeks of the course so you can help each other. There will be plenty of time for you to ask us questions during fusion or online teaching and discussion sessions. Please make sure that you do ask us for help if you need it.

# Noteable

Noteable is simply the university's service that hosts your computational notebooks in one simple online hub. We will be using Python (notebooks) on Noteable, instead of installing Python on your computers, because it makes the logistics much easier.

# Jupyter

A Jupyter notebook is this current web-based interface that looks like a notebook and has text and code together. The code we write here is in Python language. The code and text are live here, meaning that you can run and edit them. To edit any block of text or code, double-click on it.

## Text

You can edit these text boxes or write your own notes and descriptions throughout the course by adding new blocks/boxes or cells. To do this click on the plus button above, then from the drop-down menu choose `markdown`. 

To learn more about how to format your text see [Markdown Cheatsheet](https://github.com/adam-p/markdown-here/wiki/Markdown-Cheatsheet).


## Code

To write Python code, you need to add a new cell and set the drop-down menu on `code`. After writing your code in the block, click the `Run` button in the toolbar near the top of the window. Or you can click on the block and then press `Enter + Shift`. 

Let's try it. The block below has a Python code in it which is a command to print two words. Run this code and see the output.

In [None]:
print("Hello World")

Congratulations! you just successfully executed your first Python code. "Hello World" is the first program one usually writes when learning a new programming language. The `" "` around the text is required to mark the text as a `srting` of characters. Try editing the code above, removing the `" "` and running it again to see what happens.

Now run the code below. It prints "The Zen of Python".

In [None]:
import this

## Menu bar

The menu bar on top of the page includes some useful operational tools. The most useful ones are:

- **+**: This is used to add a new block or cell (of text or code)
- **Run**: To run code or text in a cell. You can also press `Shift+Enter` on your keyboard.
- **Save**: To save the notebook click on the small disk icon. Always do that before leaving your notebook.
- **Clearing all code output**: Go to `Kernel -> Restart & Clear output`
- **Deleting a cell**: Go to `Edit -> Delete Cell` or press `D+D`on your keyboard.
- **Saving a notebook locally to your own computer**: Go to `File -> Download As -> Notebook(.ipynb)`.
- **Closing**: It is not recommended to close a notebook just by closing the window. You should close it from the toolbar `File -> Close and Halt`.
- **cut, copy, paste**: To use on text, code, or cells.
- **Stop**: Or "interrupt the kenrnel" to stop the notebook from doing calculations if it is taking too long because of a coding mistake.

# First steps in Python

Let's do some basic programming in Python. Run the code below which simply adds two numbers 345 and 55 together.

In [None]:
345+55

Double-click on the code above and add another number to the code, for example 267. Run the code again.

Notice each time you run the cell the number next to it changes. Don't worry about that it just counts the number of inputs made to python so far in a session.

A better way to get the output from a command is to use the `print` function, just to make the output to the screen nicer, so the code above would look like this, run it and notice the difference:

In [None]:
print(345+55)

1. **Python as a calculator**: Python can be used as a calculator for simple arithmetical operations. See some of them in the table below:

| Symbol | Task  | Example | Result
|----|---|---|---|
| +  | Addition | 4 + 3 | 7 |
| -  | Subtraction | 4 - 3 | 1 |
| /  | Division | 7 / 2 | 3.5 |
| *  | Multiplication | 4 * 3 | 12 |
| **  | Power of | 7 ** 2 | 49 |

Let's try them:


In [None]:
print(51/7)
print(round(51/7, 2))
print(21*21)
print(2**5)

The function/command `round`, rounds up a number to as many demical places as we want, for example, 2,3,... .



2. **Assigning values to variables:** The assignment operator, “=” symbol, is used to assign values (numbers or characters) to variables in Python. 

The code below takes the value 20, and assigns that value to the variable with name “x”. After executing this line, this number will be stored into this variable. 

In [None]:
x = 20

Now call the variable "x" to see the value that was assigned to it (note that Python differentiates between lower case and upper case letters, so x and X are not the same thing). 

In [None]:
x

What do you think the output of the code below is? Run it and see if you got it right.

In [None]:
x = "Python "
y = "is "
z = "awesome"
print(x + y + z)

3. **Indentation**: Indentation, the spaces at the beginning of a code line, is very important in Python. Python uses it to indicate a block of code. 

Run the code below. You should get an error. Try removing the space at the beginning of the second line and running it again.

In [None]:
sentence = "This is a text!"
 print(sentence)

4. **Python modules**

We can consider a module to be the same as a code library or a package that contains a set of functions that you want to include in your data analysing work. Python has many modules and we will be using only a few of them. 


Some of the modules we will use in this course are:

**pandas**: for data structures and data analysis tools  [documentation](https://pandas.pydata.org/docs/getting_started/index.html#getting-started)

**NumPy**: to perform a wide variety of mathematical operations on numerical data  [documentation](https://numpy.org/doc/stable/user/absolute_beginners.html)

**matplotlib**: for data visualisation and plotting [tutorials](https://matplotlib.org/stable/tutorials/index.html)

**seaborn**: it provides a high-level interface for drawing attractive and informative statistical graphics [gallery](http://seaborn.pydata.org/examples/index.html) 

**statsmodels**: for fitting different statistical models [documentation](https://www.statsmodels.org/stable/index.html)

When we want to use a module (and the functions in it), we need to `import` that module. Whenever you import a module once it remains loaded in the background until you reset the Jupyter kernel.

To import the module NumPy, we type the code below. Note that we give it a shorter name (`np`) to be able to call its functions easier when we need them. 


In [None]:
import numpy as np 

After running the code, nothing happens, right? But it is ok because the module is loaded now and we can use its functions. 

Let's suppose we want to do some maths with the NumPy module, and use the Sine function, which is one of this module's many functions, to work out the Sine of an angle (no worries if you don't remember much from school's trigonometry). We do that by calling the module's short name `np` followed by the function's name `sin`. 


In [None]:
np.sin(0.5)

Let's say we have a list of integers and we want to calculate the average/mean of them. The numbers are (34,23,12,16,10,11,16,43,51,31). numpy has a function to do that called `mean`.

In [None]:
np.mean([34,23,12,16,10,11,16,43,51,31])

5. **Commenting your code**: You can add comments to your code to explain what you have done (to yourself or others who may read your code). It is good practice and highly encouraged to comment your code so reviewing it later becomes easier for you. You can add comments after `#` sign. Anything written after this will be considered as text comment, not code. for example:

In [None]:
# This is the list of observations after an experiment
a = [14,12,19,21,34]
# I calculate the summation of them in a variable called sum_a
sum_a = np.sum(a)
sum_a

6. **Types of variables**: The main variable types we will use in Python are "integer", "string", "float", "list", and "set". If you are not sure about type of a variable, call function `type` to figure that out. See for example: 

In [None]:
a = "It's a nice day!"
type(a)

In [None]:
b = 22
type(b)

In [None]:
c = 1.45
type(c)

In [None]:
d = ["Monday", "Wednesday", "Friday"]
type(d)

In [None]:
e = {11, 22, 33, 44, 55}
type(e)

# Excercises

1. Assign the value of `2023` to variable `year`. Assign the value of your birth year to variable `birth`. Define a new variable that is `year-birth` and name it `age`. Print `age`. 

2. Make a "list" that includes the first six months of the year (they are strings made of characters). Name this list, `first_half`. Now type and run `first_half[4]`. What is the output?

**NOTE**: Python starts counting elements of a list from zero.

What do you need to type so the output will be "april"?