## Part 1b: Variables and Data Types

This notebook covers some of the essential Python terminology and concepts. [add more intro stuff] 

#### Variables 
In the previous notebook, you encountered a so-called `variable`. A variable can be seen as a box that holds one or more values. This value can be a number, a string, a list, or any other data type. 

You can give a variable (almost) any name you want. Best practice is to give your variables a name that describes what they are used for. If you're interested in the naming conventions for Python variables, you can read more about it [in the style guide](https://peps.python.org/pep-0008/). 

The `=` operator is used to `assign` a value to a variable. Let's look at the example below.

In [21]:
colors = ["red", "green", "blue"]
print(colors)

['red', 'green', 'blue']


In [22]:
pi = 3.14159
print(pi)

3.14159


Python works sequentially, so the order in which you define your variables matters. 

If you use a variable that is not yet defined, you will get an error.

In [23]:
print(soup)
soup = ["broth", "union", "carrots", "celery", "noodles"]

NameError: name 'soup' is not defined

If you re-use a variable name, the old value will be overwritten by the new value. 

In [25]:
pasta = "bad"
pasta = "good"
print(pasta)

good


#### Data Types
Python is an object-oriented programming language. This means that everything in Python is treated as a kind of object that can be manipulated. 

These objects can take the form of different `data types`. The data type of an object determines what can be done with it. The most common data types are:
- `int` for integer numbers (e.g. 1, 2, -3, 486)
- `float` for floating-point numbers (e.g. 0.5, 3.14, -0.0001)
- `str` for strings/text (e.g. "hello", "pasta", "123")
- `bool` for boolean values (True, False)



### Differences between types
Use the function `type()` to check the type of a value or variable.

In [None]:
type(10)

int

In [None]:
type(10.10)

float

In [None]:
type("10")

str

In [None]:
type(pi)

float

Knowing the data type you're working with is important because they determine what you can do with the data. For example, you can add two integers or floats together, but you can't add an integer and a string. 

In [None]:
10 + 10.10 

20.1

In [None]:
10 + "10.10"

TypeError: unsupported operand type(s) for +: 'int' and 'str'

The data type may also determine which functions or methods you can use on the data.

In [None]:
# You can use the `len()` function to get the length of a string: 
len("beautiful soup")

14

In [None]:
# But not the length of a number:
len(86)

TypeError: object of type 'int' has no len()

#### Strings

`Strings` are sequences of characters. They should always be enclosed in quotes (either single or double). If you forget the quotes, Python will think you're referring to a variable. 

Strings have a lot of built-in `methods` that you can use to manipulate the data. You can recognize a method by the dot `.` after the variable name. Let's take a look at some of them. 

In [24]:
sentence = "Dua Lipa is an English and Albanian singer, songwriter, and model."

In [None]:
sentence.lower()

'dua lipa is an english and albanian singer, songwriter, and model.'

In [None]:
sentence.upper()

'DUA LIPA IS AN ENGLISH AND ALBANIAN SINGER, SONGWRITER, AND MODEL.'

> Exercise 2.1: \
> Predict the output of the following cell. Pay attention to the case of the letters. \
> Did it match you expectations? 

In [None]:
print(sentence)

> Exercise 2.2: \
> Can you add one line before the print statement to make the output uppercase? \
> (so without changing the print statement) \
> Hint: use a variable. 

In [None]:
# Your code here... 


print(sentence)

During this Summer School, we will often encounter strings since we're working with text analysis. To avoid some later errors, it's good to learn a bit more about the syntax. There is no difference between single and double quotes: 

In [2]:
print("quote")
print('quote')

quote
quote


However, if you want to use a quote inside a string, you need to use the other type of quote to enclose the string: 

In [6]:
# This will lead to an error
print('Let's go!')

SyntaxError: unterminated string literal (detected at line 2) (1354155248.py, line 2)

In [7]:
# You can fix it by using double quotes
print("Let's go!")

Let's go!


Another option is to use the escape character `\` before the quote. The escape character tells Python to ignore the special meaning of the character that follows it:

In [9]:
print('Let\'s go!')

Let's go!


Since strings are sequences of characters, you can also access individual characters by their index. The index starts at 0. The following table shows all characters of the sentence "" in the first row. The second row and the third row show respectively the positive and negative indices for each character:

| Character      | D   | r   | i   | n   | k   | &nbsp;   | y  | o  | u  | r  |  &nbsp;  | t  | e  | a  |
|----------------|-----|-----|-----|-----|-----|----|----|----|----|----|----|----|----|----|
| Positive index | 0   | 1   | 2   | 3   | 4   | 5  | 6  | 7  | 8  | 9  | 10 | 11 | 12 | 13 |
| Negative index | -14 | -13 | -12 | -11 | -10 | -9 | -8 | -7 | -6 | -5 | -4 | -3 | -2 | -1 |


You can access the strings by their index using square brackets `[]`: 

In [10]:
task = "Drink your tea"
print(task[2])
print(task[-1])

i
a


If you want to grab a range of characters, you can use the colon `:`. This is called `slicing`. 

Here's a quick overview of the syntax: 

| Syntax                 | Description                                                                 |
|------------------------|-----------------------------------------------------------------------------|
| `my_string[i]`         | Get the character at index `i`.                                             |
| `my_string[start:end]` | Get the substring starting at `start` and ending *before* `end`.            |
| `my_string[start:end:stepsize]` | Get all characters starting from `start`, ending before `end`, with a specific step size. |
| `my_string[:i]`        | Get the substring starting at index 0 and ending just before `i`.           |
| `my_string[i:]`        | Get the substring starting at `i` and running all the way to the end.       |

In [13]:
print(task[0:5])
print(task[-3:])

Drink
tea


> Exercise 2.3: \
> Can you predict what the following cells will print? 

In [None]:


print(task[4])

In [None]:
print(task[])

#### Booleans 
Let's take a look at Boolean values. These are values that can be either `True` or `False`, and can be useful for making decisions in your code, or for checking if a condition is met. 

In [26]:
5 == 5 

True

In [32]:
1 == 2 

False

Note that `==` is used to signal 'equals'. A single `=` is used to assign a value to a variable.

If you want to check if two values are not equal, you can use `!=`.

In [31]:
6 != 0.5

True

Boolean conditions are often used in `if` statements. These are used to check if a condition is met, and if so, execute a block of code. 

In [37]:
temperature = 25

if temperature > 20:
    print("Eat ice cream")

Eat ice cream


TO DO: 
- excersise with booleans: predict if true 