# 1. Variables and basic types

**GRA 4142 Data Management and Python Programming, Fall 2022**  
Jan Kudlicka (jan.kudlicka@bi.no)

## Hello, world!

Let us start with a "hello, world" example, but with a more suitable message:

In [None]:
print("Welcome to the GRA 4142 Data Management and Python Programming course!")

In this very first example we have called the built-in `print` function and passed the text to be printed (i.e. *Welcome to the GRA 4142 Data Management and Python Programming course!*) as its argument specified between `(` and `)`. We can also specify several arguments (separated by commas) that will be printed on a single line with spaces between them:

In [None]:
print("Current year:", 2022)

## Using Python as a calculator

We can use the Python interpreter and cells in Jupyter notebooks as a calculator, e.g.:

In [None]:
(1 + 3 / 2) * 7.5

Note that we have not used `print` in this example, in Jupyter notebooks the last calculated value in the cell is shown as the cell output.

In Python we can add strings and multiply them by integers:

In [None]:
3 * "Testing, " + "can you hear me?"

**In-class exercise.** Calculate the square root of 1524157875019052100. (You can search how to do that online!)

1, 3, 2, 7.5, "Testing, " and "can you hear me?" are examples of *literals*, i.e. fixed values in source code.

## Variables

Variables are essential in (almost) any programming language: they are used to label and store values in memory (in order to use these values later).

In Python, a new variable is created by assignment:
```python
variable_name = some_value
```
For example (note that each statement is on a separate line):

In [None]:
width = 3.5   # Width of a rectangle (anything after a # on the same line is a comment; unless a part of a string)
height = 2.5  # Height of a rectangle
area = width * height
print("Area of a rectangle with width", width, "and height", height, "is", area)

Using an undefined variable will lead to an error:

In [None]:
width * heigth  # Note the typo in height

Variable names (also known as identifiers) must only contain underscores, letters and digits. They cannot start with a digit and they cannot be a keyword (reserved words that have a special meaning). Variable names are case-sensitive, i.e. `area` and `Area` denote two different variables.

**A common practice is to use *snake_case* for variable names, i.e. lower case, words separated by an underscore, for example `total_area`.**

Tip: Always use nouns that describe the purpose of variables so that it is easy for other people (and you) to read your code.

In [None]:
# None of these is a legal variable name:
@name = 1                          # variable name contains an illegal character
1st_participant = 'Homer Simpson'  # variable name cannot start with a digit
from = '2021-06-21'                # "from" is a keyword

Avoid using the names of existing built-in or imported functions and types, since this will make it impossible to use them:
```python
print(max(4, 2, 1, 5, 3))  # Max is a built-in function returning the maximum of the passed arguments
```
will print 5, the maximum of the numbers passed to the `max` functions.

However, if we now define our own variable and name it `max`:
```python
max = 10
```
we will no longer be able to use the built-in `max` function:
```python
print(max(4, 2, 1, 5, 3))
```
```
TypeError: 'int' object is not callable
```

Tip: If you are not sure if some name is already taken, try to use it or print it to see if you get a `NameError`.

We can change the value associated with a variable, for example (note especially the last assignment):

In [None]:
x = 1
print('Value of x:', x)
x = 7.5
print('Value of x:', x)
x = x + 2
print('Value of x:', x)

The values we have seen so far belong to different types. We can use `type()` to get the type of a given value:

In [None]:
print("Type of 1:", type(1))
print("Type of 'Hello, world':", type('Hello world!'))
print("Type of 1.2:", type(1.2))
print("Type of the value assigned to the variable 'area':", type(area))

## Numeric types (int and float)

The *int* type (class) is used for integer numbers ($\{\dots, -2, -1, 0, 1, 2, \dots\}$), while the *float* type is used for real (also known as floating-point) numbers. (There is also a type for complex numbers, but that's beyond the scope of this course.)

The most important mathematical operations between numbers (sorted by their precedence) are:
<table>
    <tr><td>Parenthesized expression: ()</td></tr>
    <tr><td>Exponentiation: **</td></tr>
    <tr><td>Unary plus: +, unary minus: -</td></tr>
    <tr><td>Multiplication: *, division: /, floor division: //, remainder: %</td></tr>
    <tr><td>Addition: +, subtraction: -</td></tr>  
</table>

In [None]:
print('3 * -2 =', 3 * -2)
print('3 / 2 =', 3 / 2)    # Note that the result will be of float type, even if both operands were int! 
print('3 // 2 =', 3 // 2)  # Floor division, the result of the division is rounded down to the nearest integer
print('3 % 2 =', 3 % 2)    # Remainder (also known as the modulo operator)
print('3 ** 2 =', 3 ** 2)  # The square of 3

An example of a little bit more complex *expression* (a combination of literals, variables, operators and functions, that describes how to calculate a value of interest):

In [None]:
x = 10
(x + 1) ** 2

Now, consider the following example:

In [None]:
x = 0.1
y = 0.2
print("x + y =", x + y)
print("x + y is equal to 0.3:", x + y == 0.3)

**Never compare floats with `==`!** Test if the numbers are close enough (for example `abs(x + y - 0.3) < 1e-5`).

If you wonder why 0.1 + 0.2 is 0.30000000000000004, check out [this website](https://0.30000000000000004.com/).

**In-class exercise.** Finish code in the following cell so that the values or variables `x` and `y` are swapped.

In [None]:
x = 0.1
y = 0.2

# Your code here

print(x, y)

## Strings

Single-line string literals are enclosed in either single or double quotes:

In [None]:
print("Example of a string")
print('Another example of a string')

String enclosed in single quotes may contain double quotes and vice versa. Another option how to include quotes is to "escape" them with backslash: `\"` or `\'`.

In [None]:
#print('It's ok!') will not work, the apostrophe in the text "ends" the string.
print("It's ok!")
print('"Really?", she asked.')
print('Yes, it\'s really ok...')
print('"No, sorry, I can\'t do that", she said.')

Tabs and newlines can be inserted with `\t` resp. `\n`:

In [None]:
print("A\tB\tC\nD\tE\tF")

There are also other special characters starting with `\` so if you need to include a backslash (e.g. in file paths in Windows), either use `\\` (the first `\` escapes the second one) or use a *raw* string literal as shown in the following example:

In [None]:
path = "C:\new\text.txt"  # Not what we want!
print(path)
path = "C:\\new\\text.txt"
print(path)
path = r"C:\new\text.txt"  # A "raw" string, \ does not escape
print(path)

Python also supports multi-lined strings, which are using "triple quotes":

In [None]:
print('''A
B
C''')

print()  # Prints an empty line

print("""D
E
F""")

In [None]:
# If the opening triple quote is followed by \, the string starts on the next line.
# The following example is exactly the same as the first example in the previous code cell.
print('''\
A
B
C
''')

We can use the built-in function `len` to determine the length of a string:

In [None]:
string = "Hello, world!"
len(string)

### Indexing and slicing

A string is a sequence of characters (letters, digits, punctuation marks, white spaces and control characters such as `\n`):

<table border="1" >
    <tr>
        <td width="35">0</td>
        <td width="35">1</td>
        <td width="35">2</td>
        <td width="35">3</td>
        <td width="35">4</td>
        <td width="35">5</td>
        <td width="35">6</td>
        <td width="35">7</td>
        <td width="35">8</td>
        <td width="35">9</td>
        <td width="35">10</td>
        <td width="35">11</td>
        <td width="35">12</td>
    </tr>
    <tr>
        <td>H</td>
        <td>e</td>
        <td>l</td>
        <td>l</td>
        <td>o</td>
        <td>,</td>
        <td>&nbsp;</td>
        <td>w</td>
        <td>o</td>
        <td>r</td>
        <td>l</td>
        <td>d</td>
        <td>!</td>
    </tr>
    <tr>
        <td>-13</td>
        <td>-12</td>
        <td>-11</td>
        <td>-10</td>
        <td>-9</td>
        <td>-8</td>
        <td>-7</td>
        <td>-6</td>
        <td>-5</td>
        <td>-4</td>
        <td>-3</td>
        <td>-2</td>
        <td>-1</td>
    </tr>
</table>

Each character can be accessed with the index operator (`[]`):

In [None]:
print("string[0] =", string[0])    # Note that the first character has index 0!
print("string[1] =", string[1])
print("string[12] =", string[12])
print("string[13] =", string[13])  # This leads to an IndexError!

We can also use negative indexes: -1 will refer to the last character, -2 to the second-last one, etc.:

In [None]:
print("string[-1] =", string[-1])
print("string[-5] =", string[-5])

We can also get a slice (part, substring) of a string using the slice operator (`[:]`) and specifying the start index (inclusive) and the end index (exclusive):

In [None]:
print("string[1:3] =", string[1:3])
print("string[:5] =", string[:5])    # Not specifying the start index means "from the beginning"
print("string[7:] =", string[7:])    # Not specifying the end index means "until the end"
print("string[-6:] =", string[-6:])  # We can use negative indexes too
print("string[-6:-1] =", string[-6:-1])
# With slices, using non-existing indexes does not lead to any error (the result is an empty string ""):
print("string[20:] =", string[20:])  

Strings are **immutable**: we are not able to alter an existing string (but we can create a new one and assign it to the same variable). The following code will lead to an error:

In [None]:
string[5] = ';'

### Concatenation and repetition

We have already seen that we can concatenate strings by using `+` and we can repeat a string by multiplying it with an integer:

In [None]:
start = "Hello"
end = "world!"
string = start + ", " + end
print(string)
print("=" * 30)

## Introduction to lists

**A list is an ordered collection of values**. Lists are often called arrays in other programming languages. Note that in Python, the individual elements might be of different types.

A list is created by specifying its elements, separated by commas, between square brackets (`[` and `]`):

In [None]:
lst1 = []  # Empty list
lst2 = [1, 2, 3]
lst3 = [1.2]
lst4 = [["A", "B"], True]  # Elements can be of any types, even lists (or dictionaries that we will cover later)
lst5 = [2, False, 3.14, "Hello"]

We can use the `len` function to return the number of elements:

In [None]:
lst = [2, False, 3.14, "Hello"]
len(lst)

Similar to strings (which are sequences of characters), we can use the index `[]` and slice `[:]` operators to access individual elements and slices:

In [None]:
lst = ['A', 'B', 'C', 'D']
# Exercise: can we name the variable `list` instead of `lst`?

print(lst[0])    # First element
print(lst[-1])   # Last element,
print(lst[2:4])  # A slice between index 2 (inclusive) and 4 (exclusive)

Unlike strings, which are immutable, **lists are mutable** so we can change the individual elements:

In [None]:
lst = ['A', 'B', 'C', 'D']
lst[0] = 'E'  # Change the first element
lst[-2] = 'F' # Change the next-to-last element
print(lst)

We can also update a whole slice at once:

In [None]:
lst = ['A', 'B', 'C', 'D']
lst[1:3] = ['E', 'F']  # Elements B and C will be replaced by E and F
print(lst)

When updating a slice, the new value can have the size different from the size of the slice:

In [None]:
lst = ['A', 'B', 'C', 'D']
lst[1:3] = ['E', 'F', 'G']  # Elements B and C will be replaced by E, F and G
print(lst)

In [None]:
lst = ['A', 'B', 'C', 'D']
lst[1:3] = []  # Elements B and C will be removed.
print(lst)

Updating slices can be used to append new elements at the end of a list as well:

In [None]:
lst = ['A', 'B', 'C', 'D']
lst[len(lst):] = ['E', 'F']
print(lst)

We will later return to lists and see other (and more ergonomic) ways how to insert and remove elements to and from them, but let us already now mention `append`, one of the most useful list methods that adds a (single) element to the end of a list:

In [None]:
lst = ['A', 'B', 'C', 'D']
lst.append('E')
print(lst)

## Splitting strings and joining lists of strings

If a string is a list of elements separated by a separator (e.g. a comma), we can obtain the list of all elements by using a `split` method:

In [None]:
data = "Homer, Marge, Bart, Lisa, Maggie"
elements = data.split(", ")  # Separated by a comma followed by a space
elements

**In-class exercise**: Change the separator in the previous code to `,`. What happens if you re-run the cell? Try also to change the separator to a character (or a string) that does not occur in the given string at all and answer the same question.

If you don't specify the separator, the words will be split by any type of space (and empty strings discarded):

In [None]:
elements = "   Homer  Marge Bart       Lisa Maggie  ".split()
elements

The opposite is joining elements of a list with the `join` method:

In [None]:
', and '.join(elements)

**In-class exercise:** Strings are immutable, so we cannot change them. But we can always create a new string by using the above-mentioned operators, and assign it to the same variable. Write a program that "replaces" the first character in a given string (say "Python") with another string (say "Br").

In [None]:
string = "Python"
replacement = "Br"

# Your code here (must work with any `string` and `replacement`)

print(string)