# Introduction to Python
Welcome to your first Python notebook! In this notebook, we will explore:
- Variables
- Basic Python data types
- Fundamental data structures: lists, tuples, sets, and dictionaries

Python is widely used in data science, machine learning, and automation. Let's start with the very basics.

---

## Section 0: Variables
A variable is like a container that stores a value. You can name a variable and assign it a value using the `=` operator.

Variable names should be descriptive and cannot start with a number.

Variables in computer science differ from those in mathematics.

- **Definition**: A variable is a named memory location used to store data.
- **Mutability**: Its value can change during program execution.
- **Python specifics**: Python is dynamically typed; the type is inferred at assignment and can change later.

In Python, a variable allows you to refer to a value with a name. To create a variable use =, like this example:

x = 5
You can now use the name of this variable, x, instead of the actual value, 5.

Remember, = in Python means assignment, it doesn't test equality!

**Variables** are created with the **[assignment statement <img height="12" style="display: inline-block" src="../static/link/to_py.png">](https://docs.python.org/3/reference/simple_stmts.html#assignment-statements)** `=`, which is *not* an operator because of its *side effect* of making a **[name <img height="12" style="display: inline-block" src="../static/link/to_py.png">](https://docs.python.org/3/reference/lexical_analysis.html#identifiers)** reference an object in memory.

We read the terms **variable**, **name**, and **identifier** used interchangebly in many Python-related texts. In this book, we adopt the following convention: First, we treat *name* and *identifier* as perfect synonyms but only use the term *name* in the text for clarity. Second, whereas *name* only refers to a string of letters, numbers, and some other symbols, a *variable* means the combination of a *name* and a *reference* to an object in memory.


In [2]:
message = "Hello, Python!"
age = 25
pi = 3.14159
is_student = True

In [None]:
print("Message:", message, type(message))
print("Age:", age, type(age))
print("Pi:", pi, type(pi))
print("Is student:", is_student, type(is_student))

Message: Hello, Python! <class 'str'>
Age: 25 <class 'int'>
Pi: 3.14159 <class 'float'>
Is student: True <class 'bool'>


### Variables (Mathematics)

A **variable** is a placeholder in an equation (e.g., `x + 2 = 5`).

<img src="figures/fig1.png" alt="Variables in math vs programming" width="800"/>

<br>
<sub><i>Source: Introductory to Python course, Offenburg University of Applied Sciences</i></sub>


### Variables Naming conventions

Variable names may contain upper and lower case letters, numbers, and underscores (i.e., `_`) and be as long as we want them to be. However, they must not begin with a number. Also, they must not be any of Python's built-in **[keywords <img height="12" style="display: inline-block" src="../static/link/to_py.png">](https://docs.python.org/3/reference/lexical_analysis.html#keywords)** like `for` or `if`.

Variable names should be chosen such that they do not need any more documentation and are self-explanatory. A widespread convention is to use so-called **[snake\_case <img height="12" style="display: inline-block" src="../static/link/to_wiki.png">](https://en.wikipedia.org/wiki/Snake_case)**: Keep everything lowercase and use underscores to separate words.

See this [link <img height="12" style="display: inline-block" src="../static/link/to_wiki.png">](https://en.wikipedia.org/wiki/Naming_convention_%28programming%29#Python_and_Ruby) for a comparison of different naming conventions.

Use meaningful names:

- `area = 10` (good)
- `a = 10` (bad)

Naming styles:

- `snake_case`: `variable_one`
- `camelCase`: `variableOne`
- `PascalCase`: `VariableOne`

Stick to one style throughout your code.  
Avoid special characters: ä → ae, ö → oe, ü → ue.

#### Good examples

In [None]:
variable_one = 1
variableOne = 1
VariableOne = 1

#### Bad examples

In [6]:
PI = 3.14159 # unless used as a "Global Constant", this is not recommended
answerToEverything = 42 # this is a style used in language like JavaScript, but not recommended in Python
name = "Alexandar" # name of what ?
type_ = "studnet" # variables with leading and trailing underscores are often used for special purposes in Python, so this is not recommended

### Input from user
The input() function reads user input as a string by default.
For numeric input, convert to int or float explicitly.

In [None]:
# Uncomment below lines to test interactively
#name = input("Enter your name: ")
#age = int(input("Enter your age: "))
#print(f"Hello {name}, you are {age} years old.")

### Reassigning Variable
You can overwrite a variable by assigning it a new value

A variable may be re-assigned as often as we wish. Thereby, we could also assign an object of a different type. Because this is allowed, Python is said to be a dynamically typed language. On the contrary, a statically typed language like C also allows re-assignment but only with objects of the same type. This subtle distinction is one reason why Python is slower at execution than C: As it runs a program, it needs to figure out an object's type each time it is referenced.

In [1]:
x= 10 
print("Initial x:", x)
x= 20
print("Reassigned x:", x)

Initial x: 10
Reassigned x: 20


---

## Section 1: Basic Data Types

What are Data Types?

In Python (and programming in general), every value has a type. Data types define:
- What kind of value something is (a number, text, etc.)
- What operations you can perform with it

The data to be processed can be of very different types, e.g., numbers that can be used for calculations, or character strings that can be concatenated.  
A data type describes a set of data objects that all have the same structure and can be operated on using the same operations.

Python is **dynamically typed**, meaning you don’t need to declare a type explicitly — Python figures it out based on the assigned value.  
Python has several built-in data types. The most commonly used are:
- Integers (`int`): whole numbers like 5, -1, 0
- Floating point numbers (`float`): numbers with decimals like 3.14
- Strings (`str`): sequences of characters like "Hello"
- Boolean (`bool`): `True` or `False` values used for logic

<p align="center">
  <img src="figures/datatypes.png" alt="Python Data Types" width="500"><br>
  <em>Overview of common Python data types</em>
</p>


In [11]:
x = 10 
print("x:", x, type(x))

y=3.14
print("y:", y, type(y))

name = "Alice"
print("Name:", name, type(name))

is_raining = False
print("Is it raining?", is_raining, type(is_raining))

x: 10 <class 'int'>
y: 3.14 <class 'float'>
Name: Alice <class 'str'>
Is it raining? False <class 'bool'>


### Integers (`int`)

**Integer variables in Python:**
- Whole numbers without decimal points.
- Can be positive or negative.
- Often used for counters, indexes, or calculations that don’t require fractions.
- Python integers can be arbitrarily large — Python automatically switches to a long integer representation internally when needed.


In [22]:
a = 42          # positive integer
b = -17         # negative integer
c = 1_000_000   # underscores for readability
d = 9999999999999999999999999999999999  # very large integer

print("a:", a, "type:", type(a))
print("b:", b, "type:", type(b))
print("c:", c, "type:", type(c))
print("d:", d, "type:", type(d))


a: 42 type: <class 'int'>
b: -17 type: <class 'int'>
c: 1000000 type: <class 'int'>
d: 9999999999999999999999999999999999 type: <class 'int'>


### Floating-Point Numbers (`float`)

**Float variables in Python:**
- Numbers that can contain decimal places.
- Can be positive or negative, and also written in scientific notation.
- Used for calculations requiring fractional values, such as physics computations or financial models.
- In Python, there is no distinction between single precision and double precision floats like in C — all Python floats are double precision.


In [23]:
x = 3.14159        # decimal number
y = -0.01          # negative float
z = 2.5e3          # scientific notation (2.5 × 10³)

print("x:", x, "type:", type(x))
print("y:", y, "type:", type(y))
print("z:", z, "type:", type(z))


x: 3.14159 type: <class 'float'>
y: -0.01 type: <class 'float'>
z: 2500.0 type: <class 'float'>


### Booleans (`bool`)

**Boolean variables in Python:**
- Can have only two values: `True` or `False`.
- Often used for logical evaluations and conditions.
- Fundamental for controlling program flow in loops or conditional statements.
- Boolean values behave like integers in many cases (`False` = 0, `True` = 1).

In [24]:
is_active = True
is_admin = False
from_int = bool(1)    # True because 1 is non-zero
from_zero = bool(0)   # False because 0 is zero

print("is_active:", is_active, "type:", type(is_active))
print("is_admin:", is_admin, "type:", type(is_admin))
print("from_int:", from_int, "type:", type(from_int))
print("from_zero:", from_zero, "type:", type(from_zero))

if is_active:
    print("The account is active.")


is_active: True type: <class 'bool'>
is_admin: False type: <class 'bool'>
from_int: True type: <class 'bool'>
from_zero: False type: <class 'bool'>
The account is active.


### Strings (`str`)

**Strings in Python:**
- A sequence of characters enclosed in single (`' '`) or double (`" "`) quotes.
- Can include letters, numbers, special characters, and spaces.
- Commonly used for storing text, processing strings, file names, user inputs, and mor

In [25]:
name = "Klaus"
greeting = 'Hello, World!'
multi_line = """This is
a multi-line
string."""

print("name:", name, "type:", type(name))
print("greeting:", greeting)
print("multi_line:\n", multi_line)

full_message = greeting + " My name is " + name + "."
print(full_message)

repeat = "Hi! " * 3
print(repeat)


name: Klaus type: <class 'str'>
greeting: Hello, World!
multi_line:
 This is
a multi-line
string.
Hello, World! My name is Klaus.
Hi! Hi! Hi! 


### Type Conversion (Casting) 
In Python, type conversion (also called casting) is the process of converting a value from one data type to another.  
This is useful when you need to perform operations that require values to be in a specific type — for example, converting user input (which is always a string) into a number for calculations.  

Casting operators:
- int(x): convert to integer
- float(x): convert to float
- str(x): convert to string
- bool(x): convert to boolean


Why is type conversion important?
Python is dynamically typed, but sometimes you manually need to change the type. For example:

In [None]:
age = input("Enter your age: ")     # input() returns a string
print(age + 1)                      #  This will cause an error

In [14]:
# to fix it 
age = int(input("Enter your age: "))  # convert input to integer
print(age + 1)                      # now this works    

13


In [15]:
now_an_int = int(3.14)
also_an_int = int(3.99)
float_example = float(10)
string_example = str(1000)
bool_false = bool(0)
bool_true = bool(1)

print("Type conversions:")
print("int(3.14):", now_an_int)
print("int(3.99):", also_an_int)
print("float(10):", float_example)
print("str(1000):", string_example)
print("bool(0):", bool_false)
print("bool(1):", bool_true)

Type conversions:
int(3.14): 3
int(3.99): 3
float(10): 10.0
str(1000): 1000
bool(0): False
bool(1): True


### Example 2
Using the + operator to paste together two strings can be very useful in building custom messages.

Suppose, for example, that you've calculated the return of your investment and want to summarize the results in a string. Assuming the floats savings and result are defined, you can try something like this:

```python
print("I started with $" + savings + " and now have $" + result + ". Awesome!")
```  
This will not work, though, as you cannot simply sum strings and floats.

To fix the error, you'll need to explicitly convert the types of your variables. More specifically, you'll need str(), to convert a value into a string. str(savings), for example, will convert the float savings to a string.

Similar functions such as int(), float() and bool() will help you convert Python values into any type.

In [16]:
# Definition of savings and result
savings = 100
result = 100 * 1.10 ** 7

# Fix the printout
print("I started with $" + str(savings) + " and now have $" + str(result) + ". Awesome!")

# Definition of pi_string
pi_string = "3.1415926"

# Convert pi_string into float: pi_float
pi_float=float(pi_string)

I started with $100 and now have $194.87171000000012. Awesome!


---

## Section 2: Data Structures
Python provides powerful structures to store and manipulate collections of data.

We will cover:
- List : is a collection which is ordered and changeable. Allows duplicate members.
- Tuple : is a collection which is ordered and unchangeable. Allows duplicate members.
- Set: is a collection which is unordered and unindexed. No duplicate members.
- Dictionary : is a collection which is ordered and changeable. No duplicate members.

### Lists
- Ordered: items have a defined index
- Mutable: you can change, add, or remove items
- Allows duplicates

Lists are used to store multiple items in a single variable.

#### Create a list
As opposed to int, bool etc., a list is a compound data type; you can group values together:

a = "is"
b = "nice"
my_list = ["my", "list", a, b]


In [19]:
#student grade variables
alice = 85
bob = 90
carol = 78
dave = 92

#create list grades
grades= [alice, bob, carol,dave]

# Print areas
print("Grades:", grades)

Grades: [85, 90, 78, 92]


#### Create list with different types 
A list can contain any Python type. Although it's not really common, a list can also contain a mix of Python types including strings, floats, booleans, etc.

The printout of the previous exercise wasn't really satisfying. It's just a list of numbers representing the grades, but you can't tell which grade belongs to which student.

In [20]:
# grade variables (out of 100)
alice = 85.5
bob = 90.0
carol = 78.0
dave = 92.5
eve = 88.0

# Adapt list grades with names
grades = ["Alice", alice, "Bob", bob, "Carol", carol, "Dave", dave, "Eve", eve]

# Print grades
print(grades)


['Alice', 85.5, 'Bob', 90.0, 'Carol', 78.0, 'Dave', 92.5, 'Eve', 88.0]


#### List of Lists
As a data scientist, you'll often be dealing with a lot of data, and it will make sense to group related data together.

Instead of creating a flat list containing strings and floats, representing the names and grades of students, you can create a list of lists, where each inner list contains a student's name and their corresponding grade.

Don't get confused here: "Alice" is a string, while alice is a variable that holds the float 85.5 you specified earlier.

In [21]:
# grade variables (out of 100)
alice = 85.5
bob = 90.0
carol = 78.0
dave = 92.5
eve = 88.0

# student information as list of lists
students = [["Alice", alice],
            ["Bob", bob],
            ["Carol", carol],
            ["Dave", dave],
            ["Eve", eve]]

# Print out students
print(students)

# Print out the type of students
print(type(students))


[['Alice', 85.5], ['Bob', 90.0], ['Carol', 78.0], ['Dave', 92.5], ['Eve', 88.0]]
<class 'list'>


#### Subset and conquer
Subsetting Python lists is a piece of cake. Take the code sample below, which creates a list x and then selects "b" from it. Remember that this is the second element, so it has index 1. You can also use negative indexing.

```python
x = ["a", "b", "c", "d"]
x[1]
x[-3] # same result!
```
Remember the areas list from before, containing both strings and floats?. Can you add the correct code to do some Python subsetting?

In [22]:
# Create the grades list
grades = ["Alice", 85.5, "Bob", 90.0, "Carol", 78.0, "Dave", 92.5, "Eve", 88.0]

# Print out second element from grades (Alice's grade)
print(grades[1])

# Print out last element from grades (Eve's grade)
print(grades[-1])

# Print out Carol's grade
print(grades[5])


85.5
88.0
78.0


#### Subset and calculate

After you've extracted values from a list, you can use them to perform additional calculations. Take this example, where the second and fourth element of a list `x` are extracted. The strings that result are pasted together using the `+` operator:

```python
x = ["Alice", "scored ", "Bob", "90"]
print(x[1] + x[3])
```

In [23]:
# Create the grades list
grades = ["Alice", 85.5, "Bob", 90.0, "Carol", 78.0, "Dave", 92.5, "Eve", 88.0]

# Sum of Bob's and Dave's grades: group_total
group_total = grades[3] + grades[7]

# Print the variable group_total
print(group_total)


182.5


#### Slicing and dicing

Selecting single values from a list is just one part of the story. It's also possible to slice your list, which means selecting multiple elements from your list. Use the following syntax:

`my_list[start:end]`  
The start index will be included, while the end index is not.

The code sample below shows an example. A list with `"b"` and `"c"`, corresponding to indexes 1 and 2, are selected from a list `x`:

```python
x = ["a", "b", "c", "d"]
x[1:3]
```


In [26]:
# Create the grades list
grades = ["Alice", 85.5, "Bob", 90.0, "Carol", 78.0, "Dave", 92.5, "Eve", 88.0]

# Use slicing to create group_1 (first 3 students)
group_1 = grades[:6]

# Use slicing to create group_2 (last 2 students)
group_2 = grades[6:10]

# Print out both groups
print(group_1, group_2)


['Alice', 85.5, 'Bob', 90.0, 'Carol', 78.0] ['Dave', 92.5, 'Eve', 88.0]


#### Subsetting lists of lists

You saw before that a Python list can contain practically anything — even other lists! To subset lists of lists, you can use the same technique as before: square brackets.


In [27]:
x = [["a", "b", "c"],
     ["d", "e", "f"],
     ["g", "h", "i"]]
print(x[2][0])
print(x[2][:2])

g
['g', 'h']


`x[2]` results in a list, which you can subset again by adding additional square brackets.

In [28]:
students = [["Alice", 85.5],
            ["Bob", 90.0],
            ["Carol", 78.0],
            ["Dave", 92.5],
            ["Eve", 88.0]]

#What will `students[-1][1]` return?

In [30]:
print(students[-1][1])

88.0


#### Replace list elements

Replacing list elements is pretty easy. Simply subset the list and assign new values to the subset. You can select single elements or change entire list slices at once.


In [31]:
x = ["a", "b", "c", "d"]
x[1] = "r"
x[2:] = ["s", "t"]
print(x)

['a', 'r', 's', 't']


In [32]:
grades = ["Alice", 85.5, "Bob", 90.0, "Carol", 78.0, "Dave", 92.5, "Eve", 88.0]

# Correct Eve's grade
grades[-1] = 91.0

# Change "Carol" to "Caroline"
grades[4] = "Caroline"

print(grades)


['Alice', 85.5, 'Bob', 90.0, 'Caroline', 78.0, 'Dave', 92.5, 'Eve', 91.0]


#### Extend a list

If you can change elements in a list, you can also add elements to it using the `+` operator.


In [33]:
grades = ["Alice", 85.5, "Bob", 90.0, "Caroline", 78.0,
          "Dave", 92.5, "Eve", 91.0]

# Add Frank's data
grades_1 = grades + ["Frank", 84.0]

# Add Grace's data
grades_2 = grades_1 + ["Grace", 89.5]

print(grades_2)


['Alice', 85.5, 'Bob', 90.0, 'Caroline', 78.0, 'Dave', 92.5, 'Eve', 91.0, 'Frank', 84.0, 'Grace', 89.5]


#### Delete list elements

You can remove elements from your list using the `del` statement:


In [35]:
grades = ["Alice", 85.5, "Bob", 90.0, "Caroline", 78.0,
          "Dave", 92.5, "Eve", 91.0, "Frank", 84.0, "Grace", 89.5]

del grades[10:12]
print(grades)


['Alice', 85.5, 'Bob', 90.0, 'Caroline', 78.0, 'Dave', 92.5, 'Eve', 91.0, 'Grace', 89.5]


#### Inner workings of lists

The code below creates a list `grades` and assigns it to `grades_copy`. Modifying `grades_copy` also changes `grades`, because both refer to the same list.


In [36]:
grades = [85.5, 90.0, 78.0, 92.5, 91.0]
grades_copy = grades
grades_copy[0] = 50.0

print(grades)
print(grades_copy)


[50.0, 90.0, 78.0, 92.5, 91.0]
[50.0, 90.0, 78.0, 92.5, 91.0]


In [37]:
# To avoid this, make an **explicit copy** using `list()` or slicing.

grades = [85.5, 90.0, 78.0, 92.5, 91.0]
grades_copy = list(grades)
grades_copy[0] = 50.0

print(grades)
print(grades_copy)

[85.5, 90.0, 78.0, 92.5, 91.0]
[50.0, 90.0, 78.0, 92.5, 91.0]


### Tuples
- Ordered
- Immutable: once created, it cannot be changed
- Allows duplicates

Tuples are useful for fixed data collections (e.g., GPS coordinates).

In [38]:
tuple_example = (1, 2, 3)
print("Tuple:", tuple_example)
print("First element:", tuple_example[0])

Tuple: (1, 2, 3)
First element: 1


### Sets
- Unordered
- Mutable
- No duplicate elements

Sets are often used to store unique values.

In [39]:
colors = {"red", "green", "blue", "red"}
print("Set:", colors)
colors.add("yellow")
print("After add:", colors)

Set: {'blue', 'red', 'green'}
After add: {'blue', 'red', 'green', 'yellow'}


### Dictionaries
- Collection of key-value pairs
- Keys must be unique
- Values can be any type
- Mutable

Dictionaries are ideal for storing data that’s logically connected.

In [40]:
student = {
    "name": "Alice",
    "age": 23,
    "major": "Data Science"
}
print("Student dictionary:", student)
print("Name:", student["name"])
student["grade"] = "A"
print("After adding grade:", student)
student["age"] = 24
print("Updated age:", student)

Student dictionary: {'name': 'Alice', 'age': 23, 'major': 'Data Science'}
Name: Alice
After adding grade: {'name': 'Alice', 'age': 23, 'major': 'Data Science', 'grade': 'A'}
Updated age: {'name': 'Alice', 'age': 24, 'major': 'Data Science', 'grade': 'A'}


### Summary Table

| Structure   | Constructor | Mutable | Ordered | Allows Duplicates |
|-------------|--|---------|---------|--------------------|
| List        | [] |✅      | ✅      | ✅                |
| Tuple       | {}|❌      | ✅      | ✅                |
| Set         | ()|✅      | ❌      | ❌                |
| Dictionary  | {"key": value}|✅      | ✅ (keys) | ❌ (keys)       |



<img src="figures/fig3.png" alt="Summary of Data Structures" width="600"/>

<small>Source: <a href="https://medium.com/@aitarurachel/data-structures-with-lists-tuples-dictionaries-and-sets-in-python-612245a712af" target="_blank">Data Structures in Python – Medium</a></small>


---


### References

1. [DataCamp – Introduction to Python for Data Science](https://www.datacamp.com/courses/intro-to-python-for-data-science?utm_cid=22785184694&utm_aid=185890095721&utm_campaign=220808_1-ps-brd~brd~branded-variations_2-b2c_3-emea_4-rtw_5-na_6-na_7-le_8-pdsh-go_9-b-e_10-na_11-na&utm_loc=9041859-&utm_mtd=e-c&utm_kw=datacamp%20python%20course&utm_source=google&utm_medium=paid_search&utm_content=ps-other~emea-en~brd~tech~python&gad_source=1&gad_campaignid=22785184694&gbraid=0AAAAADQ9WsFGbwt0khugCn-zS_JkuMAfh&gclid=CjwKCAjwy7HEBhBJEiwA5hQNothFvZtzAJZuYzMvsuZIdqptprOQa2hdhbdrVfwN7K09czVXQu-7ihoC4ScQAvD_BwE)

2. HS Offenburg – Introductory to Python course materials *(internal university resource)*
