# Python

An interpreted language that the interpreter executes line by line. (Here in Colab it runs when we press the play button next to the code cell).

A program essentially consists of control structures, data declarations, and functions. (Later we'll see that in Python functions are actually data, too).

If we write something after a `#` symbol it is a comment and the interpreter ignores it. (In Colab these are usually colored green.)

Let's look at some examples of data (click the play button in the black boxes):

In [None]:
"hello" # this is a text data

'halih√≥'

In [None]:
0b1110111011110110 # this is a number given in binary format

61174

In [None]:
0xFE3E # this is a number given in hexadecimal format

65086

In [None]:
3.14e-2 # this is a number given in floating point (scientific notation)

0.0314

In [None]:
# we can put underscores into numbers (for example as thousand separators)
# sometimes it's useful if it helps readability.
1_000_000_000 # one billion

1000000000

We can use functions and operators (+, -, *, /, etc.) on data. Later we'll see that operations are actually functions as well. A function has a name, zero, one or more parameters that it uses, and a return value that we get back.

In [None]:
4 + (3 - 2*6)*7  # result of the operations -59

-59

In [None]:
# we also have an exponentiation operator: the double asterisk
3**3

27

In [None]:
max(4,8,12,6,1)  # the return value of this function is 12

12

In [None]:
abs(-32.21) # returns the absolute value of the negative number

32.21

There are functions that are specific to the Python language. For example we can query the type of a piece of data or ask for the documentation of a function.

In [None]:
type(3.14) # floating point number, also called float

In [None]:
type("Hello") # this data is a text, in Python called str

In [None]:
# how to use the max function?
help(max)

Sometimes we call a function whose return value we don't care about. We do this when we're interested in the function's "side effect". For example the `print` function writes all given parameters to the screen. We usually don't care about what it returns because we only called it to display data. Here in Colab we've seen results of previous computations, but in a "normal" program if you don't print something to the screen it won't be visible. Colab only automatically displays the result of the last line.

In [None]:
45        # we do define these
15.72     # but since we don't use them
"kakadu"  # in the end they vanish
89 # only this will be displayed if we run the code, but only in Colab


89

In [None]:
print(1, "hi", 3.14) # this function prints three pieces of data.

1 szia 3.14


In [None]:
print("What is the meaning of life?") # this text will be displayed because we explicitly ask for it to be printed.
print(42)

Mi az √©let √©rtelme?
42


In [None]:
# actually operations are functions too, they just look special.
print(45 + 67)
print(int.__add__(45, 67)) # this does exactly the same as the line above


112
112


Because Python has many functions, they are often organized into modules or attached to types. If you see `something.function` (a dot), that means either the function comes from a module named something, or it belongs to a type named something. We'll see more about this later.

Not every function is available immediately; some require us to import packages before using them:

In [None]:
import math # let's have math functions too!
math.sin(3.141) # sin function from the math module

0.0005926535550994539

In [None]:
str.capitalize("nemecsek") # capitalize function from the str type

'Nemecsek'

## Variables

Working with data can be difficult if we always have to define it in place. Sometimes it's even impossible. For example if a number comes from the user, we can't know its exact value in advance.

In many programming languages this is handled with "variables", i.e. you declare a variable of a certain type which can hold any value of that type.

In Python there aren't really variables in the traditional sense. Although many call them variables, they are more like labels. In Python the data has a type, the label does not. You can stick a label on anything and later stick it onto data of another type. We perform this "labeling" with the equals sign (=). A variable (label) name can be almost anything that Python doesn't already use, it cannot start with a digit, it can contain uppercase and lowercase letters (and Python distinguishes them).


In [None]:
cockatoo = 67.23 # we assigned the floating point number to the cockatoo label
cockatoo = "chirp?" # we reattached the cockatoo label to this text data
Cockatoo = 100 # this is a different label because it is capitalized; the lowercase cockatoo is still the string

print(cockatoo) # we pass the labeled data to the print function
print(Cockatoo) # this is the capital Cockatoo, another label, another data.


csip?
100


Modern Python has full Unicode support, so you can even use variable names like these if you want:

In [None]:
Œ± = 30
Œ≤ = 12
Œ≥ = Œ± + Œ≤
Œ≥

Of course, unless you don't mind that no one except the Greeks will be able to type them... üòÄ

However, we cannot use characters that look like operators (minus, comma, asterisk, etc.) because that would be confusing. Is `csiki-csuki` a variable name or are we subtracting `csuki` from `csiki`? The same applies to the space character; we can't use it, so use underscores instead!

This style is common in Python code. It's recommended for readability:

In [None]:
max_value = 100
estimated_result = 0.

Sometimes you'll see a single underscore `_` used as a label. By convention the programmer uses it to indicate that they don't need that value; they had to put something there for syntactic reasons, but they won't use it (don't look for it).
Example:

In [None]:
_ , y = 300 , 400

y # only the y coordinate is used,
# but something had to be provided for the first value

Although not mandatory, by tradition label names are written in lowercase. If you use a capitalized name you're usually trying to express something special (for example defining a new type). We'll talk more about that later.


        
# Python Data Types


A data type essentially means how we interpret the bit pattern stored in memory. For example if the memory contains the bit sequence: `01010011011110100110100101100001`, interpreted as an integer it is `1400531297`, as a floating point number<sup>1</sup> it is `2.6918160987937623e+20`, and interpreted as text it is `"Szia"`.

(1): more precisely as a "little-endian IEEE754 floating point" number, since multiple formats exist.

In [None]:
memory_view  = memoryview(0b01010011011110100110100101100001.to_bytes(4))

print('As integer: ', memory_view.cast('i')[0])
print('As floating point: ', memory_view.cast('f')[0])
print('As text: ', memory_view.tobytes().decode('utf-8'))

Eg√©sz sz√°mk√©nt:  1634302547
Lebeg≈ëpontos sz√°mk√©nt:  2.6918160987937623e+20
Sz√∂vegk√©nt:  Szia


If the code above is not completely crystal-clear, that's fine ‚Äî we only wanted to show that the same memory region can be interpreted very differently depending on the type you treat it as.

### Useful links

- <a href="https://docs.python.org/3/library/stdtypes.html">Python Built-in Types</a>
- <a href="https://beginnersbook.com/2019/03/python-data-types/">Python Data Types</a>

## Python Numeric Data Types


- Integer (int)
- Floating point (float)
- Boolean (bool)
- String (str)



### Integers (int)


In many languages you must specify the size of an integer (1 byte? 2 bytes? 8?). In Python there is no such limit: an integer can be as large as memory allows and Python will happily compute with it.

In [None]:
a = 1234567890123456789012345678901234567890123456789083277771287961276536743275928391200002312
b = 2 * a + 45
print(b)

2469135780246913578024691357802469135780246913578166555542575922553073486551856782400004669


Numbers can be specified in the familiar decimal form, or in hexadecimal, octal, or binary. The underlying type is always integer (`int`). If you're curious about the type of a value, the `type()` function returns the type for any Python object.

In [None]:
type(10)

int

In [None]:
type(b)

int

In [None]:
hex_number = 0x1a   # integer in hexadecimal form
bin_number = 0b11010 # integer in binary form
dec_number = 26

# in fact, all three labels point to the same number, just entered in different forms!
print(hex_number)
print(bin_number)
print(dec_number)

26
26
26


In [None]:
type(hex_number) # the type of each is int (obviously, as they are the same data)

int

If you perform operations on integers, the result is usually an integer, but not necessarily ‚Äî it depends on the operation (i.e., the function) definition.

In [None]:
a = 1
b = 5
c = 7

In [None]:
c + b - a

In [None]:
b * c

In [None]:
print(c / b)
print(type(c / b)) # the result is not int but float

In Python the `/` operator performs floating point division. If you want integer division that yields an integer result, use the `//` operator.

In [None]:
17 / 9 # floating point result

In [None]:
17 // 9 # integer division (int result)

In [None]:
17 % 9 # this gives the remainder

In [None]:
# of course we can use labels too:
c // b, c % b

(1, 2)

### Floating Point Numbers

* In Python the `float` type denotes floating point numbers
* recognized by a decimal point
* floating point numbers are accurate to about 15 significant digits
* we can use scientific notation with `e` or `E` followed by an integer to indicate magnitude.


In [None]:
7.94

7.94

In [None]:
type(7.94)

float

In [None]:
# you don't have to write the leading zero:
.56

In [None]:
# and if the number is whole you can omit the fractional part too
56.

In [None]:
2.6e-5

2.6e-05

* Floating point numbers are internally represented as binary fractions.
* Most decimal fractions cannot be represented exactly as binary fractions, so stored values are approximate.

In [None]:
2.3 + 1.4  # not exactly 3.7 as you might expect...


3.6999999999999997

#### type conversion

If a value is not of the correct type, you can try to convert it by calling the type name like a function.

In [None]:
print(int(1.3))   #  1
print(int(1.7))   #  1
print(int(-1.3))  #  -1
print(int(-1.7))  #  -1

1
1
-1
-1


If you'd rather have mathematical rounding, use the round() function, where you can also specify how many decimals to keep.
```python
   round(number [, decimals])
```


In [None]:
print(round(12.8))
print(round(-2.99))
print(round(3.14158, 2))


13
-3
3.14



### Complex numbers



Complex numbers in Python use the notation from electronics: real + j * imaginary (so the imaginary unit is `j`, not `i`).

In [None]:
# Python displays complex numbers in parentheses to indicate it's a value:
42.0 + 5j

(42+5j)

In [None]:
# and its type is complex
type(42.0 + 5j)

complex

We can convert other numbers to complex numbers and access their properties separately:

In [None]:
a = complex(-1, 1)
b = complex(3, 5)

In [None]:
print("a:", a)
print('real:', a.real)
print('imaginary:', a.imag)
print('conjugate:', a.conjugate())

a: (-1+1j)
val√≥s: -1.0
k√©pzetes: 1.0
konjug√°t: (-1-1j)


In [None]:
# and of course we can compute with them as usual:
print ("a*a= ",a*a)
print ("a/b= ",a/b)
print ("a-b= ",a-b)

a*a=  -2j
a/b=  (0.05882352941176471+0.23529411764705885j)
a-b=  (-4-4j)


## Boolean values

The boolean type can take only two values: `True` and `False`. Between boolean values we can use logical operators.




In [None]:
true_val = True
false_val = False
true_val and false_val

In [None]:
true_val or false_val

In [None]:
# you can negate them:
not true_val

In [None]:
# or build arbitrarily complex formulas:
( true_val or false_val ) and (not true_val or not false_val)

In [None]:
# when converting other types to bool, anything that is not an empty string, None, or 0 is True.

bool(42)

In [None]:
bool(0.0)

False

There are many functions and operators that return boolean values. A classic example is a relation:

In [None]:
type(5 > 8)

In [None]:
# so we can also do logical games with relational operators:

(5<2) or (8>1) and not (7==6)

In [None]:
# the equality operator in Python is a double equals `==` to distinguish it from assignment
# type(5=6) <-- this is not allowed, that's not an equality test

type(5==6)

In [None]:
# if you want to say that two values are not equal, the symbol is `!=`
42 !=  42.1

In [None]:
# equality means value equality, so if two things are "equivalent" they are equal.
# the following is equal even though one is an int and the other a float
a = 42
b = 42.0
a == b

In [None]:
# if you want to ask whether two things are actually the same object, use the `is` operator:
a = 42
b = 42.0
a is b # these two are not the same object, so the result is False.

## Strings

In Python strings must be put in quotation marks, just like in spreadsheets or databases. Single or double quotes both work, which is handy if you want to use one type inside the other. Strings can be arbitrarily large (well, as long as they fit in memory).

The type name for strings is `str`

In [None]:
english_text = "I'm english" # one quote inside the other
quote = 'Then he said: "behold"'
type(quote)

In many languages there is a separate character type (a single letter). Python doesn't have that; a single character is also a string. But strings can be indexed: you can ask which character (or slice) you want. Square brackets indicate which characters to get. Indexing starts at 0, so the first character is index zero.

In [None]:
name = "Kunigunda"
name[0] # first letter

'K'

In [None]:
name[-1] # last letter

In [None]:
name[0:3] # from index 0 up to index 3 (3 not included)

In [None]:
name[4:] # if you don't provide the end, it means "until the end"

If you only want to check whether one string occurs in another, use the `in` keyword!

In [None]:
"ember" in "november"

In [None]:
"Ember" in "november" # case matters!

String modifiers

### Special characters

Strings may contain special characters; prefixing them with a backslash (`\`) gives them special meaning. This is called "escaping". A common example is the newline, which we write as `\n`.

In [None]:
print("Let's have \n these\n on\n separate\n lines")

In [None]:
# many special characters can be entered with a backslash:
print("tab:\t1\t2\t3") # tab
print("for alignment:\t22\t34\t0") # table-like
print("quote inside quotes: \" ")
print("unicode char: \U0001F40D")
print("unicode by name: \N{SNAKE} \N{White Smiling Face}")

In [None]:
# Python strings are full Unicode, so if you can type or paste them you can put them directly in quotes:
"üêçüòáüêçüòàüêçüòé"

In [None]:
# Since backslash itself is special, it must be escaped too.
print("\\") # this is a single backslash
print("C:\\Program Files\\") # windows path

### Multiline strings

If you want multi-line text you can also escape the line ending, but that's often not very readable. If you want to store the newline itself, use triple quotes!

In [None]:
text = "if it doesn't fit on one line, we can escape the line ending\
 and then it still counts as the same line."
text

'ha nem f√©r ki egy sorba, escapelhetj√ºk a sorv√©g jelet √©s akkor az m√©g ugyannak a sornak sz√°m√≠t.'

This approach is somewhat "fragile"; write anything after the backslash in the above code block (even a simple space) and see what happens!

That's why we usually prefer triple quotes:

In [None]:
text = """This is:
A multi-line string,
up to the following triple
quote!"""

### "Modifying" strings

In Python strings (like all the simple types above) are immutable. We cannot change them in place, but we can move the label that refers to them to another value.

Operators may mean different things for different types. For strings the `+` operator is not numerical addition (which would not make much sense) but concatenation.

If we want to "modify" the value associated with a label, we, like with numbers, redefine the label to a different string.

In [None]:
"many" + "small" + "text"

In [None]:
magic_word = "abraka" # the label points to "abraka"
magic_word = magic_word + "dabra" # we move the label to the concatenation of the two
magic_word

In [None]:
# Not only addition, but multiplication is also defined for strings.
# meaning: repeat!
"most"*10 + "best"

Of course, we can do more than concatenate or repeat strings. We can measure their length:

In [None]:
# if I'm only curious about the string length (measured in characters)
len(text)

Moreover, the str type comes with a whole bunch of useful methods:


In [None]:
text = "this is a TEXT"
print(text.capitalize()) # capitalize the first letter
print(text.upper()) # or make it all uppercase
print(text.lower()) # or all lowercase
print(text.split(" ")) # split at spaces
print(text.replace(" ","-")) # replace spaces with hyphens
print(text.count("e")) # how many times does lowercase 'e' appear?
print(text.lower().count("e")) # how many times does any 'e' appear?
print(text.find("T")) # which character index is the capital "T"?

As you can see, methods bound specifically to the string type are called in a slightly unusual way: we write them after the label with a dot. Later we'll see that in this case the function receives the label as its first parameter, so this is just syntactic sugar since the type can be inferred from the label.

In [None]:
# you can write it like this:
text.upper()

In [None]:
# but you can also write it like this:
str.upper(text)
# it's just not commonly used...

This is generally true for all types. Methods that belong to a type can be called by appending them to the data (or label) with a dot. For instance you can call the bit-counting method bound to the int type this way:

In [None]:
(478).bit_count() # how many bits does 478 occupy?
# the parentheses are needed so it isn't interpreted as a decimal point

Or, if you want to confuse everyone, you could add two numbers in this sneaky way:

In [None]:
(5).__add__(8)

### String prefixes

If escaping backslashes bothers you (for example if you have many backslashes in a string), you can write raw strings. Python signals special formats with a prefix letter before the opening quote. For raw strings use `r` or `R`. In raw strings the backslash has no special meaning.

In [None]:
print('one\ntwo\n') # normal string \n is special
print(r'one\ntwo\n') # raw string

egy
kett≈ë

egy\nkett≈ë\n


There is a super-useful string feature called f-strings (formatted strings) that allows embedding variables directly into the string. An `f` before the quote indicates that expressions inside curly braces should be evaluated and inserted. Although there are several ways to do this in Python, f-strings are modern and very readable, so they are worth learning!

In [None]:
last_name = "Kov√°cs"
first_name = "J√≥zsef"

f"Dear {last_name} {first_name}!"

'Kedves Kov√°cs J√≥zsef!'

In [None]:
# of course we could build it by concatenation:
# but that is less readable...
"Dear " + last_name + " " + first_name + "!"

'Kedves Kov√°cs J√≥zsef!'

In [None]:
# f-string substitution can be formatted too:

pi = 3.141592653589793
print(f"pi: {pi}") # prints a float
print(f"Rounded: {pi:.3f}") # 3 decimals as a float
print(f"Percent: {pi:.2%}") # percentage with two decimals
print(f"Large number: {pi*10**10:,}") # include thousand separators


pi: 3.141592653589793
Kerek√≠tve: 3.142
Sz√°zal√©k: 314.16%
Nagy sz√°m: 31,415,926,535.89793


## Mutable and immutable data types
All the simple types listed above are "immutable", meaning they do not change once defined. They may be garbage collected if unused, but their value never changes. Complex types, however, such as lists, are mutable and can be changed (elements can be added or removed) even if we didn't directly modify them ourselves. (For example because another part of the program modified them.)

You can think of it (even if not implemented exactly like this) that if three labels point to an immutable value like 42, they all point to the same 42. If you add 2 to one of them, you move that label to 44: the label moves, but the 42 itself never changes.

This is not true for mutable types like lists, which may be extended or have elements removed while a label still points to them.


## Composite container types

There are types that can store an arbitrary number of values according to some structure. They can be extended or reduced as needed. Examples include:
- set
- list
- dict

We'll see these later; here are a few examples:

In [None]:
(1,2,3,4) # tuple
[8,3,3,4,9] # list
{1,2,9} # set
{"P√©ter":42, "Zsuzsa":18} # dictionary


## NoneType

Python has a special type `NoneType` which has exactly one value: `None`. We use it to indicate the absence of a value or that the value is unknown. A label cannot float in the air; it must be attached to some value. If we don't yet know the value (or want to express explicitly that it is unknown) we assign the label to `None`.

In Colab the `None` value is not displayed in the UI (if something is None, nothing is shown):

In [None]:
unknown = None

unknown # nothing will appear!

But if we explicitly ask a function to print it, we can see its value:

In [None]:
print(unknown)



If you've learned SQL you may recognize the similarity to NULL. Python's `None` is similar to SQL NULL, but unlike SQL NULL it doesn't have any magical behavior. It's simply a separate type with a single value. Operations that don't define behavior for `None` will raise an error if you try to use them with `None`.

```python
42 + None
```

Running the above line would simply raise an error because addition is not defined for unknown values. Similarly, the following would be erroneous:
```python
data = [1,2,None,11] # four values where one is unknown
# let's sum them:
sum(data) # also an error, sum doesn't know what to do with "None"
```

So what is it good for? You can at least check whether something is `None`! The equality test (`==`) is defined for `NoneType` (as for most types).

In [None]:
something = None # let something be unknown
something = 7 # ... or rather let's set it to 7

something == None # does something point to an unknown/empty value?

If you comment out (or remove) the `something = 7` line above you'll see the execution result change. If we only wanted to know whether `something` is known or not, this is all we need. If it's known do this, otherwise do that.

This brings us to the next chapter: how to tell the program to do different things under different conditions, or repeat things until some condition holds. In other words, how to give structure to our program.