# <center><font color='magenta'>**Introduction To Python at Magyar Telekom**</font></center>
### <center>Autumn, 2021</center>
### <center>Asztalos Áron, Duronelly Péter, Forgách Márton, Neszmélyi Zsolt</center>
# <center>Class 1</center>

## Class Topics
* interacting with Jupyter notebooks
* running commands from Jupyter notebooks and from script files
* using built-in modules
* installing new modules
* simple structures: strings and numbers
* complex data structures: lists, tuples
* list comprehension
* logical statements and control flow

## Jupyter notebooks

This file - a Jupyter notebook -  does not follow the standard pattern with Python code in a text file. Instead, a Jupyter notebook is stored as a file in the [JSON](http://en.wikipedia.org/wiki/JSON) format. The advantage is that we can mix formatted text, Python code and code output. It requires the Jupyter notebook server to run it though, and therefore isn't a stand-alone Python program as described above. Other than that, there is no difference between the Python code that goes into a program file or a Jupyter notebook.
We will return to JSON files later, when we will work with dictionaries, and advanced data structures.

## Modules

Most of the functionality in Python is provided by *modules*. The Python Standard Library is a large collection of modules that provides *cross-platform* implementations of common facilities such as access to the operating system, file I/O, string management, network communication, and much more. We will use some of these modules along the course.

<b>To use a module</b> in a Python program it first has <b>to be imported</b>. A module can be imported using the `import` statement. For example, to import the module `math`, which contains many standard mathematical functions, we can do:

In [None]:
import math

This includes the whole module and makes it available for use later in the program. For example, we can do:

In [None]:
import math

x = math.cos(2 * math.pi)

print(x)

Alternatively, we can chose to import all symbols (functions and variables) in a module to the current namespace (so that we don't need to use the prefix "`math.`" every time we use something from the `math` module:

In [None]:
from math import *

x = cos(2 * pi)

print(x)

This pattern can be very convenient, but *in large programs that include many modules it is often a good idea to keep the symbols from each module in their own namespaces, by using the `import math` pattern*. This would eliminate potentially confusing problems with name space collisions.

As a third alternative, we can chose to import only a few selected symbols from a module by explicitly listing which ones we want to import instead of using the wildcard character `*`:

In [None]:
from math import cos, pi

x = cos(2 * pi)

print(x)

## Variables and types

### Symbol names 

Variable names in Python can contain alphanumerical characters `a-z`, `A-Z`, `0-9` and some special characters such as `_`. Normal variable names must start with a letter. 

By convention, variable names start with a lower-case letter, and Class names start with a capital letter. 

In addition, there are a number of Python keywords **that cannot be used as variable names**. These keywords are:

    and, as, assert, break, class, continue, def, del, elif, else, except, 
    exec, finally, for, from, global, if, import, in, is, lambda, not, or,
    pass, print, raise, return, try, while, with, yield

Note: Be aware of the keyword `lambda`, which could easily be a natural variable name in a scientific program. But being a keyword, it cannot be used as a variable name.

### Assignment



The assignment operator in Python is `=`. Python is a dynamically typed language, so we do not need to specify the type of a variable when we create one.

Assigning a value to a new variable creates the variable:

In [None]:
a = 3

In [None]:
type(a)

In [None]:
b = 1.2

In [None]:
type(b)

In [None]:
c = 'a'

In [None]:
type(c)

In [None]:
d = 'abc'

In [None]:
type(d)

Variables can easily be casted into other types.

In [None]:
float(a)

In [None]:
str(a)

In [None]:
int(b)

When reassigned with a new value, its type can change. 

In [None]:
a = 1.3
type(a)

All characters have a respective number by which they are referred to. 

In [None]:
chr(60)

### Fundamental types

In [None]:
# integers
x = 1
type(x)

In [None]:
# float
x = 1.0
type(x)

In [None]:
# boolean
b1 = True
b2 = False

type(b1)

In [None]:
# complex numbers: note the use of `j` to specify the imaginary part
x = 1.0 - 1.0j
type(x)

In [None]:
print(x)

In [None]:
print(x.real, x.imag)

### Operators

arithmetic

In [None]:
1 + 2, 1 - 2, 1 * 2, 1 / 2

In [None]:
# Integer division of float numbers
3.0 // 2.0

In [None]:
# Power is ** not ^
2**3

In [None]:
# / always results in floats
2 / 1

In [None]:
# if you need intergers use integer division instead
2 // 1

In [None]:
# Modulo
7%3

boolean

In [None]:
True and False

In [None]:
not False

In [None]:
True or False

comparison

In [None]:
a = 2
b = 3
c = 3

In [None]:
a > b

In [None]:
b == c

In [None]:
b is c

In [None]:
True == 1

### Strings

**Intro**

In [None]:
s = 'Hello Monty!'

In [None]:
s

In [None]:
print(s)

Double quotes work as well. However, they are still not treated the same. (See later.)

In [None]:
s2 = "Hello Monthy!"
s2

In [None]:
len(s) # get length

In [None]:
print(s.replace('Monty', 'Python'))

*Indexing* starts at 0 in Python. 

In [None]:
s[0:5]

In [None]:
s[-6:]

In [None]:
s[1:9:2] # start, stop, step

In [None]:
s[::2] # start and stop are missing so we are stepping along the whole string

**Manipulate and print**

In [None]:
s = 'Hello Monty Python!'

split() splits the text to a complex variable called *list*. 

In [None]:
s.split(' ')

In [None]:
s.split(' ')[1]

Functions can directly be applied to the text itself. 

In [None]:
'Hello Monty Python!'.split(' ')[1]

Combine " and ' to print quotation marks. 

In [None]:
s = "Hello Monty 'Holy Grail' Python!"
print(s)

Special characters: \n, \t and alike. 

In [None]:
s = 'Hello \n Monthy!'
print(s)

Use another backslash (\\) to escape the escape character.

In [None]:
s = 'Hello \\n Monthy!'
print(s)

Start strings with the letter <font color= 'magenta'>**r**</font> to define *raw strings*. Raw strings are printed and interpreted as they are. 

In [None]:
print('Hello \t Monthy!') # plain
print(r'Hello \t Monthy!') # raw

Raw strings are espcially useful when defining Windows paths. (Not an issue in Linux & Mac.)

In [None]:
print('C:\Users')

In [None]:
print(r'C:\Users')

"\\" is the *escape character* that's why it creates a mess in strings. You can also escpate the escape character with an escape character.

In [None]:
print('C:\\Users') # Not an optimal solution in case of long paths. 

Concatenating

In [None]:
'Monthy' + 'Python'

In [None]:
''.join(['Monthy', 'Python'])

In [None]:
' '.join(['Monthy', 'Python']) # Note the space between the parenthesis. 

**Some more tweaking and formatting**

Use special characters to include variables in text.
- %s: strings
- %f: floats
- %d: integers

In [None]:
print('Hello, I am %s, I have been working for here for %f years and this is Python Class %d.'% ('Jenny', 2.5, 1))

In [None]:
# What if we replace %f with %s or %d?
print('Hello, I am %s, I have been working here for %s years and this is Python Class %d.'% ('Jenny', 2.5, 1))
print('Hello, I am %s, I have been working here for %d years and this is Python Class %d.'% ('Jenny', 2.5, 1))

We can even format the numbers.

In [None]:
print('Hello, I am %s, I have been working here for %.2f years and this is Python Class %d.'% ('Jenny', 2.5, 1))

The other way is use *f-strings*.

In [None]:
name = 'Jenny'
classnumber = 1
print(f'Hello, I am {name} and this is Python class {classnumber}.')

Use *f-strings* in your scripts to write queries.

In [None]:
database = 'SALES'
table = 'WEBSHOP_SALES'
month = '8'
day = 20 # both strings and integers will work

query = f"""
SELECT *
FROM {database}.{table}
WHERE month = {month}
AND day = {day}
"""

print(query)

In [None]:
database = 'SALES'
table = 'WEBSHOP_SALES'
month = 'April' # this will lead a query to error! 
day = 20 # both strings and integers will work

query = f"""
SELECT *
FROM {database}.{table}
WHERE month = {month}
AND day = {day}
"""

print(query)

In [None]:
database = 'SALES'
table = 'WEBSHOP_SALES'
month = 'April'
day = 20 # both strings and integers will work

query = f"""
SELECT *
FROM {database}.{table}
WHERE month = '{month}'
AND day = {day}
"""

print(query)

And of course there are even more ways to format text.

In [None]:
print('{} divided by {} is {}.'.format(2000,1500,2000/1500))

Some fancier formatting. 

In [None]:
print('{:,d} divided by {:,.0f} is {:.2f}.'.format(2000,1500,2000/1500))

In [None]:
print('{:8.2f}'.format(10/3))
print('{:8.2f}'.format(100/3))
print('{:8.2f}'.format(1000/3))
print('{:8.2f}'.format(10000/3))
print('{:8.2f}'.format(100000/3))
print('{:8.2f}'.format(1000000/3))

In [None]:
print('{:8,.2f}'.format(100000/3)) # Add a thousand separator.

**Logical operations with strings**

In [None]:
s = 'Java, C++, COBOL '
'Python' in s

In [None]:
s = 'Monthy Python'
'Python' in s

### Lists
One of the most used datatypes! 

Lists can contain elemets of various types! (But that makes large lists memory inefficient!)

In [None]:
l = [1,2,'a',4]
print(type(l))
print(l)

In [None]:
l.sort()

Lists can be nested.

In [None]:
l = [1, ['a', 1.2], [math.pi]]

In [None]:
print(l)

The classmethod *revrese()* reverses the list **in place**, that is it modifies the list which stays like that. 

In [None]:
l.reverse()
print(l)

When the type match, they can also be sorted, also in place. 

In [None]:
l = ['f', 'a', 'z', 't']

In [None]:
l.sort()
l

Lists play a very important role in Python. For example they are used in loops and other flow control structures (discussed later). There are a number of convenient functions for generating lists of various types, for example the `range` function:

In [None]:
start = 10
stop = 20
step = 2
list(range(start, stop, step)) # 'stop' in not included in the list! 

In [None]:
list(range(10))

In [None]:
s = 'Hello \t Monthy!'
print(list(s))

In [None]:
# Just for the record...
print(s)

List can be modified, so they are *mutable*.

In [None]:
l = [] # instantiate an empty list
print(l)

In [None]:
l.append('A')
print(l)

In [None]:
l.append('b')
l.append(4)
l.append(math.cos(math.pi)) # appends the results
print(l)

In [None]:
type(l[-1]) # -1 stands for the last element

In [None]:
len(l) # get length

Indexing starts at 0. The last element can be indexed as -1. (The one before that as -2, etc.)

In [None]:
l[2:] # starting at position 2, up till the end

In [None]:
l.remove('A') # remove a particular item
print(l)

In [None]:
del l[2] # delete an item at a given position
print(l)

Use lists in queries for the *'IN'* clause but watch out for the subtleties! 

In [None]:
# This will run on error.
database = 'SALES'
table = 'WEBSHOP_SALES'
month = '8'
day = [20,21,22] # since it is an input to a string function, only strings will work! 

query = f"""
SELECT *
FROM {database}.{table}
WHERE month = {month}
AND day IN {', '.join(day)}
"""

print(query)

In [None]:
database = 'SALES'
table = 'WEBSHOP_SALES'
month = '8'
day = ['20','21','22'] # since it is an input to a string function, this time only strings will work! 

query = f"""
SELECT *
FROM {database}.{table}
WHERE month = {month}
AND day IN ({', '.join(day)})
"""

print(query)

Joining list of strings for a query. While the Python syntax is correct, the query will run to an error.

In [None]:
database = 'SALES'
table = 'WEBSHOP_SALES'
month = ['April', 'May', 'June']
day = ['20','21','22'] 

query = f"""
SELECT *
FROM {database}.{table}
WHERE month IN ({', '.join(month)})
AND day IN ({', '.join(day)})
"""

print(query)

But there is a fix! String manipulation 2.0 

In [None]:
print('April')
print("'April'")

In [None]:
database = 'SALES'
table = 'WEBSHOP_SALES'
month = ["'April'", "'May'", "'June'"] # check out the quotes in quotes! 
day = ['20','21','22'] 

query = f"""
SELECT *
FROM {database}.{table}
WHERE month IN ({', '.join(month)})
AND day IN ({', '.join(day)})
"""

print(query)

Of course, there is another way! No *f-strings* this time.

In [None]:
database = 'SALES'
table = 'WEBSHOP_SALES'
month = ['April', 'May', 'June'] 
day = ['20','21','22'] 

query = """
SELECT *
FROM {}.{}
WHERE month IN ('{}')
AND day IN ({})
""".format(database, table, "', '".join(month), ', '.join(day))

print(query)

In [None]:
# This is why we need the starting and trailing single quotes in the query string.
print("', '".join(month))

<br>
 
## Conditional Statements And Control Flows

### Conditional Statements

In [None]:
import random

In [None]:
r = random.randint(20,34)
print(r)
if r < 25:
    print('A small number!')
elif r < 30:
    print('A moderately high number')
else:
    print('A large number!')

Condtional statements are controlled by ***indentation***.

In [None]:
a = 8
b = 6

if a > 6:
    if b > 8:
        print('Both numbers are large.')
    print('Result: b is larger than 8.') # This should be intended one tab right. 

Instead of semicolons or other separators, Python uses indentations to keep together logical levels and units. Standard indent is 4 characters, but any number works if they are used consistently.

### Control flows

The *'for'* loop

In [None]:
for i in range(20): # Remember: 20 is not included in the range! 
    if i%2 == 0:
        print('Number %d is even.'% i)
    else:
        print('Number %d is odd.'% i)

The *'while'* loop

In [None]:
i = 0 # the counter
while i < 20:
    if i%2 == 0:
        print('Number %d is even.'% i)
    else:
        print('Number %d is odd.'% i)
    i += 1 # increment in Python (same as i++ in Java)
print('\nDone.') # Indented so that it will only print at the end.

<font color = 'red'>**Caution!!!**</font> If you don't increment the counter, the loop will never stop!

How to repeat a code in a permanent fashion. 

In [None]:
# You can stop the loop with the black rectangle in this notebook's menu bar. 
from IPython.display import clear_output
import time

i = 1
while True: # This part makes it run forever. 
    print(i)
    i += 1
    time.sleep(1)
    clear_output()

In [None]:
for word in ['Data', 'science', 'with', 'Python']:
    print(word, len(word)) # functions can also be print inputs

Concatenating lists

In [None]:
l1 = ['a', 'b']
l2 = ['c', 'd']

In [None]:
l12 = l1 + l2
l12

<br>
 
### Let's play with what we have!

In [None]:
list_capitals = []
for i in range(65,91):
    list_capitals.append(chr(i))
print(list_capitals)

The `enumerate` function helps you get a counter. 

In [None]:
for k, v in enumerate(list_capitals):
    print(k, v)

Add some simple formatting: right-adjust k. This is what the `.rjust()` function does. 

In [None]:
for k, v in enumerate(list_capitals):
    print(str(k).rjust(2)+': ', v)

## Some More Types

### Tuples

Tuples are like lists, except that they cannot be modified once created, that is they are *immutable*. 

In Python, tuples are created using the syntax `(..., ..., ...)`, or even `..., ...`:

In [None]:
point = (10, 20)

print(point, type(point))

In [None]:
point = 10, 20

print(point, type(point))

We can iterate on tuples like on lists.

In [None]:
for p in point:
    print(p)

Access the members like those of a list.

In [None]:
print(point[0])

We can *unpack* a tuple by assigning it to a comma-separated list of variables:

In [None]:
x, y = point

print("x =", x)
print("y =", y)

If we try to assign a new value to an element in a tuple we get an error:

In [None]:
# This will run on error
point[0] = 15

In [None]:
point[0]

## Writing Functions
**UDF** = 'User-Defined Function'

A function is a block of organized, reusable code that is used to perform a single, related action. Functions provide better modularity for your application and a high degree of code reusing.

You can define functions to provide the required functionality. Here are simple rules to define a function in Python.

* Function blocks begin with the keyword ```def``` followed by the function name and parentheses ```( )```.

* Any input parameters or arguments should be placed within these parentheses. You can also define parameters inside these parentheses.

* The first statement of a function can be an optional statement - the documentation string of the function or docstring.

* The code block within every function starts with a colon (```:```) and is **indented**.

* The statement ```return``` [expression] returns a value, or a serious of values, a list, a dictionary, .... A return statement with no arguments is the same as return None.

In [None]:
def add_one(number):
    x = number + 1
    return x

In [None]:
add_one(20)

You can return more than one object from a single function. 

In [None]:
def add_one_and_return_both(number):
    x = number
    y = x + 1
    return x, y

In [None]:
x, y = add_one_and_return_both(23)
print(x)
print(y)

Function arguments can have default values.

In [None]:
def number_to_the_power(number, exponent = 2):
    return number ** exponent

In [None]:
number_to_the_power(5)

In [None]:
number_to_the_power(5, 3)

Return objects can be other than numbers only. Also, `docstrings` help you document your function. More on docstrings [here](https://www.datacamp.com/community/tutorials/docstrings-python).

In [None]:
def cast_listitems_to_string(list):
    """
    Casts list of various elements to string. 
    
    The function cast elements in a list to string,
    whatever their original type is.
    
    Parameters
    ----------
    list: list 
        A list of various data types.
        
    Returns
    -------
    list: list
        A list of strings, cast from the original elements.
    """
    
    for i in range(len(list)):
        list[i] = str(list[i]) # remember: lists are mutable
    return list

Now you can call the help() function to print the docs of your UDF.

In [None]:
help(cast_listitems_to_string)

In [None]:
ls_convertable = [1,2, 'a', math.cos(math.pi / 3)]

In [None]:
ls_convertable

In [None]:
ls_converted = cast_listitems_to_string(ls_convertable)

In [None]:
ls_converted

### Exercise 0
Cast tuple elements to strings. 

## List comprehensions: Creating lists using `for` loops:

A list comprehension mimics the mathematic formalism of defining sets. For example:
$$ L=\lbrace x^2 : x \in \lbrace 0, 1, 2, 3, 4\rbrace \rbrace.$$
This translates into:

In [None]:
L = [x**2 for x in range(0,10)]
L

You can also combine it with conditional statements. For example:
$$S = \lbrace x : x \in L \text{ and }  x  \text{ is odd} \rbrace.$$
This becomes:

In [None]:
[x for x in L if x%2 == 1]

You can also use an ``if else`` statement

In [None]:
['even' if x%2 == 1 else 'odd' for x in L]

### Sorting Lists: Simple and Customized Sorting
The simples way is the `sorted()` function, where the original list is not changed. (As opposed to the `sort()` *method* above.)

In [None]:
a = [5, 1, 4, 3]
print(sorted(a))
print(a)

The `sort()` functon can be customized with the `key=` argument.

In [None]:
strs = ['ccc', 'aaaa', 'd', 'bb']
print(sorted(strs, key=len))

And it can also be *reversed*. 

In [None]:
strs = ['ccc', 'aaaa', 'd', 'bb']
print(sorted(strs, key=len, reverse = True))

### Exercise 1
Sort by length and alphabeticaly order.

### Exercise 2
Sort by a twisted logic.

In [None]:
import urllib.request
request = urllib.request.Request('https://www.mnb.hu/Jegybanki_alapkamat_alakulasa')

result = urllib.request.urlopen(request)

text = result.read() # 'text' is eventually the html code behind page

In [None]:
text # what is this???

<br> 

**What did we get?**

In [None]:
type(text)

bytes objects need to be decoded by the `decode()` function. `utf-8` will *most likely* convert it to a readable format. 

In [None]:
text_decoded = text.decode("utf-8")

In [None]:
print(text_decoded.split(r'table class="datatable"')[1]) # Note the raw string as separator! 

In [None]:
text_decoded.split(r'table class="datatable"')[1].split('</table>')[0]

In [None]:
print(text_decoded.split(r'table class="datatable"')[1].split('</table>')[0])

In [None]:
date_of_last_change = text_decoded.split(r'table class="datatable"')[1].split('</table>')[0].split('<td>')[1].split('</td>')[0]

In [None]:
base_rate = text_decoded.split(r'table class="datatable"')[1].split('</table>')[0].split('<td>')[2].split('</td>')[0]

In [None]:
date_of_last_change

In [None]:
base_rate

In [None]:
print(f'The base rate was changed at {date_of_last_change} to {base_rate}.')

### Excercise 3
Scrape the consumer price index by year. 