### Python and SQL: intro / SQL platforms - 2019 Z


### This material may not be copied or published elsewhere (including Facebook and other social media) without the  permission of the author!

# Introduction to Python

Python's origins lie way back in distant December 1989. It was created by Guido van Rossum (the Python community's <b>Benevolent Dictator for Life </b>) as a hobby project to work on during week around Christmas. Python is famously named after the British comedy troupe Monty Python's Flying Circus. 

Python was born out of:
- ABC language [used by Dutch Research Institute in 1980s]
- Amoeba distributed operating system. Van Rossum created Python as scripting language for Amoeba system. 

### History of Python
- Python 1.0 (1994)
- Python 2.0 (2000)
- Python 3.0 (2008)
- …..
- Python 3.5 (2015)
- Python 3.6 (2016)
- Python 3.7 (2018)
- Python 3.8 (2019)

File extension: .py or .ipynb (jupyter)

<img src="guido.jpg">


## Why Python ?

### Strengths

- Free 
- Portable (Linux, Windows, Mac, Cray, and more…) 
- Object-Oriented
- Ease of use 
- Many libraries
- Component integration


### Weaknesses

- Slower than C,C++


## Python features

* Dynamic typing 
* Automatic memory management
* Built-in data structures (lists, dictionaries, tuples,sets)
* Python C, C++ API

* Polymorphism - particular object belonging to a particular class can be used in the same way as if it was a different object belonging to a different class.

* Operator overloading

* Multiple inheritance



## Python, IPython, Jupyter
Python is an interpreted programming language. Python has two basic modes:
- script 
- interactive.

<b>Script mode</b> <br>
The mode where the scripted and finished .py files are run in the Python interpreter. <br>

<b>Interactive mode </b> <br> 
Command line shell which gives immediate feedback for each statement, while running previously fed statements in active memory. As new lines are fed into the interpreter, the fed program is evaluated both in part and in whole. <br> 
<br>

<b>IPython + Jupyter </b> <br>
It is impossible to stop the program for a while, maintain its state, modify something and return to the operation in the script mode. IPython (Interactive Python) has been created to solve this problem. It allows interactive work on scripts and stores objects in the memory permanently, so it is always possible to return to them. You can also run other code, stop it, and everything will be stored in memory. Jupyter on the other hand, is a graphical interface to run IPython.
<br>
## Other programming environments
There are multiple environments to work in when it comes to Python, from simple Notepad, vim, nano, notepad++, Sublime Text, through Jupyter Notebook, to IDEs (PyCharm, Spyder etc.). Everyone is free to choose their preferred solution. During the course Jupyter Notebook is sufficient, because it is convenient, fast, has a user-friendly interface and makes making readable, structured text comments easy.


# Installing Jupyter using Anaconda 

Installing Python and Jupyter using the Anaconda Distribution is strongly recommended. 
Anaconda Distribution includes: 
- Python, 
- Jupyter Notebook, 
- other commonly used packages for scientific computing and data science

#### Install Anaconda
- First, download Anaconda from https://www.anaconda.com/download/. Choose Anaconda version, which is suitable for your system architecture (32/64 Bit). Downloading Anaconda’s Python 3.7 version is recommended.

- Second, install the version of Anaconda, which you downloaded, following the instructions from the download page.

#### Run Jupyter
To run the Jupyter notebook, run the following command at the Terminal (Mac/Linux) or Command Prompt (Windows - WinSymbol+R->type:cmd):
- jupyter notebook <br>
or 
- double-click the <b>start_jupyter.bat </b> file, which was delivered by instructor at the lab1.



## Using Notebook

Using a notebook is very easy. Some important pieces of information:
* You run a single cell using Shift+Enter (focus moves to the next cell) or Ctrl+Enter (focus stays in the same cell)
* If you want to run all cells or e.g. all below active cell, click on Cell and appropriate Run option.

You may use the icons above:
* Rectangle: stop current action
* +: add new, empty cell below currently active cell
* Arrows up/down: move cells
* Scissors: cut cell; two pieces of paper: copy cell
* Dropdown menu: choose cell type (Markdown - text, Code - Python code)
* Kernel (notebook) is restarted using Kernel > Restart or a Refresh (circle) arrow. Restarting a notebook makes you lose all results.
* Once in a while, a notebook is saved automatically. However in crucial moments it is good practice to save manually (Ctrl+S). You should also do this before running new code.

It is suggested that you get used to keyboard shortcuts. If you do anything repeatedly, you should search for a keyboard shortcut and start using it. Even if at the beginning it is slower, after a while work will become much faster.
* Ctrl+/ : comment a line
* Shift+Del or Ctrl+D : delete current line
* Tab : indent a line or multiple lines, Shift+Tab : remove indentation (both at the beginning of a line or if many lines have been chosen)
* Tab : autocomplete (at the end of a line)
* Shift+Tab : show documentation of an object (usually between parentheses: () )
* Help > Keyboard Shortcuts : show other keyboard shortcuts
* Some people may miss a shortcut to duplicate a line. Fortunately, there is an Open Source solution: [Line duplication](https://github.com/jupyter/notebook/issues/1816)
* If you don't like classic Notebook there, you can obviously change it: [Themes](https://github.com/dunovank/jupyter-themes)

Keyboard shortcuts are also available in Notebook to work on cells. You can either be inside the cell (edit a cell) or work on cells. Switch between modes using keys Esc/Enter. In the cell edition mode additional shortcuts are available, e.g.:
* A : add cell above
* B : add cell below
* C : copy a cell, X : cut, V : paste
* Shift+Up : choose cells above, Shift+Down : choose cells below.


# Basics: code organization and data structures
## Code organization
In Python code organization is based on indentation, as opposed to many other programming languages. We cannot use whitespace freely (apart from blank lines). <b> Code blocks are created using indentation levels, and not curly brackets {}.</b>
The code below is incorrect and will not run:

<b>Remark</b>
Use # to put some comments into your code!

##### Be careful with indentations !!!

In [1]:
x = 1
    y = 2

IndentationError: unexpected indent (<ipython-input-1-93b1bf8a2710>, line 2)

In [4]:
# You need to remove unnecessary indentation to run the code.
x = 1
y = 2

print(x)
print(y)

1
2


At the beginning it may be frustrating, but you will get used to it soon. Thanks to this approach, the code is always readable. Indentation is always right, and the number of unnecessary characters (for example {}) and lines is limited. There are no semicolons at the end of the line ;.

**Tip: Tab and Shift+Tab allow you to indent one or multiple lines (in case of one line cursor must be at the beginning, unless you use alternative shortcuts: Ctrl+] or Ctrl+[, which also work on multiple lines)**

### Python keywords

<img src="keywords.jpg">

## Print function and "hello world"
"print" function displays text.

In [3]:
print("Hello world! :)")

# Since Python 3 print is a function, so using parentheses is required.
# In older versions of Python the following code was also right:
# print "Hello world! :)"
# Usually a print statement without parentheses is the easiest way to distinguish Python 2 from 3.

Hello world! :)


## Variables
Python is an interpreted language (does not require compiling), which is characterized by dynamic type system - you are not required to declare variables' type. Interpreter itself guesses it. <br>

<table><tr><th>Type</th> <th>Example</th>	<th>Description</th></tr>
<tr><th>int</th> <th>x = 1</th>	<th>integers (i.e., whole numbers)</th></tr>
<tr><th>float</th>	<th>x = 1.0</th><th>floating-point numbers (i.e., real numbers)</th></tr>
<tr><th>complex</th><th>x = 1 + 2j</th>	<th>Complex numbers (i.e., numbers with real and imaginary part)</th><tr>
<tr><th>bool</th><th>x = True</th>	<th>Boolean: True/False values</th></tr>
<tr><th>str</th><th>x = 'abc'</th>	<th>String: characters or text</tr>
<tr> <th>NoneType</th><th>x = None</th><th> Special object indicating nulls</th> </tr>
</table>

In [15]:
e = 2.72
pi = "3.14"
text = "Hello world!"
print("Type of variable e:", type(e),
      ", type of variable text: ", type(text),
      ", type of variable pi: ", type(pi))
# Open parenthesis of print function allows for multiple lines with any indentation.

Type of variable e: <class 'float'> , type of variable text:  <class 'str'> , type of variable pi:  <class 'str'>


Print function can take multiple arguments.

Showing a variable's content may be achieved in one more way:

In [18]:
#old style printing
print("old style")
print("Type of variable e: %s, type of variable text: %s, type of variable pi: %s"
      % (type(e), type(text), type(pi)))

#new style printing
print("\nnew style")
print('name={0},surname={1}'.format('Tom', 'Smith'))

old style
Type of variable e: <class 'float'>, type of variable text: <class 'str'>, type of variable pi: <class 'str'>

new style
name=Tom,surname=Smith


In this way you can control formatting better.
We avoid unnecessary space before a comma.
You can read more here: https://pyformat.info/

### Dynamic type system
Dynamic type system has both advantages:
* faster code writing
* less code

and disadvantages:
* longer running time
* possibility of errors which are difficult to debug.

Python allows easy typecasting (changing a variable type):

In [None]:
# Concacenating two strings using operator "+":
print(str(e) + pi)
# Adding two numbers:
print(e + float(pi))

In [None]:
# Fortunately, this is not possible:
print(e + float(text))

## Operators
Besides obvious operators (+,-,/,\*) integer division (// - quotient, % - remainder) and exponentiation (\*\*) are available.

Comparison operators are usual: >=, >, <=, <, ==, !=.

Logical operators: & - AND, | - OR, ^ - XOR, ~ - NOT.

## Data structures
### Objects
Before describing other data structures you should know something about objects.

A difference between objects and functions is (simplifying) as follows:
**Functions is a set of instructions to run. They don't have a state and cannot "exist". Objects exist, there may be many objects simultaneously, and each one may be in a different state.**

To say it in an easy way, object is a complex element created basing on the definition of given class. Class may have multiple attributes (variables) and methods (functions). Both attributes being a class's element and available methods are accessed using a dot.

``` python
# Show contents of attribute "variable1" being an element of object1 
print(object1.variable1)
# The method is similarly used
result = object1.function1(pi)
```

### Basic data structures
Four basic data structures in Python:
* lists
* tuples
* dictionaries
* sets

Array in Python exists as an object type, but is nearly never used in practice. Numpy is used for large tables/numerical matrices, and other cases are easier using lists. Full documentation is available here: https://docs.python.org/3/tutorial/datastructures.html

### Lists
Lists are a convenient and flexible method of storing data. Lists are dynamic and may change their state (are mutable). You can extend and modify them, which makes them very practical in everyday usage. This code shows their capabilities:

In [7]:
# Lists are created using square brackets
emptyList = []

# You can also create a new list object.
emptyList2 = list()
print(emptyList, emptyList2)

colors = ["red", "blue", "green", "orange"]

# Lists are indexed by a number
print(colors[0])
print(colors[:])

# Print elements [1,3)
print(colors[1:3])

[] []
red
['red', 'blue', 'green', 'orange']
['blue', 'green']


In [9]:
# You can append an element to the end of the list in two ways
colors[len(colors):] =["yellow"]
print(colors)

# or use an existing method of a list object.
colors.append("black")
print(colors)

# You can also insert an element to the inside of the list
colors.insert(2, "black")
print(colors)

['red', 'blue', 'green', 'orange', 'yellow']
['red', 'blue', 'green', 'orange', 'yellow', 'black']
['red', 'blue', 'black', 'green', 'orange', 'yellow', 'black']


In [10]:
# Count how many times an element occurs:
print("Count 'black':", colors.count("black"))

# Does an element exist in the list?
print("Is 'black' in the colors list?:", "black" in colors)
print("Is 'black' not in the colors list?:", "black" not in colors)

Count 'black': 2
Is 'black' in the colors list?: True
Is 'black' not in the colors list?: False


In [11]:
# There are two ways of deleting elements
numbers = [4, 5,5, 6]
print(numbers)

# Delete the first element equal to a given value
numbers.remove(5)
print(numbers)

numbers = [4, 5, 6]
# Delete element using an index
numbers.pop(1)
print(numbers)

[4, 5, 5, 6]
[4, 5, 6]
[4, 6]


In [None]:
numbers = [4, 5, 6]
# You can also reverse a list
numbers.reverse()
print(numbers)

In [13]:
colors = ["red", "blue", "green"]
numbers = [4, 5, 6]
# Lists are very flexible. They allow you to have different data types in one list
mixedList = colors + numbers
print(mixedList)

# You can even create a list of lists and other combinations
mixedList1 = list(colors)
mixedList1.append(numbers)
print(mixedList1)
mixedList2 = []
mixedList2.append(colors)
mixedList2.append(numbers)
print(mixedList2)

['red', 'blue', 'green', 4, 5, 6]
['red', 'blue', 'green', [4, 5, 6]]
[['red', 'blue', 'green'], [4, 5, 6]]


In the cell above function list() created a new list. Why did we have to do it to create mixedList2? What would happen if you wrote mixedList1 = colors instead of using a function?

You can test this solution and wonder, why is the result different. We will return to this question soon.

Obviously sorting lists is also possible. How to sort a list in reverse? Place a cursor on "sort" or inside parentheses after "sort" and click Shift+Tab.

In [14]:
# Sort:
print('sorted')
colors.sort()
print(colors)

print('change the order')
colors.sort(reverse=True)
print(colors)

print("Length before clearing", len(colors))
# ... and clear:
colors.clear()
print(colors)

sorted
['blue', 'green', 'red']
change the order
['red', 'green', 'blue']
Length before clearing 3
[]


### Sets and lists
Sets are most closely related to lists. You can perform similar operations as in the case of lists. Changing one data structure to another is very simple.

In [3]:
#let's create first two lists
colors1 = ['black', 'blue', 'green','yellow']
colors2 = ['black', 'green', 'orange', 'red']
# You can join two lists using addition
allColors = colors1 + colors2
print(allColors)

['black', 'blue', 'green', 'yellow', 'black', 'green', 'orange', 'red']


As you can see, "black" and "green" are in the same list two times. What if we wanted to use lists as sets and avoid duplicates? We can convert lists to sets, which we will join.
To avoid mistakes and maintain logical cohesion, operator of joining sets is different than lists.

Pay attention to different bracket type in sets {}. We can create new sets using class name set() or by putting elements in {}. NOTICE: by typing "var={}" we create a dictionary and not a set, see more below.

In [2]:
#Convert list to set
# Create an empty set.
emptySet = set()
print(emptySet)
colors3 = {'brown', 'navy'}
print(colors3)
allColors = set(colors1) | set(colors2) | colors3
# add one color more to allColors, you add elements to a set (and append to a list)
print(allColors)
allColors.add("violet")
print(allColors)
# discard "red" from set (equivalent in lists: remove)
allColors.discard("red")
print(allColors)

# Convert set to list
allColors = list(allColors)
print(allColors)

set()
{'navy', 'brown'}
{'black', 'red', 'orange', 'navy', 'green', 'brown', 'blue', 'yellow'}
{'violet', 'yellow', 'black', 'navy', 'orange', 'green', 'brown', 'blue', 'red'}
{'violet', 'yellow', 'black', 'navy', 'orange', 'green', 'brown', 'blue'}
['violet', 'yellow', 'black', 'navy', 'orange', 'green', 'brown', 'blue']


<b>Arithmetic operations on sets (for sets x, y):</b>
* Joining x.update(y), using an operator: x|=y
* Intersecion x.intersection_update(y), using an operator: x&=y
* Difference x.difference_update(y), using an operator: x-=y
* Symmetric difference x.symmetric_difference_update(y), using an operator: x^=y

For example:

In [4]:
#intersection
commonColors = set(colors1) & set(colors2)
print(commonColors)

#difference
notCommonColors = set(colors1) ^ set(colors2)
print(notCommonColors)

{'green', 'black'}
{'orange', 'blue', 'yellow', 'red'}


Obviously you can check inclusion of sets:

In [5]:
print("Is colors2 a subset of colors1?", set(colors1) > set(colors2))
print("Is ['black', 'blue'] a subset of colors1?", set(colors1) > set(['black', 'blue']))

Is colors2 a subset of colors1? False
Is ['black', 'blue'] a subset of colors1? True


### Dictionaries
Dictionaries are similar to sets. A set guarantees uniqueness of elements, so you can use it as a key/index. For every key there is a value (or None). Dictionaries are in a way an extensions of sets. In sets we have just unique values and in dictionaries unique keys. We can create an empty dict using {}.

In [None]:
emptyDict = {}
emptyDict2 = dict()
print(emptyDict, emptyDict2)
author = {'name': 'Tom', 'surname':'Silver', 'age': 22}
print(author)
# adding elements is simply defining value for a key
author["height"] = 192
print(author)
# key may also be a number
author[1] = "Python >> R"
print(author)
# you can delete a single element from a dictionary
del author["age"]
print(author)

In [None]:
# Just as in the case of sets you can join/update two dictionaries
authorsAge = {'age': 32}
author.update(authorsAge)
print(author)

Other operations which are possible to perform on sets cannot be performed on dictionaries, because in two dictionaries for the same key there may be a different value. Many operations would be ambiguous.
It is possible to perform operations on keys alone.

In [None]:
print(set(author.keys()))

Sometimes you may want to print key/value pairs from a dictionary:

In [None]:
print(author.items())

### Tuples
Tuples are the last built-in data type. In practice, this data structure is not used directly too often in the area of data analysis. Tuples are similar to lists in some ways. You can choose elements by index, tuples may contain various variable types, be nested, etc.

Parenthesis is used for tuples "(", reminder: lists use square brackets "[", sets "{" curly brackets.

In [None]:
tuple1 = (5, 10, 15, "Hurray!")
print(tuple1)
print(tuple1[1])
print(tuple1[3])

#### Tuples immutability
 - tuples are static/immutable objects
 - they cannot be changed after being created, 
 - it is not possible to add elements to existing tuple (see code below).

This behavior is completely different than in the case of lists, sets and dictionaries, which are dynamic and mutable. It makes them flexible and convenient for fast programming.

Why do tuples and static objects exist? Because of their efficiency. If we cannot do "anything' with them, they are simple to use, so we may have fast and efficient access to them. In practice the only case, in which we will use tuples, is returning multiple values from a function.

In [6]:
#we obtain error below
tuple1[1] = 5

NameError: name 'tuple1' is not defined

### Strings
Strinngs are another immutable data type. Strings will be covered more in depth later on. In Python, both single ' and double " (as well as triple single, ''', for multiple lines) may be used to create strings.

In [None]:
a = "hello"
b = 'world'
# You can print first four characters in order
print(a[0], a[1], a[2], a[3])
# Or at once
print(a[0:4])
# Adding two strings together creates a third objects, a and b remain unchanged.
print(a + b)
c = '''This is a string
that contains
many
lines ... '''
print(c)

### Conditional expressions

We use if .... else statments to define conditional expressions:

In [None]:
if expression1:
   statement(s)
   if expression2:
      statement(s)
   elif expression3:
      statement(s)
   else
      statement(s)
elif expression4:
   statement(s)
else:
   statement(s)

In [4]:

x = 3
if x < 2:
    print("Value below 2")
elif x > 10:
    print("Value above 10")
else:
    print("Value between 2 and 10, inclusive")
    
var = 100
if var < 200:
    print("Expression value is less than 200")
    if var == 150:
        print("Which is 150")
    elif var == 100:
        print("Which is 100")
    elif var == 50:
        print("Which is 50")
elif var < 50:
    print("Expression value is less than 50")
else:
    print("Could not find true expression")
    

Value between 2 and 10, inclusive
Expression value is less than 200
Which is 100


### Loops

Two types of loops:
- for loop
- while loop

### For, ranges and iterators
There is an easy way to create ranges of numbers in Python. See a few examples using "for" and iterator "range":

In [13]:
# If you want to simply print a range of numbers, you will see a "strange" result:
print(range(4))
# Output "range(0,4)" tells you what has been created.
# It does not tell you about all the elements it can show.

#Print numbers 0,1,2,3
#Remark: 4 will be not printed below!
print("Print all elements in range(4): ")
for i in range(4):
    print(i)
# See two other examples:
print("Print all elements in range(2, 10, 2): ")
for i in range(2, 10, 2):
    print(i)

print("Print all elements in range(0, -11, -3): ")
for i in range(0, -11, -3):
    print(i)
    

range(0, 4)
Print all elements in range(4): 
0
1
2
3
Print all elements in range(2, 10, 2): 
2
4
6
8
Print all elements in range(0, -11, -3): 
0
-3
-6
-9


Iterators allow you to traverse a container (e.g. a list), when you want to see what each element contains.

In [10]:
#1st approach
print("1st approach")    
colors = ["red", "blue", "green"]
for color in colors:
    print(color)
print("\n\n2nd approach")    
#2nd approach
for i in range(len(colors)):
    print(colors[i])

1st approach
red
blue
green


2nd approach
red
blue
green


### while loop

In [5]:
import math
# Description of other functions available in math module.
# https://docs.python.org/3/library/math.html
math.pow(2, 3)
tol = 0.1
diff = 1
k = 1
while(diff > tol):
    diff = math.e - abs(math.pow((1 + 1 / k), k))
    print(k, math.pow((1 + 1 / k), k), diff)
    k += 1

1 2.0 0.7182818284590451
2 2.25 0.4682818284590451
3 2.37037037037037 0.3479114580886753
4 2.44140625 0.2768755784590451
5 2.4883199999999994 0.22996182845904567
6 2.5216263717421135 0.19665545671693163
7 2.546499697040712 0.17178213141833298
8 2.565784513950348 0.1524973145086972
9 2.5811747917131984 0.13710703674584668
10 2.5937424601000023 0.12453936835904278
11 2.6041990118975287 0.11408281656151642
12 2.613035290224676 0.10524653823436925
13 2.6206008878857308 0.09768094057331433


### Continue
Sometimes you may want to skip a loop iteration. You could use continue statement for that. For example:

In [6]:
for i in range(11):
    if i % 3 == 0:
        continue
    else:
        print(i)

1
2
4
5
7
8
10


### Break
A loop (for and while) may be stopped using break statement.

In [5]:
import math
# Description of other functions available in math module.
# https://docs.python.org/3/library/math.html
math.pow(2, 3)
tol = 0
diff = 1
k = 1
while(diff > tol):
    diff = math.e - abs(math.pow((1 + 1 / k), k))
    print(k, math.pow((1 + 1 / k), k), diff)
    k += 1
    if k > 15:
        print("Value of tol (tolerance) is probably wrong... break.")
        break

1 2.0 0.7182818284590451
2 2.25 0.4682818284590451
3 2.37037037037037 0.3479114580886753
4 2.44140625 0.2768755784590451
5 2.4883199999999994 0.22996182845904567
6 2.5216263717421135 0.19665545671693163
7 2.546499697040712 0.17178213141833298
8 2.565784513950348 0.1524973145086972
9 2.5811747917131984 0.13710703674584668
10 2.5937424601000023 0.12453936835904278
11 2.6041990118975287 0.11408281656151642
12 2.613035290224676 0.10524653823436925
13 2.6206008878857308 0.09768094057331433
14 2.6271515563008685 0.0911302721581766
15 2.6328787177279187 0.08540311073112639
Value of tol (tolerance) is probably wrong... break.


## Functions

- Functions blocks begin with <B>def</B> word followed by function name and parentheses ()
- Any input arguments should be put in parenthesses
- The code block starts with <B>colon : </B>
- The statment return exits a function and optionally passing back expression to caller.


In [9]:
total=0
def sum( arg1, arg2 ):
    #Add both the parameters and return them."
    total = arg1 + arg2
    print("Inside the function : ", total)
    return total

sum(23,45)
print(total)

Inside the function :  68
0
