# SQL Databáze

##  Úvod

Databázové systémy slouží k ukládání dat. 

Data lze podle charakteru rozdělit do dvou základních kategorií:
- Homogenní data
- Heterogenní data

Homogeními daty rozumíme data, která mají stejnou strukturu narozdíl od heterogenních dat, kde je struktura odlišná.

Homogenní data lze přirovnat k tabulce v Excelu, která má definovaný počet sloupců a těchnto sloupcích jsou uloženy hodnoty.

Pro homogenní data lze použít relační databáze. Nemáme-li homogenní data, je nutné provést jejich homogenizace, chcete-li nebo musíte-li použít relační databáze.

**[SQL](https://www.w3schools.com/sql/)** je Standard (Structured) Query Language standardizovaný dotazovací jazyk, který je v různé míře implementován data

MySQL (+MariaDb), MS SQL, PostgreSQL
Databáze se skládá z tabulek, tabulky repezentují homogenní data. Mezi tabulkami jsou definovány relace (foreign keys).

Pro DBs (RDBMS) je definován jazyk SQL, který můžeme chápat jako množinu příkazů pro správu databáze a pro práci s daty.

## Pár úvah nad datovými toky
V případě velkých datových objemů není žádoucí a mnohdy ani možné zpracovávat datové celky. Pole (```list```, ```array``` apod.) je zpracováváno po prvcích.

In [3]:
# tento zpusob znate
data = [0, 1, 2, 3] # data
def oldFashioned(data, cislo): #funkce pracujici s polem hodnot a vracejici pole hodnot
    result = []
    for item in data:
        result.append(item + cislo)
    return result

prictenoOld = oldFashioned(data, 2)
print(data, '-->', prictenoOld)

[0, 1, 2, 3] --> [2, 3, 4, 5]


In [4]:
# tento zpusob jste se neucili (temer s jistotou)
def newWay(data, cislo):
    for item in data:
        yield item + cislo

result = newWay(data, 2)
print(result)

prictenoNew = list(result)
print(data, '-->', prictenoNew)    

<generator object newWay at 0x7fa7f519ac80>
[0, 1, 2, 3] --> [2, 3, 4, 5]


In [7]:
# tento zpusob jste se neucili (temer s jistotou)
def newWay(data, cislo):
    for item in data:
        print("I am ready to send partial result")
        yield item + cislo
        print("I sent partial result")

result = newWay(data, 2)
print(result)


itemFromResult = next(result)
print(itemFromResult)
itemFromResult = next(result)
print(itemFromResult)
itemFromResult = next(result)
print(itemFromResult)
itemFromResult = next(result)
print(itemFromResult)

print('-'*30)
for item in result:
    print(item)
print('-'*30)


<generator object newWay at 0x7fa7f519e200>
I am ready to send partial result
2
I sent partial result
I am ready to send partial result
3
I sent partial result
I am ready to send partial result
4
I sent partial result
I am ready to send partial result
5
------------------------------
I sent partial result
------------------------------


### Detailní demonstrace

In [8]:
def oldA(data, hodnota):
    result = []
    for item in data:
        print('old krat', item, hodnota)
        result.append(item * hodnota)
    return result

def oldB(data, hodnota):
    result = []
    for item in data:
        print('old plus', item, hodnota)
        result.append(item + hodnota)
    return result

def newA(data, hodnota):
    for item in data:
        print('new krat', item, hodnota)
        yield item * hodnota

def newB(data, hodnota):
    for item in data:
        print('new plus', item, hodnota)
        yield (item + hodnota)

inData = [0, 1, 2]
outDataOld = oldB(oldA(inData, 2), 3)
outDataNew = newB(newA(inData, 2), 3)
print('=' * 30)
print('=', 'Vysledky')
print('=' * 30)
print('outDataOld', outDataOld)
print('outDataNew', outDataNew)
print('outDataNew', list(outDataNew))

old krat 0 2
old krat 1 2
old krat 2 2
old plus 0 3
old plus 2 3
old plus 4 3
= Vysledky
outDataOld [3, 5, 7]
outDataNew <generator object newB at 0x7fa7f519e2e0>
new krat 0 2
new plus 0 3
new krat 1 2
new plus 2 3
new krat 2 2
new plus 4 3
outDataNew [3, 5, 7]


### Proč ```list()```

In [None]:
prictenoWOList = (newWay(data, 2))
print(data, '-->', prictenoWOList)    
print('Spustime vypocet')
pricteno2List = list(newWay(data, 2))
print(data, '-->', pricteno2List)    


[0, 1, 2, 3] --> <generator object newWay at 0x7fccd237b360>
Spustime vypocet
I am ready to send partial result
I sent partial result
I am ready to send partial result
I sent partial result
I am ready to send partial result
I sent partial result
I am ready to send partial result
I sent partial result
[0, 1, 2, 3] --> [2, 3, 4, 5]


### Generators
Funkce s výrazem ```yield``` jsou generátory. Používají se v mnoha programovacích jazycích, příkladem budiž Python, Javascript, C# a další. Výsledkem takové funkce není návratová hodnota ale generátor, což je objekt s definovanými vlastnosti a metodami. Jedna z jeho metod (typicky ```next```) umožňuje opakovaným voláním získat hodnoty tvořící zpracovávanou sekvenci.

V příkladu je funkce ```list``` použita na převod generátoru na seznam. Teprve v tuto chvíli dojde k výpočtu. Pečlivě si prostudujte následující řádky kódu. 

Pozor na vyčerpané generátory. Iterovat přes generátor lze pouze jednou. Jakmile zpracujete blok dat, který dodal generátor a přejdete na další blok dat, je předchozí blok dat zapomenut.

In [None]:
def demo(data, cislo):
    for item in data:
        print('pricitam', item, '+', cislo, '=', item + cislo)
        yield item + cislo

generator = demo(data, 2)
print('generator:', data, '-->', generator)    
print('Teprve ted spustime vypocet')
vysledek = list(generator)
print('výsledek', data, '-->', vysledek)        

generator: [0, 1, 2, 3] --> <generator object demo at 0x7fccd2c55db0>
Teprve ted spustime vypocet
pricitam 0 + 2 = 2
pricitam 1 + 2 = 3
pricitam 2 + 2 = 4
pricitam 3 + 2 = 5
výsledek [0, 1, 2, 3] --> [2, 3, 4, 5]


### Generators II
Generátory jsou fakticky stavové automaty. S výhodou je lze používat při definování posloupnosti akcí nad datovým tokem. Jsou základem operačních systémů (UNIX). 

Budiž funkce

$g_1(x)=x + 1$

$g_2(x)=x+2$

Obě funkce lze zobecnit pomocí funkce

$f(x, y)= x+ y$

neboť 

$g_1(x) = f(x, 1)$

$g_2(x)=f(x,2)$

Lze definovat funkcionál
$F$, pro který platí

$F(f, 2)=g_1$

$F(f, 2)=g_2$

Funkcionály jsou základem funkcionálního programování. Pro funkcionální programování je možné využít např. jazyka F#. Prvky funcionálního programování jsou ale dostupné (a mnohdy i využívané) v jazycích Python, Javascript, C#.

In [10]:
def f(x, y):
    return x + y

def F(f, x):
    def g(y):
        return f(x, y)
    return g

g1 = F(f, 1)
g2 = F(f, 2)

print('f(5, 10) =', f(5, 10))
print('g1(3) =', g1(3))
print('g2(3) =', g2(3))

f(5, 10) = 15
g1(3) = 4
g2(3) = 5


In [11]:
result = F(f, 3)(3)
print(result)

6


In [15]:
import pandas as pd
def displayData(data):
    df = pd.DataFrame(data)
    display(df)

### Příklad homogenních dat


In [16]:
dataStudents = [
  (1,'Monique Davis',400,'Literature','Monique@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (2,'Teri Gutierrez',800,'Programming','Teri@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (3,'Spencer Pautier',1000,'Programming','Spencer@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (4,'Louis Ramsey',1200,'Programming','Louis@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (5,'Alvin Greene',1200,'Programming','Alvin@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (6,'Sophie Freeman',1200,'Programming','Sophie@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (7,'Edgar Frank \"Ted\" \"Codd\"',2400,'Computer Science','Edgar@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
  (8,'Donald D. Chamberlin',2400,'Computer Science','Donald@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
  (9,'Raymond F. Boyce',2400,'Computer Science','Raymond@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56')]

In [17]:
dataContactInfo = [
  (1,'Monique.Davis@freeCodeCamp.org','555-555-5551',97111),
  (2,'Teri.Gutierrez@freeCodeCamp.org','555-555-5552',97112),
  (3,'Spencer.Pautier@freeCodeCamp.org','555-555-5553',97113),
  (4,'Louis.Ramsey@freeCodeCamp.org','555-555-5554',0),
  (5,'Alvin.Green@freeCodeCamp.org','555-555-5555',97115),
  (6,'Sophie.Freeman@freeCodeCamp.org','555-555-5556',97116),
  (7,'Maximo.Smith@freeCodeCamp.org','555-555-5557',97117),
  (8,'Michael.Roach@freeCodeCamp.ort','555-555-5558',97118)
]

In [18]:
def tuple2dictionary(names, item):
    result = {}
    for name, value in zip(names, item):
        result[name] = value
    return result

def tuple2dictionarySeq(names, sequence):
    for item in sequence:
        yield tuple2dictionary(names, item)

namesStudents = ['studentID', 'FullName', 'sat_score', 'programOfStudy', 'schoolEmailAdr', 'rcd_Created', 'rcd_Updated']
namesContacts = ['studentID', 'studentEmailAddr',  'student-phone-cell', 'student-US-zipcode']

namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)
namedContacts = tuple2dictionarySeq(namesContacts, dataContactInfo)

displayData(namedStudents)
displayData(namedContacts)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated
0,1,Monique Davis,400,Literature,Monique@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
1,2,Teri Gutierrez,800,Programming,Teri@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
2,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
3,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
4,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
5,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
6,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
7,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
8,9,Raymond F. Boyce,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56


Unnamed: 0,studentID,studentEmailAddr,student-phone-cell,student-US-zipcode
0,1,Monique.Davis@freeCodeCamp.org,555-555-5551,97111
1,2,Teri.Gutierrez@freeCodeCamp.org,555-555-5552,97112
2,3,Spencer.Pautier@freeCodeCamp.org,555-555-5553,97113
3,4,Louis.Ramsey@freeCodeCamp.org,555-555-5554,0
4,5,Alvin.Green@freeCodeCamp.org,555-555-5555,97115
5,6,Sophie.Freeman@freeCodeCamp.org,555-555-5556,97116
6,7,Maximo.Smith@freeCodeCamp.org,555-555-5557,97117
7,8,Michael.Roach@freeCodeCamp.ort,555-555-5558,97118


## Relační algebra

Operátory 
- Sjednocení, 
- Průnik, 
- Rozdíl (podobnost s operacemi nad množinami není náhodná)
---
- Selekce, (SQL ```where```)
- Projekce, (SQL ```select```)
- Kartézký součin, (SQL ```join```)
- Přejmenování, (SQL ```as```)


### Selekce s použitím Pythonu

In [21]:
def createSelectPartial(queryF):
    def inner(generator):
        result = []
        for item in generator:
            if queryF(item):
                result.append(item)
        return result
    return inner

condition = lambda item: item < 5
print(condition(4))
print(condition(5))

dataSequence = [0, 5, 4, 8, 3, 2]
selector = createSelectPartial(condition)
result = selector(dataSequence)
print(result)

True
False
[0, 4, 3, 2]


In [22]:
def createSelectFull(queryF, generator):
    result = []
    for item in generator:
        if queryF(item):
            result.append(item)
    return result

def createSelectPartial(queryF):
    def inner(generator):
        result = []
        for item in generator:
            if queryF(item):
                result.append(item)
        return result
    return inner

def createSelectPartialEx(queryF):
    def inner(generator):
        for item in generator:
            if queryF(item):
                yield item
    return inner

studentsData = list(tuple2dictionarySeq(namesStudents, dataStudents))
#newData = createSelectFull(lambda item: item['sat_score'] >= 1000, studentsData)
#namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)

mujFilter = createSelectPartialEx(lambda item: item['sat_score'] >= 1000)
filteredData = mujFilter(studentsData)

#filteredData = createSelectFull(lambda item: item['sat_score'] >= 1000, studentsData)

displayData(filteredData)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated
0,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
1,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
2,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
3,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
4,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
5,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
6,9,Raymond F. Boyce,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56


In [23]:
print(studentsData)

[{'studentID': 1, 'FullName': 'Monique Davis', 'sat_score': 400, 'programOfStudy': 'Literature', 'schoolEmailAdr': 'Monique@someOtherSchool.edu', 'rcd_Created': '2017-08-16 15:34:50', 'rcd_Updated': '2017-09-02 19:33:56'}, {'studentID': 2, 'FullName': 'Teri Gutierrez', 'sat_score': 800, 'programOfStudy': 'Programming', 'schoolEmailAdr': 'Teri@someOtherSchool.edu', 'rcd_Created': '2017-08-16 15:34:50', 'rcd_Updated': '2017-09-02 19:33:56'}, {'studentID': 3, 'FullName': 'Spencer Pautier', 'sat_score': 1000, 'programOfStudy': 'Programming', 'schoolEmailAdr': 'Spencer@someOtherSchool.edu', 'rcd_Created': '2017-08-16 15:34:50', 'rcd_Updated': '2017-09-02 19:33:56'}, {'studentID': 4, 'FullName': 'Louis Ramsey', 'sat_score': 1200, 'programOfStudy': 'Programming', 'schoolEmailAdr': 'Louis@someOtherSchool.edu', 'rcd_Created': '2017-08-16 15:34:50', 'rcd_Updated': '2017-09-02 19:33:56'}, {'studentID': 5, 'FullName': 'Alvin Greene', 'sat_score': 1200, 'programOfStudy': 'Programming', 'schoolEmail

In [None]:
def createSelect(queryF):
    def selectF(generator):
        return filter(queryF, generator)
    return selectF

def createSelectEx(queryF): # stejne jako createSelect
    def selectF(generator):
        return (item for item in generator if queryF(item)) # viz https://docs.python.org/3/howto/functional.html
    return selectF

def createSelect2(queryF):
    def selectF(generator):
        for item in generator:
            if queryF(item):
                yield item
    return selectF

studentsWithHighScoreSelection = createSelect(lambda item: item['sat_score'] >= 1000)
namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)

subsetResult = studentsWithHighScoreSelection(namedStudents)
displayData(subsetResult)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated
0,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
1,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
2,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
3,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
4,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
5,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
6,9,Raymond F. Boyce,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56


### Projekce s využitím Pythonu

In [24]:
def createProjection(mapF):
    def projectionF(generator):
        for item in generator:
            yield mapF(item)
    return projectionF

dataSequence = [0, 1, 2, 3, 4]
myProjection = createProjection(lambda x: x * x)
result = myProjection(dataSequence)
print(list(result))

[0, 1, 4, 9, 16]


In [26]:
from functools import partial

def createProjection(mapF):
    def projectionF(generator):
        for item in generator:
            yield mapF(item)
    return projectionF


#========================================
# Toto muze byt srozumitelnejsi
def createProjectionAll(mapF, generator):
    for item in generator:
        yield mapF(item)

def createProjectionSpec(mapF):
    return partial(createProjectionAll, mapF)
#========================================


def createProjectionEx(mapF): #stejne jako createProjection
    def projectionF(generator):
        return (mapF(item) for item in generator) #viz https://docs.python.org/3/howto/functional.html / Generator expressions and list comprehensions
    return projectionF

def createProjection2(names):
    def projectionF(generator):
        for item in generator:
            result = {}
            for name in names:
                result[name] = item[name]
            yield result
    return projectionF

studentsData = list(tuple2dictionarySeq(namesStudents, dataStudents))
displayData(studentsData)

myProjection = createProjection2(['studentID', 'FullName'])
result = myProjection(studentsData)
displayData(result)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated
0,1,Monique Davis,400,Literature,Monique@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
1,2,Teri Gutierrez,800,Programming,Teri@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
2,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
3,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
4,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
5,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
6,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
7,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
8,9,Raymond F. Boyce,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56


Unnamed: 0,studentID,FullName
0,1,Monique Davis
1,2,Teri Gutierrez
2,3,Spencer Pautier
3,4,Louis Ramsey
4,5,Alvin Greene
5,6,Sophie Freeman
6,7,"Edgar Frank ""Ted"" ""Codd"""
7,8,Donald D. Chamberlin
8,9,Raymond F. Boyce


In [None]:
namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)

#someStudentsColumns = createProjection2(['studentID', 'FullName'])
someStudentsColumns = createProjection(lambda item: {'studentID': item['studentID'], 'FullName': item['FullName'], 'demo': item['FullName'] + item['FullName'] })
someColumnsResult = someStudentsColumns(namedStudents)
displayData(someColumnsResult)

#namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)
#someStudentsColumns2 = createProjection2(['studentID', 'FullName', 'programOfStudy'])
#someColumnsResult2 = someStudentsColumns2(namedStudents)
#displayData(someColumnsResult2)

Unnamed: 0,studentID,FullName,demo
0,1,Monique Davis,Monique DavisMonique Davis
1,2,Teri Gutierrez,Teri GutierrezTeri Gutierrez
2,3,Spencer Pautier,Spencer PautierSpencer Pautier
3,4,Louis Ramsey,Louis RamseyLouis Ramsey
4,5,Alvin Greene,Alvin GreeneAlvin Greene
5,6,Sophie Freeman,Sophie FreemanSophie Freeman
6,7,"Edgar Frank ""Ted"" ""Codd""","Edgar Frank ""Ted"" ""Codd""Edgar Frank ""Ted"" ""Codd"""
7,8,Donald D. Chamberlin,Donald D. ChamberlinDonald D. Chamberlin
8,9,Raymond F. Boyce,Raymond F. BoyceRaymond F. Boyce


### Přejmenování s využitím Pythonu

In [None]:
def createRename(names):
    def renameF(generator):
        for item in generator:
            result = {**item}
            for old, new in names:
                del result[old]
                result[new] = item[old]
        yield result
    return renameF

namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)
renamedColumnsStudents = createRename([('studentID', 'id'), ('FullName', 'name')])
renamedColumnsResults = renamedColumnsStudents(namedStudents)
displayData(renamedColumnsResults)



Unnamed: 0,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated,id,name
0,400,Literature,Monique@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,1,Monique Davis
1,800,Programming,Teri@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,2,Teri Gutierrez
2,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,3,Spencer Pautier
3,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,4,Louis Ramsey
4,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,5,Alvin Greene
5,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,6,Sophie Freeman
6,2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,7,"Edgar Frank ""Ted"" ""Codd"""
7,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,8,Donald D. Chamberlin
8,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,9,Raymond F. Boyce


### Kartézský součin s využitím Pythonu



In [27]:
from itertools import product
#https://docs.python.org/3/library/itertools.html#itertools.product
def createCartesian(joinF):
    def cartesian(firstG, secondG):
        return map(lambda item: {**item[0], **item[1]}, filter(joinF, product(firstG, secondG)))
    return cartesian

def createCartesianEx(joinF): #stejne jako createCartesian
    def cartesian(firstG, secondG):
        result = ({**f, **g} for f in firstG for g in secondG if joinF((f, g)))
        return result
    return cartesian

def createCartesianReadable(joinF):#  stejne jako createCartesian
    def cartesian(firstG, secondG):
        for f in firstG:
            for g in secondG:
                if (joinF((f, g))):
                    yield {**f, **g}
    return cartesian

namedStudents = list(tuple2dictionarySeq(namesStudents, dataStudents))
namedContacts = list(tuple2dictionarySeq(namesContacts, dataContactInfo))

joinF = lambda item: item[0]['studentID'] == item[1]['studentID']
cartesian = createCartesianReadable(joinF)

cartesianResult = cartesian(namedStudents, namedContacts)
displayData(cartesianResult)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated,studentEmailAddr,student-phone-cell,student-US-zipcode
0,1,Monique Davis,400,Literature,Monique@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Monique.Davis@freeCodeCamp.org,555-555-5551,97111
1,2,Teri Gutierrez,800,Programming,Teri@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Teri.Gutierrez@freeCodeCamp.org,555-555-5552,97112
2,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Spencer.Pautier@freeCodeCamp.org,555-555-5553,97113
3,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Louis.Ramsey@freeCodeCamp.org,555-555-5554,0
4,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Alvin.Green@freeCodeCamp.org,555-555-5555,97115
5,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Sophie.Freeman@freeCodeCamp.org,555-555-5556,97116
6,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,Maximo.Smith@freeCodeCamp.org,555-555-5557,97117
7,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,Michael.Roach@freeCodeCamp.ort,555-555-5558,97118


 ## MySQL

 V prostředí MySQL (viz první stack) s pomocí phpMyAdmin spusťte následující SQL příkaz převzato [odtud](https://github.com/SteveChevalier/Distilling-Data/blob/master/schema_data_01_Student%20Schema%20and%20Data.sql)
```sql
-- ---------------------------------------------------
-- Part I - Create and Load Student Schema
-- ---------------------------------------------------
-- Create Schema (database) and set as default
CREATE DATABASE IF NOT EXISTS `student_examples`;
USE `student_examples`;

-- create student and student contact tables
DROP TABLE IF EXISTS `student`; 
CREATE TABLE `student` (
  `studentID` int(11) NOT NULL AUTO_INCREMENT,
  `FullName` text,
  `sat_score` int(11) DEFAULT NULL,
  `programOfStudy` text,
  `schoolEmailAdr` text,
  `rcd_Created` datetime DEFAULT CURRENT_TIMESTAMP,
  `rcd_Updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`studentID`));
  
DROP TABLE IF EXISTS `student-contact-info`;
CREATE TABLE `student-contact-info` (
  `studentID` int(11) DEFAULT NULL,
  `studentEmailAddr` text,
  `student-phone-cell` text,
  `student-US-zipcode` int(11) DEFAULT NULL);

-- Load data
INSERT INTO `student` 
	VALUES (1,'Monique Davis',400,'Literature','Monique@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
		(2,'Teri Gutierrez',800,'Programming','Teri@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (3,'Spencer Pautier',1000,'Programming','Spencer@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (4,'Louis Ramsey',1200,'Programming','Louis@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (5,'Alvin Greene',1200,'Programming','Alvin@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (6,'Sophie Freeman',1200,'Programming','Sophie@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (7,'Edgar Frank \"Ted\"\" Codd\"',2400,'Computer Science','Edgar@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
        (8,'Donald D. Chamberlin',2400,'Computer Science','Donald@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
        (9,'Raymond F. Boyce',2400,'Computer Science','Raymond@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56');

INSERT INTO `student-contact-info` 
	VALUES (1,'Monique.Davis@freeCodeCamp.org','555-555-5551',97111),
    (2,'Teri.Gutierrez@freeCodeCamp.org','555-555-5552',97112),
    (3,'Spencer.Pautier@freeCodeCamp.org','555-555-5553',97113),
    (4,'Louis.Ramsey@freeCodeCamp.org','555-555-5554',0),
    (5,'Alvin.Green@freeCodeCamp.org','555-555-5555',97115),
    (6,'Sophie.Freeman@freeCodeCamp.org','555-555-5556',97116),
    (7,'Maximo.Smith@freeCodeCamp.org','555-555-5557',97117),
    (8,'Michael.Roach@freeCodeCamp.ort','555-555-5558',97118);


-- end Part I schema create and data import
```

### Selekce s využitím SQL
```sql
select * from `student` where `student`.`sat_score` >= 1000
```

### Projekce s využitím SQL
```sql
select `studentID`, `FullName` from `student`
```

### Přejmenování s využitím SQL
```sql
select `student`.`sat_score`, `student`.`programOfStudy`,
  `student`.`schoolEmailAdr`, `student`.`rcd_Created`, 
  `student`.`rcd_Updated`, `student`.`studentID` as `id`,
  `student`.`FullName` as `name` from `student`
```


### Kartézský součin s využitím SQL
```sql

```

## Role API

### Fast API

> **Doporučené video**
>
> [FastAPI Introduction - Build Your First Web App - Python Tutorial 12 minut](https://www.youtube.com/watch?v=0RS9W8MtZe4)
>
> [Let's Build a Fast, Modern Python API with FastAPI 1,5 h](https://www.youtube.com/watch?v=sBVb4IB3O_U)

Fast API má jednu obrovskou výhodu oproti obdobným systémům / frameworkům. Touto výhodou je automatická publikace popisu API ve formě **[Swagger](https://swagger.io/)** dokumentu.
Díky Swagger (nebo OpenAPI) je možné využít [celou řadu nástrojů](https://swagger.io/tools/swagger-codegen/) pro generování klientů tvořeného API.

https://fastapi.tiangolo.com/tutorial/sql-databases/

#### SQL Alchemy

https://github.com/LeeBergstrand/Jupyter-SQLAlchemy-Tutorial/blob/master/Jupyter-SQLAlchemy.ipynb

In [41]:
#https://docs.sqlalchemy.org/en/13/orm/tutorial.html
#https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, BigInteger, Sequence, Table, ForeignKey, DateTime
from sqlalchemy.orm import relationship

BaseModel = declarative_base()

#### Models

In [42]:
unitedSequence = Sequence('all_id_seq')

class UserModel(BaseModel):
    __tablename__ = 'users'

    #id = Column(BigInteger, Sequence('users_id_seq'), primary_key=True)
    id = Column(BigInteger, unitedSequence, primary_key=True)
    name = Column(String)

    def __init__(self, name):
        self.name = name
        
class UserGroupModel(BaseModel):
    __tablename__ = 'usergroups'

    id = Column(BigInteger, unitedSequence, primary_key=True)
    user_id = Column(BigInteger, ForeignKey('users.id'), index=True)
    group_id = Column(BigInteger, ForeignKey('groups.id'), index=True)
    
    #user = relationship('UserModel', uselist=False, back_populates='groups', primaryjoin=user_id==UserModel.id)
    group = relationship('GroupModel', uselist=False, back_populates='users')#, primaryjoin=authorization_id==AuthorizationModel.id)
    

class GroupModel(BaseModel):
    __tablename__ = 'groups'
    
    id = Column(BigInteger, unitedSequence, primary_key=True)
    name = Column(String)
    
    users = relationship('UserGroupModel', back_populates='group', lazy='dynamic', primaryjoin=id==UserGroupModel.group_id)
        

#### Schemas

In [39]:
from typing import List, Optional

from pydantic import BaseModel as BaseSchema

class UserCreateSchema(BaseSchema):
    name: str
        
class UserIdSchema(UserCreateSchema):
    id: int

class UserGetSchema(BaseSchema):
    id: int
    name: str
    class Config:
        orm_mode = True #ensures appropriate translation from SQLAlchemy 
    pass

class UserPutSchema(BaseSchema):
    id: int
    name: str


#### Engine Init

In [None]:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
#engine = create_engine('sqlite:///:memory:', echo=True)
#engine = create_engine('postgresql+psycopg2://user:password@hostname/database_name')
engine = create_engine('postgresql+psycopg2://postgres:example@postgres/jupyterII') 
Session = sessionmaker(bind=engine)
session = Session()
BaseModel.metadata.drop_all(engine)
BaseModel.metadata.create_all(engine)

#### CRUD Ops

In [44]:
def crudUserGet(db: Session, id: int):
    return db.query(UserModel).filter(UserModel.id==id).first()

def crudUserGetAll(db: Session, skip: int = 0, limit: int = 100):
    return db.query(UserModel).offset(skip).limit(limit).all()

def crudUserCreate(db: Session, user: UserCreateSchema):
    userRow = UserModel(name=user.name)
    db.add(userRow)
    db.commit()
    db.refresh(userRow)
    return userRow

def crudUserUpdate(db: Session, user):
    userToUpdate = db.query(UserModel).filter(UserModel.id==user.id).first()
    userToUpdate.name = user.name if user.name else userToUpdate.name
    db.commit()
    db.refresh(userToUpdate)
    return userToUpdate

NameError: name 'Session' is not defined

#### Test

In [None]:
import random
import string

def get_random_string(length):
    letters = string.ascii_lowercase
    result = ''.join(random.choice(letters) for i in range(length))
    return result 

def PopulateUsers(count=10):
    for i in range(count):
        crudUserCreate(db=session, user=UserModel(name='user_' + get_random_string(20)))
        
PopulateUsers(10)

In [None]:
usersData = list(crudUserGetAll(db=session))
for index, userRow in enumerate(usersData):
    row = crudUserGet(db=session, id=userRow.id)
    print(index, '\t', row.id, row.name)

#### Server

In [28]:
!pip install uvicorn
!pip install fastapi
!pip install wait4it



#### Minimal Code

In [34]:
import uvicorn
from fastapi import FastAPI

app = FastAPI()#root_path='/api')

def run():
    uvicorn.run(app, port=9993, host='0.0.0.0', root_path='')

#### Helper Func for Notebook

In [35]:
# Code in this cell is just for (re)starting the API on a Process, and other compatibility stuff with Jupyter cells.
# Just ignore it!

from multiprocessing import Process
from wait4it import wait_for

_api_process = None

def start_api(runNew=True):
    """Stop the API if running; Start the API; Wait until API (port) is available (reachable)"""
    global _api_process
    if _api_process:
        _api_process.terminate()
        _api_process.join()
    
    if runNew:
        _api_process = Process(target=run, daemon=True)
        _api_process.start()
        wait_for(port=9993)

def delete_route(method: str, path: str):
    """Delete the given route from the API. This must be called on cells that re-define a route"""
    [app.routes.remove(route) for route in app.routes if method in route.methods and route.path == path]
    

In [31]:
def delete_all_routes():
    rr = [*app.routes]
    for item in rr:
        app.routes.remove(item)

#### First API Endpoint

In [36]:
@app.get("/api")
def get_root():
    return {"Hello": "World"}

start_api()

INFO:     Started server process [669]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9993 (Press CTRL+C to quit)


INFO:     172.17.0.1:47700 - "GET / HTTP/1.1" 404 Not Found
INFO:     172.17.0.1:47898 - "GET /api HTTP/1.1" 200 OK


INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [669]


In [38]:
# Get with "surname" param only
import requests

r = requests.get("http://localhost:9992/api")
print("Status code:", r.status_code)
print("Response:", r.json())

Status code: 200
Response: {'Hello': 'World'}


In [37]:
start_api(False)

#### Database CRUD Endpoint

In [40]:
#delete_all_routes()

@app.get("/users/{id}", response_model=UserGetSchema)
#@app.get("/users/{id}")
def userGet(id: int):
    #result = crudUserGet(db=session, id=id)
    result = {'id': id, 'name': 'Hrbolek', 'password': 'extraultrahesozahesovane'}
    return result

@app.get("/users", response_model=List[UserGetSchema])
def userGetAll(skip: Optional[int]=0, limit: Optional[int]=100):
    #result = crudUserGetAll(db=session, skip=skip, limit=limit)
    #return result
    pass

@app.post("/users")#, response_model=UserIdSchema)
def userPost(user: UserCreateSchema):
    #print('userPut')
    #return crudUserCreate(db=session, user=user)
    pass

@app.put("/users", response_model=UserGetSchema)
def userPut(user: UserPutSchema):
    #result = crudUserUpdate(db=session, user=user)
    #return result
    pass

start_api()

INFO:     Started server process [691]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9993 (Press CTRL+C to quit)


INFO:     172.17.0.1:49872 - "GET /users/45 HTTP/1.1" 200 OK
INFO:     172.17.0.1:50234 - "GET /users/56 HTTP/1.1" 200 OK
INFO:     172.17.0.1:51720 - "GET /docs HTTP/1.1" 200 OK
INFO:     172.17.0.1:51720 - "GET /openapi.json HTTP/1.1" 200 OK
INFO:     172.17.0.1:52160 - "GET /users/78 HTTP/1.1" 200 OK
INFO:     172.17.0.1:52346 - "GET /openapi.json HTTP/1.1" 200 OK


INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [691]


In [41]:
start_api(False)

## Databázove klastry

Pro potřeby výkoných databází se slučují jednotlivé servery do klastrů (clusters). Databáze toto slučování podporují různým způsobem.

V souvislosti s klastry je možné setkat se s pojmy
- Replikace (replication)
- (Connection Pooling)
- Vyvažování záteže (Load Balancing)
- Dotaz nad více servery (Query Partitioning)

PostgreSQL
- https://www.postgresql.org/docs/9.5/creating-cluster.html
- https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

MySQL
- https://www.digitalocean.com/community/tutorials/how-to-create-a-multi-node-mysql-cluster-on-ubuntu-18-04
- https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster.html

MSSQL
- https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/install/create-a-new-sql-server-failover-cluster-setup?view=sql-server-ver15


## LINQ

LINQ je jazyk integrovaný do .NET. Umožňuje použít Relační algebru nad zdroji v rámci programovacího jazyka.

C#:

https://docs.microsoft.com/cs-cz/dotnet/csharp/programming-guide/concepts/linq/

Visual Basic:

https://docs.microsoft.com/cs-cz/dotnet/visual-basic/programming-guide/language-features/linq/introduction-to-linq

Velmi často se používá v .net core pro přístup k databázi.

## Normální formy

> **Doporučené video**
>
> https://www.youtube.com/watch?v=7B9FnIIIsQc
> 
> https://www.youtube.com/watch?v=xoTyrdT9SZI

# NoSQL Databáze

MongoDb, CouchDb jsou databáze, které pracují s dokumenty / datovými strukturami (např. JSON).

https://www.freelancinggig.com/blog/2018/04/19/couchdb-vs-mongodb-understanding-difference/

Speciálními případy jsou databáze typu Neo4j nebo Redis.

> **Přečíst povinně**
>
> https://en.wikipedia.org/wiki/Graph_database

> **Přečíst volitelně**
>
> https://neo4j.com/download-center/?ref=web-product-database/#community
>
> https://redis.io/

## Vsuvka a asynchronnímu programování

> **Doporučené video**
>
> [Raymond Hettinger, Keynote on Concurrency, PyBay 2017 1h 14min](https://www.youtube.com/watch?v=9zinZmE3Ogk)

Inspirováno / převzato z https://pybay.com/site_media/slides/raymond2017-keynote/threading.html

In [42]:
counter = 0

def worker():
    global counter
    oldValue = counter
    counter = oldValue + 1
    
    
for i in range(10):
    worker()
    
print('final value is', counter)

final value is 10


In [43]:
import threading

counter = 0

def worker():
    global counter
    oldValue = counter
    counter = oldValue + 1
    
    
for i in range(10):
    threading.Thread(target=worker).start()
    
print('final value is', counter)

final value is 10


In [45]:
import threading
import time
import random

def fuzzIt():
    time.sleep(random.randint(1, 5))

counter = 0

def worker():
    global counter
    fuzzIt()
    oldValue = counter
    fuzzIt()
    counter = oldValue + 1
    fuzzIt()
    
    
for i in range(10):
    threading.Thread(target=worker).start()
    
print('final value is', counter)

In [47]:
print('final value is', counter)

final value is 3


In [52]:
import asyncio
import time

def mS(start=0):
    return time.time() - start

async def execute():
    await asyncio.sleep(1)
    return 2

result = execute()
print(result)

start = mS()
awaitedResult = await result
end = mS(start)
print(awaitedResult)
print('elapsed', end)

<coroutine object execute at 0x7fa7bba52f40>
2
elapsed 1.0011518001556396


In [56]:
def fuzzIt():
    time.sleep(random.randint(1, 5))


counter = 0
async def execute2():
    fuzzIt()
    await asyncio.sleep(1)
    fuzzIt()
    global counter
    fuzzIt()
    oldValue = counter
    fuzzIt()
    counter = oldValue + 1
    
    
tasks = []
for i in range(10):
    tasks.append(execute2())
    
start = mS()
results = await asyncio.gather(*tasks)
end = mS(start)
print('final value', counter)
print('elapsed', end)

final value 10
elapsed 136.11234211921692


## MongoDB

https://motor.readthedocs.io/en/stable/tutorial-asyncio.html

> **Dopo video**
>
> [MongoDB with Python Crash Course - Tutorial for Beginners 2h](https://www.youtube.com/watch?v=E-1xI85Zog8)

In [3]:
!pip install motor

Collecting motor
  Downloading motor-2.3.1-py3-none-any.whl (53 kB)
[K     |████████████████████████████████| 53 kB 53 kB/s  eta 0:00:011
[?25hCollecting pymongo<4,>=3.11
  Downloading pymongo-3.11.3-cp38-cp38-manylinux2014_x86_64.whl (531 kB)
[K     |████████████████████████████████| 531 kB 7.2 MB/s eta 0:00:01
[?25hInstalling collected packages: pymongo, motor
Successfully installed motor-2.3.1 pymongo-3.11.3


In [18]:
import getpass
mongoPassword = getpass.getpass()

 ·········


In [4]:
import motor.motor_asyncio

In [5]:
client = motor.motor_asyncio.AsyncIOMotorClient('mongo', 27017)

In [6]:
print(client)

AsyncIOMotorClient(MongoClient(host=['mongo:27017'], document_class=dict, tz_aware=False, connect=False, driver=DriverInfo(name='Motor', version='2.3.1', platform='asyncio')))


In [13]:
import pandas as pd

def displayData(data):
    df = pd.DataFrame(data)
    display(df)

### Connection

In [20]:
from pymongo import MongoClient
client = MongoClient('mongodb://%s:%s@192.168.1.6:27017' % ('root', mongoPassword))
db = client.admin
serverStatusResult = db.command("serverStatus")
displayData(serverStatusResult)

Unnamed: 0,host,version,process,pid,uptime,uptimeMillis,uptimeEstimate,localTime,asserts,connections,...,storageEngine,tcmalloc,trafficRecording,transactions,transportSecurity,twoPhaseCommitCoordinator,wiredTiger,mem,metrics,ok
regular,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,0.0,,...,,,,,,,,,,1.0
warning,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,0.0,,...,,,,,,,,,,1.0
msg,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,0.0,,...,,,,,,,,,,1.0
user,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,0.0,,...,,,,,,,,,,1.0
rollovers,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,0.0,,...,,,,,,,,,,1.0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
operation,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,,,...,,,,,,,,,"{'scanAndOrder': 0, 'writeConflicts': 0}",1.0
queryExecutor,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,,,...,,,,,,,,,"{'scanned': 0, 'scannedObjects': 0}",1.0
record,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,,,...,,,,,,,,,{'moves': 0},1.0
repl,mongo,4.2.7,mongod,1,172051.0,172051422,172051,2021-04-08 08:33:51.495,,,...,,,,,,,,,"{'executor': {'pool': {'inProgressCount': 0}, ...",1.0


### Connection Async

In [21]:
#uri = "mongodb://user:pass@localhost:27017/database_name"
uri = f"mongodb://root:{mongoPassword}@192.168.1.6:27017"
client = motor.motor_tornado.MotorClient(uri)
print(client)
dbs = list(await client.list_database_names())
print(dbs)

MotorClient(MongoClient(host=['192.168.1.6:27017'], document_class=dict, tz_aware=False, connect=False, driver=DriverInfo(name='Motor', version='2.3.1', platform='Tornado 6.1')))
['admin', 'calendar', 'config', 'internetArticles', 'local']


In [22]:
db = client['test_database']

In [23]:
result = await db.create_collection('test_collection')
print(result)

MotorCollection(Collection(Database(MongoClient(host=['192.168.1.6:27017'], document_class=dict, tz_aware=False, connect=False, driver=DriverInfo(name='Motor', version='2.3.1', platform='Tornado 6.1')), 'test_database'), 'test_collection'))


### Create

In [34]:
collection = db['test_collection']

async def do_insert():
    for i in range(10):
        document = {'key': 'value', 'i': i}
        result = await db.test_collection.insert_one(document)
        print('result %s' % result)

await do_insert()

AttributeError: 'InsertOneResult' object has no attribute 'id'

### Read

In [35]:
async def do_find_one():
    document = await db.test_collection.find_one({'i': {'$lt': 5}})
    print(document)
    
await do_find_one()

{'_id': ObjectId('606ec0a26b84c76e2a735461'), 'key': 'value', 'i': 0}


In [36]:
async def do_find():
    cursor = db.test_collection.find({'i': {'$lt': 5}}).sort('i')
    documents = await cursor.to_list(length=100)
    for document in documents:
        print(document)
        
await do_find()

{'_id': ObjectId('606ec0a26b84c76e2a735461'), 'key': 'value', 'i': 0}
{'_id': ObjectId('606ecb166b84c76e2a735462'), 'key': 'value', 'i': 0}
{'_id': ObjectId('606ecb4f6b84c76e2a735463'), 'key': 'value', 'i': 0}
{'_id': ObjectId('606ecb586b84c76e2a73546d'), 'key': 'value', 'i': 0}
{'_id': ObjectId('606ecb4f6b84c76e2a735464'), 'key': 'value', 'i': 1}
{'_id': ObjectId('606ecb4f6b84c76e2a735465'), 'key': 'value', 'i': 2}
{'_id': ObjectId('606ecb4f6b84c76e2a735466'), 'key': 'value', 'i': 3}
{'_id': ObjectId('606ecb4f6b84c76e2a735467'), 'key': 'value', 'i': 4}


In [37]:
async def do_findII():
    c = db.test_collection
    async for document in c.find({'i': {'$lt': 2}}):
        print(document)
        
await do_findII()

{'_id': ObjectId('606ec0a26b84c76e2a735461'), 'key': 'value', 'i': 0}
{'_id': ObjectId('606ecb166b84c76e2a735462'), 'key': 'value', 'i': 0}
{'_id': ObjectId('606ecb4f6b84c76e2a735463'), 'key': 'value', 'i': 0}
{'_id': ObjectId('606ecb4f6b84c76e2a735464'), 'key': 'value', 'i': 1}
{'_id': ObjectId('606ecb586b84c76e2a73546d'), 'key': 'value', 'i': 0}


In [38]:
async def do_findIII():
    cursor = db.test_collection.find({'i': {'$lt': 4}})
    # Modify the query before iterating
    cursor.sort('i', -1).skip(1).limit(2)
    async for document in cursor:
        print(document)
        
await do_findIII()

{'_id': ObjectId('606ecb4f6b84c76e2a735465'), 'key': 'value', 'i': 2}
{'_id': ObjectId('606ecb4f6b84c76e2a735464'), 'key': 'value', 'i': 1}


### Counting

In [39]:
async def do_count():
    n = await db.test_collection.count_documents({})
    print('%s documents in collection' % n)
    n = await db.test_collection.count_documents({'i': {'$gt': 1000}})
    print('%s documents where i > 1000' % n)
    
await do_count()

15 documents in collection
0 documents where i > 1000


### Update

In [41]:
async def do_replace():
    coll = db.test_collection
    old_document = await coll.find_one({'i': 9})
    print('found document: %s' % old_document)
    _id = old_document['_id']
    result = await coll.replace_one({'_id': _id}, {'key': 'newValue'})
    print('replaced %s document' % result.modified_count)
    new_document = await coll.find_one({'_id': _id})
    print('document is now %s' % new_document)
    
await do_replace()

found document: {'_id': ObjectId('606ecb4f6b84c76e2a73546c'), 'key': 'value', 'i': 9}
replaced 1 document
document is now {'_id': ObjectId('606ecb4f6b84c76e2a73546c'), 'key': 'newValue'}


In [43]:
async def do_update():
    coll = db.test_collection
    result = await coll.update_one({'i': 8}, {'$set': {'key': 'replacedValue'}})
    print('updated %s document' % result.modified_count)
    new_document = await coll.find_one({'i': 8})
    print('document is now %s' % new_document)
    
await do_update()

updated 1 document
document is now {'_id': ObjectId('606ecb4f6b84c76e2a73546b'), 'key': 'replacedValue', 'i': 8}


## Map / Reduce

Map - mapování, porvedení funkcí nad datovou strukturou je paralelizovatelný proces. 

In [47]:
import random
import string

def randomStr(lenght=10):
    return (''.
        join(random.choice(string.ascii_uppercase) for _ in range(lenght)))

print(randomStr())

UKAKILZUNA


### Map

In [54]:
def intoDict(number):
    return {'id': number}

dataSequence = map(intoDict, range(10))
print(dataSequence)
mappedResult = list(dataSequence)
print(mappedResult)
displayData(mappedResult)

<map object at 0x7fbdfec81d30>
[{'id': 0}, {'id': 1}, {'id': 2}, {'id': 3}, {'id': 4}, {'id': 5}, {'id': 6}, {'id': 7}, {'id': 8}, {'id': 9}]


Unnamed: 0,id
0,0
1,1
2,2
3,3
4,4
5,5
6,6
7,7
8,8
9,9


In [55]:
def intoDict(number):
    return {'id': number}

def createName(item):
    return {**item, 'name': randomStr()}

dataSequence = map(intoDict, range(10))
mapped = map(createName, dataSequence)
print(mapped)
mappedResult = list(mapped)
print(mappedResult)
displayData(mappedResult)

<map object at 0x7fbdfec81940>
[{'id': 0, 'name': 'PPBNZSXFUC'}, {'id': 1, 'name': 'BUIPWPANKG'}, {'id': 2, 'name': 'URGJAUCNFP'}, {'id': 3, 'name': 'BWXHVSDVUA'}, {'id': 4, 'name': 'MCSWMPMUDM'}, {'id': 5, 'name': 'PAHOFSRKJE'}, {'id': 6, 'name': 'CTBDJRQLYE'}, {'id': 7, 'name': 'FYUTJCZIIE'}, {'id': 8, 'name': 'YDWVECAIMP'}, {'id': 9, 'name': 'OVDYUDUQVQ'}]


Unnamed: 0,id,name
0,0,PPBNZSXFUC
1,1,BUIPWPANKG
2,2,URGJAUCNFP
3,3,BWXHVSDVUA
4,4,MCSWMPMUDM
5,5,PAHOFSRKJE
6,6,CTBDJRQLYE
7,7,FYUTJCZIIE
8,8,YDWVECAIMP
9,9,OVDYUDUQVQ


In [62]:
def funcReduce(*funcList):
    def result(item):
        resultItem = item
        for func in funcList:
            resultItem = func(resultItem)
        return resultItem
    return result

In [63]:
def intoDict(number):
    return {'id': number}

def createName(item):
    return {**item, 'name': randomStr()}

allOps = funcReduce(intoDict, createName)
mapped = map(allOps, range(10))
print(mapped)
mappedResult = list(mapped)
print(mappedResult)
displayData(mappedResult)

<map object at 0x7fbdb77608b0>
[{'id': 0, 'name': 'JRAYARVUCW'}, {'id': 1, 'name': 'RILXEXAGEL'}, {'id': 2, 'name': 'AYVEZOUFBU'}, {'id': 3, 'name': 'BHNTPZFRIT'}, {'id': 4, 'name': 'FPHUTBCVKG'}, {'id': 5, 'name': 'EHYHNORDPH'}, {'id': 6, 'name': 'NEXAXNORXT'}, {'id': 7, 'name': 'CISXFKUGVO'}, {'id': 8, 'name': 'EUFYCOTYNU'}, {'id': 9, 'name': 'KCHUQFXCYP'}]


Unnamed: 0,id,name
0,0,JRAYARVUCW
1,1,RILXEXAGEL
2,2,AYVEZOUFBU
3,3,BHNTPZFRIT
4,4,FPHUTBCVKG
5,5,EHYHNORDPH
6,6,NEXAXNORXT
7,7,CISXFKUGVO
8,8,EUFYCOTYNU
9,9,KCHUQFXCYP


### Filter

In [56]:
def justSome(item):
    return item['name'] < 'C'

filteredData = filter(justSome, mappedResult)
print(filteredData)
filteredResult = list(filteredData)
print(filteredResult)
displayData(filteredResult)

<filter object at 0x7fbdfec81c70>
[{'id': 1, 'name': 'BUIPWPANKG'}, {'id': 3, 'name': 'BWXHVSDVUA'}]


Unnamed: 0,id,name
0,1,BUIPWPANKG
1,3,BWXHVSDVUA


### Reduce

In [60]:
from functools import reduce

def count(acc, item):
    return acc + 1

result = reduce(count, filteredResult, 0)
print(result)

2
