# SQL Databáze

##  Úvod

Databázové systémy slouží k ukládání dat. 

Data lze podle charakteru rozdělit do dvou základních kategorií:
- Homogenní data
- Heterogenní data

Homogeními daty rozumíme data, která mají stejnou strukturu narozdíl od heterogenních dat, kde je struktura odlišná.

Homogenní data lze přirovnat k tabulce v Excelu, která má definovaný počet sloupců a těchnto sloupcích jsou uloženy hodnoty.

Pro homogenní data lze použít relační databáze. Nemáme-li homogenní data, je nutné provést jejich homogenizace, chcete-li nebo musíte-li použít relační databáze.

**[SQL](https://www.w3schools.com/sql/)** je Standard (Structured) Query Language standardizovaný dotazovací jazyk, který je v různé míře implementován data

MySQL (+MariaDb), MS SQL, PostgreSQL
Databáze se skládá z tabulek, tabulky repezentují homogenní data. Mezi tabulkami jsou definovány relace (foreign keys).

Pro DBs (RDBMS) je definován jazyk SQL, který můžeme chápat jako množinu příkazů pro správu databáze a pro práci s daty.

## Pár úvah nad datovými toky
V případě velkých datových objemů není žádoucí a mnohdy ani možné zpracovávat datové celky. Pole (```list```, ```array``` apod.) je zpracováváno po prvcích.

In [1]:
# tento zpusob znate
data = [0, 1, 2, 3] # data
def oldFashioned(data, cislo): #funkce pracujici s polem hodnot a vracejici pole hodnot
    result = []
    for item in data:
        result.append(item + cislo)
    return result

prictenoOld = oldFashioned(data, 2)
print(data, '-->', prictenoOld)

[0, 1, 2, 3] --> [2, 3, 4, 5]


In [2]:
# tento zpusob jste se neucili (temer s jistotou)
def newWay(data, cislo):
    for item in data:
        yield item + cislo

result = newWay(data, 2)
print(result)
prictenoNew = list(result)
print(data, '-->', prictenoNew)    

<generator object newWay at 0x7f729de7ddd0>
[0, 1, 2, 3] --> [2, 3, 4, 5]


In [3]:
# tento zpusob jste se neucili (temer s jistotou)
def newWay(data, cislo):
    for item in data:
        print("I am ready to send partial result")
        yield item + cislo
        print("I sent partial result")

result = newWay(data, 3)
print(result)


itemFromResult = next(result)
print(itemFromResult)
itemFromResult = next(result)
print(itemFromResult)
itemFromResult = next(result)
print(itemFromResult)
itemFromResult = next(result)
print(itemFromResult)

for item in result:
    print(item)



<generator object newWay at 0x7f729de7d820>
I am ready to send partial result
3
I sent partial result
I am ready to send partial result
4
I sent partial result
I am ready to send partial result
5
I sent partial result
I am ready to send partial result
6
I sent partial result


### Detailní demonstrace

In [4]:
def oldA(data, hodnota):
    result = []
    for item in data:
        print('old krat', item, hodnota)
        result.append(item * hodnota)
    return result

def oldB(data, hodnota):
    result = []
    for item in data:
        print('old plus', item, hodnota)
        result.append(item + hodnota)
    return result

def newA(data, hodnota):
    for item in data:
        print('new krat', item, hodnota)
        yield item * hodnota

def newB(data, hodnota):
    for item in data:
        print('new plus', item, hodnota)
        yield (item + hodnota)

inData = [0, 1, 2]
outDataOld = oldB(oldA(inData, 2), 3)
outDataNew = newB(newA(inData, 2), 3)
print('=' * 30)
print('=', 'Vysledky')
print('=' * 30)
print('outDataOld', outDataOld)
print('outDataNew', outDataNew)
#print('outDataNew', list(outDataNew))
for index, item in enumerate(outDataNew):
    print(index, item)

old krat 0 2
old krat 1 2
old krat 2 2
old plus 0 3
old plus 2 3
old plus 4 3
= Vysledky
outDataOld [3, 5, 7]
outDataNew <generator object newB at 0x7f729c5c0580>
new krat 0 2
new plus 0 3
0 3
new krat 1 2
new plus 2 3
1 5
new krat 2 2
new plus 4 3
2 7


### Proč ```list()```

In [5]:
prictenoWOList = (newWay(data, 2))
print(data, '-->', prictenoWOList)    
print('Spustime vypocet')
pricteno2List = list(newWay(data, 2))
print(data, '-->', pricteno2List)    


[0, 1, 2, 3] --> <generator object newWay at 0x7f729c5c07b0>
Spustime vypocet
I am ready to send partial result
I sent partial result
I am ready to send partial result
I sent partial result
I am ready to send partial result
I sent partial result
I am ready to send partial result
I sent partial result
[0, 1, 2, 3] --> [2, 3, 4, 5]


### Generators
Funkce s výrazem ```yield``` jsou generátory. Používají se v mnoha programovacích jazycích, příkladem budiž Python, Javascript, C# a další. Výsledkem takové funkce není návratová hodnota ale generátor, což je objekt s definovanými vlastnosti a metodami. Jedna z jeho metod (typicky ```next```) umožňuje opakovaným voláním získat hodnoty tvořící zpracovávanou sekvenci.

V příkladu je funkce ```list``` použita na převod generátoru na seznam. Teprve v tuto chvíli dojde k výpočtu. Pečlivě si prostudujte následující řádky kódu. 

Pozor na vyčerpané generátory. Iterovat přes generátor lze pouze jednou. Jakmile zpracujete blok dat, který dodal generátor a přejdete na další blok dat, je předchozí blok dat zapomenut.

In [6]:
def demo(data, cislo):
    for item in data:
        print('pricitam', item, '+', cislo, '=', item + cislo)
        yield item + cislo

generator = demo(data, 2)
print('generator:', data, '-->', generator)    
print('Teprve ted spustime vypocet')
vysledek = list(generator)
print('výsledek', data, '-->', vysledek)        

generator: [0, 1, 2, 3] --> <generator object demo at 0x7f729c5c0ac0>
Teprve ted spustime vypocet
pricitam 0 + 2 = 2
pricitam 1 + 2 = 3
pricitam 2 + 2 = 4
pricitam 3 + 2 = 5
výsledek [0, 1, 2, 3] --> [2, 3, 4, 5]


### Generators II
Generátory jsou fakticky stavové automaty. S výhodou je lze používat při definování posloupnosti akcí nad datovým tokem. Jsou základem operačních systémů (UNIX). 

Budiž funkce

$g_1(x)=x + 1$

$g_2(x)=x+2$

Obě funkce lze zobecnit pomocí funkce

$f(x, y)= x+ y$

neboť 

$g_1(x) = f(x, 1)$

$g_2(x)=f(x,2)$

Lze definovat funkcionál
$F$, pro který platí

$F(f, 2)=g_1$

$F(f, 2)=g_2$

Funkcionály jsou základem funkcionálního programování. Pro funkcionální programování je možné využít např. jazyka F#. Prvky funcionálního programování jsou ale dostupné (a mnohdy i využívané) v jazycích Python, Javascript, C#.

In [7]:
def f(x, y):
    return x + y

def F(f, x):
    def g(y):
        return f(x, y)
    return g

g1 = F(f, 1)
g2 = F(f, 2)

print('f(5, 10) =', f(5, 10))
print('g1(3) =', g1(3))
print('g2(3) =', g2(3))

f(5, 10) = 15
g1(3) = 4
g2(3) = 5


In [8]:
def decorator(f):
    def wrapped(arg):
        def result():
            return f(arg)
        return result
    return wrapped

def addOne(x):
    return x + 1

print(addOne(3))

def addTwo(x):
    return x + 2

modifiAddTwo = decorator(addTwo)
print(modifiAddTwo(3)())

@decorator
def addThree(x):
    return x + 3

print(addThree(1)())

4
5
4


In [9]:
def addOne(x):
    return x + 1


def convertToPipe(f):
    def result(sequence):
        for item in sequence:
            yield f(item)
    return result

data = list(range(9))
print(data)
vectorFunc = convertToPipe(addOne)
result = list(vectorFunc(data))
print(result)

[0, 1, 2, 3, 4, 5, 6, 7, 8]
[1, 2, 3, 4, 5, 6, 7, 8, 9]


In [10]:
def Q(*funcs):
    def result(sequence):
        out = sequence
        for func in funcs:
            out = func(out)
        return out
    return result

def convertToPipe(f):
    def result(sequence):
        for item in sequence:
            yield f(item)
    return result

@convertToPipe
def addOne(x):
    return x + 1

@convertToPipe
def multiplyByTwo(x):
    return x * 2


pipe = Q(addOne, multiplyByTwo)
data = list(range(9))
print(data)
out = list(pipe(data))
print(out)

[0, 1, 2, 3, 4, 5, 6, 7, 8]
[2, 4, 6, 8, 10, 12, 14, 16, 18]


In [11]:
import pandas as pd
def displayData(data):
    df = pd.DataFrame(data)
    display(df)

### Příklad homogenních dat


In [12]:
dataStudents = [
  (1,'Monique Davis',400,'Literature','Monique@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (2,'Teri Gutierrez',800,'Programming','Teri@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (3,'Spencer Pautier',1000,'Programming','Spencer@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (4,'Louis Ramsey',1200,'Programming','Louis@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (5,'Alvin Greene',1200,'Programming','Alvin@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (6,'Sophie Freeman',1200,'Programming','Sophie@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (7,'Edgar Frank \"Ted\" \"Codd\"',2400,'Computer Science','Edgar@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
  (8,'Donald D. Chamberlin',2400,'Computer Science','Donald@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
  (9,'Raymond F. Boyce',2400,'Computer Science','Raymond@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56')]

In [13]:
dataContactInfo = [
  (1,'Monique.Davis@freeCodeCamp.org','555-555-5551',97111),
  (2,'Teri.Gutierrez@freeCodeCamp.org','555-555-5552',97112),
  (3,'Spencer.Pautier@freeCodeCamp.org','555-555-5553',97113),
  (4,'Louis.Ramsey@freeCodeCamp.org','555-555-5554',0),
  (5,'Alvin.Green@freeCodeCamp.org','555-555-5555',97115),
  (6,'Sophie.Freeman@freeCodeCamp.org','555-555-5556',97116),
  (7,'Maximo.Smith@freeCodeCamp.org','555-555-5557',97117),
  (8,'Michael.Roach@freeCodeCamp.ort','555-555-5558',97118)
]

In [14]:
def tuple2dictionary(names, item):
    result = {}
    for name, value in zip(names, item):
        result[name] = value
    return result

def tuple2dictionarySeq(names, sequence):
    for item in sequence:
        yield tuple2dictionary(names, item)

namesStudents = ['studentID', 'FullName', 'sat_score', 'programOfStudy', 'schoolEmailAdr', 'rcd_Created', 'rcd_Updated']
namesContacts = ['studentID', 'studentEmailAddr',  'student-phone-cell', 'student-US-zipcode']

namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)
namedContacts = tuple2dictionarySeq(namesContacts, dataContactInfo)

displayData(namedStudents)
displayData(namedContacts)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated
0,1,Monique Davis,400,Literature,Monique@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
1,2,Teri Gutierrez,800,Programming,Teri@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
2,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
3,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
4,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
5,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
6,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
7,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
8,9,Raymond F. Boyce,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56


Unnamed: 0,studentID,studentEmailAddr,student-phone-cell,student-US-zipcode
0,1,Monique.Davis@freeCodeCamp.org,555-555-5551,97111
1,2,Teri.Gutierrez@freeCodeCamp.org,555-555-5552,97112
2,3,Spencer.Pautier@freeCodeCamp.org,555-555-5553,97113
3,4,Louis.Ramsey@freeCodeCamp.org,555-555-5554,0
4,5,Alvin.Green@freeCodeCamp.org,555-555-5555,97115
5,6,Sophie.Freeman@freeCodeCamp.org,555-555-5556,97116
6,7,Maximo.Smith@freeCodeCamp.org,555-555-5557,97117
7,8,Michael.Roach@freeCodeCamp.ort,555-555-5558,97118


## Relační algebra

Operátory 
- Sjednocení, 
- Průnik, 
- Rozdíl (podobnost s operacemi nad množinami není náhodná)
---
- Selekce, (SQL ```where```)
- Projekce, (SQL ```select```)
- Kartézký součin, (SQL ```join```)
- Přejmenování, (SQL ```as```)


### Selekce s použitím Pythonu

In [15]:
def createSelectFull(queryF, generator):
    result = []
    for item in generator:
        if queryF(item):
            result.append(item)
    return result

def createSelectPartial(queryF):
    def inner(generator):
        result = []
        for item in generator:
            if queryF(item):
                result.append(item)
        return result
    return inner


def createSelectPartialEx(queryF):
    def inner(generator):
        for item in generator:
            if queryF(item):
                yield item
    return inner

studentsData = list(tuple2dictionarySeq(namesStudents, dataStudents))
#newData = createSelectFull(lambda item: item['sat_score'] >= 1000, studentsData)
#namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)

mujFilter = createSelectPartialEx(lambda item: item['sat_score'] >= 1000)
filteredData = mujFilter(studentsData)
print(filteredData)
#filteredData = createSelectFull(lambda item: item['sat_score'] >= 1000, studentsData)

displayData(filteredData)

<generator object createSelectPartialEx.<locals>.inner at 0x7f7261a82d60>


Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated
0,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
1,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
2,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
3,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
4,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
5,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
6,9,Raymond F. Boyce,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56


In [16]:
def createSelect(queryF):
    def selectF(generator):
        return filter(queryF, generator)
    return selectF

def createSelectEx(queryF): # stejne jako createSelect
    def selectF(generator):
        return (item for item in generator if queryF(item)) # viz https://docs.python.org/3/howto/functional.html
    return selectF

def createSelect2(queryF):
    def selectF(generator):
        for item in generator:
            if queryF(item):
                yield item
    return selectF

studentsWithHighScoreSelection = createSelect(lambda item: item['sat_score'] >= 1000)
namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)

subsetResult = studentsWithHighScoreSelection(namedStudents)
displayData(subsetResult)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated
0,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
1,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
2,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
3,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56
4,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
5,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56
6,9,Raymond F. Boyce,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56


### Projekce s využitím Pythonu

In [17]:
from functools import partial

def createProjection(mapF):
    def projectionF(generator):
        for item in generator:
            yield mapF(item)
    return projectionF


#========================================
# Toto muze byt srozumitelnejsi
def createProjectionAll(mapF, generator):
    for item in generator:
        yield mapF(item)

def createProjectionSpec(mapF):
    return partial(createProjectionAll, mapF)
#========================================


def createProjectionEx(mapF): #stejne jako createProjection
    def projectionF(generator):
        return (mapF(item) for item in generator) #viz https://docs.python.org/3/howto/functional.html / Generator expressions and list comprehensions
    return projectionF

def createProjection2(names):
    def projectionF(generator):
        for item in generator:
            result = {}
            for name in names:
                result[name] = item[name]
            yield result
    return projectionF

namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)

#someStudentsColumns = createProjection2(['studentID', 'FullName'])
someStudentsColumns = createProjection(lambda item: {'studentID': item['studentID'], 'FullName': item['FullName'], 'demo': item['FullName'] + item['FullName'] })
someColumnsResult = someStudentsColumns(namedStudents)
displayData(someColumnsResult)

#namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)
#someStudentsColumns2 = createProjection2(['studentID', 'FullName', 'programOfStudy'])
#someColumnsResult2 = someStudentsColumns2(namedStudents)
#displayData(someColumnsResult2)

Unnamed: 0,studentID,FullName,demo
0,1,Monique Davis,Monique DavisMonique Davis
1,2,Teri Gutierrez,Teri GutierrezTeri Gutierrez
2,3,Spencer Pautier,Spencer PautierSpencer Pautier
3,4,Louis Ramsey,Louis RamseyLouis Ramsey
4,5,Alvin Greene,Alvin GreeneAlvin Greene
5,6,Sophie Freeman,Sophie FreemanSophie Freeman
6,7,"Edgar Frank ""Ted"" ""Codd""","Edgar Frank ""Ted"" ""Codd""Edgar Frank ""Ted"" ""Codd"""
7,8,Donald D. Chamberlin,Donald D. ChamberlinDonald D. Chamberlin
8,9,Raymond F. Boyce,Raymond F. BoyceRaymond F. Boyce


### Přejmenování s využitím Pythonu

In [22]:
def createRename(names):
    def renameF(generator):
        for item in generator:
            result = {**item}
            for old, new in names:
                del result[old]
                result[new] = item[old]
            yield result
    return renameF


dataStudents = [
  (1,'Monique Davis',400,'Literature','Monique@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (2,'Teri Gutierrez',800,'Programming','Teri@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (3,'Spencer Pautier',1000,'Programming','Spencer@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (4,'Louis Ramsey',1200,'Programming','Louis@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (5,'Alvin Greene',1200,'Programming','Alvin@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (6,'Sophie Freeman',1200,'Programming','Sophie@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
  (7,'Edgar Frank \"Ted\" \"Codd\"',2400,'Computer Science','Edgar@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
  (8,'Donald D. Chamberlin',2400,'Computer Science','Donald@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
  (9,'Raymond F. Boyce',2400,'Computer Science','Raymond@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56')]

namedStudents = tuple2dictionarySeq(namesStudents, dataStudents)

renamedColumnsStudents = createRename([('studentID', 'id'), ('FullName', 'name')])
renamedColumnsResults = renamedColumnsStudents(namedStudents)
displayData(renamedColumnsResults)

Unnamed: 0,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated,id,name
0,400,Literature,Monique@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,1,Monique Davis
1,800,Programming,Teri@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,2,Teri Gutierrez
2,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,3,Spencer Pautier
3,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,4,Louis Ramsey
4,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,5,Alvin Greene
5,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,6,Sophie Freeman
6,2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,7,"Edgar Frank ""Ted"" ""Codd"""
7,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,8,Donald D. Chamberlin
8,2400,Computer Science,Raymond@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,9,Raymond F. Boyce


### Kartézský součin s využitím Pythonu



In [19]:
from itertools import product
#https://docs.python.org/3/library/itertools.html#itertools.product
def createCartesian(joinF):
    def cartesian(firstG, secondG):
        return map(lambda item: {**item[0], **item[1]}, filter(joinF, product(firstG, secondG)))
    return cartesian

def createCartesianEx(joinF): #stejne jako createCartesian
    def cartesian(firstG, secondG):
        result = ({**f, **g} for f in firstG for g in secondG if joinF((f, g)))
        return result
    return cartesian

def createCartesianReadable(joinF):#  stejne jako createCartesian
    def cartesian(firstG, secondG):
        for f in firstG:
            for g in secondG:
                if (joinF((f, g))):
                    yield {**f, **g}
    return cartesian

namedStudents = list(tuple2dictionarySeq(namesStudents, dataStudents))
namedContacts = list(tuple2dictionarySeq(namesContacts, dataContactInfo))

joinF = lambda item: item[0]['studentID'] == item[1]['studentID']
cartesian = createCartesianReadable(joinF)

cartesianResult = cartesian(namedStudents, namedContacts)
displayData(cartesianResult)

Unnamed: 0,studentID,FullName,sat_score,programOfStudy,schoolEmailAdr,rcd_Created,rcd_Updated,studentEmailAddr,student-phone-cell,student-US-zipcode
0,1,Monique Davis,400,Literature,Monique@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Monique.Davis@freeCodeCamp.org,555-555-5551,97111
1,2,Teri Gutierrez,800,Programming,Teri@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Teri.Gutierrez@freeCodeCamp.org,555-555-5552,97112
2,3,Spencer Pautier,1000,Programming,Spencer@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Spencer.Pautier@freeCodeCamp.org,555-555-5553,97113
3,4,Louis Ramsey,1200,Programming,Louis@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Louis.Ramsey@freeCodeCamp.org,555-555-5554,0
4,5,Alvin Greene,1200,Programming,Alvin@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Alvin.Green@freeCodeCamp.org,555-555-5555,97115
5,6,Sophie Freeman,1200,Programming,Sophie@someOtherSchool.edu,2017-08-16 15:34:50,2017-09-02 19:33:56,Sophie.Freeman@freeCodeCamp.org,555-555-5556,97116
6,7,"Edgar Frank ""Ted"" ""Codd""",2400,Computer Science,Edgar@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,Maximo.Smith@freeCodeCamp.org,555-555-5557,97117
7,8,Donald D. Chamberlin,2400,Computer Science,Donald@someOtherSchool.edu,2017-08-16 15:35:33,2017-09-02 19:33:56,Michael.Roach@freeCodeCamp.ort,555-555-5558,97118


 ## MySQL

 V prostředí MySQL (viz první stack) s pomocí phpMyAdmin spusťte následující SQL příkaz převzato [odtud](https://github.com/SteveChevalier/Distilling-Data/blob/master/schema_data_01_Student%20Schema%20and%20Data.sql)
```sql
-- ---------------------------------------------------
-- Part I - Create and Load Student Schema
-- ---------------------------------------------------
-- Create Schema (database) and set as default
CREATE DATABASE IF NOT EXISTS `student_examples`;
USE `student_examples`;

-- create student and student contact tables
DROP TABLE IF EXISTS `student`; 
CREATE TABLE `student` (
  `studentID` int(11) NOT NULL AUTO_INCREMENT,
  `FullName` text,
  `sat_score` int(11) DEFAULT NULL,
  `programOfStudy` text,
  `schoolEmailAdr` text,
  `rcd_Created` datetime DEFAULT CURRENT_TIMESTAMP,
  `rcd_Updated` timestamp NOT NULL DEFAULT CURRENT_TIMESTAMP ON UPDATE CURRENT_TIMESTAMP,
  PRIMARY KEY (`studentID`));
  
DROP TABLE IF EXISTS `student-contact-info`;
CREATE TABLE `student-contact-info` (
  `studentID` int(11) DEFAULT NULL,
  `studentEmailAddr` text,
  `student-phone-cell` text,
  `student-US-zipcode` int(11) DEFAULT NULL);

-- Load data
INSERT INTO `student` 
	VALUES (1,'Monique Davis',400,'Literature','Monique@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
		(2,'Teri Gutierrez',800,'Programming','Teri@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (3,'Spencer Pautier',1000,'Programming','Spencer@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (4,'Louis Ramsey',1200,'Programming','Louis@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (5,'Alvin Greene',1200,'Programming','Alvin@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (6,'Sophie Freeman',1200,'Programming','Sophie@someOtherSchool.edu','2017-08-16 15:34:50','2017-09-02 19:33:56'),
        (7,'Edgar Frank \"Ted\"\" Codd\"',2400,'Computer Science','Edgar@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
        (8,'Donald D. Chamberlin',2400,'Computer Science','Donald@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56'),
        (9,'Raymond F. Boyce',2400,'Computer Science','Raymond@someOtherSchool.edu','2017-08-16 15:35:33','2017-09-02 19:33:56');

INSERT INTO `student-contact-info` 
	VALUES (1,'Monique.Davis@freeCodeCamp.org','555-555-5551',97111),
    (2,'Teri.Gutierrez@freeCodeCamp.org','555-555-5552',97112),
    (3,'Spencer.Pautier@freeCodeCamp.org','555-555-5553',97113),
    (4,'Louis.Ramsey@freeCodeCamp.org','555-555-5554',0),
    (5,'Alvin.Green@freeCodeCamp.org','555-555-5555',97115),
    (6,'Sophie.Freeman@freeCodeCamp.org','555-555-5556',97116),
    (7,'Maximo.Smith@freeCodeCamp.org','555-555-5557',97117),
    (8,'Michael.Roach@freeCodeCamp.ort','555-555-5558',97118);


-- end Part I schema create and data import
```

### Selekce s využitím SQL
```sql
select * from `student` where `student`.`sat_score` >= 1000
```

### Projekce s využitím SQL
```sql
select `studentID`, `FullName` from `student`
```

### Přejmenování s využitím SQL
```sql
select `student`.`sat_score`, `student`.`programOfStudy`,
  `student`.`schoolEmailAdr`, `student`.`rcd_Created`, 
  `student`.`rcd_Updated`, `student`.`studentID` as `id`,
  `student`.`FullName` as `name` from `student`
```


### Kartézský součin s využitím SQL
```sql

```

## Databázove klastry

Pro potřeby výkoných databází se slučují jednotlivé servery do klastrů (clusters). Databáze toto slučování podporují různým způsobem.

V souvislosti s klastry je možné setkat se s pojmy
- Replikace (replication)
- (Connection Pooling)
- Vyvažování záteže (Load Balancing)
- Dotaz nad více servery (Query Partitioning)

PostgreSQL
- https://www.postgresql.org/docs/9.5/creating-cluster.html
- https://wiki.postgresql.org/wiki/Replication,_Clustering,_and_Connection_Pooling

MySQL
- https://www.digitalocean.com/community/tutorials/how-to-create-a-multi-node-mysql-cluster-on-ubuntu-18-04
- https://dev.mysql.com/doc/refman/8.0/en/mysql-cluster.html

MSSQL
- https://docs.microsoft.com/en-us/sql/sql-server/failover-clusters/install/create-a-new-sql-server-failover-cluster-setup?view=sql-server-ver15


## Vrstvy nad databází

### Modely

- SQL Alchemy
- FastAPI
- Swagger

- RESTFull interface

#### SQL Alchemy

https://github.com/LeeBergstrand/Jupyter-SQLAlchemy-Tutorial/blob/master/Jupyter-SQLAlchemy.ipynb

In [41]:
#https://docs.sqlalchemy.org/en/13/orm/tutorial.html
#https://docs.sqlalchemy.org/en/14/orm/basic_relationships.html
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String, BigInteger, Sequence, Table, ForeignKey, DateTime
from sqlalchemy.orm import relationship

BaseModel = declarative_base()

#### Models

In [42]:
unitedSequence = Sequence('all_id_seq')

class UserModel(BaseModel):
    __tablename__ = 'users'

    #id = Column(BigInteger, Sequence('users_id_seq'), primary_key=True)
    id = Column(BigInteger, unitedSequence, primary_key=True)
    name = Column(String)

    def __init__(self, name):
        self.name = name
        
class UserGroupModel(BaseModel):
    __tablename__ = 'usergroups'

    id = Column(BigInteger, unitedSequence, primary_key=True)
    user_id = Column(BigInteger, ForeignKey('users.id'), index=True)
    group_id = Column(BigInteger, ForeignKey('groups.id'), index=True)
    
    #user = relationship('UserModel', uselist=False, back_populates='groups', primaryjoin=user_id==UserModel.id)
    group = relationship('GroupModel', uselist=False, back_populates='users')#, primaryjoin=authorization_id==AuthorizationModel.id)
    

class GroupModel(BaseModel):
    __tablename__ = 'groups'
    
    id = Column(BigInteger, unitedSequence, primary_key=True)
    name = Column(String)
    
    users = relationship('UserGroupModel', back_populates='group', lazy='dynamic', primaryjoin=id==UserGroupModel.group_id)
        

#### Schemas

In [43]:
from typing import List, Optional

from pydantic import BaseModel as BaseSchema

class UserCreateSchema(BaseSchema):
    name: str
        
class UserIdSchema(UserCreateSchema):
    id: int

class UserGetSchema(BaseSchema):
    id: int
    name: str
    class Config:
        orm_mode = True #ensures appropriate translation from SQLAlchemy 
    pass

class UserPutSchema(BaseSchema):
    id: int
    name: str


#### Engine Init

In [None]:
from sqlalchemy import create_engine
from sqlalchemy.orm import sessionmaker
#engine = create_engine('sqlite:///:memory:', echo=True)
#engine = create_engine('postgresql+psycopg2://user:password@hostname/database_name')
engine = create_engine('postgresql+psycopg2://postgres:example@postgres/jupyterII') 
Session = sessionmaker(bind=engine)
session = Session()
BaseModel.metadata.drop_all(engine)
BaseModel.metadata.create_all(engine)

#### CRUD Ops

In [44]:
def crudUserGet(db: Session, id: int):
    return db.query(UserModel).filter(UserModel.id==id).first()

def crudUserGetAll(db: Session, skip: int = 0, limit: int = 100):
    return db.query(UserModel).offset(skip).limit(limit).all()

def crudUserCreate(db: Session, user: UserCreateSchema):
    userRow = UserModel(name=user.name)
    db.add(userRow)
    db.commit()
    db.refresh(userRow)
    return userRow

def crudUserUpdate(db: Session, user):
    userToUpdate = db.query(UserModel).filter(UserModel.id==user.id).first()
    userToUpdate.name = user.name if user.name else userToUpdate.name
    db.commit()
    db.refresh(userToUpdate)
    return userToUpdate

NameError: name 'Session' is not defined

#### Server

In [9]:
!pip install uvicorn
!pip install fastapi
!pip install wait4it

Collecting uvicorn
  Downloading uvicorn-0.13.4-py3-none-any.whl (46 kB)
[K     |████████████████████████████████| 46 kB 867 kB/s eta 0:00:011
Collecting h11>=0.8
  Downloading h11-0.12.0-py3-none-any.whl (54 kB)
[K     |████████████████████████████████| 54 kB 2.9 MB/s eta 0:00:011
[?25hInstalling collected packages: h11, uvicorn
Successfully installed h11-0.12.0 uvicorn-0.13.4
Collecting fastapi
  Downloading fastapi-0.63.0-py3-none-any.whl (50 kB)
[K     |████████████████████████████████| 50 kB 1.2 MB/s eta 0:00:011
[?25hCollecting pydantic<2.0.0,>=1.0.0
  Downloading pydantic-1.8.1-cp38-cp38-manylinux2014_x86_64.whl (13.7 MB)
[K     |████████████████████████████████| 13.7 MB 6.2 MB/s eta 0:00:01
[?25hCollecting starlette==0.13.6
  Downloading starlette-0.13.6-py3-none-any.whl (59 kB)
[K     |████████████████████████████████| 59 kB 1.4 MB/s eta 0:00:011
Installing collected packages: starlette, pydantic, fastapi
Successfully installed fastapi-0.63.0 pydantic-1.8.1 starlette-0

#### Minimal Code

In [23]:
import uvicorn
from fastapi import FastAPI

app = FastAPI()#root_path='/api')

def run():
    uvicorn.run(app, port=9992, host='0.0.0.0', root_path='')

#### Helper Func for Notebook

In [24]:
# Code in this cell is just for (re)starting the API on a Process, and other compatibility stuff with Jupyter cells.
# Just ignore it!

from multiprocessing import Process
from wait4it import wait_for

_api_process = None

def start_api(runNew=True):
    """Stop the API if running; Start the API; Wait until API (port) is available (reachable)"""
    global _api_process
    if _api_process:
        _api_process.terminate()
        _api_process.join()
    
    if runNew:
        _api_process = Process(target=run, daemon=True)
        _api_process.start()
        wait_for(port=9992)

def delete_route(method: str, path: str):
    """Delete the given route from the API. This must be called on cells that re-define a route"""
    [app.routes.remove(route) for route in app.routes if method in route.methods and route.path == path]
    

In [25]:
def delete_all_routes():
    rr = [*app.routes]
    for item in rr:
        app.routes.remove(item)

#### First API Endpoint

In [26]:
delete_all_routes()

@app.get("/api")
def get_root():
    return {"Hello": "World"}

start_api()

INFO:     Started server process [483]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9992 (Press CTRL+C to quit)


INFO:     172.17.0.1:46234 - "GET /docs HTTP/1.1" 200 OK
INFO:     172.17.0.1:46234 - "GET /openapi.json HTTP/1.1" 200 OK


INFO:     Shutting down
INFO:     Waiting for application shutdown.
INFO:     Application shutdown complete.
INFO:     Finished server process [483]


In [27]:
start_api(False)

#### Database CRUD Endpoint

In [53]:
delete_all_routes()

@app.get("/users/{id}", response_model=UserGetSchema)
#@app.get("/users/{id}")
def userGet(id: int):
    #result = crudUserGet(db=session, id=id)
    result = {'id': 45, 'name': 'Hrbolek', 'password': 'extraultrahesozahesovane'}
    return result

@app.get("/users", response_model=List[UserGetSchema])
def userGetAll(skip: Optional[int]=0, limit: Optional[int]=100):
    #result = crudUserGetAll(db=session, skip=skip, limit=limit)
    #return result
    pass

@app.post("/users")#, response_model=UserIdSchema)
def userPost(user: UserCreateSchema):
    #print('userPut')
    #return crudUserCreate(db=session, user=user)
    pass

@app.put("/users", response_model=UserGetSchema)
def userPut(user: UserPutSchema):
    #result = crudUserUpdate(db=session, user=user)
    #return result
    pass

start_api()

INFO:     Started server process [526]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:9992 (Press CTRL+C to quit)


INFO:     172.17.0.1:38450 - "GET /users/12 HTTP/1.1" 200 OK


In [51]:
start_api(False)

### Nadstavby

Graph QL

https://www.apollographql.com/docs/apollo-server/

Otázka sytémové integrace, 

## LINQ

LINQ je jazyk integrovaný do .NET. Umožňuje použít Relační algebru nad zdroji v rámci programovacího jazyka.

C#:

https://docs.microsoft.com/cs-cz/dotnet/csharp/programming-guide/concepts/linq/

Visual Basic:

https://docs.microsoft.com/cs-cz/dotnet/visual-basic/programming-guide/language-features/linq/introduction-to-linq

Velmi často se používá v .net core pro přístup k databázi.

## Normální formy

> **Doporučené video**
>
> https://www.youtube.com/watch?v=7B9FnIIIsQc
> 
> https://www.youtube.com/watch?v=xoTyrdT9SZI

## Doplňkové studijní zdroje

https://www.freecodecamp.org/news/best-sql-database-tutorial/

https://www.freecodecamp.org/news/sql-foreign-key-vs-primary-key-explained-with-mysql-syntax-examples/

https://cs.wikipedia.org/wiki/Rela%C4%8Dn%C3%AD_algebra

https://www.w3schools.com/sql/

https://web.archive.org/web/20110124214346/http://www.dcs.fmph.uniba.sk/~plachetk/TEACHING/DB2009/db2009_4.pdf

