[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/prokaj/elte-python-2023/blob/main/2023-12-11.ipynb)   

N√©h√°ny import amit k√©s≈ëbb haszn√°lni fogunk:

In [None]:
import importlib
import math
import graphviz
import matplotlib.pyplot as plt
from pprint import pprint
from tqdm.auto import tqdm
import itertools

from IPython.display import SVG, Image, display_html

if importlib.util.find_spec('pycosat') is None:
    ! pip install --quiet pycosat

if importlib.util.find_spec('ipytest') is None:
    ! pip install --quiet ipytest

import ipytest

ipytest.autoconfig()

## Feladatok el≈ëad√°sr√≥l

### Ism√©tl≈ëd√©sek t√∂rl√©se


Egy adott sztringb≈ël t√°vol√≠tsuk el az egym√°s mellett √°ll√≥ ism√©tl≈ëd≈ë karatereket.

P√©lda:

```python
"kukkkuuuurrrriiiikuuuuuuuu" -> "kukuriku".
```

(Ez m√°r volt kor√°bban. Most keress√ºnk eleg√°nsabb megold√°st, pl. az `itertools` k√∂nyvt√°r f√ºggv√©nyeinek seg√≠ts√©g√©vel.)


In [None]:
import itertools

In [None]:
s = "aabcdefg"
list(itertools.pairwise(s)), list(itertools.groupby(s))

In [None]:

def rm_duplicates_a(string):
    return string[0] + ''.join(b for a, b in itertools.pairwise(string) if a != b)
    
def rm_duplicates_b(string):
    return ''.join(a for a, _ in itertools.groupby(string))


In [None]:
%%ipytest

def test_rm_duplicates_a():
    s = "aabbbc"
    assert rm_duplicates_a(s) == "abc"

def test_rm_duplicates_b():
    s = "aabbbc"
    assert rm_duplicates_b(s) == "abc"


In [None]:
s = "kukkkuuuurrrriiiikuuuuuuuu"
print(f"{s=}, {rm_duplicates_a(s)=}, {rm_duplicates_b(s)=}")

### Sz√°mol√°s polinomokkal

√çrjunk egy `Polynomial` oszt√°lyt, ahol a polinomot t√°roljuk el, mint az egy√ºtthat√≥inak list√°j√°t.

Implement√°ljuk k√©t ilyen polinom √∂sszead√°s√°t (azaz √≠rjuk meg az `__add__`, valamint a `__repr__`  met√≥dust, hogy l√°ssuk, mi egy ilyen egyed ,,tartalma'').

Egy polinom f√ºggv√©ny is. Implement√°ljuk a `__call__` met√≥dust is, ami ki√©rt√©keli a polinomot egy adott pontban!


In [None]:
from itertools import dropwhile
from fractions import Fraction


In [None]:
def format_term(k, coeff, fmt="x^{k}"):
    sep = "-" if coeff < 0 else "+"
    coeff = abs(coeff)
    c = f"{coeff}" if coeff != 1 else ""
    match k:
        case 0:
            x = ""
            c = f"{coeff}"
        case 1:
            x = "x"
        case _:
            x = fmt.format(k=k)
    return f"{c}{x}", sep

class Polynomial:
    def __init__(self, *coefficients):
        self.coeff = tuple(dropwhile(lambda x: x==0, coefficients))[::-1]

    def degree(self):
        return len(self.coeff)-1 if self.coeff else 0

    def __eq__(self, other):
        return isinstance(other, type(self)) and self.coeff == other.coeff

    def __add__(self, other):

        coeff_a = self.coeff
        coeff_b = other.coeff

        if len(coeff_a) < len(coeff_b):
            coeff_a, coeff_b = coeff_b, coeff_a

        coeff = list(coeff_a)

        for i, c in enumerate(coeff_b):
            coeff[i] += c

        return Polynomial(*reversed(coeff))

    def __sub__(self, other):
        return self+(-1)*other

    def __mul__(self, other):
        coeff = [0]*(len(self.coeff)+len(other.coeff)-1)
        for i, ca in enumerate(self.coeff):
            for j, cb in enumerate(other.coeff):
                coeff[i+j] += ca*cb

        return Polynomial(*reversed(coeff))

    def __rmul__(self, c):
        coeff = [c0*c for c0 in self.coeff]
        return Polynomial(*reversed(coeff))

    def leading_coeff(self):
        return self.coeff[-1] if self.coeff else 0

    def __divmod__(self, other):
        if not isinstance(other, type(self)):
            raise TypeError

        m = Polynomial()
        r = self
        main_coeff = other.leading_coeff()
        other_dg = other.degree()

        while r.degree() >= other_dg:
            c = r.leading_coeff()/main_coeff
            m += c*monomial(r.degree()-other_dg)
            r = self-m*other
        return m, r

    def __mod__(self, other):
        return divmod(self, other)[1]

    def __floordiv__(self, other):
        return divmod(self, other)[0]

    def __call__(self, x):
        value = 0
        for c in reversed(self.coeff):
            value *= x
            value += c
        return value

    def as_token_list(self, formatter, fmt):
        coeffs = [(k, coeff) for k, coeff in enumerate(self.coeff) if coeff != 0]
        if len(coeffs) == 0:
            coeffs = [(0, 0)]
        tokens = [token for k, coeff in coeffs for token in format_term(k, coeff, fmt)]
        if tokens[-1] == "+":
            tokens.pop()
        tokens.reverse()
        return tokens

    def __str__(self):
        return ''.join(self.as_token_list(format_term, "x^{k}"))

    def __repr__(self):
        return f"{type(self).__name__}({', '.join(map(str, reversed(self.coeff)))})"

    def _repr_latex_(self):
        formula = ''.join(self.as_token_list(format_term, "x^{{{k}}}"))
        return f"$x\mapsto {formula}$"


def monomial(degree, unit=1):
    coeff = [unit]+[0]*degree
    return Polynomial(*coeff)

N√©h√°ny p√©lda:

In [None]:
p = Polynomial(-1, -0.0, -3)    # -> x^2 - 3
q = Polynomial(2, 0, -1, 1)  # -> 2x^3 + 3x + 1
display(p)
display(q)
display(p*q)
display(2*p)
print(str(q))
## p**2

In [None]:
p = Fraction(1,1)*p
q = Fraction(1,1)*q

m, r = divmod(q, p)
display(m)
display(r)
display(p)
display(q)
display(m*p)

#### Kor√°bbi feladat

√çrjunk egy f√ºggv√©nyt, ami kisz√°molja az els≈ë $n$ term√©szetes sz√°m $p$-ik hatv√°ny√∂sszeg√©t.

pl. `p = 0`-ra

```Python
def f0(n):
    return n
```

j√≥, mert $k^0=1$ ha $k=1,\dots,n$ √©s ezek √∂sszege pont $n$.

Ha `p = 1`, akkor

```Python
def f1(n):
    return n*(n+1)//2
```

j√≥, mert $\sum_{k=1}^n k = n(n+1)/2$.

M√©g `p = 2`-t is tanultuk

```Python
def f2(n):
    return n*(n+1)*(2*n+1)//6
```

√Åltal√°nos $p$-re tudunk-e ilyen f√ºggv√©nyt √≠rni?

In [None]:
def mk_power_sum(p):
    def f(n):
        total = 0
        for k in range(1, n+1):
            total += k**p
        return total

    f.__doc__ = f"""
        {p}-ik hatv√°nyok √∂sszeg√©t sz√°molja
        """

    return f

In [None]:
f2_slow = mk_power_sum(2)

In [None]:
f2_slow?

In [None]:
[f2_slow(i) for i in range(0, 10)]

In [None]:
def f2_fast(n):
    return n*(n+1)*(2*n+1)//6

In [None]:
%timeit f2_slow(10_000)
%timeit f2_fast(10_000)

### √ñtlet

$$
    \sum_{k=r}^n  \binom{k}{r} = \binom{n+1}{r+1}
$$

**Bizony√≠t√°s.**
$\{1,2,\dots,n+1\}$-b≈ël v√°lasszunk ki $r+1$ k√ºl√∂nb√∂z≈ë sz√°mot.

√ñsszes lehet≈ës√©g:
$$
\binom{n+1}{r+1}.
$$

Sz√°moljuk meg az eseteket aszerint sz√©tbontva is, hogy legnagyobb kiv√°lasztott sz√°m mivel egyenl≈ë.

Ha a legnagyobb sz√°m $k+1$, akkor a marad√©k $r$ sz√°mot $\{1,2,\dots, k\}$ k√∂z√ºl v√°lasztjuk. √çgy az esetek sz√°ma
$$
    \sum_{k+1=r+1}^{n+1} \binom{k}{r} =  \sum_{k=r}^{n} \binom{k}{r}
$$  
$k+1$ helyett $k$ az √∂sszegz√©si v√°ltoz√≥

Ugyanez m√°sk√©pp.

$$
\binom{k}{r} = \frac{1}{r!} k(k-1)\cdots(k-r+1) = \frac{1}{r!}p_r(k-r+1),\quad\text{ahol}\quad p_r(x) = x(x+1)\cdots(x+r-1)
$$
√©s
$$
    \sum_{j=1}^{n-r+1} \frac{1}{r!}p_r(j) = \frac{1}{(r+1)!}p_{r+1}(n+1-(r+1)+1)= \frac{1}{(r+1)!} p_{r+1}(n-r+1)\quad\text{minden $n\geq r$ √©s $r\geq 0$-ra}
$$

Az √∂sszegz√©s fels≈ë hat√°ra √©s $p_{r+1}$ argumentuma ugyanaz, azaz

$$
\sum_{j=0}^{n} p_r(j) = \frac{1}{r+1}p_{r+1}(n)
$$

**Line√°ris algebra.**

$$
p_0\equiv 1,\quad p_1(x)=x,\quad p_2(x)=x(x+1),\quad\dots,\quad p_r(x)=x(x+1)\cdots(x+r-1)
$$

b√°zis a legfeljebb $r$-edfok√∫ polinomok vektorter√©ben.

$$
    x^r = \sum_{i=0}^r a_i p_i(x)
$$
√©s
$$
    \sum_{k=0}^n k^r = \sum_{k=0}^n \sum_{i=0}^r a_i p_i(k) =  \sum_{i=0}^r a_i  \sum_{k=0}^n p_i(k) = \sum_{i=0}^r \frac{a_i}{r+1}  p_{i+1}(n)
$$


### √ñsszefoglalva



- Egy polinomot az egy√ºtthat√≥kkal √°br√°zolhatunk.
- Kellene egy f√ºggv√©ny, ami a term√©szetes $1, x, x^2,\dots$ b√°zisban fel√≠rt polinomot a $p_0,p_1,\dots$ b√°zisban √≠r fel.
- $p_0, p_1, \dots,$ b√°zisban az √∂sszegz√©s k√∂nny≈±, l√©nyeg√©ben arr√©bb kell tolni az egy√ºtthat√≥kat.
- A $p_0,p_1,\dots$ b√°zisban fel√≠rt polinomot vagy visszasz√°moljuk a term√©szetes b√°zisba, vagy meg√≠rjuk a f√ºggv√©nyt, ami ki√©rt√©keli a f√ºggv√©nyt egy adott pontban.

Vegy√ºk √©szre, hogy $p_0$ azonosan 1, $p_1(0)=0$, $p_2(0)=p_2(-1)=0$, stb.

Ha $f=\sum_i a_i p_i$, akkor
$$
    f(0) = \sum_i a_i p(0) = a_0, \quad f(-1) = a_0 p_0(-1) + a_1 p_1(-1),\quad f(-k) = a_0 p_0(-k) + a_1 p_1(-k) + \cdots + a_k p_k(-k).
$$
amib≈ël

$$
\begin{align*}
    a_0 & = f(0)\\
    a_1 & = \frac{f(-1) - a_0 p_0(-1)}{p_1(-1)}\\
    \vdots\\
    a_k & = \frac{f(-k) - \sum_{j=0}^{k-1} a_j p_j(-k)}{p_{k}(-k)}\\
    \vdots
\end{align*}
$$  
Kihaszn√°lhatjuk m√©g, hogy
$$
p_k(-k)=(-k)(-k+1)\cdots(-k+(k-1))=(-1)^k k!.
$$

In [None]:
def basis():
    i = 0
    p = Polynomial(1)
    while True:
        yield p
        p *= Polynomial(1,i)
        i += 1


In [None]:
from itertools import islice

$$
p_k(-k)=(-k)(-k+1)\cdots(-k+(k-1))=(-1)^k k!.
$$
ellen≈ërz√©se.


In [None]:
for i, p in zip(range(5), basis()):
    display(p)
    print(f"{i=}, {p(-i)=}")
    print("-"*20)

In [None]:
from typing import Callable

def mk_fast_polynomial_sum(p: Polynomial) -> Polynomial:
    """return a polynomial q such that q(n)=sum_{k=1}^n p(k)"""
    q = Polynomial()
    q0 = Polynomial()

    it = iter(basis())
    i = 0

    while q0 != p:
        pi = next(it)
        if i > 0:
            q += (coeff/i)*pi
        coeff = ((p(-i) - q0(-i))/pi(-i))
        q0 += coeff*pi
        i += 1

    q += (coeff/i)*next(it)
    return q

def mk_fast_power_sum(k:int) -> Callable[[int], int]:
    coeff = mk_fast_polynomial_sum(monomial(k, unit=Fraction(1,1))).coeff
    f = math.lcm(*[c.denominator for c in coeff])
    p = Polynomial(*[int(c*f) for c in reversed(coeff)])
    def fun(x: int) -> int:
        return p(x)//f

    fun.__doc__ = f"computes sum_{{k=1}}^n x^{k}"
    return fun


#### Gyors ellen≈ërz√©s.

In [None]:
%%ipytest

def test_fast_power_sum():
    for i in range(1, 4):
        f = mk_fast_power_sum(i)
        g = mk_power_sum(i)
        for n in range(1000):
            assert f(n) == g(n)

In [None]:
f2 = mk_fast_polynomial_sum(monomial(2, Fraction(1,1)))
f2a = mk_fast_power_sum(2)

In [None]:
%timeit f2_slow(10_000)
%timeit f2(10_000)
%timeit f2a(10_000)
%timeit f2_fast(10_000)

T√∂rtekkel sz√°molni kicsit lassabb!

Hogyan eml√©kszik `f2a` a kisz√°molt `p` √©s `f` √©rt√©kre. A `__closure__` nev≈± attrib√∫tum t√°rolja.

In [None]:
print(f"{f2.__dict__}")
for cell in f2a.__closure__:
    print(cell.cell_contents)


In [None]:
f2a = mk_fast_power_sum(2)
print(f"{f2a.__closure__[0].cell_contents=}, {f2a(2)=}")
f2a.__closure__[0].cell_contents = 3
print(f"{f2a.__closure__[0].cell_contents=}, {f2a(2)=}")

In [None]:
help(mk_fast_power_sum(2))

In [None]:
[mk_fast_power_sum(2)(i) for i in range(5)]

In [None]:
mk_fast_polynomial_sum(monomial(8))

Ugyanez a polinom racion√°lis egy√ºtthat√≥kkal

In [None]:
mk_fast_polynomial_sum(monomial(8, unit=Fraction(1, 1)))

In [None]:
q = mk_fast_polynomial_sum(monomial(4, Fraction(1,1)))
display(q)
display(30*q)

Az els≈ë n√©h√°ny hatv√°ny√∂sszeget sz√°mol√≥ polinom:

In [None]:
for k in range(10):
    p_k = mk_fast_polynomial_sum(monomial(k, Fraction(1, 1)))
    f = math.lcm(*(c.denominator for c in p_k.coeff))
    print(f"{k=}, {p_k=!s:>50}, {f:>3}*p_k={f*p_k!s:>40}")


Igaz-e, hogy a gy√∂k√∂k racion√°lisak?

$\sum_{i=0}^n i^k = P_{k+1}(n)$

$k=0,1,2,3$ eset√©n igen.

$$
\begin{aligned}
    x^2+x &= x(x+1)\\
    2x^3+3x^2+x &= x(x+1)(2x+1)\\
    x^4+2x^3+x^2&= x^2(x+1)^2
\end{aligned}
$$


In [None]:
p_5 = mk_fast_polynomial_sum(monomial(4, unit=Fraction(1, 1)))

print(f"{p_5(0)=!s}, {p_5(-1)=!s}, {p_5(Fraction(-1, 2))=!s}, {p_5.degree()=}")

H√°rom gy√∂k√∂t tal√°ltunk. Legyen $q=x(x+1)(2x+1)$. Ekkor $q$ gy√∂kei pont  a megtal√°lt gy√∂k√∂k.

In [None]:
q = monomial(1)*Polynomial(1,1)*Polynomial(2,1)
display(q)
print(f"{q(0)=!s}, {q(-1)=!s}, {q(Fraction(-1, 2))=!s}")

In [None]:
p, r = divmod(30*p_5, q)
print(f"{p=!s}, {r=!s}")

Gy√∂k√∂k:
$$
    x_{1,2}=\frac{-3\pm\sqrt{3^2-4\cdot 3\cdot(-1)}}{2\cdot3}=\frac12\left(-1\pm\sqrt{\frac7{3}}\right)
$$

In [None]:
xs = np.linspace(-2, 1, 201)
plt.plot(xs, [p(x) for x in xs], label="p(x)")
plt.plot(xs, [10*p_5(x) for x in xs], label="10*p_5(x)")
plt.legend()
plt.grid()
plt.show()

# Egy gr√°felm√©leti algoritmus

## Feladat

Adott egy `n` cs√∫cs√∫ ir√°ny√≠tatlan gr√°f az √©lek list√°j√°val. A gr√°f cs√∫csait `0`-t√≥l `n-1`-ig c√≠mk√©zt√ºk meg, az √©leket pedig a v√©gpontokkal.

Emellett adott egy kiindul√°si pont √©s egy v√©gpont. Azt szeretn√©nk eld√∂nteni, hogy el lehet-e jutni a kiindul√°si pontb√≥l a v√©gpontba a gr√°f √©leit haszn√°lva.

Gondolhatunk arra, hogy a gr√°f egy √∫th√°l√≥zatot √≠r le √©s a k√©rd√©s az, hogy el tudunk-e jutni `A`-b√≥l `B`-be.

Pl. `n = 3`, √©lek `edges = [[0,1], [1,2], [2,0]]`, `A =  0`, `B = 2`.

Gr√°fok megjelen√≠t√©s√©re egy hasznos k√∂nyvt√°r a `graphviz`.

In [None]:
import importlib
if importlib.util.find_spec('graphviz') is None:
    ! pip install graphviz
import graphviz

In [None]:
edges = [[0,1], [1,2], [2,0]]
g0 = graphviz.Graph()

g0.edges([(str(a), str(b)) for a, b in edges])
g0

A gr√°f √∂sszef√ºgg≈ë, tetsz≈ëleges `A`, `B` eset√©n a v√°lasz: `True`

In [None]:
n = 6
edges = [[0,1],[0,2],[3,5],[5,4],[4,3]]
A = 0
B = 5

g1 = graphviz.Graph()
g1.edges([(str(a), str(b)) for a, b in edges])
g1

Nincs √∫t 0 √©s 5 k√∂z√∂tt. A v√°lasz: `False`

√ñsszef√ºgg≈ës√©gi komponenseket szeretn√©nk sz√°molni.



## √ñtlet.

Az √©l n√©lk√ºli gr√°fb√≥l indulunk ki. Itt egy elem≈± komponensek vannak.

Minden komponensb≈ël v√°lasszunk egy reprezent√°ns √©s minden $i$ pontra feljegyezz√ºk, melyik komponensben van.




In [None]:
def show_graph(roots, direction = 'LR'):
    g = graphviz.Digraph(graph_attr={'rankdir': direction})
    g.edges((str(i), str(r)) for i, r in enumerate(roots))
    return g

In [None]:
n = 5
roots = [i for i in range(n)]
display(show_graph(roots, 'TD'))

ha beh√∫zzuk a $(0, 1)$ √©let, akkor $0$ √©s $1$ azonos komponensbe ker√ºl. V√°laszthatunk a k√©t √∂sszeuni√≥zott komponens reprezent√°nsa k√∂z√∂tt, legyen pl. 1

In [None]:
roots[0] = 1
display(show_graph(roots, 'TD'))

Ha most a (0, 2) √©let akarjuk beh√∫zni, akkor nem √°ll√≠thatjuk √°t `roots[0]`. Meg kell keresn√ºnk `0` komponens√©nek reprezent√°s√°t, ez 1 √©s vagy `roots[1]`-et √°ll√≠tjuk 2 -re, vagy `roots[2]`-t 1-re.

In [None]:
def find(roots, a):
    while a != roots[a]:
        a = roots[a]
    return a

In [None]:
find(roots, 0), find(roots, 2)

In [None]:
def union(roots, a, b):
    ra = find(roots, a)
    rb = find(roots, b)
    roots[ra] = rb

In [None]:
union(roots, 0, 2)
print(f"After union(0, 2) {roots=}")
display(show_graph(roots))

union(roots, 3, 4)
print(f"After union(3, 4) {roots=}")
print(roots)
display(show_graph(roots))

union(roots, 3, 2)
print(f"After union(3, 2) {roots=}")
print(roots)
display(show_graph(roots))

Ezut√°n az a k√©rd√©s, hogy el lehet-e jutni `A`-b√≥l, `B`-be k√∂nnyen eld√∂nthet≈ë. Ha `A` √©s `B` azonos komponensben van, akkor `A` √©s `B` k√∂z√∂tt megy √∫t az eredeti gr√°fban, k√ºl√∂nben nem.

1. p√©lda
`n = 3`, √©lek `edges = [[0,1], [1,2], [2,0]]`, `A =  0`, `B = 2`.

In [None]:
def show_edges(edges, direction='LR'):
    g = graphviz.Graph(graph_attr={'rankdir': direction})
    g.edges([(str(a), str(b)) for a, b in edges])
    return g

In [None]:
n = 3
edges = [[0,1], [1,2], [2,0]]
A =  0
B = 2

display(show_edges(edges, 'TD'))
print(f"Eredeti gr√°f")
print("="*50)

roots = [i for i in range(n)]
for a, b in edges:
    union(roots, a, b)

display(show_graph(roots))
print(f"{A=} √©s {B=} {'azonos' if find(roots, A)==find(roots, B) else 'k√ºl√∂nb√∂z≈ë'} komponensben van")

2. p√©lda

In [None]:
n = 6
edges = [[0,1],[0,2],[3,5],[5,4],[4,3]]
A = 0
B = 5

display(show_edges(edges))
print(f"Eredeti gr√°f")
print("="*50)

roots = [i for i in range(n)]
for a, b in edges:
    union(roots, a, b)

display(show_graph(roots))
print(f"{A=} √©s {B=} {'azonos' if find(roots, A)==find(roots, B) else 'k√ºl√∂nb√∂z≈ë'} komponensben van")


L√°tszik, hogy el≈ësz√∂r l√©tre kell hoznunk a `roots` t√∂mb√∂t, majd ut√°na ezzel kell sz√°molnunk.

C√©lszer≈± egy oszt√°lyt l√©terhozni:

In [None]:
class UnionFind:
    def __init__(self, n):
        self.roots = [i for i in range(n)]

    def find(self, a):
        while a != self.roots[a]:
            a = self.roots[a]
        return a

    def union(self, a, b):
        ra, rb = self.find(a), self.find(b)
        if ra != rb:
            self.roots[ra] = rb
        return self

    def show(self, *args, **kwargs):
        display(show_graph(self.roots, *args, **kwargs))

Mi t√∂rt√©nik, ha nagyobb gr√°funk van?

In [None]:
n = 10
edges = [(0,i+1) for i in range(n-1)]
display(show_edges(edges, 'TD'))

uf = UnionFind(n)
for a, b in edges:
    uf.union(a, b)

uf.show()

Valah√°nyszor beh√∫zzuk a $(0, i)$ √©let, meg kell keresni $0$ reprezent√°ns√°t. $k$ √©l beh√∫z√°sa ut√°n $k$-l√©p√©ssel tal√°ljuk meg. Ha 10 helyett 10_000 m√©ret≈± a gr√°f ez nem fog m≈±k√∂dni.

### Jav√≠t√°si lehet≈ës√©gek.

- Amikor megkeress√ºk $i$ reprezent√°ns√°t v√©gig megy√ºnk a reprezent√°nshoz vezet≈ë √∫ton. Minden megl√°togatott cs√∫csra ismert√© v√°lik a reprezent√°ns √©rt√©ke. Ezt be√≠rhatjuk a `roots` t√∂mbe. (path compression)

- A nagyobb komponensbe k√∂ss√ºk be a kisebbet √©s ne ford√≠tva. Ehhez  a ,,m√©retet'' nyilv√°n kell tartani.

In [None]:
def find_better(roots, a):
    ra = roots[a]
    if a != ra:
        ra = find_better(roots, ra)
        roots[a] = ra
    return ra

def find_better_without_recursion(roots, a):
    stack = []

    ra = roots[a]
    while a != ra:
        stack.append(a)
        a = ra
        ra = roots[a]

    while stack:
        roots[stack.pop()] = ra

    return ra


def union_sizes(roots, sizes, a, b):
    ra = find_better(roots, a)
    rb = find_better(roots, b)
    if ra != rb:
        if sizes[ra] < sizes[rb]:
            ra, rb = rb, ra
        roots[rb] = ra
        sizes[ra] += sizes[rb]

def union_ranks(roots, ranks, a, b):
    ra = find_better(roots, a)
    rb = find_better(roots, b)
    if ra != rb:
        if ranks[ra] < ranks[rb]:
            ra, rb = rb, ra
        roots[rb] = ra
        if ranks[ra] == ranks[rb]:
            ranks[ra] += 1

In [None]:
n = 10
edges = [(0, i) for i in range(1, n)]
display(show_edges(edges, 'TD'))

roots = [i for i in range(n)]

for a, b in edges:
    ra = find_better(roots, a)
    rb = find_better(roots, b)
    roots[ra] = rb

display(show_graph(roots, 'LR'))


In [None]:
n = 10
edges = [(0, i) for i in range(1, n)]
display(show_edges(edges, 'TD'))

roots = [i for i in range(n)]

for a, b in edges:
    ra = find_better(roots, a)
    rb = find_better(roots, b)
    roots[rb] = ra

display(show_graph(roots, 'TD'))


In [None]:
n = 10
edges = [(0, i) for i in range(1, n)]
display(show_edges(edges, 'TD'))

roots = [i for i in range(n)]
sizes = [1]*n

for a, b in edges:
    union_sizes(roots, sizes, a, b)

display(show_graph(roots, 'TD'))


In [None]:
n = 10
edges = [(0, i) for i in range(1, n)]
display(show_edges(edges, 'TD'))

roots = [i for i in range(n)]
ranks = [0]*n

for a, b in edges:
    union_ranks(roots, ranks, a, b)

display(show_graph(roots, 'TD'))


### Szok√°sos implement√°ci√≥

In [None]:
class UnionFind:
    def __init__(self, n):
        self.roots = [i for i in range(n)]
        self.sizes = [1]*n

    def find(self, a):
        ra = self.roots[a]
        if a != ra:
            ra = self.find( ra)
            self.roots[a] = ra
        return ra

    def union(self, a, b):
        ra = self.find(a)
        rb = self.find(b)
        if ra != rb:
            if self.sizes[ra] < self.sizes[rb]:
                ra, rb = rb, ra
            self.roots[rb] = ra
            self.sizes[ra] += self.sizes[rb]


In [None]:
uf = UnionFind(10)
print(uf)
uf.union(1, 2)
print(uf)
uf.union(5, 9)
uf.union(6, 7)
print(uf)

uf

`__str__` √©s `__repr__` met√≥dusok

In [None]:
def as_set(seq):
    return f"{{{', '.join(map(str, seq))}}}"

def uf_str(self):
    components = {}
    for a in range(len(self.roots)):
        ra = self.find(a)
        if ra not in components:
            components[ra] = []
        components[ra].append(a)
    return f"{{{ ', '.join(map(as_set, components.values()))}}}"

def uf_repr(self):
    return f"{type(self).__name__}({len(self.roots)})"

# √çgy is lehet:
UnionFind.__str__ = uf_str
UnionFind.__repr__ = uf_repr

In [None]:
uf = UnionFind(10)
print(uf)
uf.union(1, 2)
print(uf)
uf.union(5, 9)
uf.union(6, 7)
print(uf)

uf

### Tov√°bbi k√©rd√©sek

- Tegy√ºk fel, hogy a komponensek sz√°ma √©rdekel minket. Hogyan oldan√°nk, meg, hogy konstans id≈ë alatt megkaphassuk.
- Tegy√ºk fel, hogy a legnagyobb komponens m√©retet √©rdekel minket. Hogyan oldan√°nk, meg, hogy konstans id≈ë alatt megkaphassuk.
- Hogyan ellen≈ërizn√©nk, hogy k√©t part√≠ci√≥ azonos?

# Faktori√°lis √©rt√©k√©nek k√∂zel√≠t√©se

Mekkora $n!$, ha $n$ nagy?

√ñtlet:
$$
    \log n! = \sum_{k=1}^n \log k \approx \int_1^{?} \log x dx = \left[ x(\log x-1)\right]_{x=1}^{x=?}
$$

In [None]:
import matplotlib.pyplot as plt
import math

def subdivision(a, b, n):
    d = (b-a)/n
    return [a+i*d for i in range(n+1)]

def add_function_curve(f, a, b, n=100):
    xs = subdivision(a, b, n)
    fxs = [f(x) for x in  xs]
    plt.plot(xs, fxs, "r-")

## T√©glalap k√∂zel√≠t√©s

In [None]:
k_values = [k for k in range(1, 11)]
for k in k_values:
    plt.fill_between([k+i for k in k_values for i in range(2)], [math.log(k) for k in k_values for i in range(2)], color='lightblue')

add_function_curve(math.log, 1, 11)


A hiba:

In [None]:
def primitive_function(x):
    return x*(math.log(x)-1)

def rectangle_error(k):
    return primitive_function(k+1)-primitive_function(k)-math.log(k)

def cummulative_error(n, error_fun=rectangle_error):
    return sum(error_fun(k) for k in range(1, n+1))

In [None]:
for n in [10, 100, 1000]:
    print(f"{n=}, {cummulative_error(n)=}")

### Jav√≠t√°s, trap√©z √∂sszeg k√∂zel√≠t√©s

T√©glalapok helyett minden egys√©g intervallumon a be√≠rt trap√©zt  haszn√°ljuk.

In [None]:
k_values = [k for k in range(1, 11)]
for k in k_values:
    plt.fill_between(k_values, [math.log(k) for k in k_values], color='lightblue')

add_function_curve(math.log, 1, 10)


In [None]:
def error_fun(x):
    k, t = divmod(x, 1)
    return math.log(x) - ((1-t)*math.log(k) + t*math.log(k+1))

xs = subdivision(1, 10, 100)
plt.fill_between(xs, [error_fun(x) for x in xs], color='lightblue')
add_function_curve(error_fun, 1, 10, 500)


In [None]:
def modified_error_fun(x):
    k = x//1
    return k*k*error_fun(x)

for a in [1, 100, 1000, 10000]:
    b = a+10
    xs = subdivision(a, b, 1000)

    plt.fill_between(xs, [modified_error_fun(x) for x in xs], color='lightblue')
    add_function_curve(modified_error_fun, a, b, 1000)
    plt.show()

In [None]:
def trapezoid_error(k):
    return primitive_function(k+1)-primitive_function(k)-0.5*(math.log(k)+math.log(k+1))


In [None]:
for n in [10, 100, 1000]:
    print(f"{n = :>4}, {cummulative_error(n, trapezoid_error) = :.8f}")

## L√°tszik, hogy a hiba lassan n≈ë. Tudunk-e fels≈ë becsl√©st adni r√°?

$$
    \int_{k}^{k+1} \log x dx = \int_0^1 \log(k+x) dx
$$

A be√≠rt trap√©z ter√ºlete integr√°llal

$$
    \int_0^1 x\log(k+1)+(1-x)\log(k) dx
$$

Tudjuk-e becs√ºlni a k√©t integrandus k√ºl√∂nbs√©g√©t?
$$
    \log(k+x) - (x\log(k+1)+(1-x)\log(k))
$$

√Åtalak√≠t√°s mindk√©t tagb√≥l levonunk $\log(k)$-t:
$$
    \log(k+x) - \log(k) - (x\log(k+1)+(1-x)\log(k) -\log(k)) = \log(1+\tfrac{x}{k}) - x\log(1+\tfrac1k)
$$


In [None]:
xs = subdivision(0, 1, 100)

for k in range(1, 4):
    plt.fill_between(xs, [math.log(k+x)-math.log(k) for x in xs], [math.log(k+1)*x+math.log(k)*(1-x)-math.log(k) for x in xs], color="blue")

plt.grid()

A logaritmus f√ºggv√©ny konk√°v, a deriv√°lt monoton fogy ($1/x$) √©s egy be√≠rt h√∫r mindig a v√©gponthoz beh√∫zott √©rint≈ë egyenesek alatt van.

$$
    \log (1+x)\leq x
$$

ez√©rt
$$
    \log(1+x/k) - x \log(1+1/k)\leq \frac xk - x\log(1+1/k) = x(1/k -\log(1+1/k))
$$

de
$$
    \log(1+1/k) = \log\frac{k+1}{k} = -\log\frac{k}{k+1} = -\log(1-1/(k+1)) \geq \frac1{k+1}
$$

√çgy

$$
    \log(1+x/k) - x \log(1+1/k)\leq x\left(\frac1k-\frac1{k+1}\right)
$$
√©s
a $k$. intervallumon elk√∂vetett hiba legfeljebb
$$
    \int_{0}^{1} \log(k+x) - ((1-x)\log(k)+x\log(k+1))dx = \int_0^1 x dx \left(\frac1k -\frac1{k+1}\right)
$$
A hib√°k √∂sszege legfeljebb:
$$
    \frac12 \sum_{k=1}^\infty \frac1k -\frac1{k+1} =\frac12
$$

√ñsszefoglalva:

$$
    \log n! = \int_1^n \log x dx + \frac12 \log n + r_n = n(\log(n) - 1) + 1 + \frac12 \log (n) + r_n
$$
ahol $r_n$ a k√∂zel√≠t√©s hib√°ja az els≈ë $n$ intervallumon
$$
    r_n = \sum_{k=1}^n \int_0^1 \log(1+x/k)-x\log(1+1/k)dx \leq 1/2
$$
$(r_n)$ monoton n≈ë, ez√©rt l√©tezik limesze.

Vissza√≠rva faktori√°lisra:

$$
    n! = \sqrt{n}\left(\frac{n}{e}\right)^n c_n
$$
ahol $c_n=e^{1+r_n}\leq e^{3/2}$

Anal√≠zisben a Wallis formula k√∂vetkezm√©nyek√©nt szerepel

$$
\lim c_n = \sqrt{2\pi}
$$

Ez a nevezetes **Stirling** formula:
$$
 \frac{n!}{\sqrt{2\pi n}\left(\frac{n}{e}\right)^n} \to 1
$$


### Wallis formula k√∂z√©piskolai eszk√∂z√∂kkel


K√∂z√©piskolai tud√°st haszn√°lva is kisz√°m√≠thatn√°nk a limeszt. Ehhez a
$$
    I_n = \int_{0}^{\pi/2} \cos^n(x) dx
$$
sorozatot kellen vizsg√°lni. Parci√°lisan integr√°lva:
$$
    I_{n+2} = \int_{0}^{\pi/2} (1-\sin^2x)\cos^n(x) dx
    = I_n - \int_{0}^{\pi/2} \sin^2(x)\cos^n(x) dx
$$
ahol
$$
\begin{aligned}
-\int_{0}^{\pi/2} \sin^2(x)\cos^n(x) dx
& =
\frac1{n+1} \int_{0}^{\pi/2} \sin(x)(\cos^{n+1}(x))' dx
\\
&=
\frac{1}{n+1}\left[\sin(x)\cos^{n+1}(x)\right]_{x=0}^{x=\pi/2}-\frac{1}{n+1}\int_0^{\pi/2}\cos^{n+2}(x)dx
\\
&= -\frac{1}{n+1} I_{n+2}
\end{aligned}
$$

**√ñsszefoglalva**:

$$
    I_{n+2} = \int_{0}^{\pi/2} (1-\sin^2x)\cos^n(x) dx
    = I_n - \frac{1}{n+1}I_{n+2} = \frac{n+1}{n+2} I_n
$$

Ha $n=2k$ p√°ros, akkor
$$
    I_{2k} = \frac{2k-1}{2k} I_{2k-2} = \frac{(2k-1)(2k-3)}{2k(2k-2)} I_{2k-4} =\cdots=\frac{(2k-1)!!}{2^k k!}I_0=\frac{1}{2^{2k}}\binom{2k}{k}\frac{\pi}2
$$
Ha $n=2k+1$ p√°ratlan, akkor
$$
    I_{2k+1} = \frac{2k}{2k-1} I_{2k-3} = \frac{(2k)(2k-2)}{(2k+1)(2k-1)} I_{2k-3} =\cdots=\frac{2^k k!}{(2k+1)!!}I_1=\frac{2^{2k}}{(2k+1)\binom{2k}{k}}=\frac{2^{2(k+1)}}{2(k+1)\binom{2(k+1)}{k+1}}
$$
Mivel $(I_n)$ monoton fogy√≥:
$$
    I_{2k-1}=\frac{2^{2k}}{2k\binom{2k}{k}}  > I_{2k} = \frac{1}{2^{2k}}\binom{2k}{k}\frac{\pi}2  > I_{2k+1}= \frac{2^k k!}{(2k+1)!!}I_1=\frac{2^{2k}}{(2k+1)\binom{2k}{k}}
$$
√Åtrendez√©s ut√°n
$$
\frac{1}{2k\pi/2} < \left(\frac{1}{2^{2k}}\binom{2k}{k}\right)^2 < \frac{1}{(2k+1)\pi/2}
$$
√©s
$$
    \lim_{n\to\infty}\sqrt{n} \frac{1}{2^{2n}}\binom{2n}{n} = \frac{1}{\sqrt{\pi}}
$$

M√°sfel≈ël a faktori√°lis k√∂zel√≠t√©s√©t haszn√°lva:
$$
    \sqrt{n}\frac1{2^{2n}}\binom{2n}{n}
    = \sqrt{n} \frac1{2^{2n}} \frac{c_{2n} \sqrt{2n} \left(\frac{2n}{e}\right)^{2n}}{c_n^2 n\left(\frac{n}{e}\right)^{2n}}
    \to \lim_{n\to\infty} \frac{\sqrt{2}c_{2n}}{c_n^2} = \lim_{n\to\infty}\frac{\sqrt{2}}{c_n}=\frac1{\sqrt{\pi}}
$$
amib≈ël $\lim_n c_n =\sqrt{2\pi}$.



# Conway's Game of Life


The Game of Life is a cellular automaton created by mathematician John Conway in 1970. The game consists of a board of cells that are either on or off. One creates an initial configuration of these on/off states and observes how it evolves. There are four simple rules to determine the next state of the game board, given the current state:

- **Overpopulation**: if a living cell is surrounded by more than three living cells, it dies.
- **Stasis**: if a living cell is surrounded by two or three living cells, it survives.
- **Underpopulation**: if a living cell is surrounded by fewer than two living cells, it dies.
- **Reproduction**: if a dead cell is surrounded by exactly three cells, it becomes a live cell.


√çrjunk egy oszt√°lyt a j√°t√©khoz, pl. az `__init__` met√≥dus hozzal√©tre a megadott konfigur√°ci√≥nak megfelel≈ë objektumot. Legyen egy `step` met√≥dus, ami a rendszert a k√∂vetkez≈ë √°llapot√°ba viszi √©s az `__str__` met√≥dus pedig
valahogy √°br√°zolja az aktu√°lis √°llapotot.

Tegy√ºk fel, hogy a r√°cs amin a rendszer √©l, egy $n\times n$-es r√°cs, ahol mindk√©t ir√°nyban ciklikusan k√∂rbemegy√ºnk,
azaz a cs√∫csokat modulo $n$ tekintj√ºk.


In [None]:
class ConwayGoL:

    def __init__(self, state):
        self.state = list(state)

    def step(self):
        return self

    def __repr__(self):
        return f"{type(self).__name__}({self.state})"

In [None]:
import random

init_state = [] ## ???
conway = ConwayGoL(init_state)

conway.step()


A j√°t√©k √°llapot√°nak le√≠r√°s√°hoz egy $m\times n$ r√°cs minden pontj√°r√≥l tudni kell, hogy foglalt-e vagy sem.

```python
m, n = 11, 11
state = [[0]*n for _ in range(m)]
```

V√©letlenszer≈± kezdeti √°llapot:
```python
state = [[random.randint(0,1) for _ in range(n)] for _ in range(m)]
```

In [None]:
def random_state(n, m, p):
    return [[int(random.random()<p) for _ in range(m)] for _ in range(n)]

state = random_state(5, 6, 0.2)
print(state)

Szebb megjelen√≠t√©s?

In [None]:
def as_matrix(lst, n):
    return [lst[i:i+n] for i in range(0, len(lst), n)]

print('\n'.join(''.join(map(str, line)) for line in state))


In [None]:
for symbols in [
    "\u2b1c\u2b1b",
    "¬∑‚ô•",
    "üü°üü•"
    ]:
    print('\n'.join(''.join(symbols[x] for x in line) for line in state))


In [None]:
import matplotlib.pyplot as plt

img = plt.matshow(state, cmap="Pastel1", vmax=1, vmin=0, alpha=0.8)
img.axes.axis("off")
n, m = len(state), len(state[0])
for pos in range(0, n+1):
    img.axes.axhline(y=pos-0.5, color="gray")
for pos in range(0, m+1):
    img.axes.axvline(x=pos-0.5, color="gray")

plt.show()


In [None]:

def cgol_str(self):
    symbols = "\u2b1c\u2b1b"
    return '\n'.join(''.join(symbols[x] for x in line) for line in self.state)

ConwayGoL.__str__ = cgol_str



In [None]:
conway = ConwayGoL(state)
print(conway)

A `step` met√≥dushoz ki kellene sz√°molni egy adott cs√∫cs foglalt szomsz√©dainak sz√°m√°t `cnt`. Ha ez k√©sz,
akkor az $i$ cs√∫cs √∫j √°llapota:

$$
    \text{state}_{t+1}[i]=
    \begin{cases}
    1 &\text{Ha $\text{cnt}[i]\in\{2,3\}$ √©s $\text{state}_t[i]=1$}\\
    1 &\text{Ha $\text{cnt}[i]\in\{3\}$ √©s $\text{state}_t[i]=0$}\\
    0 &\text{k√ºl√∂nben}
    \end{cases}
$$

In [None]:
def newstate(state, count):
    return [int((c==3)|((c==2) & (s==1))) for s, c in  zip(state, count)]

In [None]:
import ipytest
ipytest.autoconfig()

In [None]:
%%ipytest

def test_newstate():
    res = [0]*9
    res[2] = 1
    res[3] = 1
    assert newstate([1]*9, list(range(9))) == res
    res = [0]*9
    res[3] = 1
    assert newstate([0]*9, list(range(9))) == res


In [None]:
def count_neighbors(state):
    delta = [(0,-1), (0, 1), (1,-1), (1,0), (1,1), (-1,-1), (-1,0), (-1,1)]
    m, n = len(state), len(state[0])
    return [ [sum(state[(i+di) % m][(j+dj) % n] for di, dj in delta) for j in range(n)] for i in range(m)]

def cgol_step(self):
    counts = count_neighbors(self.state)
    self.state = [ newstate(line, cnt) for line, cnt in zip(self.state, counts) ]
    return self

ConwayGoL.step = cgol_step

In [None]:
@classmethod
def cgol_from_random_state(cls, n, m,  p):
    return cls(random_state(n, m, p))

ConwayGoL.from_random_state=cgol_from_random_state

In [None]:
conway = ConwayGoL.from_random_state(5, 10, 0.2)
print(conway)
print(*count_neighbors(conway.state), sep='\n')

In [None]:
conway = ConwayGoL.from_random_state(5, 5, 0.25)
print(conway)
print("-"*20)
print(conway.step())

Tudunk-e valami anim√°ci√≥szer≈±t k√©sz√≠teni? Jupyter notebook-ban pl. a k√∂vetkez≈ë k√©ppen lehet:

In [None]:
from ipywidgets import Output
from time import sleep


In [None]:

out = Output()
display(out)
conway = ConwayGoL.from_random_state(n=21, m=51, p=0.5)

for i in range(50):
    out.clear_output(True)
    with out:
        print(f"After {i} steps:\n{conway}")
    sleep(0.15)
    conway.step()


## Ugyanez `numpy` t√∂mbbel

In [None]:
import numpy as np

### Random `state`

In [None]:
def random_state_np(m, n, p):
    return np.random.binomial(1, p, size=(m, n)).astype(np.int8)


In [None]:
print(random_state_np(11, 21, 0.2))

### `__str__` unicode karakterrel

In [None]:

symbols_array = np.array(["\u2b1c", "\u2b1b"])

def str_state_np(state):
    return '\n'.join(map(''.join, symbols_array[state]))

In [None]:
print(str_state_np(random_state(11, 21, 0.2)))

### Szomsz√©dsz√°m `pad`-el

In [None]:
def count_neighbors_np(state, mode='wrap'):
    count = np.pad(state, pad_width=((1,1), (1,1)), mode=mode)
    count = count[2:]+ count[1:-1] + count[:-2]
    count = count[:, 2:] + count[:, 1:-1] + count[:,:-2]
    return count-state

In [None]:
%%ipytest

def test_count_neighbors():
    for _ in range(5):
        state_np = random_state_np(10, 10, 0.2)
        state = state_np.tolist()
        assert count_neighbors(state) == count_neighbors_np(state_np).tolist()

In [None]:
state = random_state_np(5, 8, 0.2)
print(str_state_np(state))
print(count_neighbors_np(state))

In [None]:
def new_state_np(state):
    count = count_neighbors_np(state)
    return ((count == 3)|((count == 2) & (state==1))).astype(np.int8)

In [None]:
%%ipytest

def test_step():
    for _ in range(5):
        state_np = random_state_np(10, 10, 0.2)
        state = state_np.tolist()
        cgol = ConwayGoL(state)
        assert cgol.step().state == new_state_np(state_np).tolist()


In [None]:
x = np.arange(5)
(x<4)&(x>2)

In [None]:
state0 = random_state_np(11, 21, 0.2)
state1 = new_state_np(state0)
print(str_state_np(state0))
print('='*50)
print(str_state_np(state1))

In [None]:
state = random_state_np(11, 21, 0.2)
plt.imshow(state, cmap='Pastel1_r', vmax=1, vmin=0)
plt.xticks(np.arange(state.shape[1]+1)-.5, minor=True)
plt.xticks([])
plt.yticks(np.arange(state.shape[0]+1)-.5, minor=True)
plt.yticks([])
plt.grid(which="minor", color="gray", linestyle='-', linewidth=1)
plt.axis()

for (i, j), cnt in np.ndenumerate(count_neighbors_np(state)):
    plt.text(j, i, str(cnt), ha="center", va="center")

plt.title("State (red/white) with neighbor counts")
plt.show()

In [None]:
plt.imshow(state, cmap='Pastel1_r', vmax=1, vmin=0)
plt.xticks(np.arange(state.shape[1]+1)-.5, minor=True)
plt.xticks([])
plt.yticks(np.arange(state.shape[0]+1)-.5, minor=True)
plt.yticks([])
plt.grid(which="minor", color="gray", linestyle='-', linewidth=1)
plt.axis()

for (i, j), cnt in np.ndenumerate(new_state_np(state)):
    plt.text(j, i, str(cnt), ha="center", va="center")

plt.title("State (red/white), new_state (0/1)")
plt.show()

In [None]:
out = Output()
display(out)
state = random_state_np(n=51, m=21, p=0.2)

for i in range(50):
    out.clear_output(True)
    sleep(0.15)
    state = new_state_np(state)
    with out:
        print(f"After {i+1} steps:\n{str_state_np(state)}")


## Parancssoros script

Ha parancssorb√≥l dolgozunk, akkor valami ilyesmit lehetne tenni

In [None]:
%%writefile conway.py

import random


def random_state(m, n, p):
    return [ [ int(random.random()<p) for _ in range(n) ] for _ in range(m) ]

def count_neighbors(state):
    delta = [(0,-1), (0, 1), (1,-1), (1,0), (1,1), (-1,-1), (-1,0), (-1,1)]
    m, n = len(state), len(state[0])
    return [ [sum(state[(i+di) % m][(j+dj) % n] for di, dj in delta) for j in range(n)] for i in range(m)]

def newstate(state, count):
    return [int((c==3)|((c==2) & (s==1))) for s, c in  zip(state, count)]

class ConwayGoL:
    symbols = "\u2b1c\u2b1b"

    def __init__(self, state):
        self.state = list(state)

    def step(self):
        counts = count_neighbors(self.state)
        self.state = [ newstate(line, cnt) for line, cnt in zip(self.state, counts) ]
        return self


    def __str__(self):
        symbols = self.symbols
        return '\n'.join(''.join(symbols[x] for x in line) for line in self.state)

    def __repr__(self):
        return f"{type(self).__name__}({self.state})"

    @classmethod
    def from_random_state(cls, m, n, p):
        return cls(random_state(m, n, p))

    def is_empty(self):
        return not any(any(line) for line in self.state)


def clear_terminal(n):
    print(f"{chr(27)}[{n+1}A", end="")

def main(m=11, n=25, p=0.2, nsteps=10, clear_screen=clear_terminal):
    from time import sleep
    conway = ConwayGoL.from_random_state(m, n, p)
    for i in range(nsteps+1):
        if i>0:
            clear_screen(m)
        print(f"after {i} step:")
        print(conway)
        sleep(0.2)
        conway.step()
        if conway.is_empty():
            break

if __name__ == "__main__":
    main()


Ha valamit m√°r meg√≠rtunk √©s szeretn√©nk haszn√°lni, `import`-tal el√©rhet≈ë. Pl.

### Tudunk-e param√©tereket adni a python scriptnek?

Amikor egy python scriptet futtatunk, a parancssor (amivel a fut√°st ind√≠tottuk) a `sys` modul `argv` v√°ltoz√≥j√°ban √©rhet≈ë el.

In [None]:
import sys
sys.argv

In [None]:
! python -c 'import sys; print(sys.argv)' -alma


Egy nagyon egyszer≈± megold√°s, ha minden opci√≥nak a neve a param√©ter amit be√°ll√≠t √©s egyenl≈ës√©gjel ut√°n az √©rt√©ke:
pl. n=11 m=25 nstep=10 p=0.2

In [None]:
cmdline = "conway.py -n=11 -m=25 -nstep=10 -p=0.2"
argv = cmdline.split()
params =[param.split("=") for param in argv[1:]]
params

Minden param√©terr≈ël tudni kellene, hogy milyen t√≠pus√∫!

In [None]:
param_types={'-n': int, '-m': int, '-nstep': int, '-p': float}
params = {k.replace("-",""): param_types[k](v)  for k, v in (param.split("=") for param in argv[1:])}
params

Ezek ut√°n a `main` f√ºggv√©nyt a megadott param√©terekkel meg tudjuk h√≠vni:

```
    main(**params)
```
Mi van a `default` √©rt√©kekkel, `help`-pel stb.

Ezeket mind meg tudn√°nk √≠rni, de nem kell. Van k√©sz megold√°s `python`-ban.

Az `argparse` k√∂nvyt√°r mindent megcsin√°l, ami nek√ºnk kell.

In [None]:
import argparse

help(argparse)

A `conway.py` file v√©g√©t cser√©lj√ºk le erre.
```python
if __name__ == "__main__":
    import argparse
    
    parser = argparse.ArgumentParser(description='Conways Game of Life')

    parser.add_argument(
        '-n', '--nrows',
        type=int,
        default=11,
        help='number of rows'
        )

    parser.add_argument(
        '-m', '--ncols',
        type=int,
        default=25,
        help='number of columns'
        )

    parser.add_argument(
        '-p', '--density',
        type=float,
        default=0.2,
        help='initial density')
    
    parser.add_argument(
        '--nsteps',
        type=int,
        default=10,
        help='steps to display'
        )

    args = parser.parse_args()
    print(args)
    
    main(n=args.ncols, m=args.nrows, p=args.density, nsteps=args.nsteps)
```
    

In [None]:
# import importlib
# importlib.reload(conway)

In [None]:
import conway

out1 = Output()
display(out1)

with out1:
    conway.main(clear_screen=lambda n: out1.clear_output(True))

Az `argparse` k√∂nyvt√°r nem a legk√©nyelmesebb. Alternat√≠v√°k:

- [Docopt](http://docopt.org/)
- [Click](https://pypi.org/project/click/)
- [clize](https://github.com/epsy/clize)

√©s m√©g sok m√°sik is!

## Itt is haszn√°lhattunk volna dekor√°tort


A `ConwayGoL` p√©ld√°ban ut√≥lag adtunk met√≥dusokat az oszt√°lyunkhoz. Ezt is megtehett√ºk volna dekor√°torral.  

In [None]:
def conway_method(f):
    setattr(ConwayGoL, f.__name__, f)
    return f


@conway_method
def dummy_method(self):
    print("this is a message from the new method!")

c = ConwayGoL.from_random_state(10, 10, 0.2)
c.dummy_method()

Azt is megtehett√ºk volna, hogy a oszt√°ly nincs bele√©getve a k√≥dba.

In [None]:
def new_method(cls):
    def decorator(f):
        setattr(cls, f.__name__, f)
        return f
    return decorator

@new_method(ConwayGoL)
def dummy_method(self):
    print("Note that the old value of dummy_method is overwritten!")

In [None]:
c.dummy_method()


# Pandas



A Pandas egy Python-k√∂nyvt√°r, amit t√°bl√°zatos adatokhoz haszn√°lunk.

In [None]:
import pandas as pd

A legfontosabb adatt√≠pus a `DataFrame` (adatkeret). 

Ez olyan mint egy m√°trix √©s a lista k√∂z√∂s √°ltal√°nos√≠t√°sa. 

Egy oszlopban csak azonos t√≠pus√∫ elemek lehetnek, de az oszlopok t√≠pusa k√ºl√∂nb√∂zhet.  

Az oszlopoknak neve van, a soroknak indexe.

In [None]:
df = pd.DataFrame({'numbers': list(range(26)),  'codes': list(range(65, 65+26)), 'letters': [chr(code) for code in range(65, 65+26)]})
df

In [None]:
df.info()

In [None]:
df.describe()

In [None]:
df.head()

In [None]:
df.tail()

A `DataFrame` sorainak,  oszlopainak t√≠pusa `pd.Series`.

In [None]:
type(df['numbers']), type(df.numbers), type(df.iloc(0)[0])

In [None]:
print(f"{df['codes']=}")
print(f"{df.numbers=}")
print(f"{df.iloc(0)[0]=}")

Ha t√∂bb oszlopot, sort v√°lasztunk ki, akkor az eredm√©ny egy kisebb `DataFrame` lesz.

In [None]:
df[:5][["codes", "letters"]]

## Adatok beolvas√°sa

Beolvas√°sra a `pd.read_...` f√ºggv√©nyek szolg√°lnak.

In [None]:
[name for name in pd.__all__ if name.startswith('read')]

Ki√≠r√°sra a `DataFrame` oszt√°ly `to_...` met√≥dusai szolg√°lnak. 

In [None]:
import inspect

In [None]:
[name for name, _ in inspect.getmembers(pd.DataFrame) if name.startswith('to_')]

A legn√©pszer≈±bb a `csv` form√°tum. Ez egyszer≈± sz√∂veges form√°tum a t√°bl√°zat minden sora a text file egy-egy sora, az elemeket elv√°laszt√≥ karakter `,`.

In [None]:
filename = "/tmp/teszt.csv" 
df.to_csv(filename, index=False)
df0 = pd.read_csv(filename)
(df0==df).all(axis=None)

In [None]:
print(f"{filename} tartalma: ")
! cat {filename}

In [None]:
import plotly.express as px
df = px.data.gapminder()

fig = px.scatter(
    df,
    #df.query("year==2007"), 
    x="gdpPercap", 
    y="lifeExp",
	size="pop", 
    color="continent",
    hover_name="country", 
    log_x=True, 
    size_max=60,
    animation_frame="year",
    range_y=(20, 100),
    range_x=(200, 60000)
    )

fig.show()


In [None]:
df[df.country=="Hungary"]

## Adattiszt√≠t√°s



Korrig√°land√≥ adatok:

- √ºres cella
- hib√°s form√°tum√∫ adat
- hib√°s √©rt√©k
- duplik√°tum


A data.csv f√°jlban az √∂sszes hib√°ra van pl. 

In [None]:
%%writefile /tmp/data.csv
Duration,Date,Pulse,Maxpulse,Calories
60,'2020/12/01',110,130,409.1
60,'2020/12/02',117,145,479.0
60,'2020/12/03',103,135,340.0
45,'2020/12/04',109,175,282.4
45,'2020/12/05',117,148,406.0
60,'2020/12/06',102,127,300.0
60,'2020/12/07',110,136,374.0
450,'2020/12/08',104,134,253.3
30,'2020/12/09',109,133,195.1
60,'2020/12/10',98,124,269.0
60,'2020/12/11',103,147,329.3
60,'2020/12/12',100,120,250.7
60,'2020/12/12',100,120,250.7
60,'2020/12/13',106,128,345.3
60,'2020/12/14',104,132,379.3
60,'2020/12/15',98,123,275.0
60,'2020/12/16',98,120,215.2
60,'2020/12/17',100,120,300.0
45,'2020/12/18',90,112,
60,'2020/12/19',103,123,323.0
45,'2020/12/20',97,125,243.0
60,'2020/12/21',108,131,364.2
45,,100,119,282.0
60,'2020/12/23',130,101,300.0
45,'2020/12/24',105,132,246.0
60,'2020/12/25',102,126,334.5
60,20201226,100,120,250.0
60,'2020/12/27',92,118,241.0
60,'2020/12/28',103,132,
60,'2020/12/29',100,132,280.0
60,'2020/12/30',102,129,380.3
60,'2020/12/31',92,115,243.0


In [None]:
import pandas as pd

df = pd.read_csv('/tmp/data.csv')
print(f"{len(df)=}")
df

### 1. Az √ºres cell√°kat tartalmaz√≥ sorok elt√°vol√≠t√°sa/az √ºres cell√°k felt√∂lt√©se

Erre a `dropna()` met√≥dus szolg√°l. Ha fel√ºl is akarjuk √≠rni az adatkeretet, akkor `df.dropna(inplace = True)` m√≥don haszn√°ljuk. Pr√≥b√°ljuk ki ezt a data.csv f√°jlb√≥l k√©sz√≠tett adatkereten. Az Excelben is l√°that√≥, hogy t√∂bb sorban hi√°nyzik az utols√≥ √©rt√©k, √©s ezek hely√©n `NaN` √°ll a Pandas-adatkeretben.

In [None]:
import pandas as pd

df = pd.read_csv('/tmp/data.csv')

new_df = df.dropna()

print(f"{len(new_df)=}")
new_df

T√∂rl√©s helyett az √ºres cell√°ba t√∂lthet√ºnk √∫j √©rt√©ket, pl.

In [None]:
df = pd.read_csv('/tmp/data.csv')

df.fillna(130, inplace = True)

df

Ha csak a Calories oszlopban akarjuk az √ºres cell√°kat 130-as √©rt√©kkel felt√∂lteni:

In [None]:
df = pd.read_csv('/tmp/data.csv')

df["Calories"].fillna(130, inplace = True)
df

Gyakori, hogy a hi√°nyz√≥ √©rt√©keket valamilyen k√∂z√©p√©rt√©kkel helyettes√≠tsj√ºk. mean() - az √©rt√©kek √°tlaga, median() - a medi√°njuk, mode() - a leggyakrabban el≈ëfordul√≥ √©rt√©k

In [None]:
df = pd.read_csv('/tmp/data.csv')

x = df["Calories"].mean()

df["Calories"].fillna(x, inplace = True)

df

## 2. Hib√°s form√°tum√∫ adat kijav√≠t√°sa / elt√°vol√≠t√°sa



Ha nem akarjuk elt√°vol√≠tani a sort, amelyben a hib√°s form√°tum√∫ adat szerepel, akkor esetleg √°t lehet konvert√°lni a k√≠v√°nt form√°tumra. A dirtydate adatf√°jlban a 22. √©s 26. sorban ebbe a hib√°ba √ºtk√∂z√ºnk. A 26. sorban nem sztring form√°j√°ban van megadva a d√°tum. Ez kijav√≠that√≥ a to_datetime met√≥dussal. 

In [None]:
df = pd.read_csv('/tmp/data.csv')

df['Date'] = pd.to_datetime(df['Date'], format="mixed")

df

Ekkor a 22. sorban szerepl≈ë NaN helyett NaT (not a time) √°ll, ami √ºres cell√°ra utal. Azt a sort t√∂r√∂lj√ºk, amelyikben a Date oszlopban ez az √©rt√©k szerepel:

In [None]:
df.dropna(subset=['Date'], inplace = True)
df

## 3. Hib√°s adatok kezel√©se



Lehet, hogy egy cell√°ban helyes form√°tum√∫ adat √°ll, de az m√©gis hib√°s, pl. el√≠r√°s miatt. Pl. a Duration oszlopban, a 7. sorban l√©v≈ë elem 450 - 450 percig foly√≥ testedz√©s igen val√≥sz√≠n≈±tlen, feltehet≈ëen 45-√∂t kellett volna √≠rni. √çrjuk √°t erre az √©rt√©kre: 

In [None]:
df.loc[7, 'Duration'] = 45

Nagy mennyis√©g≈± adatn√°l persze nem tudjuk egyenk√©nt ellen≈ërizni az adatokat √©s az el≈ëbbi m√≥don korrig√°lni az √∂sszes hib√°sat. Ekkor a hib√°s adatok kisz≈±r√©se t√∂rt√©nhet pl. √∫gy, hogy r√∂gz√≠t√ºnk egy √©sszer≈± als√≥ / fels≈ë hat√°rt az oszlopban szerepl≈ë √©rt√©kekre. Pl. a testedz√©s re√°lisan 120 percn√©l nem tart tov√°bb, ez√©rt d√∂nthet√ºnk √∫gy, hogy ha egy √©rt√©k meghaladja a 120-at ebben az oszlopban, akkor kicser√©lj√ºk 120-ra.

In [None]:
df = pd.read_csv('/tmp/data.csv')

df.loc[df["Duration"]>120, "Duration"] = 120

x = df["Calories"].mean()

df["Calories"].fillna(x, inplace = True)


df

A m√°sik lehet≈ës√©g, hogy az eg√©sz sort t√∂r√∂lj√ºk, amelyben ez a hiba el≈ëfordul:

In [None]:
df = pd.read_csv('/tmp/data.csv')
df = df[df.Duration<=120]
df


## 4. A duplik√°tumok elt√°vol√≠t√°sa



A fenti t√°bl√°zat egyes sorai dupl√°n szerepelnek, pl. a 11. √©s a 12. Az egyiket el kellene t√°vol√≠tani. El≈ësz√∂r meg√°llap√≠tjuk, hogy mely index≈± sorok duplik√°tumai valamely kor√°bbi sornak.

In [None]:
print(df.duplicated())

Elt√°vol√≠t√°suk:

In [None]:
df.drop_duplicates(inplace = True)

In [None]:
df

## Korrel√°ci√≥ az adatok k√∂z√∂tt



Az oszlopok k√∂z√∂tt kapcsolat er≈ëss√©g√©r≈ël a `corr()` f√ºggv√©nnyel sz√°molt korrel√°ci√≥ t√°j√©koztat.

In [None]:
df = pd.read_csv('/tmp/data.csv')
df = df[df.Duration<=120]
df['Date'] = pd.to_datetime(df['Date'], format="mixed")
df.dropna()

df.corr()

In [None]:
import seaborn as sns   


In [None]:
corr = df.corr()
sns.heatmap(corr, annot=corr.round(2), cmap=sns.diverging_palette(200, 20), vmin=-1, vmax=1)

In [None]:
sns.choose_diverging_palette()

In [None]:
sns.choose_light_palette(
)