# Deskriptive Statistik

## Eindimensionale Daten

In [37]:
from pandas import Series
import pandas as pd
import numpy as np

methodeA = Series([79.98, 80.04, 80.02, 80.04, 80.03, 80.03, 80.04, 79.97, 80.05, 80.03, 80.02, 80.00, 80.02])

methodeA.sort_values()

7     79.97
0     79.98
11    80.00
2     80.02
10    80.02
12    80.02
4     80.03
5     80.03
9     80.03
1     80.04
3     80.04
6     80.04
8     80.05
dtype: float64

### Mittelwert

In [38]:
print(methodeA.mean())

80.02076923076923


### Varianz

In [39]:
methodeA.var()

0.0005743589743590099

### Standardabweichung

In [40]:
methodeA.std()

0.023965787580611863

### Median $x_{(n)}$

Beispiel $x_1 = 3, x_2 = 7, x_3 = 2$:

$$
x_{(1)} = x_3 = 2 \qquad x_{(2)} = x_1 = 3\qquad x_{(3)} = x_2 = 7
$$

In [41]:
methodeA.sort_values()

7     79.97
0     79.98
11    80.00
2     80.02
10    80.02
12    80.02
4     80.03
5     80.03
9     80.03
1     80.04
3     80.04
6     80.04
8     80.05
dtype: float64

In [42]:
methodeA.median() # same as: methodeA.quantile(0.5)

80.03

#### Gerade Anzahl Elemente

In [43]:
methodeB = Series([80.02, 79.94, 79.98, 79.97, 79.98, 80.03, 79.95, 79.97])
methodeB.median()

79.975

### Quartile

Die `methodeA` Länge ist $13$ Elemente. Der Untere Quartil-Index ist $13\cdot 0.25 = 3.25$. Da der Index nur ganzzahlig ist, wird der **nächstgrössere** Wert genommen, also $x_{(4)}$.

In [44]:
print("methodeA:",
      methodeA.quantile(0.25, interpolation="higher"),
      methodeA.median(),
      methodeA.quantile(0.75))
print("methodeB:", methodeB.quantile(0.25), methodeB.median(), methodeB.quantile(0.75))

methodeA: 80.02 80.03 80.04
methodeB: 79.965 79.975 79.99000000000001


### Quartilsdifferenz

In [45]:
q75, q25 = methodeA.quantile(q = [.75, .25], interpolation="lower")

iqr = q75 - q25
iqr

0.020000000000010232

### Empirische $\alpha$-Quantile ($0< \alpha < 1$)

In [46]:
print(methodeA.quantile(q=0.1, interpolation="lower"))
print(methodeA.quantile(q=0.7, interpolation="lower"))

79.98
80.03


---

In [47]:
noten = Series([4.2, 2.3, 5.6, 4.5, 4.8, 3.9, 5.9, 2.4, 5.9, 6, 4, 3.7, 5, 5.2, 4.5, 3.6, 5, 6, 2.8, 3.3, 5.5, 4.2, 4.9, 5.1])
noten.quantile(q = np.linspace(.2,1,5), interpolation="lower")

0.2    3.6
0.4    4.2
0.6    4.9
0.8    5.5
1.0    6.0
dtype: float64