In [7]:
import sys
#print(sys.executable)  # Zeigt dir den Pfad zum Python-Interpreter
import matplotlib.pyplot as plt
import matplotlib.image as mpimg
import matplotlib.patches as patches
import plotly.graph_objects as go
import numpy as np
import base64
from PIL import Image
import io
from IPython.display import HTML

In [8]:
def img_to_base64(img_path):
    with open(img_path, 'rb') as f:
        encoded = base64.b64encode(f.read()).decode('utf-8')
    return f'data:image/png;base64,{encoded}'

---
title: "Bundestagsmining - Wenn Volksvertreter nicht mehr aus dem Volk kommen"
format: 
  revealjs:
    css: custom.css  # Nur einmal hier!
    footer: "Lena Aretz | yourcupofdata | 26.09.2025"
    slide-number: true
    logo: "./images/yourcuopfdata_logo.png" 
    include-in-header:
      text: |
        <style>
        .reveal .slide-logo {
          max-height: 70px !important;
          right: 12px !important;
          bottom: 12px !important;
        }
        </style>
    toc-title: "Inhalt"
author: "Lena Aretz (yourcupofdata.com <br> aretzlena@gmail.com)"
date: today
execute:
    echo: false
    toc: true
---

# Inhaltsverzeichnis

1. Motivation
2. Daten
3. EDA
4. Datenbereinigung
5. Datenvisualisierung und Ergebnisse
6. Deployment

## Wer bin ich?
![](./images/lena_profil_01.png){height="80%"}

![](attachment:lena_profil_02.png)

# Motivation


## Friedrich Merz

::: {.image-container}
![](./images/portrait_merz.jpg){.portrait-image}

::: {.fragment .overlay-label}
**Rechtsanwalt**
:::
:::

## Olaf Scholz
::: {.image-container}
![](./images/portrait_scholz.jpg){.portrait-image}

::: {.fragment .overlay-label}
**Rechtsanwalt**
:::
:::

## Boris Pistorius

::: {.image-container}
![](./images/portrait_pistorius.jpg){.portrait-image}

::: {.fragment .overlay-label}
**Rechtsanwalt**
:::
:::

# Ähnliche Projekte

---

:::: {.columns}

::: {.column width="50%"}

![](./images/tagesschau_mining_01.png){width="90%"}

![](./images/tagesschau_mining_02.png){width="90%"}

**David Englert** - *Software Engineer*
:::

::: {.column width="50%"}

![](./images/spiegel_mining.png){width="80%"}

**David Kriesel** - *Data Scientist*


![](./images/bundestagsmine.png){width="80%"}

**Kevin Bönisch** - *Software Developer*
:::

::::

# Workflow

::: {.workflow-container}

::: {.workflow-step .fragment}
::: {.step-number}
1
:::
::: {.step-content}
**Datenrecherche**
:::
:::


::: {.workflow-arrow .fragment}
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step .fragment}
::: {.step-number}
2
:::
::: {.step-content}
**EDA**
:::
:::

::: {.workflow-arrow .fragment}
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step .fragment}
::: {.step-number}
3
:::
::: {.step-content}
**Datenbereinigung**
:::
:::

::: {.workflow-arrow .fragment}
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step .fragment}
::: {.step-number}
4
:::
::: {.step-content}
**Auswertung / Modell**
:::
:::

::: {.workflow-arrow .fragment}
<span style="font-size: 3rem">→</span>
:::

::: {.workflow-step .fragment}
::: {.step-number}
5
:::
::: {.step-content}
**Deployment**
:::
:::

:::

# Datenrecherche

## Workflow

::: {.workflow-container}

::: {.workflow-step .highlight}
::: {.step-number}
1
:::
::: {.step-content}
**Datenrecherche**
:::
:::


::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
2
:::
::: {.step-content}
**EDA**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
3
:::
::: {.step-content}
**Datenbereinigung**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
4
:::
::: {.step-content}
**Auwertung**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem">→</span>
:::

::: {.workflow-step}
::: {.step-number}
5
:::
::: {.step-content}
**Deployment**
:::
:::

:::

## Open Data

![](./images/bundestag_open_data.png)

## XML
![](./images/merz_xml_01.png)

## XML
![](./images/merz_xml_02.png)

## Parse XML
![](./images/parse_xml.png)

# EDA

## Workflow

::: {.workflow-container}

::: {.workflow-step}
::: {.step-number}
1
:::
::: {.step-content}
**Datenrecherche**
:::
:::


::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step .highlight}
::: {.step-number}
2
:::
::: {.step-content}
**EDA**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
3
:::
::: {.step-content}
**Datenbereinigung**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
4
:::
::: {.step-content}
**Auswertung / Modell**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem">→</span>
:::

::: {.workflow-step}
::: {.step-number}
5
:::
::: {.step-content}
**Deployment**
:::
:::

:::

## Data Frame

![](./images/eda_df_01.png)

## Data Frame
![](./images/eda_df_02.png)

---

## Missing Values
![](./images/missing_values.png)

---

![](./images/eda_geschlecht.png)


## Anzahl abgeordneter
![](./images/wrapper_barplot.png)

## Anzahl abgeordneter
![](./images/eda_anzahl_abgeordneter.png)

## Parteien

::: {.content-container}
![](./images/eda_parteien.png)

<div class="meme-corner">
![](./images/meme_homer_01.png){width="400px"}
</div>
:::

## Religion


::: {.content-container}
![](./images/eda_religion.png)

<div class="meme-corner">
![](./images/meme_homer_02.png){width="400px"}
</div>
:::

## Familienstand

::: {.image-container}
![](./images/eda_familienstand_02.png)

::: {.fragment .info-box}
**61 verschiedene Werte!**
:::
:::

<div class="meme-corner">
![](./images/meme_homer_03.png){width="400px"}
</div>
:::



## Berufe

::: {.image-container}
![](./images/eda_berufe_long.png)

::: {.fragment .info-box}
**2405 verschiedene Werte!**
:::
:::

<div class="meme-corner">
![](./images/meme_homer_04.png){width="400"}
</div>

# Datenbereinigung

## Workflow

::: {.workflow-container}

::: {.workflow-step}
::: {.step-number}
1
:::
::: {.step-content}
**Datenrecherche**
:::
:::


::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
2
:::
::: {.step-content}
**EDA**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step .highlight}
::: {.step-number}
3
:::
::: {.step-content }
**Datenbereinigung**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
4
:::
::: {.step-content}
**Auswertung / Modell**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem">→</span>
:::

::: {.workflow-step}
::: {.step-number}
5
:::
::: {.step-content}
**Deployment**
:::
:::

:::

## Parteien
![](./images/preproc_parteien.png)

## Religionen
![](./images/preproc_religionen_01.png)

## Religionen
![](./images/preproc_religionen_02.png)

## Berufe
![](./images/preproc_berufe_basic.png)

## Berufe
![](./images/preproc_berufe_gender.png)

## Berufe
![](./images/preproc_berufe_mapping.png)

# Datenvisualisierung

## Workflow

::: {.workflow-container}

::: {.workflow-step}
::: {.step-number}
1
:::
::: {.step-content}
**Datenrecherche**
:::
:::


::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
2
:::
::: {.step-content}
**EDA**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
3
:::
::: {.step-content }
**Datenbereinigung**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step .highlight}
::: {.step-number}
4
:::
::: {.step-content}
**Dashboard**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem">→</span>
:::

::: {.workflow-step}
::: {.step-number}
5
:::
::: {.step-content}
**Auswertung / Modell**
:::
:::

:::

## Tools
![](./images/visualisierung_tools_01.png)

## Tools
![](./images/visualisierung_tools_02.png)

## Dashboard
![](./images/bundestagsmining.png)

# Ergebnisse

## gender
![](./images/gender_01.png)

## gender
![](./images/gender_02.png)

## gender
![](./images/gender_03.png)

## Verweildauer im Parlament
![](./images/klebefaktor_01.png)

## Klebefaktor
![](./images/klebefaktor_02.png)

## Schäuble-Faktor
![](./images/klebefaktor_03.png)

## Schäuble-Faktor
![](./images/klebefaktor_04.png)

## Juristen
![](./images/beruf_jurist.png)

## Medizinisches Personal
![](./images/beruf_medizin.png)

## ITler
![](./images/beruf_it.png)

# Deployment

## Workflow

::: {.workflow-container}

::: {.workflow-step}
::: {.step-number}
1
:::
::: {.step-content}
**Datenrecherche**
:::
:::


::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
2
:::
::: {.step-content}
**EDA**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
3
:::
::: {.step-content }
**Datenbereinigung**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem;">→</span>
:::

::: {.workflow-step}
::: {.step-number}
4
:::
::: {.step-content}
**Auswertung / Modell**
:::
:::

::: {.workflow-arrow }
<span style="font-size: 3rem">→</span>
:::

::: {.workflow-step  .highlight}
::: {.step-number}
5
:::
::: {.step-content}
**Deployment**
:::
:::

:::

## Cloud
![](./images/cloud.webp)

## Docker
![](./images/deployment_dockerfile.png)

## Docker
![](./images/deployment_docker_build.png)

## Docker
![](./images/deployment_docker_push.png)

## CLoud
![](./images/deployment_google_cloud.png)

# Workflow Extended
![](./images/workflow_extended.png)

# Projektaufwand
![](./images/projektaufwand_01.png)

# Projektaufwand
![](./images/projektaufwand_02.png)

# Fazit
![](./images/drake_no.png)

# Fazit
![](./images/drake_yes.png)

# Fazit

---

## **3 Dinge, die ihr wirklich braucht:**

::: {.fragment}
### 1. Schaut euch die Daten an
- Ausführliche "manuelle" Datenexploration
- Sauberes Preprocessing
:::

::: {.fragment}
### 2. Lernt Python + Pandas 
- Setup: pandas + plotly / seaborn / matplotlib + Jupyter + git
:::

::: {.fragment}
### 3. Macht es interaktiv
- Lernt ein Web-App-Framework: Dash, Streamlit oder Gradio
:::

::: {.fragment}
### ... und wenn ihr der King sein wollt
- Lernt Deployment / Devops
:::
