# DDJ-Beispiele

Eine kuratierte Auflistung von Datenjournalismus-Beispielen:

https://docs.google.com/spreadsheets/d/1Sf0pkpan0sMeq_pva1AV6rkTva9xr0sgw3ThX_2VkYA/edit?usp=sharing

Dieses Script generiert aus der Google-Tabelle eine Markup-Datei.

## Setup

In [1]:
import pandas as pd

In [2]:
pd.set_option("display.max_colwidth", 200)

## Fetch data

**List of Articles**

In [3]:
url = "https://docs.google.com/spreadsheets/d/1Sf0pkpan0sMeq_pva1AV6rkTva9xr0sgw3ThX_2VkYA/export?format=csv"

In [4]:
df = pd.read_csv(url)

In [5]:
df.head(5)

Unnamed: 0,Jahr,Quelle,Titel,Kategorie,Link
0,2018,Bloomberg,Pick Your Own Brexit,Interactive,https://www.bloomberg.com/graphics/2018-pick-your-own-brexit/
1,2016,The Pudding,Film Dialogue,Textanalyse,https://pudding.cool/2017/03/film-dialogue/
2,2017,Guardian,Bussed Out,Mapping,https://www.theguardian.com/us-news/ng-interactive/2017/dec/20/bussed-out-america-moves-homeless-people-country-study
3,2018,Washington Post,America is more diverse than ever — but still segregated,Mapping,https://www.theguardian.com/us-news/ng-interactive/2017/dec/20/bussed-out-america-moves-homeless-people-country-study
4,2015,Guardian,How well do you really know your country?,Interactive,https://www.theguardian.com/world/ng-interactive/2015/dec/02/how-well-do-you-really-know-your-country-take-our-quiz


**List of Categories**

In [6]:
url = "https://docs.google.com/spreadsheets/d/1Sf0pkpan0sMeq_pva1AV6rkTva9xr0sgw3ThX_2VkYA/export?gid=1349399595&format=csv"


In [7]:
df_c = pd.read_csv(url)

In [8]:
df_c

Unnamed: 0,Nr,Kategorie,Beschreibung
0,1,Recherche,"BGÖ-Anfrage, Scraping und weitere besondere Methoden der Daten-Zusammenstellung"
1,2,Textanalyse,"Worthäufigkeiten, Sentiment-Analyse und weitere computerlinguistische Techniken"
2,3,Dataviz,"Storytelling mit Daten: besondere Plots, ungewöhnliche Chartformen, etc."
3,4,Mapping,"Choropleth, Point maps und allerlei weitere geografische Darstellungsformen"
4,5,Interactive,"Arbeiten, bei denen der User zum Mitmachen aufgefordert wird, insb. auch mit Gamification"
5,6,Algorithmen,Diskussion von Machine Learning und anderen Algorithmen
6,7,Crowdsourcing,Datengewinnung mihilfe der User


## Prepare File

We prepare two sections:
- Header (contains the title)
- Body (contains the actual content with articles sorted by category)

### Header

In [9]:
header = """# DDJ-Beispiele

Zusammengestellt für den MAZ-Kurs "Datenjournalismus"

"""

### Format Rows

In [10]:
df['Row'] = "- **" + df['Quelle'] + ":** [" + df['Titel'] + "](" + df['Link'] + ") (" + df['Jahr'].astype(str) + ")\n"

### Loop over Categories, Rows

In [11]:
body = ""
for category in df_c.sort_values('Nr')['Kategorie'].to_list():
    body += "## " + category + "\n"
    body = body + df_c[df_c['Kategorie'] == category]['Beschreibung'].sum() + "\n"
    body += df[df['Kategorie'] == category].sort_values(['Jahr', 'Quelle', 'Titel'])['Row'].sum()
        

## Write to File

In [12]:
file_out = "DDJ-Beispiele.md"

In [13]:
with open(file_out, "w") as f:
    f.write(header)
    f.write(body)