# Introduction
This is part 3 of the [Data Journalism Workflow tutorial  series](https://www.kaggle.com/iamleonie/data-journalism-workflow).
In this notebook you will learn how to bring all the gathered information into order and how to craft a storyline.

![Screenshot%202021-03-23%20at%2019.53.02.png](attachment:Screenshot%202021-03-23%20at%2019.53.02.png)

There are a few ressources [1, 2, 3] available on different types of data stories. Although, they share similar underlying ideas, they cover different aspects. The following is my summary and interpretation of [1, 2, 3]. 
- [Profiling](#Profiling)
- [Comparison](#Comparison)
- [Relationships](#Relationships)
- [Change Over Time](#Change-Over-Time)

Note, that these types of data stories are not necessarily standalone. Often times, a data story might include multiple aspects and the lines between these categories can get blurred easily. These are intended to give you some **starting points to brainstorm your story ideas**.

# Profiling

Profiling is probably the most straightforward approach to crafting a story from data. Here, you simply explore the data and describe the topic. 
To avoid creating a boring report that nobody cares about you can approach this type of story from different angles: 

First, you can try the **novelty approach [2] if you have never before seen data**. This requires you to stay up to date with the latest news and scrape your own data but it can be quite rewarding. Great examples of the novelty angle are [WallStreetBets Reddit Posts Analysis by @gpreda](https://www.kaggle.com/gpreda/wallstreetbets-reddit-posts-analysis) or [Coronavirus (COVID-19) Visualization & Prediction by @therealcyberlord](https://www.kaggle.com/therealcyberlord/coronavirus-covid-19-visualization-prediction).

Another approach is to analyse and **describe a specific group**. You could either start small and zoom out or start big and drill down as suggested in [3] or describe the most common group, which is called the archetype in [2]. When you are doing this type of analysis, keep in mind not to oversimplify. Great examples for this are [Who 'Excels' on Kaggle? by @janinekadar](https://www.kaggle.com/janinekadar/who-excels-on-kaggle) and [Geek Girls Rising : Myth or Reality! by @parulpandey](https://www.kaggle.com/parulpandey/geek-girls-rising-myth-or-reality)

Last but not least, when you are focusing on one aspect, you could describe the **scale** [1]. You could focus on answering questions like 'How big of a problem is this issue?' or 'How big is the impact of this?'.

--- 

Let's brainstorm what story ideas we could apply to the [Netflix Movies and TV Shows dataset](https://www.kaggle.com/shivamb/netflix-shows) based on our [previous findings](https://www.kaggle.com/iamleonie/data-journalism-exploratory-data-analysis-2-5):
- Focus on TV shows added to Netflix in the past year and profile them: 
    - What do they have in common? 
    - What does it say about our zeitgeist?
    - What is the most popular category at the moment?
- Focus on specific groups, such as e.g.:
    - Romantic TV Shows in Asia
    - North American Actors
- Focus on the scale
    - How many seasons will a TV show be produced?
    - How popular is a TV show based on the number of countries it is available in?

# Comparison

When you are profiling, you might notice that you feel tempted to make some comparisons. This can also be viewed from different angles.

First, you can **only highlight the contrasts and explore different variations** in the data [1, 3]. If you go a step further, you could also **rank** groups by a specific metric [1]. For example, who is the best, who are the bottom 3 and explore the reason behind it, such as [ Why is Israel a leader in vaccination? 💉  by @michau96](https://www.kaggle.com/michau96/why-is-israel-a-leader-in-vaccination). If you want to take it to another level, you could even **come up with a custom metric** to rank your data by such as in [Quantify the Madness, a study of competitiveness by @lucabasa](https://www.kaggle.com/lucabasa/quantify-the-madness-a-study-of-competitiveness).

Lastly, when talking about comparisons, we must also consider talking about **outliers** [2, 3]. Analysis outliers in depth can be challenging but also rewarding if you find something interesting. This notebook, [Candidate Profiles for Cinderella Upset Potential by @alexsadowski](https://www.kaggle.com/alexsadowski/candidate-profiles-for-cinderella-upset-potential), might give you some inspiration.

--- 

Let's brainstorm what story ideas we could apply to the [Netflix Movies and TV Shows dataset](https://www.kaggle.com/shivamb/netflix-shows) based on our [previous findings](https://www.kaggle.com/iamleonie/data-journalism-exploratory-data-analysis-2-5):
- Highlighting contrasts and variations
    - How do countries  differ in the most popular TV show categories?
    - Is there a difference between TV dramas across countries?
- Ranking
    - Which TV show categories are most popular/least popular?
    - Which is the TV show with the most diverse cast?
- Outliers
    - What makes a TV show be produced for more than 10 seasons?

# Relationships

Instead of focusing on one group or comparing one characteristic, you can also **explore different relationships between characteristics** [1]. A nice example using only heatmaps is this notebook [A story told through a heatmap by @tkubacka](https://www.kaggle.com/tkubacka/a-story-told-through-a-heatmap). 

If you want to take it one step further, you can go into **why and how** something is behaving a certain way. For example like in this notebooks [[Plotly] Analyzing Why do Space Missions Fail? by @foolofatook](https://www.kaggle.com/foolofatook/plotly-analyzing-why-do-space-missions-fail)

--- 

Let's brainstorm what story ideas we could apply to the [Netflix Movies and TV Shows dataset](https://www.kaggle.com/shivamb/netflix-shows) based on our [previous findings](https://www.kaggle.com/iamleonie/data-journalism-exploratory-data-analysis-2-5):
- Relationships
    - Is there a connection between directors and actors?
- Why and how?
    - Why are romantic dramas more popular in Asian countries? How are Asian romantic dramas different from the rest of the world?

# Change Over Time

This type of data story is related to the previous one since we are looks at relationships between two characteristics. In this case, one of the characteristics is the time. Based on the data you can explore the **change over time** and see if you can see a trend [1,2]. Is the feature increasing, decreasing or in stasis over time? 

If you are comparing different groups, you might want to explore their relationship to each other as well. For example, are group A and B both steadily increasing with a constant offset or is one group catching up to the other? Is there an intersection and if yes, explore it in depth [3]. A nice example for this type of data story is this notebook [Tools of the Trade: A Short History by @haakakak](https://www.kaggle.com/haakakak/tools-of-the-trade-a-short-history/)

You could even take it one step further and experiment with **forecasting**. 
There was this [analytics challenge](https://www.kaggle.com/c/acea-water-prediction/) on Kaggle where the objective was to make some predictions about water supplies in different waterbodies.

---

Let's brainstorm what story ideas we could apply to the [Netflix Movies and TV Shows dataset](https://www.kaggle.com/shivamb/netflix-shows) based on our [previous findings](https://www.kaggle.com/iamleonie/data-journalism-exploratory-data-analysis-2-5):
- What was the change (increase, decrease, statis)?
    - Has there been a change on what days TV shows are added?
- Why did it change?
    - What is the cast of a TV show telling about deversity?
- How will it change in the future
    - Can we predict what TV shows Netflix might produce next?

# References
[1] [Bradshaw, Paul (2020): From Relationships to Ranking: Angles for Your Next Data Story](https://gijn.org/2020/08/18/from-relationships-to-ranking-angles-for-your-next-data-story/)

[2] [Flowers, Andrew (2017): The Six Types of Data Journalism Stories](https://www.youtube.com/watch?v=4zLo12JdeOA)

[3] [Kang, Martha (2015): Exploring the 7 Different Types of Data Stories](http://mediashift.org/2015/06/exploring-the-7-different-types-of-data-stories/)

🚀 Let's continue with [Lesson 4: Visualization & Storytelling](https://www.kaggle.com/iamleonie/data-journalism-visualization-storytelling-4-5/)