## Data storytelling

Data storytelling is a powerful mechanism for sharing insights supported by a compelling narrative and efficient visualizations.

Data storytelling is not about spinning results, it's about making them stick. For this to work, they should be:
- simple (pruning the message to its core) 
- concrete (can be described or detected by human senses) 
- credible (can be put to test)

### *Fundamentals of data storytelling*

`Two concepts` help us build the story's core. 
1. **3-minute story:** if we only had three minutes to tell a story, what would we focus on? 
2. **The big idea:** state the story's unique point of view in one sentence. 
 
These concepts help us articulate our story clearly and concisely. 

There are `three central elements` for any data story:


1. The data
    - Only include results (e.g. predictions) and findings (e.g. data analysis) that support the story's big idea
    - Data should be relevant i.e, data that applies to the situation at hand
    - The data should be accurate and reliable


2. The narrative: Only include the key points needed to drive change.
    - Main point
      - Avoid disconnected facts
      - One central insight
    - Explanatory context
      - Understand background and audience
      - Clarify facts to that audience
    - Linear sequence: Every data point presented builds on each other until the conclusion is reached.

3. The visuals
    - Should be simple
    - Should be engaging
    - *Must not be misleading*

### *Translating technical results*

1. **Awareness:** Be aware of your audience. What is their background? What do they know? Will they understand techical jargons if you explain it? How much do they need to understand to drive change? Explaining why we chose the variables to predict is not useful for a non-tech person, we should explain the context in which our model works. Our story should help the audience understand what our project is about, what change needs to be made for tangible outcomes. Listing the correlation coefficients could be overwhelming, it's better to explain the interaction between the variables (how they depend on each other, what will happen to the other if we change one etc). 

2. **The ADEPT technique**
   - Analogy: Use a familiar concept to explain a new one.
   - Diagram: Use simple, easy-to-understand visuals.
   - Example: Illustrative power of a concrete example is always efficient.
   - Plain English: Avoid jargon and technical terms as much as possible.
   - Technical definition: Use technical terms only when necessary. Be aware that not all technical details can be removed. Include a reference guide or definitions in the report. Be active, explain the same thing differently if necessary, try to answer the questions that the audience has, be humble and don't be condescending.

3. **Focus on impact:** Even if we are interested in showing the technical details, a persuasive story usually focuses on impact rather than process. For example, if we are talking about adopting a new technology, we should focus on the impact that technology has on the technical team's time and deliverables. Or if we are explaining how our model predicts an outcome, we should focus on the benefits, such as financial benefits.

### *Impacting the decision-making process*

<u>`Narrative structure`</u>

1. **Background**
   - What motivated the analysis?
   - What changed in the previous situation that called for an analysis?
   - Who/What the analysis if focused on? (e.g. a specific customer group, a specific product etc)
2. **Initial insights:** Provide evidence of the factors that contribute to the problem. Only include relevant information.
3. **Deeper insights:** Provide more supporting evidence and data, as long as it helps explain on a deeper level the cause of the problem.
4. **Climax:** All the evidence should lead to the climax, the moment when we introduce the central finding of our analysis. It should state clearly what could happen if nothing changes.
5. **Next steps:** After the main finding is revealed, we should finish exploring potential solutions and opportunities, by recommending a course of action to take. We need to be proactive and guide the audience through understanding what to do with our results if we want to impact the decision making process. 

<u>`Selecting correct visualizations and presenting them`</u>


1. **Identify your audience's level of expertise**. Choose the right level of detail and the right visualizations accordingly.
2. Give a **simple, clear and proper title** to the visualization. When in doubt you can follow the *y vs x format*.
3. Anticipate any questions the audience might have and **provide the answers**. Explain the data source, the context, the units, the time period etc. In short, **prove that the data is reliable**.
4. A **strong and general statement about the findings** i.e, the **main takeaway** of the visualization.
5. Highlight **supporting data**.
6. **Separately highlight** any odd trend or any feature that you want **to draw attention to**.
7. Explain the **impact of the findings**. What does it mean for the business? What should be done next? What are the next steps?

## Data preparation for communication

### *Choosing the right format*

A good communication format shows **key informations** from our project in a way that is **engaging** and **easy to understand**.

There are two main formats that are very common: written reports and oral presentations. Most data science projects will require a written report and an oral presentation, but in the end it depends on the situation and project at hand. 

**There are several things to consider when sharing findings:**

1. The audience (their background, why they are interested in the project, what they are interested in - the results or the impact it'll have, how will they use the findings)
2. The content to include (results, conclusions, recommendations, methods)
3. Special requirements to take into account (time constraints, will they report to someone else and need a copy to backup their claims)
4. Lastly deciding which channel to use (format - notebook, blogpost, report, presentation etc; delivery mechanism - email, in person, slack etc; how big is the audience?)

- #### Oral presentations

<u>Advantages</u>
1. You get to build a relationship with the audience
2. Non-verbal cues can be used to emphasize important points and also to gauge the audience's reaction
3. Immediate feedback

<u>Disadvantages</u>
1. No permanent record of communication
2. Not suitable for long messages

- #### Written reports

<u>Advantages</u>
1. Permanent records of communication so the message can be analyzed on the longer term 
2. It is easy to share with large audiences 
3. Less prone to emotional reactions 
4. It is also suitable to share code with any technical stakeholder for review or replication 

<u>Disadvantages</u>
1. Hard to see if the message was understood
2. No immediate feedback