# Written reports

- Informational
    - Factual information
    - Short
    - Not strict structure
    - Inform about facts
- Analytical
    - Analysis (relationships/recommendations)
    - Varies (short or long)
    - Strict structure
    - Data-driven decisions
- Final report
    - Elements
        - Data analysis
        - Findings and results
        - Visuals
    - Format
        - Strict structure
        - Long
    - Audience
        - Details
- Summary report
    - Elements
        - Key findings and recommendations
        - Visuals
    - Format
        - Short (< 5 pages)
        - Summary of final report
        - Strict structure
        - Link to main document
    - Audience
        - No need for technical details


# Report Structure : Journal Context

- Introduction
    - Purpose of the report
        - Show Analysis of the product reviews
        - Rating prediction based on review
    - Contextual information
        - Why we performed the analysis
        - What motivated this report
        - Increase in negative reviews
    - Question of analysis
        - Summarize the answers of research questions
        - Factor affecting bad user experience
- Body
    - Data
        - Description of most relevant data
        - Tables
    - Methods
        - Methods used to analyze data
        - NLP and Random Forest
    - Analysis
        - Selected model
        - Visuals
        - Graphs with most common words
    - Results
        - Description of analysis
        - visuals for evaluation
- Conclusions
    - Restate question
    - Summarize most important results from analysis
    - Add recommendations : For next steps


# Report Structure : Business context


- 1-3-25 Approach
    - 1 page of abstract
    - 3 pages max of executive summary
    - 25 pages of detail

# Remember the audience

- People with little time (Customer or internal collaborator)
    - Read Introduction
    - Read Conclusion
    - Scan body
- People with liabilities (Executive team)
    - Scan introduction 
    - Scan conclusions
    - Scan Recommendations
- Technical stakeholder
    - Body of report (Understand and validate methods and analysis)

# Reproducibility

- A report must be clear 
- A report must be reproducible
- example:
    - Baking a cake with same recipe, process and same ingredients should give same cake with same flavors
    - in jupyter notebook, same analysis will produce same results
- Prevents duplication effort
- Build upon preexisting work
- Focus on new challenges
- Peer review
- Tool agnostic (Does not matter which platform, language)
- Best practices
    1. Keep track of how results were produced
        - Well documented scripts (Comments in code)
        - List packages and environment used
        - Version control (git)
    2. Avoid manual data manipulation
        - Never change data directly with editor
        - Save all version of dataset
        - Save raw data with intermediate steps (Helps to tract transformation)
        - Adapt and resolve issues (from previous versions)
        - example : Data imputation (impute with mean, later found that 0 is the best choice)
    3. Control randomness
        - Random seed to introduce reproducibility in output
        - Controls confounding variables (Change in result is due to model and not because of randomness)
    4. Interpretability
        - Understand the cause of a decision or predict model results
        - Story with compelling narrative : Helps to understand the finding
        - Link with reproducibility : Conclusion can be reproduced
    5. Cite bibliography correctly
        - Other people's work
        - basic information required to identify and locate a specific publication
        - Different styles but same underlying logic
            - Book: Author Name (Year). Title. Publisher.
            - Journal Article: Author Name. (Year) 'Article Title.' Journal Title, Volume Number, Issue Number, Page Numbers.
            - Website: Author Name. Date of Publication, 'Title of Page/Work.' Title of Website, Location
        - APA style (Most Common):
            - In text citations (author, date)
        - Reference management tools
            - Easier to keep track
            - Change between styles
            - Search for reference online
            - Options: EndNote, Mendeley, RefWorks
        - Business context
            - Less strict
            - Simpler utilization of reference (hyperlink)
            - information available and retrievable

# Precision

- Concise
        - Write concrete nouns
        - Avoid "this", "that", "it" ( Add cognitive load, unclear, Distract them from insights)
        - example: <s> This </s> (The model) shows an accuracy of 80% when predicting customer churn.
- Precise
        - Active voice: emphasis on the author
        - Passive voice: stuffy and hard to read
        - Academic (passive) vs business (active) context
        - Eliminate redundant adjective and adverbs (Phrases that say the same thing twice)
        - examples: Introduce a new, Done previously = not recommended
        - Two or more independent clauses connected incorrectly (2 sentences added by a comma)
                - Make two sentences
                - Use dependent clause
- Avoid misleading and confusion
- Meaningful message
- No empty phrases:
        - It is interesting to note that
        - The fact that
        - It should be pointed out that
        - It is well known that
        - It is obvious that
        - Contain no information
        - Distracting
        - Doesn't add information = should be removed
        - example : <s>Another important point is the fact that </s>  negative ratings were associated with the words "delayed" and "shipping"

# Example : Credit Risk

<center><img src="images/01.011.jpg"  style="width: 400px, height: 300px;"/></center>


- Credit risk: probability of defaulting
- Loanme bank wants to predict if a customer is likely to default
- Raw data available
- Data Exploration Analysis
- Model training and evaluation
- Audience
    - Non-technical stakeholders
    - Bank decision-makers
- Story
    - Background (What problem motivated the analysis): Increase in defaulting percentage over last 5 years.
    - Background (What is the focus of the analysis):Predicting which customers had a high probability of default.
    - Insight (Evidence of what contributed to the problem): People with more unemployment periods tends to default more
    - Insight (Supporting Evidence of what contributed to the problem): People with lower income tend to default more
    - Climax (Central findings): Possible to predict which people is more likely to default with an accuracy of 95%
    - Next steps (Potential Solution) : Run a trial on a control population
- Translate technical results
    - Simplify for non-technical audience
    - Role : Financing Department Director
    - Interest : Decision on implementing an automated loan rejection system
    - Appropriate data : Relationship between unemployment or income and loan default Percentage customer defaulting over the next months
    - statistics : Median age and income, Percentage of change
- Visuals
    - Boxplot with age vs. default condition 
    - Lineplot with % change defaulting customers
- Presentation Format
    - Who? Financial Department director
    - Why? Important decisions ahead
    - Content : Key findings and recommendations
    - Channel: Send the results before the meeting 
    - Written report (Summary report for non-technical audience) 
    - Analytical report (We are presenting analysis)

# Summary report structure


- Introduction
    - Purpose
    - Contextual information
    - Question of analysis
- Body
    - Data
    - Results: Key findings
- Conclusions
    - Restate question
    - Central insight
    - Add recommendations