# Analysis

Template notebook for analysis, including some commonly used starter code and formatting techniques.
- generated from https://github.com/pawlodkowski/analysis_template

## Contents<a class="anchor" id="contents"></a>

1. [Background](#background)

2. [Executive Summary](#summary)

3. [Q1](#q1)

    a. [Q1a](#1a)
    
    b. [Q1b](#1b)

4. [Q2](#q2)

    a. [Q2a](#2a)
    
    b. [Q2b](#2b)
    
5. [Appendix](#Appendix)

6. [Footnotes](#Footnotes)

In [None]:
import os
import re
from typing import List, Tuple

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.figure_factory as ff
import sqlalchemy
import yaml
from IPython.display import Markdown, display, display_html
from scipy import stats

pd.options.plotting.backend = "plotly"

def get_db_url(file='database.yml', conn='dwh') -> str:
    """
    Returns database URL from credential variables found in .yml file.
    """
    path_options = [os.path.expanduser('~') + '/.rport/' + file,
                    os.getcwd() + '/' + file,
                    '/root/' + file]
    for opt in path_options:
        if os.path.isfile(opt):
            with open(opt) as data:
                creds = yaml.safe_load(data)
            break
    else:
        err = '\nCredentials file not found from the list of options:\n'
        for opt in path_options:
            err += f'\t- {opt}\n'
        raise Exception(err)
         
    c = creds.get(conn)
    if not c:
        raise Exception(f'\nNo data found for connnection named {conn}!\n')
        
    db_url = f"""\
            {c['driver'].lower()}://{c['username']}:\
            {c['password']}@{c['host']}:{c['port']}/{c['database']}"""
    return re.sub('\s+', '', db_url)

DB_URL = get_db_url()
CON = sqlalchemy.create_engine(DB_URL)

#modify or delete
TEST_PARAMS = {'start_date':'2022-01-01 00:00:00'}
              
SQL = """
with dummy_query as (
  select 
    date_trunc('day', series)::date as day,
    round((random() * 9 + 1)::numeric, 1) as num
  from generate_series(%(start_date)s::timestamp,
                       %(start_date)s::timestamp + interval '7 days', 
                       '1 day'::interval
                       ) series
)
select * from dummy_query
"""
df = pd.read_sql(SQL, CON, params=TEST_PARAMS)

## Background<a class="anchor" id="background"></a>:

In [None]:
text = f"""
Here is the text for the background. And here is the first footnote<a href="#footnote1"><sup>1</sup></a>.
"""
display(Markdown(text))

## Executive Summary<a class="anchor" id="summary"></a>:

In [None]:
text = f"""
Here is the text for the executive summary. And here is the second footnote<a href="#footnote2"><sup>2</sup></a>.

When referencing more specific things (like technical definitions) that don't belong in a high-level summary, you can reference the
[Appendix](#Appendix).
"""
display(Markdown(text))

## Q1 (Heading Level 2) <a class="anchor" id="q1"></a>

In [None]:
text = f"""
Here is the text for the first question. And here is the third footnote<a href="#footnote3"><sup>3</sup></a>.

The question-level (i.e. a section) should be heading level 2 (i.e. `<h2>` in HTML or `##` in Markdown), **along with the the executive summary and
table of contents**.
- This is essentially the "top-level", as `<h1>` is only reserved for the title of the report.
"""
display(Markdown(text))

### Q1a (Heading Level 3) <a class="anchor" id="1a"></a>

In [None]:
text = f"""
Sub-sections within a question should be heading level 3 (i.e. `<h3>` in HTML or `###` in Markdown).
"""
display(Markdown(text))

#### Sub-Sub-Section (Heading Level 4)<a class="anchor" id="descriptive_name"></a>

In [None]:
text = f"""
Sub-sections within a sub-section within a question should be heading level 4 (i.e. `<h4>` in HTML or `####` in Markdown).
"""
display(Markdown(text))

#### Sub-Sub-Sub-Section (Heading Level 5)<a class="anchor" id="descriptive_name"></a>

In [None]:
text = f"""
The deepest level allowed is heading level 5 (i.e. `<h5>` in HTML or `#####` in Markdown). 

- This is the lowest heading level that is targeted by my 
[flowkey nbconvert template](https://github.com/pawlodkowski/nbconvert_flowkey/blob/master/share/jupyter/nbconvert/templates/flowkey/index.html.j2) and
given styling and anchor links for easier navigating / internal hyperlinking.
- Using anything lower will just be turned into regular text in the HTML report (_which won't look nice_).
"""
display(Markdown(text))

### Q1b (Heading Level 3) <a class="anchor" id="1b"></a> 

In [None]:
text = f"""
some text here...
"""
display(Markdown(text))

## Q2 (Heading Level 2)<a class="anchor" id="q2"></a>

In [None]:
text = f"""
some text here...
"""
display(Markdown(text))

### Q2a (Heading Level 3)<a class="anchor" id="2a"></a>

In [None]:
text = f"""
some text here...
"""
display(Markdown(text))

### Q2b (Heading Level 3) <a class="anchor" id="2b"></a> 

In [None]:
text = f"""
some text here...
"""
display(Markdown(text))

## Appendix<a class="anchor" id="Appendix"></a>

In [None]:
text = f"""
This is the text for the appendix. 
"""
display(Markdown(text))

## Footnotes<a class="anchor" id="Footnotes"></a>

**<a name="footnote1">1</a>:** _Here is the text for footnote 1._

**<a name="footnote2">2</a>:** _Here is the text for footnote 2._

**<a name="footnote2">3</a>:** _Here is the text for footnote 3._

In [None]:
display_html(
    '<div id="timestamp_container"\
    style="display: flex; justify-content: center; align-items: center; font-size:smaller; color: dimgray">\
    <p><i>Report generated at: <span id="timestamp" style="font-weight:bold; margin:5px">{} (UTC)</span></i></p></div>'\
    .format(datetime.utcnow().strftime('%Y-%m-%d @ %H:%M:%S')),
    raw=True
)