# NameX Daily Stats

**!pip** can be used to install any libraries not loaded when the env was created.

This notebook assumes you've installed the requirements.txt (`pip install -r requirements.txt`) before launching jupyter

contents of requirements.txt should be

`jupyter
psycopg2-binary
sqlalchemy
ipython-sql
simplejson
pandas
matplotlib
spacy
papermill
schedule`

We need to load in these libraries into our notebook in order to query, load, manipulate and view the data

In [10]:
import os
import logging
import sqlalchemy
import simplejson
import pandas as pd
import csv
import matplotlib
from datetime import datetime, timedelta
from IPython.core.display import HTML

%load_ext sql
%config SqlMagic.displaylimit = 5

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


Read in the connection string info, for DEV | TEST | PROD depending on which DB you wish to run stats against

In [11]:
# Local Credentials
# with open("creds-dev-forward.json.nogit") as fh:
#     creds = simplejson.loads(fh.read())

In [12]:
# DEV Credentials
with open("creds-dev.json.nogit") as fh:
    creds = simplejson.loads(fh.read())

In [13]:
# TEST Credentials
# with open("creds-test.json.nogit") as fh:
#     creds = simplejson.loads(fh.read())

In [14]:
# PROD Credentials
# with open("creds-prod.json.nogit") as fh:
#     creds = simplejson.loads(fh.read())

This will create the connection to the database and prep the jupyter magic for SQL

In [15]:
# creds['username'] = 'postgres'
# creds['password'] = 'postgres'
# creds['hostname'] = 'localhost'
# creds['port_num'] = '54323'
# creds['db_name'] ='namex'

creds['username'] = 'userHQH'
creds['password'] = 'cOsEhpcJeigX2Its'
creds['hostname'] = 'postgresql-dev'
creds['port_num'] = '5432'
creds['db_name'] ='namex'

connect_to_db = 'postgresql://' + \
                creds['username'] + ":" + creds['password'] +'@' + \
                creds['hostname'] + ':' + creds['port_num'] + '/' + creds['db_name'];
logging.debug("##########connect_to_db in namex-daily-report is {}".format(connect_to_db))
%sql $connect_to_db

'Connected: postgres@namex'

Simplest query to run to ensure our libraries are loaded and our DB connection is working

In [16]:
%%sql 
select now() AT TIME ZONE 'PST' as current_date

 * postgresql://postgres:***@localhost:54323/namex
1 rows affected.


current_date
2019-11-19 14:43:59.702440


Daily totals for specified date: In following query, 'current_date - 0' means today, 'current_date - 1' means yesterday, 'current_date - 2' means the day before yesterday...

In [17]:
%%sql stat_daily_completed  <<
SELECT r.user_id
     , to_char(date(r.last_update AT TIME ZONE 'PST'), 'YY-Mon-DD') AS Examined_Date
     , (select username from users u where u.id=r.user_id) AS EXAMINER
     , count(r.*) FILTER (WHERE r.state_cd = 'APPROVED')  AS APPROVED
     , count(r.*) FILTER (WHERE r.state_cd = 'REJECTED')  AS REJECTED
     , count(r.*) FILTER (WHERE r.state_cd = 'CONDITIONAL')  AS CONDITIONAL
     , count(r.*) FILTER (WHERE r.state_cd = 'CANCELLED')  AS CANCELLED
     , count(r.*) FILTER (WHERE r.priority_cd = 'Y')  AS PRIORITIES
     , count(r.*) + count(r.*) FILTER (WHERE r.priority_cd = 'Y')   AS total      
FROM requests r
where r.user_id != 1
AND date(r.last_update AT TIME ZONE 'PST') = current_date
and r.state_cd in ('APPROVED','REJECTED','CONDITIONAL','CANCELLED')
group by r.user_id, date(r.last_update AT TIME ZONE 'PST')

 * postgresql://postgres:***@localhost:54323/namex
0 rows affected.
Returning data to local variable stat_daily_completed


In [18]:
edt = stat_daily_completed.DataFrame()

if not edt.empty: 
    edt['examiner'] = edt['examiner'].str.replace('idir/','')    
    edt['approved_%'] = ((edt.approved + edt.conditional) / edt.total * 100).round(1)
    edt['rejected_%'] = (edt.rejected / edt.total * 100).round(1)
    
    with pd.option_context('display.max_rows', None, 'display.max_columns', None):
        display(HTML(edt.to_html()))
        print('grand total', edt['total'].sum())
        
    

Save to CSV
    

In [19]:
# filename = 'daily_totals_' + datetime.strftime(datetime.now() - timedelta(0), '%Y-%m-%d') +'.csv'
filename = 'daily_totals_' + datetime.strftime(datetime.now(), '%Y-%m-%d') +'.csv'
edt.to_csv(filename, sep=',', encoding='utf-8', index=False)

if edt.empty:
    with open(filename, 'a') as f:
        writer = csv.writer(f)
        writer.writerow(('No Data Retrieved',''))