In [17]:
%%capture
!pip install -r ../requirements.txt

In [18]:
%load_ext sql
%sql postgresql://corise:corise@localhost:5432/dbt
%config SqlMagic.displaylimit=5
%config SqlMagic.displaycon = False

The sql extension is already loaded. To reload it, use:
  %reload_ext sql


In [19]:
#list dbs in psql
%sql \l

4 rows affected.


Name,Owner,Encoding,Collate,Ctype,Access privileges
dbt,postgres,UTF8,C.UTF-8,C.UTF-8,=Tc/postgres postgres=CTc/postgres corise=CTc/postgres
postgres,postgres,UTF8,C.UTF-8,C.UTF-8,
template0,postgres,UTF8,C.UTF-8,C.UTF-8,=c/postgres postgres=CTc/postgres
template1,postgres,UTF8,C.UTF-8,C.UTF-8,=c/postgres postgres=CTc/postgres


In [20]:
%sql \d

8 rows affected.


Schema,Name,Type,Owner
public,addresses,table,postgres
public,events,table,postgres
public,order_items,table,postgres
public,orders,table,postgres
public,products,table,postgres


##Questions to answer:

- How many users do we have?
- On average, how many orders do we receive per hour?
- On average, how long does an order take from being placed to being delivered?
- How many users have only made one purchase? Two purchases? Three+ purchases?
- On average, how many unique sessions do we have per hour?

In [21]:
#How many users do we have?
%sql select count(*) from public.users

1 rows affected.


count
130


In [22]:
#On average, how many orders do we receive per hour?
#query orders by hour, then take avg

%sql with orders_by_hour as (SELECT count(*), date_trunc('hour', created_at) as hour_group FROM orders GROUP BY date_trunc('hour', created_at) ORDER BY count desc) select ROUND(AVG(count),2) from orders_by_hour

1 rows affected.


round
8.16


On average, how long does an order take from being placed to being delivered?
Query plan - calc difference between order create date and delivery date (remove null deliveries) and then take the avg of this (use minutes?)

In [23]:
%%sql 
with delivery_times_in_hours 
as (SELECT EXTRACT(EPOCH FROM (delivered_at - created_at ))/3600  as diff_time_hour 
FROM orders where delivered_at is not null)  

select AVG(diff_time_hour) as avg_hours 
FROM delivery_times_in_hours

1 rows affected.


avg_hours
94.21958456973294


How many users have only made one purchase? Two purchases? Three+ purchases?

count users with group by and having clause (maybe pivot in there somewhere?)  to group 1/2/3+


In [26]:
%%sql

WITH user_purchase_count as (
    SELECT
    u.user_id,
    count(u.*) as purchases
    FROM users u
    JOIN orders o on o.user_id=u.user_id
    GROUP by u.user_id)

SELECT
case when purchases = 1 then '1'
     when purchases = 2 then '2'
                    else '3+' end as no_of_purchases,
count(user_id) as count_of_users
FROM  user_purchase_count
GROUP BY
case when purchases = 1 then '1'
     when purchases = 2 then '2'
                        else '3+' end

ORDER BY no_of_purchases


3 rows affected.


no_of_purchases,count_of_users
1,25
2,22
3+,81


How many avg unique sessions per hour?

In [27]:
%%sql

WITH sessions_by_hour as
(SELECT  COUNT(DISTINCT(session_id)) as distinct_sessions,
date_trunc('hour', created_at) as hour_grouped
FROM events
WHERE created_at is not null
GROUP BY  hour_grouped)

SELECT ROUND( AVG(distinct_sessions), 2) as avg_sessions_per_hour
FROM sessions_by_hour

1 rows affected.


avg_sessions_per_hour
7.27
