In [38]:
%reload_ext sql

#Connect the notebook with the database
%sql postgresql://user:password@localhost:5432/postgres


### 1. Select the first 10 rows from events table

In [39]:
%%sql
SELECT * FROM events LIMIT 10;

 * postgresql://user:***@localhost:5432/postgres
10 rows affected.


app_id,event_type,timestamp
7,impression,2020-02-25 16:00:00
2,click,2020-05-15 14:01:00
9,impression,2020-07-22 15:00:00
4,click,2021-06-10 17:01:00
5,impression,2021-01-07 08:00:00
6,click,2020-05-12 19:01:00
1,click,2021-03-09 13:03:00
8,impression,2021-01-05 10:00:00
3,click,2021-06-01 09:02:00
9,impression,2021-02-02 08:05:00


### 2. Select all 'click' events and order them by app_id

In [44]:
%%sql
SELECT * FROM events 
WHERE event_type = 'click'
ORDER BY app_id;

 * postgresql://user:***@localhost:5432/postgres
20 rows affected.


app_id,event_type,timestamp
1,click,2021-03-09 13:03:00
1,click,2020-01-05 10:01:00
1,click,2020-01-01 10:01:00
2,click,2020-05-15 14:01:00
2,click,2020-02-02 08:06:00
2,click,2020-04-25 15:02:00
2,click,2020-03-30 18:02:00
2,click,2020-05-17 14:01:00
3,click,2021-06-01 09:02:00
3,click,2020-04-22 13:01:00


### 3. Write a query to return the impressions that happened in 2020 and the hour, minute and second when they happened, order by app_id

In [50]:
%%sql
SELECT app_id, 
        event_type, timestamp, EXTRACT(HOUR FROM timestamp) as Hour, EXTRACT(MINUTE FROM timestamp) as Minute, EXTRACT(SECOND FROM timestamp) as Second
FROM events
WHERE EXTRACT(YEAR FROM timestamp) = 2020
    AND event_type = 'impression'
ORDER BY app_id;


 * postgresql://user:***@localhost:5432/postgres
4 rows affected.


app_id,event_type,timestamp,hour,minute,second
4,impression,2020-03-30 18:01:00,18,1,0.0
6,impression,2020-03-14 09:00:00,9,0,0.0
7,impression,2020-02-25 16:00:00,16,0,0.0
9,impression,2020-07-22 15:00:00,15,0,0.0


### 4. Write a query to count impressions per app_id and order descending.

In [55]:
%%sql 
SELECT app_id,
    SUM(CASE WHEN event_type = 'impression' THEN 1 ELSE 0 END) as count_impressions
FROM events
GROUP BY app_id
ORDER BY app_id DESC;

 * postgresql://user:***@localhost:5432/postgres
9 rows affected.


app_id,count_impressions
9,5
8,3
7,4
6,3
5,4
4,5
3,2
2,1
1,3


### 5. Extract year and month from timestamp and count events per month.

In [74]:
%%sql 
SELECT EXTRACT(YEAR FROM timestamp) as year,
    EXTRACT(MONTH FROM timestamp) as month,
    COUNT(*) as eventS_month
FROM events
GROUP BY year, month
ORDER BY year, month;

 * postgresql://user:***@localhost:5432/postgres
13 rows affected.


year,month,events_month
2020,1,4
2020,2,4
2020,3,4
2020,4,2
2020,5,3
2020,7,1
2021,1,5
2021,2,8
2021,3,7
2021,4,3


### 6. Show app_id with more than 3 impressions and order by impressions

In [83]:
%%sql
SELECT app_id, SUM(CASE WHEN event_type = 'impression' THEN 1 ELSE 0 END) as impressions
FROM events
GROUP BY app_id
HAVING SUM(CASE WHEN event_type = 'impression' THEN 1 ELSE 0 END) > 3 
ORDER BY impressions;

 * postgresql://user:***@localhost:5432/postgres
4 rows affected.


app_id,impressions
5,4
7,4
9,5
4,5


### 7. Extract the year from timestamp and count total events per year.

In [91]:
%%sql
SELECT EXTRACT(YEAR FROM timestamp) as year,
    COUNT(*) event_count
FROM events
GROUP BY EXTRACT(YEAR FROM timestamp);

 * postgresql://user:***@localhost:5432/postgres
2 rows affected.


year,event_count
2021,32
2020,18


### 8. Write a query to group by app_id and event_type and count events. Order by app_id

In [96]:
%%sql 
SELECT app_id, COUNT(event_type) as events
FROM events
GROUP BY app_id, event_type
ORDER BY app_id;

 * postgresql://user:***@localhost:5432/postgres
18 rows affected.


app_id,events
1,3
1,3
2,1
2,5
3,3
3,2
4,5
4,1
5,4
5,1


### 9. Get the clickthrough rate per app in 2021.


   <p align="center">
  <img src="../imgs/ctr.png" alt="ctr" width="500" />
</p>

#### Write input and output of the query

Logical execution of SQL queries:
“Frida Went Gathering Her Small Delicious Orange Lollipops”

F → FROM
W → WHERE
G → GROUP BY
H → HAVING
S → SELECT
D → DISTINCT
O → ORDER BY
L → LIMIT

Input those 3 cols
Ouput has to be app_id, sum_events where event = "click", sum_events where event = "impression", ctr as (sum_events where event = "click"/sum_events where event = "impression" )*100

Steps:
1. from events
2. where timestamp = 2021
3. group by app_id
4. select app_id, CASE(event_type = "click"), CASE(event_type = "impression"), ctr as mentioned before



In [40]:
%%sql 
SELECT app_id, 
    SUM(CASE WHEN event_type = 'click' THEN 1 ELSE 0 END) as total_clicks,
    SUM(CASE WHEN event_type = 'impression' THEN 1 ELSE 0 END) as total_impressions,
    ROUND(
        (SUM(CASE WHEN event_type = 'click' THEN 1 ELSE 0 END)::numeric
        / NULLIF(SUM(CASE WHEN event_type = 'impression' THEN 1 ELSE 0 END),0))*100, 2
    ) as ctr
FROM events
WHERE EXTRACT(YEAR FROM timestamp) = 2021
GROUP BY app_id;

 * postgresql://user:***@localhost:5432/postgres
9 rows affected.


app_id,total_clicks,total_impressions,ctr
1,1,3,33.33
2,0,1,0.0
3,1,2,50.0
4,1,4,25.0
5,0,4,0.0
6,1,2,50.0
7,1,3,33.33
8,1,3,33.33
9,0,4,0.0
