### Table:
|visitor_id|page_name|visit_datetime|conversion_flag|
|:--------:|:--------:|:--------:|:--------:|
|123|A|11/1/2019 9:00:00|0|
|123|A|11/1/2019 9:20:00|1|
|123|B|11/1/2019 9:30:00|1|
|...|...|...|...|...|

### Questions:
* Find average conversion rate of visitors
* Find conversion rate by the first page of visit
* Find conversion rate by the last page of visit
* Find conversion rate by the number of pages a visitor goes to
* Find conversion rate by the page path users take

In [1]:
%load_ext sql

#### * Find average conversion rate of visitors
First, I find whether the visitor *ultimately* converted. I count the percentage by dividing total visitor that has `conversion_flag` ultimately `1` divided by total of `visitor_id`. There is no `GROUP BY` in the end script because `visitor_id` is already unique.

In [None]:
%%sql postgresql://postgres:postgrepassword@localhost/
WITH conversion AS (
	SELECT
		visitor_id,
		MAX(conversion_flag) AS converted
	FROM
		visitor_table
	GROUP BY
		visitor_id)
SELECT
	SUM(CASW WHEN converted = 1 THEN 1 ELSE 0)/COUNT(*) AS Conversion_Rate
FROM
	conversion;


#### * Find conversion rate by the first page of visit
First, I get the first `visit_datetime`, and whether the visitor *ultimately* converted. The end script joined the CTE back to the original table by capturing exact `visit_date_time`. By grouping `page_name`, I calculate the `Conversion_Rate` the same way as the first question.

In [None]:
%%sql postgresql://postgres:postgrepassword@localhost/
WITH
	conversion AS (
		SELECT
			visitor_id,
			MIN(visit_datetime) AS first_visit_date,
			MAX(conversion_flag) AS converted
		FROM
			visitor_table
		GROUP BY
			visitor_id)
SELECT
	vt.page_name AS page,
	SUM(CASE WHEN c.converted = 1 THEN 1 ELSE 0)/COUNT(*) AS Conversion_Rate
FROM
	conversion c
INNER JOIN
	visitor_table AS vt ON c.visitor_id = vt.visitor_id AND c.first_visit_date = vt.visit_datetime
GROUP BY
	vt.page_name


#### * Find conversion rate by the last page of visit
With the similar method as previous question, I capture `MAX()` of `last_page_visited`, but only when the `conversion_flag` is still `0`. This way I will not capture any navigation *after* the `conversion_flag` is turned into `1`. The end script based on this CTE, because all the `visitor_id` should be captured (assuming **no visitor** should have `conversion_flag` set as `1` since the beginning), then joined with another CTE which only capture whether the `visitor_id` is converted or not.

In [None]:
%%sql postgresql://postgres:postgrepassword@localhost/
WITH
	last_page_visited AS (
		SELECT
			visitor_id,
			MAX(visit_datetime) AS last_visit_date
		FROM
			visitor_table
		WHERE
			conversion_flag = 0
		GROUP BY
			visitor_id),
	conversion AS (
		SELECT
			visitor_id,
			MAX(conversion_flag) AS converted
		FROM
			visitor_table
		GROUP BY
			visitor_id)
SELECT
	vt.page_name,
	SUM(CASE WHEN c.converted = 1 THEN 1 ELSE 0)/COUNT(*) AS Conversion_Rate
FROM
	last_page_visited AS lp
LEFT JOIN	
	visitor_table AS vt ON lp.visitor_id = vt.visitor_id AND lp.last_visit_date = vt.visit_datetime
LEFT JOIN
	conversion AS c ON lp.visitor_id = c.visitor_id
GROUP BY
	vt.page_name


#### * Find conversion rate by the number of pages a visitor goes to
Same as the previous question, I created a CTE of `COUNT()` of page(s) visited, when the `conversion_flag` is still at `0`. I then ultimately joined another CTE which capture whether visitor is converted or not.

In [None]:
%%sql postgresql://postgres:postgrepassword@localhost/
WITH
	count_page_visited AS (
		SELECT
			COUNT(*) AS total_page_visited,
			visitor_id
		FROM
			visitor_table
		WHERE
			conversion_flag = 0
		GROUP BY
			visitor_id),
	conversion AS (
		SELECT
			visitor_id,
			MAX(conversion_flag) AS converted
		FROM
			visitor_table
		GROUP BY
			visitor_id)
SELECT
	cpv.total_page_visited,
	SUM(CASE WHEN c.converted = 1 THEN 1 ELSE 0)/COUNT(*) AS Conversion_Rate
FROM
	count_page_visited AS cpv
LEFT JOIN
	conversion AS c ON cpv.visitor_id = c.visitor_id
GROUP BY
	cpv.total_page_visited


#### Find conversion rate by the page path users take
This one a little tricky because I'm not sure if I can use `ARRAY_TO_STRING()` and `ARRAY_AGG()` on another engine. I used Postgre SQL. This time I capture the path, again, when the `conversion_flag` is still at `0`.

In [None]:
%%sql postgresql://postgres:postgrepassword@localhost/
WITH
	path AS (
		SELECT
			ARRAY_TO_STRING(ARRAY_AGG(page_name), ' -> ') AS nav_path,
			visitor_id
		FROM
			visitor_table
		WHERE
			conversion_flag = 0
		GROUP BY
			visitor_id
	),
	conversion AS (
		SELECT
			visitor_id,
			MAX(conversion_flag) AS converted
		FROM
			visitor_table
		GROUP BY
			visitor_id)
SELECT
	p.nav_path,
	SUM(CASE WHEN c.converted = 1 THEN 1 ELSE 0)/COUNT(*) AS Conversion_Rate
FROM
	path AS p
LEFT JOIN
	conversion AS c ON p.visitor_id = c.visitor_id
GROUP BY
	p.nav_path