# E-Commerce Analytics - Maven Fuzzy Factory

## 3.User Behaviour Analytics

This section focuses on analysing user engagement and user experience to understand customer satisfaction and loyalty

In [2]:
%%sql

-- Time difference between first and second repeat sessions

-- STEP 1: Identify first sessions
-- STEP 2: Identify second sessions (repeat sessions)
-- STEP 3: Aggregate the days from first to second sessions
-- STEP 4: Find the average, min and max days from first to second sessions

-- STEP 1: Identify first sessions
DROP TABLE IF EXISTS first_sessions;
CREATE TEMPORARY TABLE first_sessions
SELECT
    created_at AS first_created_at,
    website_session_id AS first_session_id,
    user_id
FROM website_sessions
WHERE created_at > '2014-01-01'
    AND created_at <= '2014-11-03'
    AND is_repeat_session = 0 ; -- first sessions

-- STEP 2: Identify second sessions (repeat sessions)    
DROP TABLE IF EXISTS second_sessions;
CREATE TEMPORARY TABLE second_sessions    
SELECT 
    website_sessions.created_at AS second_created_at,
    website_sessions.user_id,
    website_sessions.website_session_id AS second_session_id
FROM
    first_sessions
    LEFT JOIN website_sessions
    ON first_sessions.user_id = website_sessions.user_id
    AND website_sessions.is_repeat_session = 1
    AND website_sessions.website_session_id > first_sessions.first_session_id
WHERE website_sessions.created_at > '2014-01-01'
    AND website_sessions.created_at <= '2014-11-03';

##-- STEP 3: Aggregate the days from first to second sessions  
DROP TABLE IF EXISTS users_first_to_second;
CREATE TEMPORARY TABLE users_first_to_second
SELECT
    user_id,
    DATEDIFF(second_created_at, first_created_at) AS days_first_to_second_session
FROM (
SELECT
    first_sessions.user_id,
    first_sessions.first_created_at,
    first_sessions.first_session_id,
    second_sessions.second_created_at,
    second_sessions.second_session_id
FROM 
    first_sessions
    LEFT JOIN second_sessions
    ON first_sessions.user_id = second_sessions.user_id
) AS first_second;

-- STEP 4: Find the average, min and max days from first to second sessions
SELECT
	AVG(days_first_to_second_session) AS avg_days_first_to_second,
    MIN(days_first_to_second_session) AS min_days_first_to_second,
	MAX(days_first_to_second_session) AS max_days_first_to_second
FROM users_first_to_second;

 * mysql+pymysql://root:***@127.0.0.1/mavenfuzzyfactory
0 rows affected.
146514 rows affected.
0 rows affected.
28907 rows affected.
0 rows affected.
156238 rows affected.
1 rows affected.


avg_days_first_to_second,min_days_first_to_second,max_days_first_to_second
42.4391,1,124


Insight: On average, customers take a little over month to return to the website. This might suggest that there is a monthly purchase cycle for the goods. The longest return being 124 days could indicate a seasonal pattern. Ther are oppotunities to improve customer engagement and encourage more frequent visits.

In [5]:
%%sql

-- User engagement analysis by channel types

SELECT
    channel_group,
    COUNT(DISTINCT CASE WHEN is_repeat_session = 0 THEN website_session_id ELSE NULL END) AS new_sessions,
    COUNT(DISTINCT CASE WHEN is_repeat_session = 1 THEN website_session_id ELSE NULL END) AS repeat_sessions
FROM (
SELECT
    website_session_id,
    is_repeat_session,
    CASE
        WHEN utm_source IS NULL AND http_referer IN ('https://www.gsearch.com','https://www.bsearch.com') THEN 'organic_search'
        WHEN utm_campaign = 'brand' THEN 'paid_brand'
        WHEN utm_campaign = 'nonbrand' THEN 'paid_nonbrand'
        WHEN utm_source ='socialbook' THEN 'paid_social'
        WHEN utm_source IS NULL AND http_referer IS NULL THEN 'direct_type_in'
    END AS channel_group
FROM website_sessions
WHERE created_at >= '2014-01-01'
    AND created_at < '2014-11-05'
) AS channel_mix
GROUP BY 1;

 * mysql+pymysql://root:***@127.0.0.1/mavenfuzzyfactory
5 rows affected.


channel_group,new_sessions,repeat_sessions
direct_type_in,6591,10564
organic_search,7139,11507
paid_brand,6432,11027
paid_nonbrand,119950,0
paid_social,7652,0


Insight: The Paid Nonbrand channel excels in attracting first-time visitors but appears to fall short in fostering repeat engagements, signaling a need for strategies to boost customer retention. Conversely, the high numbers of both new and repeat sessions from Direct Type-in and Organic Search indicate robust brand recognition and loyalty. Optimising marketing initiatives to enhance the retention rates of paid channels could be a key focus moving forward.

In [6]:
%%sql

-- Conversion Rate and Sales by repeat sessions

SELECT
    website_sessions.is_repeat_session AS repeat_sessions,
    COUNT(DISTINCT website_sessions.website_session_id) AS sessions,
    COUNT(DISTINCT orders.order_id)/COUNT(DISTINCT website_sessions.website_session_id) AS conv_rt,
    SUM(orders.price_usd)/COUNT(DISTINCT website_sessions.website_session_id) AS rev_per_session
FROM website_sessions
    LEFT JOIN orders
    ON orders.website_session_id = website_sessions.website_session_id
WHERE website_sessions.created_at > '2014-01-01'
    AND website_sessions.created_at <= '2014-11-08'
GROUP BY 1;

 * mysql+pymysql://root:***@127.0.0.1/mavenfuzzyfactory
2 rows affected.


repeat_sessions,sessions,conv_rt,rev_per_session
0,149787,0.068,4.343754
1,33577,0.0811,5.168828


Insight: Repeat sessions tend to yield higher conversion rates and generate greater revenue per session