# Business Questions (Part 1)

In [None]:
import os
from pathlib import Path

from dotenv import load_dotenv
from sqlalchemy import create_engine
from snowflake.sqlalchemy import URL

In [None]:
PROJ_ROOT = Path().resolve().parents[3]
env_file_dir = PROJ_ROOT / '.env'
_ = load_dotenv(env_file_dir, verbose=True)

## About

Answer the Part 1. business questions using the greenery e-commerce data using DBT models created in my personal Snowflake schema for week 2 project.

### Notes

1. This notebook supports <kbd>Run</kbd> > <kbd>Run All Cells</kbd>.

## User Inputs

In [None]:
schema_name = os.getenv("UPLIMIT_SNOWFLAKE_SCHEMA")

In [None]:
engine = create_engine(
    URL(
        drivername="driver",
        account=os.getenv("UPLIMIT_SNOWFLAKE_ACCOUNT"),
        user=os.getenv("UPLIMIT_SNOWFLAKE_USER"),
        password=os.getenv("UPLIMIT_SNOWFLAKE_PASS"),
        warehouse=os.getenv("UPLIMIT_SNOWFLAKE_WAREHOUSE"),
        role=os.getenv("UPLIMIT_SNOWFLAKE_ROLE"),
        database=os.getenv("UPLIMIT_SNOWFLAKE_DB_NAME"),
        schema=schema_name,
    )
)

## Connect

Load Jupyter SQL extension

In [None]:
%load_ext sql

Set the maximum number of rows to be displayed to `None` (shows all rows)

In [None]:
%config SqlMagic.displaylimit = None

Connect to DuckDB database

In [None]:
%sql engine --alias connection

## Queries

### Question 1

**What is our user repeat rate? Repeat Rate = Users who purchased 2 or more times / users who purchased**

In [None]:
%%sql
WITH repeat_purchasers AS (
    SELECT user_id,
           COUNT(*) AS num_orders,
           True AS is_repeat_purchaser
    FROM stg_postgres_orders
    GROUP BY user_id
    HAVING num_orders >= 2
),
non_repeat_purchasers AS (
    SELECT user_id,
           COUNT(*) AS num_orders,
           False AS is_repeat_purchaser
    FROM stg_postgres_orders
    GROUP BY user_id
    HAVING num_orders = 1
),
repeat_purchaser_total AS (
    SELECT COUNT(*) AS num_repeat_purchasers
    FROM repeat_purchasers
),
non_repeat_purchaser_total AS (
    SELECT COUNT(*) AS num_non_repeat_purchasers
    FROM non_repeat_purchasers
),
repeat_purchaser_rate AS (
    SELECT *,
           (num_non_repeat_purchasers+num_repeat_purchasers) AS purchasers,
           100*num_repeat_purchasers/purchasers AS repeat_purchaser_rate
    FROM repeat_purchaser_total rp
    LEFT JOIN non_repeat_purchaser_total nrp
)
SELECT *
FROM repeat_purchaser_rate

### Question 2

**What are good indicators of a user who will likely purchase again? What about indicators of users who are likely NOT to purchase again? If you had more data, what features would you want to look into to answer this question? NOTE: This is a hypothetical question vs. something we can analyze in our Greenery data set. Think about what exploratory analysis you would do to approach this question.**

#### Indicators of Users Who are Likely to Purchase Again

1. User has purchased frequently in trailing 3, 6, 12 months
2. Wheter user leaves/writes positive product reviews or leaves positive ratings (if applicable)
   - if product ratings can be left on the Greenery store site then users who have prevoiusly purchased and who write and/or react positively to (eg. like, thumbs up, etc.) those ratings are more likely to purchase again
3. Whether users are subscribers
   - similar to [Amazon's *Subscribe & Save Subscription*](https://www.amazon.ca/gp/help/customer/display.html?nodeId=GJ2LTMLFGGMH67M7)
4. Whether users frequently participate in promotions

#### Indicators of Users Who are Not Likely to Purchase Again

1. User metrics for behaviour on site are poor
   - bounce rate is high
   - time on site is low
   - views few product pages
   - high cart abandonment rate
2. Wheter user leaves/writes negative user reviews or leaves negative ratings (if applicable)
   - if product ratings can be left on the Greenery store site then users who have prevoiusly purchased but who write and/or react negatively to (eg. thumbs down, etc.) those ratings are less likely to purchase again
3. Poor experience with Greenery platform
   - frequent and long delays in receiving their orders
   - high product return rate
4. Whether users infrequently participate in promotions

#### Features to Look into with More Data

1. Past purchasing frequency (number of orders) of user
2. Whether user is subscriber
3. Past partitipation in promotions

## Disconnect

Close connection

In [None]:
%sql --close connection