# Practice 4 - Payment types and median number of installments
This notebook uses two tables in the Olist database which are olist_orders_dataset (renamed to orders) and olist_order_payments_dataset (renamed to order_payments).<br>
Three questions are:

1. **What is the most popular payment type?**
3. **How do payment types change every year?**
2. **What is the median number of payment installments each year?**

## Connect and load in the database

In [1]:
%load_ext sql
%sql mysql+mysqlconnector://root:***@localhost/olist

'Connected: root@olist'

### Relational Schema <br>
<img src="files/photos/P4.png">

## SQL queries

### Popular payment types

In [2]:
%%sql
SELECT payment_type,
       COUNT(order_id) AS num_payments
FROM order_payments
GROUP BY payment_type
ORDER BY num_payments DESC;

 * mysql+mysqlconnector://root:***@localhost/olist
5 rows affected.


payment_type,num_payments
credit_card,76795
boleto,19784
voucher,5775
debit_card,1529
not_defined,3


Most customers pay by credit card, and boleto is the second popular payment type. You can find more explanation for botelo [here](https://www.kaggle.com/olistbr/brazilian-ecommerce/discussion/70607).

In [3]:
%%sql
SELECT SUM(IF(num_methods = 1,1,0)) AS one_method,
       SUM(IF(num_methods = 2,1,0)) AS two_methods,
       SUM(IF(num_methods = 3,1,0)) AS three_methods,
       SUM(IF(num_methods = 4,1,0)) AS four_methods,
       SUM(IF(num_methods > 4,1,0)) AS more_methods
FROM (SELECT order_id,
      COUNT(DISTINCT payment_type) AS num_methods
      FROM order_payments
      GROUP BY order_id) a;

 * mysql+mysqlconnector://root:***@localhost/olist
1 rows affected.


one_method,two_methods,three_methods,four_methods,more_methods
97194,2246,0,0,0


Customers choose maximum 2 types of payment for one order, and over 97% orders are paid by 1 payment type.<br>
Next, I would like to know how types of payment change over time.

In [4]:
%%sql
SELECT payment_methods,
       SUM(IF(YEAR(order_purchase_timestamp) = 2016,1,0))
       AS year_2016,
       SUM(IF(YEAR(order_purchase_timestamp) = 2017,1,0))
       AS year_2017,
       SUM(IF(YEAR(order_purchase_timestamp) = 2018,1,0))
       AS year_2018,
       ROUND((SUM(IF(YEAR(order_purchase_timestamp) = 2018,1,0)) -
              SUM(IF(YEAR(order_purchase_timestamp) = 2017,1,0)))/
              SUM(IF(YEAR(order_purchase_timestamp) = 2017,1,0))*100,2)
       AS percentage_change_17_18
FROM (SELECT order_id,
             GROUP_CONCAT(DISTINCT payment_type ORDER BY payment_type)
             AS payment_methods
      FROM order_payments
      GROUP BY order_id) a
JOIN orders o
ON a.order_id = o.order_id
GROUP BY payment_methods;

 * mysql+mysqlconnector://root:***@localhost/olist
7 rows affected.


payment_methods,year_2016,year_2017,year_2018,percentage_change_17_18
credit_card,252,33246,40761,22.6
boleto,63,9508,10213,7.41
"credit_card,voucher",5,1169,1071,-8.38
debit_card,2,422,1103,161.37
voucher,6,756,859,13.62
not_defined,0,0,3,
"credit_card,debit_card",0,0,1,


From 2017 to 2018, debit cards payments increased 161.73% while the combination of credit cards and vouchers slightly decreased by 8.38%. Other one-type payments increased at most 22.60%.

### Median number of payment installments

Customers can choose number of installments at check-out step. Let's see what the maximum and minimum installments are.

In [5]:
%%sql
SELECT MIN(payment_installments) AS min_installments,
       MAX(payment_installments) AS max_installments
FROM order_payments;

 * mysql+mysqlconnector://root:***@localhost/olist
1 rows affected.


min_installments,max_installments
0,24


There are customers not choosing installments, and the number of orders with 0 installment are:

In [6]:
%%sql
SELECT COUNT(DISTINCT order_id) AS num_orders
FROM order_payments
WHERE payment_installments = 0;

 * mysql+mysqlconnector://root:***@localhost/olist
1 rows affected.


num_orders
2


So mostly every order has a payment installment plan. The query below calculates the median number of installments per order each year.

In [7]:
%%sql
SELECT year,
       ROUND(AVG(payment_installments),0) AS median_installments
FROM (SELECT ROW_NUMBER()
             OVER(PARTITION BY a.year
             ORDER BY a.payment_installments) AS count_of_group,
             a.year, a.payment_installments, b. total_of_group
      FROM (SELECT YEAR(order_purchase_timestamp) AS year,
                   payment_installments
            FROM orders o JOIN order_payments op
            ON o.order_id = op.order_id
            ORDER BY payment_installments) a
      JOIN (SELECT year,
                   COUNT(payment_installments)
                   AS total_of_group
            FROM (SELECT YEAR(order_purchase_timestamp) AS year,
                         payment_installments
                  FROM orders o JOIN order_payments op
                  ON o.order_id = op.order_id) a
            GROUP BY year) b
      ON a.year = b.year) c
WHERE count_of_group BETWEEN total_of_group/2 AND total_of_group/2 + 1
GROUP BY year
ORDER BY year;

 * mysql+mysqlconnector://root:***@localhost/olist
3 rows affected.


year,median_installments
2016,2
2017,2
2018,1
