title: Usage of trading platforms
author: Vladas Jankus 
date: 2021-06-27
region: EU  
tags: product, trade, trading, invest, stocks, coinbase, crypto, cryptocurrency, Trade Republic, eToro, Flatex, Comdirect, Onvista, Scalable Capital, Bux, Smartbroker, Lynx
summary: A descriptive analysis of our current customer base and their interaction with coinbase and a selected group of trading platforms - Trade Republic, eToro, Flatex, Comdirect, Onvista, Scalable Capital, Bux, Smartbroker and Lynx. Side by side comparison of coinbase and trading platforms

In [1]:
!pip install duckdb

import pandas as pd
import altair as alt
from utils.datalib_database import df_from_sql
from IPython.display import HTML, Markdown as md
import duckdb

You should consider upgrading via the '/usr/local/bin/python -m pip install --upgrade pip' command.[0m


In [2]:
raw_df = df_from_sql(
    "redshiftreader",
    """
        select
            to_char(created, 'yyyy-mm') as month,
            'SEPA ' || type as type,
            user_created,
            case
                when partner_bic = 'SOBKDEBBXXX'
                    then 'trade_republic'
                when partner_bic = 'BIWBDE33XXX'
                    then 'degiro'
                when partner_bic = 'BDWBDEMMXXX'
                    then 'scalable_capital'
                when partner_bic = 'MHSBDEHBXXX'
                    then 'justtrade'
                when partner_bic = 'COUTGB22XXX'
                    then 'etoro'
                when partner_bic = 'DABBDEMMXXX'
                    then 'smartbroker'
                when partner_bic = 'BOURDEFFXXX'
                    then 'onvista'
                when left(partner_bic, 8) = 'COBADEHD'
                    then 'comdirect'
            end as broker,
            case broker
                when 'comdirect' then 1
                when 'trade_republic' then 2
                when 'etoro' then 3
                when 'degiro' then 4
                when 'justtrade' then 5
                when 'scalable_capital' then 6
                when 'smartbroker' then 7
                when 'onvista' then 8
            end as row_order,
            count(*) as transactions,
            sum(bank_balance_impact) as value_eur
        from etl_reporting.zr_transaction
        left join (
            select 
                account_id,
                user_created,
                /* If there is only one account owner, this will always be 1.
                If there are multiple potential account owners we assume the
                first one who was created is the actual first one. */
                row_number() over (partition by account_id order by created) as rn
            from cr_user_account as cua 
            where coalesce(cua.user_role, 'OWNER') = 'OWNER'
        ) as cua 
        on cua.account_id = zr_transaction.account_id 
        and rn = 1
        where created < date('2021-07-01')
            and (
                partner_bic in (
                    'SOBKDEBBXXX', 'BIWBDE33XXX', 'BDWBDEMMXXX', 'MHSBDEHBXXX', 'COUTGB22XXX',
                    'DABBDEMMXXX', 'BOURDEFFXXX'
                )
                or left(partner_bic, 8) = 'COBADEHD'
            )
            and (bank_balance_impact > 1 or bank_balance_impact < -1)
        group by 1, 2, 3, 4, 5
        union all
        select
            to_char(created, 'yyyy-mm') as month,
            'CARD' as type,
            user_created,
            case
                when merchant_name = 'Trade Republic'
                    then 'trade_republic'
                when merchant_name ilike '%etoro%'
                    then 'etoro'
                when merchant_name ilike '%onvista%'
                    then 'onvista'
                when merchant_name ilike '%comdirect%'
                    then 'comdirect'
            end as broker,
            case broker
                when 'comdirect' then 1
                when 'trade_republic' then 2
                when 'etoro' then 3
                when 'degiro' then 4
                when 'justtrade' then 5
                when 'scalable_capital' then 6
                when 'smartbroker' then 7
                when 'onvista' then 8
                when 'bux' then 9
            end as row_order,
            count(*) as transactions,
            sum(amount_cents_eur) / -100.0 as value_eur
        from dbt.zrh_card_transactions
        where type = 'PT'
            and created < date('2021-07-01')
            and (
                merchant_name = 'Trade Republic'
                or merchant_name ilike '%etoro%'
                or merchant_name ilike '%onvista%'
                or merchant_name ilike '%comdirect%'
            )
            and amount_cents_eur > 100
        group by 1, 2, 3, 4, 5
    """,
)

{"message": "started", "db": "redshiftreader", "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 126, "funcName": "df_from_sql", "created": "20210730T123524", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}
{"message": "success", "db": "redshiftreader", "duration": 67.8619, "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 136, "funcName": "df_from_sql", "created": "20210730T123632", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}


In [3]:
raw_df_coinbase = df_from_sql(
    "redshiftreader",
    """
        select
            to_char(created, 'yyyy-mm') as month,
            'SEPA ' || type as type,
            user_created,
            case when partner_name ilike '%coinbase%' then 'coinbase' else 'kraken' end as broker,
            case when broker = 'coinbase' then 1 else 2 end as row_order,
            count(*) as transactions,
            sum(bank_balance_impact) as value_eur
        from etl_reporting.zr_transaction
        left join (
            select 
                account_id,
                user_created,
                /* If there is only one account owner, this will always be 1.
                If there are multiple potential account owners we assume the
                first one who was created is the actual first one. */
                row_number() over (partition by account_id order by created) as rn
            from cr_user_account as cua 
            where coalesce(cua.user_role, 'OWNER') = 'OWNER'
        ) as cua 
        on cua.account_id = zr_transaction.account_id 
        and rn = 1
        where created < date('2021-07-01')
            and (
                partner_name ilike '%coinbase%'
                or partner_iban in (
                    'LI08088110102720K000E', 'GB29CLJU04130729900313', 'GB60CLJU00997129900160', 
                    'DE31700222000071788512', 'CH7008799935590051814'
                )
            )
            and (bank_balance_impact > 1 or bank_balance_impact < -1)
        group by 1, 2, 3, 4, 5
        union all
        select
            to_char(created, 'yyyy-mm') as month,
            'CARD' as type,
            user_created,
            case when merchant_name = 'KRAKEN EXCHANGE' then 'kraken' else 'coinbase' end as broker,
            case when broker = 'coinbase' then 1 else 2 end as row_order,
            count(*) as transactions,
            sum(amount_cents_eur) / -100.0 as value_eur
        from dbt.zrh_card_transactions
        where type = 'PT'
            and created < date('2021-07-01')
            and (merchant_name ilike '%coinbase%' or merchant_name = 'KRAKEN EXCHANGE')
            and amount_cents_eur > 100
        group by 1, 2, 3, 4, 5
    """,
)

{"message": "started", "db": "redshiftreader", "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 126, "funcName": "df_from_sql", "created": "20210730T123633", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}
{"message": "success", "db": "redshiftreader", "duration": 34.4155, "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 136, "funcName": "df_from_sql", "created": "20210730T123707", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}


In [4]:
out_percentiles = df_from_sql(
    "redshiftreader",
    """
        with percentiles as (
            select
                month,
                percentile_cont(0.10) within group (order by abs(value_eur)) as perc_10,
                percentile_cont(0.25) within group (order by abs(value_eur)) as perc_25,
                percentile_cont(0.5) within group (order by abs(value_eur)) as perc_50,
                percentile_cont(0.75) within group (order by abs(value_eur)) as perc_75,
                percentile_cont(0.9) within group (order by abs(value_eur)) as perc_90
            from (select
                      to_char(created, 'yyyy-mm') as month,
                      user_created,
                      sum(bank_balance_impact) as value_eur
                  from etl_reporting.zr_transaction
                    left join (
                        select 
                            account_id,
                            user_created,
                            /* If there is only one account owner, this will always be 1.
                            If there are multiple potential account owners we assume the
                            first one who was created is the actual first one. */
                            row_number() over (partition by account_id order by created) as rn
                        from cr_user_account as cua 
                        where coalesce(cua.user_role, 'OWNER') = 'OWNER'
                    ) as cua 
                    on cua.account_id = zr_transaction.account_id 
                    and rn = 1
                  where created < date('2021-07-01')
                    and created > date('2020-01-01')
                    and (
                        partner_bic in (
                            'SOBKDEBBXXX', 'BIWBDE33XXX', 'BDWBDEMMXXX', 'MHSBDEHBXXX', 'COUTGB22XXX',
                            'DABBDEMMXXX', 'BOURDEFFXXX'
                        )
                        or left(partner_bic, 8) = 'COBADEHD'
                    )
                  group by 1, 2
                  union all
                  select
                      to_char(created, 'yyyy-mm') as month,
                      user_created,
                      sum(amount_cents_eur) / -100.0 as value_eur
                  from dbt.zrh_card_transactions
                  where type = 'PT'
                    and created < date('2021-07-01')
                    and created > date('2020-01-01')
                    and (
                        merchant_name = 'Trade Republic'
                        or merchant_name ilike '%etoro%'
                        or merchant_name ilike '%onvista%'
                        or merchant_name ilike '%comdirect%'
                    )
                  group by 1, 2)
            where month >= '2020-01-01'
              and value_eur < -1
            group by 1
        )
            select month,'10th percentile' as percentile, perc_10 as value from percentiles
            union all select month, '25th percentile' as percentile, perc_25 as value from percentiles
            union all select month, 'median' as percentile, perc_50 as value from percentiles
            union all select month, '75th percentile' as percentile, perc_75 as value from percentiles
            union all select month, '90th percentile' as percentile, perc_90 as value from percentiles
    """,
)

{"message": "started", "db": "redshiftreader", "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 126, "funcName": "df_from_sql", "created": "20210730T123708", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}
{"message": "success", "db": "redshiftreader", "duration": 36.6166, "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 136, "funcName": "df_from_sql", "created": "20210730T123744", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}


In [5]:
out_percentiles_coinbase = df_from_sql(
    "redshiftreader",
    """
        with percentiles as (
            select
                month,
                percentile_cont(0.10) within group (order by abs(value_eur)) as perc_10,
                percentile_cont(0.25) within group (order by abs(value_eur)) as perc_25,
                percentile_cont(0.5) within group (order by abs(value_eur)) as perc_50,
                percentile_cont(0.75) within group (order by abs(value_eur)) as perc_75,
                percentile_cont(0.9) within group (order by abs(value_eur)) as perc_90
            from (select
                      to_char(created, 'yyyy-mm') as month,
                      user_created,
                      sum(bank_balance_impact) as value_eur
                  from etl_reporting.zr_transaction
        left join (
            select 
                account_id,
                user_created,
                /* If there is only one account owner, this will always be 1.
                If there are multiple potential account owners we assume the
                first one who was created is the actual first one. */
                row_number() over (partition by account_id order by created) as rn
            from cr_user_account as cua 
            where coalesce(cua.user_role, 'OWNER') = 'OWNER'
        ) as cua 
        on cua.account_id = zr_transaction.account_id 
        and rn = 1
                  where created < date('2021-07-01')
                    and created > date('2020-01-01')
                    and (
                        partner_name ilike '%coinbase%'
                        or partner_iban in (
                            'LI08088110102720K000E', 'GB29CLJU04130729900313', 'GB60CLJU00997129900160', 
                            'DE31700222000071788512', 'CH7008799935590051814'
                        )
                    )
                  group by 1, 2
                  union all
                  select
                      to_char(created, 'yyyy-mm') as month,
                      user_created,
                      sum(amount_cents_eur) / -100.0 as value_eur
                  from dbt.zrh_card_transactions
                  where type = 'PT'
                    and created < date('2021-07-01')
                    and created > date('2020-01-01')
                    and (merchant_name ilike '%coinbase%' or merchant_name = 'KRAKEN EXCHANGE')
                  group by 1, 2)
            where month >= '2020-01-01'
              and value_eur < -1
            group by 1
        )
            select month,'10th percentile' as percentile, perc_10 as value from percentiles
            union all select month, '25th percentile' as percentile, perc_25 as value from percentiles
            union all select month, 'median' as percentile, perc_50 as value from percentiles
            union all select month, '75th percentile' as percentile, perc_75 as value from percentiles
            union all select month, '90th percentile' as percentile, perc_90 as value from percentiles
    """,
)

{"message": "started", "db": "redshiftreader", "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 126, "funcName": "df_from_sql", "created": "20210730T123745", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}
{"message": "success", "db": "redshiftreader", "duration": 24.517, "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 136, "funcName": "df_from_sql", "created": "20210730T123810", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}


In [6]:
in_percentiles = df_from_sql(
    "redshiftreader",
    """
        with percentiles as (
            select
                month,
                percentile_cont(0.10) within group (order by abs(value_eur)) as perc_10,
                percentile_cont(0.25) within group (order by abs(value_eur)) as perc_25,
                percentile_cont(0.5) within group (order by abs(value_eur)) as perc_50,
                percentile_cont(0.75) within group (order by abs(value_eur)) as perc_75,
                percentile_cont(0.9) within group (order by abs(value_eur)) as perc_90
            from (select
                      to_char(created, 'yyyy-mm') as month,
                      user_created,
                      sum(bank_balance_impact) as value_eur
                  from etl_reporting.zr_transaction
        left join (
            select 
                account_id,
                user_created,
                /* If there is only one account owner, this will always be 1.
                If there are multiple potential account owners we assume the
                first one who was created is the actual first one. */
                row_number() over (partition by account_id order by created) as rn
            from cr_user_account as cua 
            where coalesce(cua.user_role, 'OWNER') = 'OWNER'
        ) as cua 
        on cua.account_id = zr_transaction.account_id 
        and rn = 1
                  where created < date('2021-07-01')
                    and created > date('2020-01-01')
                    and type = 'CT'
                    and (
                        partner_bic in (
                            'SOBKDEBBXXX', 'BIWBDE33XXX', 'BDWBDEMMXXX', 'MHSBDEHBXXX', 'COUTGB22XXX',
                            'DABBDEMMXXX', 'BOURDEFFXXX'
                        )
                        or left(partner_bic, 8) = 'COBADEHD'
                    )
                  group by 1, 2)
            where month >= '2020-01-01'
              and value_eur >= 1
            group by 1
        )
            select month,'10th percentile' as percentile, perc_10 as value from percentiles
            union all select month, '25th percentile' as percentile, perc_25 as value from percentiles
            union all select month, 'median' as percentile, perc_50 as value from percentiles
            union all select month, '75th percentile' as percentile, perc_75 as value from percentiles
            union all select month, '90th percentile' as percentile, perc_90 as value from percentiles
    """,
)

{"message": "started", "db": "redshiftreader", "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 126, "funcName": "df_from_sql", "created": "20210730T123810", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}
{"message": "success", "db": "redshiftreader", "duration": 25.7331, "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 136, "funcName": "df_from_sql", "created": "20210730T123836", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}


In [7]:
in_percentiles_coinbase = df_from_sql(
    "redshiftreader",
    """
        with percentiles as (
            select
                month,
                percentile_cont(0.10) within group (order by abs(value_eur)) as perc_10,
                percentile_cont(0.25) within group (order by abs(value_eur)) as perc_25,
                percentile_cont(0.5) within group (order by abs(value_eur)) as perc_50,
                percentile_cont(0.75) within group (order by abs(value_eur)) as perc_75,
                percentile_cont(0.9) within group (order by abs(value_eur)) as perc_90
            from (select
                      to_char(created, 'yyyy-mm') as month,
                      user_created,
                      sum(bank_balance_impact) as value_eur
                  from etl_reporting.zr_transaction
        left join (
            select 
                account_id,
                user_created,
                /* If there is only one account owner, this will always be 1.
                If there are multiple potential account owners we assume the
                first one who was created is the actual first one. */
                row_number() over (partition by account_id order by created) as rn
            from cr_user_account as cua 
            where coalesce(cua.user_role, 'OWNER') = 'OWNER'
        ) as cua 
        on cua.account_id = zr_transaction.account_id 
        and rn = 1
                  where created < date('2021-07-01')
                    and created > date('2020-01-01')
                    and type = 'CT'
                    and (
                        partner_name ilike '%coinbase%'
                        or partner_iban in (
                            'LI08088110102720K000E', 'GB29CLJU04130729900313', 'GB60CLJU00997129900160', 
                            'DE31700222000071788512', 'CH7008799935590051814'
                        )
                    )
                  group by 1, 2)
            where month >= '2020-01-01'
              and value_eur >= 1
            group by 1
        )
            select month,'10th percentile' as percentile, perc_10 as value from percentiles
            union all select month, '25th percentile' as percentile, perc_25 as value from percentiles
            union all select month, 'median' as percentile, perc_50 as value from percentiles
            union all select month, '75th percentile' as percentile, perc_75 as value from percentiles
            union all select month, '90th percentile' as percentile, perc_90 as value from percentiles
    """,
)

{"message": "started", "db": "redshiftreader", "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 126, "funcName": "df_from_sql", "created": "20210730T123836", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}
{"message": "success", "db": "redshiftreader", "duration": 30.7435, "name": "datalib-logger", "args": [], "levelname": "INFO", "pathname": "/usr/local/lib/python3.7/site-packages/datalib/database.py", "filename": "database.py", "module": "database", "lineno": 136, "funcName": "df_from_sql", "created": "20210730T123907", "processName": "MainProcess", "service": "fargo", "environment": "local", "loggerId": "e4eee647-b95b-4803-a7e9-a115e198e546", "hostname": "172.23.0.4"}


In [8]:
con = duckdb.connect(database=":memory:", read_only=False)

In [9]:
con.register("raw_data", raw_df)
con.register("raw_data_coinbase", raw_df_coinbase)

<duckdb.DuckDBPyConnection at 0x7f5a2f0d3810>

# How N26 customers are using trading brokers
Vladas Jankus<br/>
2021-06-27

### Contents:
* [Introduction](#1)
* [Monthly volumes](#2)
* [Transaction types](#3)
  * [Transaction type split](#31)
  * [Incoming and outgoing split](#32)
  * [Monthly outbound volume percentiles](#33)
  * [Monthly inbound volume percentiles](#34)
* [Customer view](#4)
  * [Customers split by lifetime transactions](#41)
  * [Customer behaviour by experience](#42)
  * [Customers who stop trading](#43)

## Introduction <a class="anchor" id="1"></a>

The main purpose of this analysis is to have a look at how our customers are using the most popular stock and crypto brokers. Stock brokers and crypto brokers were two different requests but their nature was very similar so they are put together. All charts will be displayed twice, one time for stock brokers another time for crypto brokers.

Tranding platforms used in this analysis were Trade Republic, eToro, Degiro, Comdirect, Onvista, Scalable Capital and Smartbroker. The transactions were identified by BIC code (for sepa) or merchant name (cards).

Volumes include SEPA CT, DD, DT and card transactions.

There is a high number of transactions < 1Eur because brokers are using small transactions to check the validity of accounts. These transactions are excluded from this analysis.

<b>Limitations:</b> relying on BIC number captures all tranasctions to the bank. There is a chance that in some cases transactions are over-estimated because we don't really know how many of transactions to these banks are for brokerage purposes. However we assume in all cases this should be close to 100%.

## Monthly volumes <a class="anchor" id="2"></a>

In the very first place let's look at the overall volumes our customers are exchanging with brokers. Charts below display the count and volume of monthly transactions. Colors display different brokers. Note that this is the absolute sum of all transaction types, incoming and outgoing. (scroll right to see more charts)

In [47]:
charts_col1 = []
charts_col2 = []

broker_activity = con.execute(
    """
        select
            month,
            broker,
            row_order,
            sum(abs(value_eur)) as activity_eur,
            sum(transactions) as transactions
        from raw_data
        where month >= '2020-01-01'
        group by 1, 2, 3
    """
).fetchdf()

broker_activity_cb = con.execute(
    """
        select
            month,
            broker,
            row_order,
            sum(abs(value_eur)) as activity_eur,
            sum(transactions) as transactions
        from raw_data_coinbase
        where month >= '2020-01-01'
        group by 1, 2, 3
    """
).fetchdf()

charts_col1.append(
    alt.Chart(broker_activity.rename(columns={"broker": "Broker"}))
    .mark_area(opacity=0.7, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("activity_eur:Q", title="Eur (absolute)"),
        color=alt.Color(
            "Broker:N",
            sort=[
                "comdirect",
                "trade_republic",
                "etoro",
                "degiro",
                "justtrade",
                "scalable_capital",
                "smartbroker",
                "onvista",
                "bux",
            ],
        ),
        order=alt.Order("row_order", sort="ascending"),
    )
    .properties(
        width=800,
        height=300,
        title="Trading platforms: Volume (EUR) in and out of brokers.",
    )
)

charts_col2.append(
    alt.Chart(broker_activity_cb.rename(columns={"broker": "Broker"}))
    .mark_area(opacity=0.7, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("activity_eur:Q", title="Eur (absolute)"),
        color=alt.Color(
            "Broker:N",
            sort=[
                "comdirect",
                "trade_republic",
                "etoro",
                "degiro",
                "justtrade",
                "scalable_capital",
                "smartbroker",
                "onvista",
                "bux",
            ],
        ),
        order=alt.Order("row_order", sort="ascending"),
    )
    .properties(
        width=800,
        height=300,
        title="Crypto platforms: Volume (EUR) in and out of brokers.",
    )
)

charts_col1.append(
    alt.Chart(broker_activity.rename(columns={"broker": "Broker"}))
    .mark_area(opacity=0.7, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("transactions:Q", title="Transaction count"),
        color=alt.Color(
            "Broker:N",
            sort=[
                "comdirect",
                "trade_republic",
                "etoro",
                "degiro",
                "justtrade",
                "scalable_capital",
                "smartbroker",
                "onvista",
                "bux",
            ],
        ),
        order=alt.Order("row_order", sort="ascending"),
    )
    .properties(
        width=800,
        height=300,
        title="Trading platforms: Transaction count in and out of brokers.",
    )
)

charts_col2.append(
    alt.Chart(broker_activity_cb.rename(columns={"broker": "Broker"}))
    .mark_area(opacity=0.7, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("transactions:Q", title="Transaction count"),
        color=alt.Color(
            "Broker:N",
            sort=[
                "comdirect",
                "trade_republic",
                "etoro",
                "degiro",
                "justtrade",
                "scalable_capital",
                "smartbroker",
                "onvista",
                "bux",
            ],
        ),
        order=alt.Order("row_order", sort="ascending"),
    )
    .properties(
        width=800,
        height=300,
        title="Crypto platforms: Transaction count in and out of brokers.",
    )
)

alt.hconcat(*charts_col1).display()
alt.hconcat(*charts_col2).display()

There is an obvious increase of activity since the beginning of 2021. A spike of volume can also be noticed in March 2020, when the quarantine was announced and people started investing/saving more.

When looking at the color split for trading platforms, there are two main brokers - Trade Republic and Comdirect - that share the most transactions by our customers. 

We will not go deeper into the broker split in this analysis, and will focus more on the customer side.

## Transaction types <a class="anchor" id="3"></a>

### Transaction type split <a class="anchor" id="31"></a>

Above charts contained all transaction types summed together. Charts below display split of transaction types. These are mostly SEPA CT, SEPA DT and card transactions. 'Other' contains some direct debits, debit rejections and card presentment rejections.

In [48]:
charts_col1 = []
charts_col2 = []

tx_type = con.execute(
    """
        select
            month,
            case when type in ('SEPA CT', 'CARD', 'SEPA DT') then type else 'Other' end as type,
            sum(abs(value_eur)) as activity_eur,
            sum(transactions) as transactions
        from raw_data
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()
tx_type_cb = con.execute(
    """
        select
            month,
            case when type in ('SEPA CT', 'CARD', 'SEPA DT') then type else 'Other' end as type,
            sum(abs(value_eur)) as activity_eur,
            sum(transactions) as transactions
        from raw_data_coinbase
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

charts_col1.append(
    alt.Chart(tx_type.rename(columns={"type": "Type"}))
    .mark_area(opacity=0.6, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("activity_eur:Q", stack=None, title="Eur"),
        color=alt.Color("Type:N"),
    )
    .properties(width=800, height=300, title="Trading Platforms: Transaction volume")
)

charts_col2.append(
    alt.Chart(tx_type_cb.rename(columns={"type": "Type"}))
    .mark_area(opacity=0.6, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("activity_eur:Q", stack=None, title="Eur"),
        color=alt.Color("Type:N"),
    )
    .properties(width=800, height=300, title="Crypto platforms: Transaction volume")
)

charts_col1.append(
    alt.Chart(tx_type.rename(columns={"type": "Type"}))
    .mark_area(opacity=0.6, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("transactions:Q", stack=None, title="Transaction count"),
        color=alt.Color("Type:N"),
    )
    .properties(width=800, height=300, title="Trading Platforms: Transaction count")
)

charts_col2.append(
    alt.Chart(tx_type_cb.rename(columns={"type": "Type"}))
    .mark_area(opacity=0.6, interpolate="natural")
    .encode(
        x=alt.X("month:T", title=None),
        y=alt.Y("transactions:Q", stack=None, title="Transaction count"),
        color=alt.Color("Type:N"),
    )
    .properties(width=800, height=300, title="Crypto platforms: Transaction count")
)

alt.hconcat(*charts_col1).display()
alt.hconcat(*charts_col2).display()

Trading platforms have a lot less share of card transactions than crypto platforms. Visibly there are slightly more debit transactions than credit, so customers are sending a bit more to the platforms than are subtracting with SEPA.

Crypto platforms have a very high count of card transactions with not so much volume, indicating that there are a lot of low volume transactions. (note: transactions <1 EUR are ignored in this analysis)

### Incoming and outgoing split <a class="anchor" id="32"></a>

Chart below displays the volume based on the transfer direction. Charts display volume in Euros.

In [49]:
tx_type = con.execute(
    """
        select
            month,
            case when value_eur >= 0 then 'In' else 'Out' end as tx_direction,
            sum(value_eur) as activity_eur
        from raw_data
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()
tx_type_cb = con.execute(
    """
        select
            month,
            case when value_eur >= 0 then 'In' else 'Out' end as tx_direction,
            sum(value_eur) as activity_eur
        from raw_data_coinbase
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

alt.Chart(tx_type.rename(columns={"tx_direction": "Transaction Direction"})).mark_area(
    opacity=0.7, interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("activity_eur:Q", title="Eur"),
    color=alt.Color("Transaction Direction:N"),
).properties(
    width=750, height=300, title="Trading Platforms: Volume in or out of N26 account"
).display()

alt.Chart(
    tx_type_cb.rename(columns={"tx_direction": "Transaction Direction"})
).mark_area(opacity=0.7, interpolate="natural").encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("activity_eur:Q", title="Eur"),
    color=alt.Color("Transaction Direction:N"),
).properties(
    width=750, height=300, title="Crypto platforms: Volume in or out of N26 account"
).display()

This very clearly displays that the customers are mostly investing money to trading platforms, inbound transactions (broker -> N26) are lower than outbound. This applies both, to crypto and to stock brokers.

Chart below displays the average volume (EUR) of in and out transactions.

In [50]:
tx_type = con.execute(
    """
        select
            month,
            case when value_eur >= 0 then 'In' else 'Out' end as tx_direction,
            avg(abs(value_eur)) as average_activity_eur
        from raw_data
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()
tx_type_cb = con.execute(
    """
        select
            month,
            case when value_eur >= 0 then 'In' else 'Out' end as tx_direction,
            avg(abs(value_eur)) as average_activity_eur
        from raw_data_coinbase
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

alt.Chart(tx_type.rename(columns={"tx_direction": "Transaction Direction"})).mark_line(
    interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("average_activity_eur:Q", title="Eur"),
    color=alt.Color("Transaction Direction:N"),
).properties(
    width=750, height=300, title="Trading platforms: Average transaction value"
).display()

alt.Chart(
    tx_type_cb.rename(columns={"tx_direction": "Transaction Direction"})
).mark_line(interpolate="natural").encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("average_activity_eur:Q", title="Eur"),
    color=alt.Color("Transaction Direction:N"),
).properties(
    width=750, height=300, title="Crypto platforms: Average transaction value"
).display()

It is interesting to see, that crypto users by average have significantly higher inbound transaction volume. In and out transaction volumes for trading platforms are around the same level.

Note that average of over 1k eur per transaction looks very high. This could suggest that the data is very skewed, let's look at this in the next section.

### Monthly outbound volume percentiles <a class="anchor" id="33"></a>

In [51]:
alt.Chart(out_percentiles.rename(columns={"percentile": "Percentile"})).mark_line(
    interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("value:Q", title="Eur"),
    color=alt.Color("Percentile:N"),
).properties(
    width=750, height=300, title="Trading platforms: Outbound volume percentiles"
).display()

alt.Chart(
    out_percentiles_coinbase.rename(columns={"percentile": "Percentile"})
).mark_line(interpolate="natural").encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("value:Q", title="Eur"),
    color=alt.Color("Percentile:N"),
).properties(
    width=750, height=300, title="Crypto platforms: Outbound volume percentiles"
).display()

The median for outbound (from N26 to broker) transactions is around 300-400 EUR monthly. Data is skewed to the right side as the average of 1,2k-1,4k (one section above) is a lot higher than the median. Top 10% of customers transfer over 2,5k-3k EUR monthly to brokers.

For crypto brokers, median is just under 200 Eur while average was around 1,5k Eur (one section above). Top 10% of customers invest over 2k-2,5k per transaction.

### Monthly inbound volume percentiles <a class="anchor" id="34"></a>

Below is the same chart but for inbound transactions (from broker to N26).

In [52]:
alt.Chart(in_percentiles.rename(columns={"percentile": "Percentile"})).mark_line(
    interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("value:Q", title="Eur"),
    color=alt.Color("Percentile:N"),
).properties(
    width=750, height=300, title="Trading Platforms: Inbound volume percentiles"
).display()

alt.Chart(
    in_percentiles_coinbase.rename(columns={"percentile": "Percentile"})
).mark_line(interpolate="natural").encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("value:Q", title="Eur"),
    color=alt.Color("Percentile:N"),
).properties(
    width=750, height=300, title="Crypto platforms: Inbound volume percentiles"
).display()

For trading platforms, the percentiles look more or less similar to outbound transactions. Median is slightly higher, around 500 EUR for a customer monthly, top 10% of customers take more than 2,5k Eur monthly out of brokers.

For crypto platforms, inbound transactions have a median of around 700 EUR in recent months. Top 10% of customers take out around 7k-10k per month, which seems really high.

## Customer view <a class="anchor" id="4"></a>

Now let's look at the customer details.

Chart below displays the sum of distinct customers monthly. Colors display how many transactions the customer makes per month. 

In [53]:
customer_tx = con.execute(
    """
        with step1 as (
            select
                month,
                user_created,
                sum(transactions) as transactions
            from raw_data
            group by 1,2
        )
        select 
            month,
            case 
                when transactions = 1 then '1'
                when transactions = 2 then '2'
                when transactions = 3 then '3'
                else '4+'
            end as transactions,
            count(*) as customers
        from step1
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

customer_tx_cb = con.execute(
    """
        with step1 as (
            select
                month,
                user_created,
                sum(transactions) as transactions
            from raw_data_coinbase
            group by 1,2
        )
        select 
            month,
            case 
                when transactions = 1 then '1'
                when transactions = 2 then '2'
                when transactions = 3 then '3'
                else '4+'
            end as transactions,
            count(*) as customers
        from step1
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

alt.Chart(customer_tx).mark_area(opacity=0.7, interpolate="natural").encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("customers:Q", title="Customers"),
    color=alt.Color("transactions"),
).properties(
    width=750, height=300, title="Trading platforms: Customers interacting with brokers"
).display()

alt.Chart(customer_tx_cb).mark_area(opacity=0.7, interpolate="natural").encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("customers:Q", title="Customers"),
    color=alt.Color("transactions"),
).properties(
    width=750, height=300, title="Crypto platforms: Customers interacting with coinbase"
).display()

In the recent months there are around 70k customers monthly interacting with trading platforms and around 20k-30k customers interacting with crypto platforms. It looks like the number of monthly active customers grew significantly just before 2021, and this caused the increase of overall volumes we saw before. More than half of the customers make only 1 transaction per month.

### Customers split by lifetime transactions <a class="anchor" id="41"></a>

Chart below displays the distribution of customers by their lifetime transaction count.

In [54]:
customer_xp = con.execute(
    """
        with step1 as (
            select
                user_created,
                sum(transactions) as transactions
            from raw_data
            group by 1
        ), step2 as (
            select 
                transactions,
                count(*) as customers
            from step1
            group by 1
        ), step3 as (
            select
                transactions,
                customers::float / sum(customers) over () as customer_rate
            from step2
        ), step4 as (
            select
                *,
                sum(customer_rate) over (order by transactions rows unbounded preceding) as cumulative_rate
            from step3
        )
        select * from step4 where transactions < 15
    """
).fetchdf()

customer_xp_cb = con.execute(
    """
        with step1 as (
            select
                user_created,
                sum(transactions) as transactions
            from raw_data_coinbase
            group by 1
        ), step2 as (
            select 
                transactions,
                count(*) as customers
            from step1
            group by 1
        ), step3 as (
            select
                transactions,
                customers::float / sum(customers) over () as customer_rate
            from step2
        ), step4 as (
            select
                *,
                sum(customer_rate) over (order by transactions rows unbounded preceding) as cumulative_rate
            from step3
        )
        select * from step4 where transactions < 15
    """
).fetchdf()


# trading platforms
base = alt.Chart(customer_xp).encode(
    alt.X("transactions:O", axis=alt.Axis(title="Transactions"))
)

bar = base.mark_bar(opacity=0.7, color="#57A44C").encode(
    alt.Y("customer_rate", axis=alt.Axis(title="Customer %", titleColor="#57A44C"))
)

line = base.mark_line(stroke="#5276A7", interpolate="monotone").encode(
    alt.Y("cumulative_rate", axis=alt.Axis(title="Cumulative %", titleColor="#5276A7"))
)

alt.layer(bar, line).resolve_scale(y="independent").properties(
    width=750, height=300, title="Trading platforms: Customer transaction distribution"
).display()


##coinbase
base = alt.Chart(customer_xp_cb).encode(
    alt.X("transactions:O", axis=alt.Axis(title="Transactions"))
)

bar = base.mark_bar(opacity=0.7, color="#57A44C").encode(
    alt.Y("customer_rate", axis=alt.Axis(title="Customer %", titleColor="#57A44C"))
)

line = base.mark_line(stroke="#5276A7", interpolate="monotone").encode(
    alt.Y("cumulative_rate", axis=alt.Axis(title="Cumulative %", titleColor="#5276A7"))
)

alt.layer(bar, line).resolve_scale(y="independent").properties(
    width=750, height=300, title="Crypto platforms: Customer transaction distribution"
).display()

For trading platforms, 30% of customers have made only 1 transaction to brokers. 80% of our customers have made 10 or less transactions. For crypto platforms only 25% of customers have made 1 transaction. 90% of customers make 14 or less transactions.

Table below groups all customers into 4 buckets based on their transaction count:

In [55]:
print("Trading platforms: Customer split by transaction count")
cst_group = con.execute(
    """
        with step1 as (
            select
                user_created,
                sum(transactions) as transactions
            from raw_data
            group by 1
        ), step2 as (
            select 
                case 
                    when transactions = 1 then '1'
                    when transactions <= 3 then '2-3'
                    when transactions <= 7 then '4-7'
                    when transactions >= 8 then '8+'
                end as tx_group,
                case 
                    when transactions = 1 then '1'
                    when transactions <= 3 then '2'
                    when transactions <= 7 then '3'
                    when transactions >= 8 then '4'
                end as row_order,
                count(*) as customers
            from step1
            group by 1, 2
        )
        select
            tx_group,
            customers,
            round((customers::numeric / sum(customers) over ()) * 100)::text || '%' as cst_share
        from step2
        order by row_order
    """
).fetchdf()

display(
    cst_group.rename(
        columns={
            "tx_group": "# of Transactions",
            "customers": "Customers",
            "cst_share": "% of customers",
        }
    ).style.hide_index()
)

print("Crypto platforms: Customer split by transaction count")
cst_group = con.execute(
    """
        with step1 as (
            select
                user_created,
                sum(transactions) as transactions
            from raw_data_coinbase
            group by 1
        ), step2 as (
            select 
                case 
                    when transactions = 1 then '1'
                    when transactions <= 3 then '2-3'
                    when transactions <= 7 then '4-7'
                    when transactions >= 8 then '8+'
                end as tx_group,
                case 
                    when transactions = 1 then '1'
                    when transactions <= 3 then '2'
                    when transactions <= 7 then '3'
                    when transactions >= 8 then '4'
                end as row_order,
                count(*) as customers
            from step1
            group by 1, 2
        )
        select
            tx_group,
            customers,
            round((customers::numeric / sum(customers) over ()) * 100)::text || '%' as cst_share
        from step2
        order by row_order
    """
).fetchdf()

display(
    cst_group.rename(
        columns={
            "tx_group": "# of Transactions",
            "customers": "Customers",
            "cst_share": "% of customers",
        }
    ).style.hide_index()
)

Trading platforms: Customer split by transaction count


# of Transactions,Customers,% of customers
1,86968,31.0%
2-3,64689,23.0%
4-7,50221,18.0%
8+,82142,29.0%


Crypto platforms: Customer split by transaction count


# of Transactions,Customers,% of customers
1,33823,26.0%
2-3,37873,29.0%
4-7,30789,23.0%
8+,29495,22.0%


### Customer behaviour by experience <a class="anchor" id="42"></a>

Now chart below is displaying monthly volume distribution based on customer experience (lifetime transaction count).

In [56]:
cst_group = con.execute(
    """
        with cst_group as (
            select
                user_created,
                case 
                    when sum(transactions) = 1 then '1'
                    when sum(transactions) <= 3 then '2-3'
                    when sum(transactions) <= 7 then '4-7'
                    when sum(transactions) >= 8 then '8+'
                end as tx_group
            from raw_data
            group by 1
        )
        select 
            month,
            tx_group,
            sum(abs(value_eur)) as eur,
            sum(abs(value_eur))/sum(transactions) as transactions
        from raw_data
        left join cst_group
            using (user_created)
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

cst_group_cb = con.execute(
    """
        with cst_group as (
            select
                user_created,
                case 
                    when sum(transactions) = 1 then '1'
                    when sum(transactions) <= 3 then '2-3'
                    when sum(transactions) <= 7 then '4-7'
                    when sum(transactions) >= 8 then '8+'
                end as tx_group
            from raw_data_coinbase
            group by 1
        )
        select 
            month,
            tx_group,
            sum(abs(value_eur)) as eur,
            sum(abs(value_eur))/sum(transactions) as transactions
        from raw_data_coinbase
        left join cst_group
            using (user_created)
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

alt.Chart(cst_group.rename(columns={"tx_group": "# of cst tx"})).mark_area(
    opacity=0.7, interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("eur:Q", title="Eur (absolute)", stack="normalize"),
    color=alt.Color("# of cst tx:N", sort=["1", "2-3", "4-7", "8+"]),
).properties(
    width=800, height=300, title="Trading platforms: Volume (EUR) by customer group."
).display()

alt.Chart(cst_group_cb.rename(columns={"tx_group": "# of cst tx"})).mark_area(
    opacity=0.7, interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("eur:Q", title="Eur (absolute)", stack="normalize"),
    color=alt.Color("# of cst tx:N", sort=["1", "2-3", "4-7", "8+"]),
).properties(
    width=800, height=300, title="Crypto platforms: Volume (EUR) by customer group."
).display()

It looks like the ratios stay the same over time and customers who have 8+ lifetime transactions generate around 80% of volume both for crypto and trading platforms.

Recent slight downward trend exists because we group customers by their lifetime transactions and there probably is a higher share of less experienced customers in recent months, especially since there was a large increase of customers in the beginning of 2021 as we saw earlier. This is more visible for crypto platforms.

Let's see the average value of transaction by customer group in the chart below:

In [57]:
alt.Chart(cst_group.rename(columns={"tx_group": "# of cst tx"})).mark_line(
    interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y(
        "transactions:Q",
        scale=alt.Scale(domain=(1, 1400), clamp=True),
        title="Eur (absolute)",
    ),
    color=alt.Color("# of cst tx:N", sort=["1", "2-3", "4-7", "8+"]),
).properties(
    width=800,
    height=300,
    title="Trading platforms: Average transaction value by customer group.",
).display()

alt.Chart(cst_group_cb.rename(columns={"tx_group": "# of cst tx"})).mark_line(
    interpolate="natural"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("transactions:Q", title="Eur (absolute)"),
    color=alt.Color("# of cst tx:N", sort=["1", "2-3", "4-7", "8+"]),
).properties(
    width=800,
    height=300,
    title="Crypto platforms: Average transaction value by customer group.",
).display()

It seems that there is quite a large variance on all the groups in terms of average value per transaction for crypto platforms, much less for trading platforms. 

It's interesting to note, that customers with 8+ transactions have consistently lower average values for trading platforms, but higher for crypto. This could suggest that more experienced customers are the ones doing very high inbound volume transactions which we saw earlier. 

### Customers who stop trading <a class="anchor" id="43"></a>

Last thing we will look at is the customers who stop trading. Since 40% of customers make only 1 lifetime transaction, we can assume there is a high number of customers who try trading and then stop.

Chart below displays monthly volumes by customers who were active in the past 5 months. Note that the data starts in 2018 here.

In [58]:
active_cst = con.execute(
    """
        with cst_group as (
            select
                user_created,
                min(month) as start_month
            from raw_data
            group by 1
        )
        select 
            month,
            case 
                when start_month < '2020-01' then 'Before 2020' 
                else 
                    case
                        when right(start_month, 2) in ('01', '02', '03') then left(start_month, 5)::text || 'Q1'
                        when right(start_month, 2) in ('04', '05', '06') then left(start_month, 5)::text || 'Q2'
                        when right(start_month, 2) in ('07', '08', '09') then left(start_month, 5)::text || 'Q3'
                        when right(start_month, 2) in ('10', '11', '12') then left(start_month, 5)::text || 'Q4'
                    end
            end as start_month,
            count(distinct user_created) as customers
        from raw_data
        left join cst_group
            using (user_created)
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

active_cst_cb = con.execute(
    """
        with cst_group as (
            select
                user_created,
                min(month) as start_month
            from raw_data_coinbase
            group by 1
        )
        select 
            month,
            case 
                when start_month < '2020-01' then 'Before 2020' 
                else 
                    case
                        when right(start_month, 2) in ('01', '02', '03') then left(start_month, 5)::text || 'Q1'
                        when right(start_month, 2) in ('04', '05', '06') then left(start_month, 5)::text || 'Q2'
                        when right(start_month, 2) in ('07', '08', '09') then left(start_month, 5)::text || 'Q3'
                        when right(start_month, 2) in ('10', '11', '12') then left(start_month, 5)::text || 'Q4'
                    end
            end as start_month,
            count(distinct user_created) as customers
        from raw_data_coinbase
        left join cst_group
            using (user_created)
        where month >= '2020-01-01'
        group by 1, 2
    """
).fetchdf()

alt.Chart(active_cst.rename(columns={"start_month": "Starting quarter"})).mark_area(
    opacity=0.7, interpolate="monotone"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("customers:Q", title="Customers"),
    color=alt.Color(
        "Starting quarter:N",
    ),
    order=alt.Order("Starting quarter", sort="descending"),
).properties(
    width=800, height=300, title="Trading platforms: Customers by activity start time"
).display()

alt.Chart(active_cst_cb.rename(columns={"start_month": "Starting quarter"})).mark_area(
    opacity=0.7, interpolate="monotone"
).encode(
    x=alt.X("month:T", title=None),
    y=alt.Y("customers:Q", title="Customers"),
    color=alt.Color("Starting quarter:N"),
    order=alt.Order("Starting quarter", sort="descending"),
).properties(
    width=800, height=300, title="Crypto platforms: Customers by activity start time"
).display()

We can see, that for trading platforms, around 30% of customers have started before 2020. Every other following quarter gave similar input of new customers. 

For crypto platforms the churn looks a lot higher, because there was significantly less active customers in June 2021. Only around 15% of customers started before 2020, and it makes sense because we see a very high increase in activity just before 2021.

In [59]:
HTML(
    """
<script>
    code_show=true; 
    function code_toggle() {
        if (code_show){
            $('div.input').hide();
        } else {
            $('div.input').show();
            }
        code_show = !code_show
 
        $('div.output_subarea').css("text-align", "center"); 
        $('body').css("font-family", "Montserrat, sans-serif");
        $('h1').css("font-family", "Karla, sans-serif");
        $('h2').css("font-family", "Karla, sans-serif");
    } 
    $( document ).ready(code_toggle);
    
    </script>
    <form action="javascript:code_toggle()">
        <input type="submit" value="Click here to toggle on/off the raw code.">
    </form>
"""
)