## SI Calculation: an example using generated sample trades

Here we will generate sample trades from the RTS 2 Annex III taxonomy.  Each sample trade is then enriched with the information needed run an SI calculation.

Once the trade data is assembled the the data normally provided by the regulator is synthesised.

Lastly, the SI calculations are run

The SI calculation includes a number of tests.  See the official word:
https://ec.europa.eu/transparency/regdoc/rep/3/2016/EN/3-2016-2398-EN-F1-1.PDF

# Step 1 - Prepare the trade data

The first step is to use the RTS 2 Annex III taxonomy to generate some sample trades.


In [1]:
import rts2_annex3
import random
import json

random.seed()

root = rts2_annex3.class_root

asset_class = root.asset_class_by_name("Credit Derivatives")

# Ask the Asset class to generate some sample trade
sample_trades = asset_class.make_test_samples(number=500)

# Print the one of the generated trades
print(vars(random.choice(sample_trades)))


{'sub_asset_class_name': 'Bespoke basket credit default swap (CDS)', 'asset_class_name': 'Credit Derivatives'}


## LEIs

In a real firm with real trades we would need to know the LEI (Legal Entity Identifier) of the legal entity which did each trade because SI status is reported distinctly for each legal entity (LEI).

Quite often firms will do trades within a single legal entity, perhaps to move risk from one trading desk to another.  These are called intra-entity trades and must be filtered out before the SI calculation.  For this example we'll say that all the trades we generated are inter-entity trades, trades between distinct legal entities, so we count them all.

In this example we'll use just one LEI, and not even a valid one, but it will suffice for the example.

In [2]:
# Add our LEI to each trade
our_lei = 'Our_LEI_here'
for sample_trade in sample_trades:
    sample_trade.our_lei = our_lei

# Print the one of the modified sample trades
print(vars(random.choice(sample_trades)))

{'sub_asset_class_name': 'Other credit derivatives', 'asset_class_name': 'Credit Derivatives', 'our_lei': 'Our_LEI_here'}


## Trade Date
The SI calculation includes checks for frequency, the number of trades done in a single week.  To work that out we need a trade date for each trade.  Here we'll just use a few dates and add these to our sample trades.

In [3]:
# We give each sample trade a trade date in a 30 day range of dates
# and an ISO week number (c.f. https://en.wikipedia.org/wiki/ISO_week_date)

import datetime

sample_dates = []
today = datetime.date.today()
for day_number in range(-30, 0):
    a_date =  today + datetime.timedelta(day_number)
    if a_date.weekday() < 6:
        sample_dates.append(a_date)

for sample_trade in sample_trades:
    selected_date = random.choice(sample_dates)
    sample_trade.trade_date = selected_date
    sample_trade.trade_date_week = selected_date.isocalendar()[1]
    
# Print the one of the modified sample trades
print(vars(random.choice(sample_trades)))

{'cds_sub_class': 'cds_sub_class.value', 'from_date': datetime.date(2018, 4, 25), 'sub_asset_class_name': 'Single name CDS options', 'to_date': datetime.date(2018, 5, 25), 'asset_class_name': 'Credit Derivatives', 'trade_date': datetime.date(2018, 4, 18), 'trade_date_week': 16, 'our_lei': 'Our_LEI_here'}


## MIC
The Market Identifier Code (MIC) is the ISO 10383 ID for a trading venue, for example a stock exchange.  The regulator is expected to provide a list of MIC values which identify venues which are recognised for the purposes of the SI calculation.  Trades which are done on vs. off recognised venues are counted differently.

In [4]:
# We define our MICs.  A MIC value is always 4 charcters in length.  The values used
# here are made-up nonsense, but good enough for an illustration

eea_mics = ['EEA1', 'EEA2', 'EEA3']
non_eea_mics = ['OFF1', 'OFF2', 'OFF3', 'OFF4']
all_mics = eea_mics + non_eea_mics

# Add a MIC to each sample trade
for sample_trade in sample_trades:
    sample_trade.mic = random.choice(all_mics)

# Print the one of the modified sample trades
print(vars(random.choice(sample_trades)))

{'cds_sub_class': 'cds_sub_class.value', 'from_date': datetime.date(2018, 4, 25), 'mic': 'OFF1', 'sub_asset_class_name': 'Single name CDS options', 'to_date': datetime.date(2018, 5, 25), 'asset_class_name': 'Credit Derivatives', 'trade_date': datetime.date(2018, 4, 20), 'trade_date_week': 16, 'our_lei': 'Our_LEI_here'}


## Own Account

We need to know if a trade was done on the firms own account.  Such trades are counted differently. 

In [5]:
# Own Account is simply a boolean.  Either this is a trade which the regulator views
# as being on own account, or not.  I use a random boolean with a probability.

own_account_probability = 0.25

for sample_trade in sample_trades:
    sample_trade.own_account = random.random() < own_account_probability
    
# Print the one of the modified sample trades
print(vars(random.choice(sample_trades)))

{'sub_asset_class_name': 'Other credit derivatives', 'own_account': True, 'asset_class_name': 'Credit Derivatives', 'mic': 'EEA2', 'trade_date': datetime.date(2018, 4, 5), 'trade_date_week': 14, 'our_lei': 'Our_LEI_here'}


## Client Order

We need to know if a trade was done in response to a client order.  Such trades are counted differently. 

In [6]:
# Client Order is also simply a boolean.  Either this is a trade which was done
# in response to a client order, or not.  I use a random boolean.

client_order_probability = 0.5

for sample_trade in sample_trades:
    sample_trade.client_order = random.random() < client_order_probability
    
# Print the one of the modified sample trades
print(vars(random.choice(sample_trades)))

{'client_order': True, 'sub_asset_class_name': 'Other credit derivatives', 'own_account': False, 'asset_class_name': 'Credit Derivatives', 'mic': 'OFF2', 'trade_date': datetime.date(2018, 4, 23), 'trade_date_week': 17, 'our_lei': 'Our_LEI_here'}


## EUR Notional
Another measure used by the SI calculation is the EUR notional value of each trade.  Here we assign a notional value to each trade.

In [7]:
# Add a random-ish Euro Notional amount of n million EUR to each trade
notional_amounts = [x * 1000000 for x in [1, 1, 1, 2, 2, 5, 10, 25]]

for sample_trade in sample_trades:
    sample_trade.eur_notional = random.choice(notional_amounts)

# Print one of the modified sample trades
print(vars(random.choice(sample_trades)))

{'from_date': datetime.date(2018, 4, 25), 'own_account': True, 'mic': 'OFF3', 'cds_index_sub_class': 'cds_index_sub_class.value', 'eur_notional': 10000000, 'sub_asset_class_name': 'CDS index options', 'to_date': datetime.date(2018, 10, 22), 'asset_class_name': 'Credit Derivatives', 'client_order': True, 'trade_date': datetime.date(2018, 3, 28), 'trade_date_week': 13, 'our_lei': 'Our_LEI_here'}


## RTS 2 Annex III Classification
The last step before we start the SI calculation is to add the RTS 2 Annex III classification to each trade.

In [8]:
# Now classify each trade and add the JSON classification back to the trade
for sample_trade in sample_trades:
    classification = root.classification_for(subject=sample_trade)
    json_classification = json.dumps(classification.classification_dict())
    sample_trade.rts2_classification = json_classification

print(random.choice(sample_trades).rts2_classification)

{"RTS2 version": "EU 2017/583 of 14 July 2016", "Asset class": "Credit Derivatives", "Sub-asset class": "CDS index options", "Segmentation criterion 1 description": "CDS index sub-class as specified for the sub-asset class of index credit default swap (CDS )", "Segmentation criterion 1": "cds_index_sub_class.value", "Segmentation criterion 2 description": "time maturity bucket of the option defined as follows:", "Segmentation criterion 2": "Maturity bucket 1: Zero to 6 months"}


In [None]:
# We define a class to represent an RTS2 sub class in the context of an
# SI calculation.  Instances can answer all the questions for three tests
# To keep the variable names a bit shorter I call 

class SubClass(object):
    def __init__(self, classification)
        self.classification = classification
        self.trades = []
        self._results = None
        self._is_liquid = None
        self._client_own_otc_count = None
        self._client_own_otc_avg_weekly_count = None
        self._client_own_otc_notional = None
        
        

## Put the trade data in to Pandas tables

The SI calculation requires a number of selections of the trade population.  See the comments below for details of each selection.

In [9]:
# Put the essential information for each trade into a Pandas table.

import pandas as pd

def si_details_from_sample(sample_trade):
    return dict(
        lei=sample_trade.our_lei,
        trade_date=sample_trade.trade_date,
        trade_date_week=sample_trade.trade_date_week,
        mic=sample_trade.mic,
        own_account=sample_trade.own_account,
        client_order=sample_trade.client_order,
        eur_notional=sample_trade.eur_notional,
        rts2_classification=sample_trade.rts2_classification,
        )

# The set of all trades (by LEI if there is more than one)
all_trades = pd.DataFrame.from_records([si_details_from_sample(s) for s in sample_trades])

# The subset of all trades which were not done on an EEA venue (i.e. OTC trades)
otc_trades = all_trades[~all_trades.mic.isin(eea_mics)]

# The subset of OTC trades which were done on the banks own account
own_account_otc_trades = otc_trades[otc_trades.own_account]

# The subset of own account OTC trades which were done in response to client orders
client_own_account_otc_trades = own_account_otc_trades[own_account_otc_trades.client_order]


# Step 2 - The Regulator Supplied Data

The regulator is expected to provide information about each sub class:
* Is the sub class liquid?
* How many trades of that sub class were done in the whole EU?
* What is the total EUR notional value traded in that sub class in the whole EU?

We don't have any regulator supplied data here so we synthesise some.

In [10]:
# For every RTS 2 sub class we need to decide if it is liquid or not
# We simply generate a random true/false for each sub class and use
# a dictionary to hold the result so we can look it up later.

distinct_sub_classes = all_trades.rts2_classification.unique()
liquidity_dictionary = dict()
for sub_class in distinct_sub_classes:
    is_liquid = random.random() < 0.5
    liquidity_dictionary[sub_class] = is_liquid
liquidity_dictionary.values()


dict_values([False, False, False, True, True, True, True, False, True, True, False, True])

In [41]:
# The values which will be compared with the EU trade count and sum(eur_notional)
# are the counts and totals of the own account OTC trades which were done in 
# response to client orders.  We synthesise the test EU numbers from these.

# First we get the counts and sum of notional, grouping by sub class (RTS 2 string)
notional_by_sub_class = client_own_account_otc_trades[['rts2_classification', 'eur_notional']]\
    .groupby(by='rts2_classification')
sums_series = notional_by_sub_class.agg(['count', 'sum'])
sums_df = pd.DataFrame(sums_series)

sums_df.columns = sums_df.columns.get_level_values(0)
sums_df = sums_df.reset_index()
sums_df.columns = pd.Index(
    ['rts2_classification', 'trade_count', 'trade_notional_sum'], 
    dtype='object')

# We add a column for the EU trade count for each sub class.  For this exercise, the
# threshold for being an SI is if our LEO count for the subclass is >= 2.5% of
# the EU count.  The EU figure is randomly set to be a bit more or a bit less than
# will trigger SI status.

sums_df['eu_count'] = sums_df['trade_count']\
    .apply(lambda x: x * 40 + random.choice([x * -1, x]) )

# Add a column for the EU notional for each sub class.  The threshold for
# notional is 1% of the EU figure.  Again the EU number is randomly tweaked
sums_df['eu_eur_notional'] = sums_df['trade_notional_sum']\
    .apply(lambda x: x * 100 + random.choice([x * -1, x]) )


# Add a column for the average number of trades per week
min_week_number = all_trades['trade_date_week'].min()
max_week_number = all_trades['trade_date_week'].max()
number_of_weeks = max_week_number - min_week_number + 1
sums_df['avg_weekly_trades'] = sums_df['trade_count']\
    .apply(lambda x: x /  number_of_weeks)

# Add a column which indicates if the subclass is liquid
sums_df['is_liquid'] = sums_df['rts2_classification']\
    .apply(lambda x: liquidity_dictionary[x] )

sums_df.head(2)

Unnamed: 0,rts2_classification,trade_count,trade_notional_sum,eu_count,eu_eur_notional,avg_weekly_trades,is_liquid
0,"{""RTS2 version"": ""EU 2017/583 of 14 July 2016""...",8,82000000,328,8118000000,1.6,False
1,"{""RTS2 version"": ""EU 2017/583 of 14 July 2016""...",5,49000000,205,4949000000,1.0,False


# Step 3 - Do the SI calculation

The "calculation" is really a set of filters which might catch an RTS 2 subclass for an LEI

1. If the RTS 2 Annex III sub class is liquid
   - and the count of client own-account otc trades >= 2.5% of eu_rts2_trade_count
   - and average weekly number of client own-account otc trades >= 1
2. If the RTS 2 Annex III sub class is not liquid 
   - and average weekly number of client own-account otc trade >= 1
3. If the sum of EUR notional for client own-account otc trades is
   - \>= 25% of all trades notional for the LEI
   - **or** >= 1% of EU trade notional

In [38]:
# Filter 1:  This is looking at the trades we have in client_own_account_otc_trades.
# We need to join the liquidity, EU trade count and weekly trade avereages data 
# with trades data frame, then we can for which the following three things are true:
#    The RTS 2 sub class is liquid
#    The count of our trades for that sub class >= 2.5% of the EU count
#    More than one trade per week was done on average

x1 = sums_df.copy()
x1[(x1.is_liquid) 
   & (x1.trade_count >= (x1.eu_count * 0.025))
   & (x1.avg_weekly_trades >= 1)]


Unnamed: 0,rts2_classification,trade_count,trade_notional_sum,eu_count,eu_eur_notional,avg_weekly_trades,is_liquid
3,"{""RTS2 version"": ""EU 2017/583 of 14 July 2016""...",8,78000000,312,7722000000,1.6,True
4,"{""RTS2 version"": ""EU 2017/583 of 14 July 2016""...",6,55000000,234,5445000000,1.2,True


In [40]:
# Filter 2 just looks at the average weekly number of trades for non liquid sub classes

x2 = sums_df.copy()
x2[(~x2.is_liquid) 
   & (x1.avg_weekly_trades > 1)]


Unnamed: 0,rts2_classification,trade_count,trade_notional_sum,eu_count,eu_eur_notional,avg_weekly_trades,is_liquid
0,"{""RTS2 version"": ""EU 2017/583 of 14 July 2016""...",8,82000000,312,8118000000,1.6,False


In [None]:
# Filter 3
# If the sum of EUR notional for client own-account otc trades is
#    >= 25% of all trades notional for the LEI
#    or >= 1% of EU trade notional
