# Corporate actions tracker

Function: This corporate actions app aggregates the dividend actions for a portfolio. It is an initial step towards market surveillance and risk management.

This jupyter notebook lists out the steps towards inputting a hypothetical client’s portfolios into DataFusion. It forms part of the training and familiarization session for a practical use case of DataFusion. Python program is used for the automation of data input and linking in DataFusion.


## Procedure

The first step is the input of data into DataFusion. This needs to generate firstly a token under the OAuth2 security protocol. The token lasts for only 60 minutes without usage. The postcurl automates the retrieval of tokens from the DataFusion server. Please enter your own userid and password. 

In [45]:
# -*- coding: utf-8 -*-

import requests, json, os
from ast import literal_eval
from requests.packages.urllib3.exceptions import InsecureRequestWarning
requests.packages.urllib3.disable_warnings(InsecureRequestWarning)

def htmlparse(url):
    urlr=re.sub(':', '%3A',url)
    urlr=re.sub('/', '%2F',urlr)
    urlr=re.sub(' ', '%20',urlr)
    urlr=re.sub('#', '%23',urlr)
    return urlr

def postcurl():
    url = "https://dds-test.thomsonreuters.com/app/oauth/token"
    headers = {'content-type': 'application/json'}
    data = json.dumps({"username":"eric.tham@thomsonreuters.com", "password":"datafusion"})
    ans = requests.post(url, headers= headers, data =data, verify = False )
    ans = literal_eval(ans.content)
    
    return ans['access_token']

token = postcurl()
print "Token generated is " + token

def posturl(url):
    headers = {'Content-Type': 'application/json',  'Authorization' : 'Bearer ' + token }
    response = requests.post(url, headers= headers, verify = False )
    
    print "Response code is " + str(response.status_code) # response code if ok should be 2xx 
    print response.text


Token generated is slieekkma3u657n0od5mka23uub2gm43


Generate a contextID. The context is like a data source in DataFusion. It encapsulates all the entities, predicates for a particular purpose.

In [None]:
contextname = "TestUse3" # please enter a new context name
url = "https://dds-test.thomsonreuters.com/app/api/context?name=" + contextname
print url #
posturl(url)

The context creation needs to be done one time through the API https://dds-test.thomsonreuters.com/app/api#!/context/postContext

The next step involves the portfolio creation. The portfolio details are as below:
    
Account	Acc_Open	Acc_Close	Acc_Type	Acc_Ccy	Acc_limit	Acc_limit_ccy
ACC_TRDSG_001	2016-01-23	2030-01-01	Client	Multi	10000000	USD
ACC_TRDSG_002	2016-01-24	2030-01-02	Client	Multi	500000	USD
ACC_TRDSG_003	2015-01-25	2030-01-03	Prop	HKD	4000000	USD

This is used to create the attached create_account.nt file. An example for #ACC_TRDSG_001 is below. 

<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.entagen.com/ns/type/account> . 
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://www.w3.org/2000/01/rdf-schema#label> "ACC_TRDSG_001" .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://sg.thomsonreuters.com/account/pred/acc_open> "2016-01-23" .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://sg.thomsonreuters.com/account/pred/acc_close> "2030-01-01" .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://sg.thomsonreuters.com/account/pred/acc_type> "Client" .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://sg.thomsonreuters.com/account/pred/acc_ccy> "Multi" .

This is an .nt format for RDF files. The first triple indicates the entity type for the #ACC_TRDSG_001 which follows standard RDF1.1 syntax. In this case, it belongs to the Master entity type "Account". The second states the label used. The rest of the triples are properties of the entity. Creation of the account may be done through the API https://dds-test.thomsonreuters.com/app/api#!/context/insertFile or through the postfile(url, datafile) above.

In [67]:
def postfile(url, datafile, token):   
    headers = {'Authorization' : 'Bearer ' + token }
    
    files = {'file': ('datafile.csv', datafile, 'multipart/form-data')}
    response = requests.post(url, headers = headers, files = files,  verify = False) #, data =values)
    print "Response code is " + str(response.status_code)
    print response.text
    return response

cwd = os.getcwd()
fname = cwd + "\\create_account.nt"
fhand = open(fname, "rb")
context_id = 154    #CorpAction1 ID
data = fhand
print "Present directory is = " + cwd
url_postfile = "https://dds-test.thomsonreuters.com/app/api/context/" + str(context_id) + "/rdf/file?importFormat=N-Triples"
res = postfile(url_postfile, data , token)
fhand.close()

Present directory is = C:\Users\u6038155\Documents\Products\DataFusion\Corporate Action
Response code is 200
{"id":154,"name":"CorpAction1","uri":"http://rdf.entagen.com/context/1474011354459/CorpAction1","lastBatchReceived":null,"predicates":[{"id":17638,"uri":"http://sg.thomsonreuters.com/account/pred/acc_close"},{"id":17639,"uri":"http://www.w3.org/1999/02/22-rdf-syntax-ns#type"},{"id":17640,"uri":"http://www.w3.org/2000/01/rdf-schema#label"},{"id":17643,"uri":"http://sg.thomsonreuters.com/account/pred/acc_open"},{"id":17642,"uri":"http://sg.thomsonreuters.com/account/pred/acc_type"},{"id":17641,"uri":"http://sg.thomsonreuters.com/account/pred/acc_ccy"}],"rdfTypes":[{"id":2377,"uri":"http://rdf.entagen.com/ns/type/account","shortName":"account","totalCount":3,"mappedLabelCount":1,"mappedStitchValueCount":0,"mappedEntityTypeId":23}],"remoteContextUri":null,"jdbcRepository":null}


Check the status of the import of the rdf file, by using a GET command. 

In [60]:
def geturl(url, token):
    headers = {'Authorization' : 'Bearer ' + token }
    response = requests.get(url, headers = headers, verify = False) 
    print "Response code is " + str(response.status_code)
    print response.text
    return response

url = "https://dds-test.thomsonreuters.com/app/api/context/154/rdf/status"
response = geturl(url, token)

Response code is 200
{"completed":"100","status":"100% -- Saving context metadata..."}


<Response [200]>

The portfolio accounts are now created in DataFusion. The next step is to create the corporate actions entities. These are linked to the instruments entities. An example of the triples are:

<http://sg.thomsonreuters.com/corpaction/#8590932301> <http://www.w3.org/1999/02/22-rdf-syntax-ns#type> <http://rdf.entagen.com/ns/type/corporateaction> .
<http://sg.thomsonreuters.com/corpaction/#8590932301> <http://www.w3.org/2000/01/rdf-schema#label> "Apple Ord Shs CorporateAction" .
<http://sg.thomsonreuters.com/corpaction/#8590932301> <https://permid.org/PermIdLink> <https://permid.org/1-8590932301> .

These are the triples for Apple Ord Shs corporate action. These are stored in the create_corporate.nt file and created from the corporate_action.csv through the python function below. THe first two triples are standard identification for the type definition and label identification. The third triple links the corporate action to the correct instrument.

In [70]:
import pandas as pd
def createCorpAction(fname):
    fhand = open(fname)
    df = pd.read_csv(fhand, sep=",")
    suburi = "<http://sg.thomsonreuters.com/corpaction/"
    
    df_permid = df[["PermId", "Instruments"]].drop_duplicates()
    rdf_1 = ""
    
    for rid, val in df_permid.iterrows():
        rdftemp = suburi + "#" + str(val["PermId"]) + "> "
        rdftemp = rdftemp + "<http://www.w3.org/1999/02/22-rdf-syntax-ns#type> "
        rdftemp = rdftemp + "<http://rdf.entagen.com/ns/type/corporateaction> .\n"
        
        rdftemp = rdftemp + suburi + "#" + str(val["PermId"]) + "> "
        rdftemp = rdftemp + "<http://www.w3.org/2000/01/rdf-schema#label> "
        rdftemp = rdftemp + "\"" + val["Instruments"] + " CorporateAction" + "\" .\n"
        
        rdftemp = rdftemp + suburi + "#" + str(val["PermId"]) + "> "
        rdftemp = rdftemp + "<https://permid.org/PermIdLink> "  # change this to 
        # rdftemp = rdftemp + "<http://instru.thomsonreuters.com/pred/CorporateActionLink> "
        rdftemp = rdftemp + "<https://permid.org/1-" + str(val["PermId"]) + "> .\n"
        
        rdf_1 = rdf_1 + rdftemp
    
    rdf_2 = ""
    for rid, val in df.iterrows():
        rdftemp = rdftemp + suburi + "#" + str(val["PermId"]) + "> "
        rdftemp = rdftemp + "<http://account.thomsonreuters.com/CorporateActionType> "
        rdftemp = rdftemp + "\"" + str(val["CorpActionType"]) + "\" .\n"

        rdftemp = rdftemp + suburi + "#" + str(val["PermId"]) + "> "
        rdftemp = rdftemp + "<http://account.thomsonreuters.com/" + str(val["Date"]) + "/CorporateActionDate> " 
        rdftemp = rdftemp + "\"" + str(val["Date"]) + "\" .\n"        
        
        rdf_2 = rdf_2 + rdftemp
    
    return rdf_1 + "\n" + rdf_2

fname = cwd + "\\corporate_action.csv"
fout = cwd + "\\corporate_action.nt"
fhand = open(fname)
rdf = createCorpAction(fname)

rdfout = open(fout, "w")
rdfout.write(rdf)

fhand.close()
print "File corporate_action.nt successfully created"

File corporate_action.nt successfully created


The corporate_action.net is then input into DataFusion.

In [72]:
rdfout = open(fout, "r")
res = postfile(url_postfile, rdfout , token)
rdfout.close()

Response code is 200
{"id":154,"name":"CorpAction1","uri":"http://rdf.entagen.com/context/1474011354459/CorpAction1","lastBatchReceived":null,"predicates":[{"id":17658,"uri":"http://account.thomsonreuters.com/2016-10-16/CorporateActionDate"},{"id":17641,"uri":"http://sg.thomsonreuters.com/account/pred/acc_ccy"},{"id":17642,"uri":"http://sg.thomsonreuters.com/account/pred/acc_type"},{"id":17662,"uri":"http://account.thomsonreuters.com/2016-10-19/CorporateActionDate"},{"id":17643,"uri":"http://sg.thomsonreuters.com/account/pred/acc_open"},{"id":17645,"uri":"http://account.thomsonreuters.com/2016-10-27/CorporateActionDate"},{"id":17646,"uri":"http://account.thomsonreuters.com/2016-10-25/CorporateActionDate"},{"id":17652,"uri":"http://account.thomsonreuters.com/2016-10-28/CorporateActionDate"},{"id":17660,"uri":"http://account.thomsonreuters.com/2016-10-22/CorporateActionDate"},{"id":17661,"uri":"http://account.thomsonreuters.com/2016-10-24/CorporateActionDate"},{"id":17639,"uri":"http://w

At this juncture, we have created the accounts and the corporate actions. The corporate actions are also linked to the instruments through a predicate. We further illustrate how to enter additional properties for an existing instrument. 
<img src="images/instrument_addppty.JPG"> 
As shown here, the existing instrument does not have ccy and price properties. It has also been linked to a corporate action.

The ccy and prices are inputted through posting the add_instru_details.nt file. 

<https://permid.org/1-8590932301> <http://instru.thomsonreuters.com/pred/ccy> "USD".
<https://permid.org/1-8590932301> <http://instru.thomsonreuters.com/pred/price> "10.4".

In [74]:
fname = cwd + "\\add_instru_details.nt"
rdf = open(fname, "r")
res = postfile(url_postfile, rdf , token)
rdf.close()

Response code is 200
{"id":154,"name":"CorpAction1","uri":"http://rdf.entagen.com/context/1474011354459/CorpAction1","lastBatchReceived":null,"predicates":[{"id":17652,"uri":"http://account.thomsonreuters.com/2016-10-28/CorporateActionDate"},{"id":17654,"uri":"http://account.thomsonreuters.com/2016-10-18/CorporateActionDate"},{"id":17641,"uri":"http://sg.thomsonreuters.com/account/pred/acc_ccy"},{"id":17658,"uri":"http://account.thomsonreuters.com/2016-10-16/CorporateActionDate"},{"id":17662,"uri":"http://account.thomsonreuters.com/2016-10-19/CorporateActionDate"},{"id":17639,"uri":"http://www.w3.org/1999/02/22-rdf-syntax-ns#type"},{"id":17638,"uri":"http://sg.thomsonreuters.com/account/pred/acc_close"},{"id":17660,"uri":"http://account.thomsonreuters.com/2016-10-22/CorporateActionDate"},{"id":17645,"uri":"http://account.thomsonreuters.com/2016-10-27/CorporateActionDate"},{"id":17655,"uri":"http://account.thomsonreuters.com/2016-10-30/CorporateActionDate"},{"id":17647,"uri":"http://acc

The next stage is to enter the instrument records for a portfolio. For eg, portfolio account #ACC_TRDSG_001 has the following instruments. The triples only identify the instrument records inside the portfolio. 
The quantity and the date open/ closed still needs to be input by annotation features of the predicate. 
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://pred.thomsonreuters.com/account/trades> <https://permid.org/1-8590932301> .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://pred.thomsonreuters.com/account/trades> <https://permid.org/1-21475210032> .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://pred.thomsonreuters.com/account/trades> <https://permid.org/1-8591112257> .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://pred.thomsonreuters.com/account/trades> <https://permid.org/1-8590934012> .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://pred.thomsonreuters.com/account/trades> <https://permid.org/1-21475875541> .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://pred.thomsonreuters.com/account/trades> <https://permid.org/1-8590928320> .
<http://sg.thomsonreuters.com/account/#ACC_TRDSG_001> <http://pred.thomsonreuters.com/account/trades> <https://permid.org/1-8589980998> .

In [81]:
def createRDF_positions(fname):
    fhand = open(fname)
    df = pd.read_csv(fhand, sep=",")
    suburi = "<http://sg.thomsonreuters.com/account/#"
    fhand.close()
    
    jsonout = ""    
    for rid, val in df.iterrows():
        jsontemp = suburi + val["Account"] + "> "
        jsontemp = jsontemp + "<http://pred.thomsonreuters.com/account/trades> "
        jsontemp = jsontemp + "<https://permid.org/1-" + str(val["Instruments_PermID"]) + "> .\n"
        
        jsonout = jsonout + jsontemp
    
    return jsonout      # string output

fname_1 = cwd + "create_positions.csv"
fname_2 = cwd + "\\create_positions.nt"
rdf = open(fname_2, "r")

res = postfile(url_postfile, rdf , token)
rdf.close()

Response code is 200
{"id":154,"name":"CorpAction1","uri":"http://rdf.entagen.com/context/1474011354459/CorpAction1","lastBatchReceived":null,"predicates":[{"id":17639,"uri":"http://www.w3.org/1999/02/22-rdf-syntax-ns#type"},{"id":17658,"uri":"http://account.thomsonreuters.com/2016-10-16/CorporateActionDate"},{"id":17667,"uri":"http://pred.thomsonreuters.com/account/trades"},{"id":17641,"uri":"http://sg.thomsonreuters.com/account/pred/acc_ccy"},{"id":17652,"uri":"http://account.thomsonreuters.com/2016-10-28/CorporateActionDate"},{"id":17662,"uri":"http://account.thomsonreuters.com/2016-10-19/CorporateActionDate"},{"id":17644,"uri":"http://account.thomsonreuters.com/2016-10-29/CorporateActionDate"},{"id":17647,"uri":"http://account.thomsonreuters.com/2016-10-31/CorporateActionDate"},{"id":17648,"uri":"http://account.thomsonreuters.com/2016-10-21/CorporateActionDate"},{"id":17656,"uri":"http://account.thomsonreuters.com/2016-10-20/CorporateActionDate"},{"id":17660,"uri":"http://account.t

In [80]:
def createAnnotate(fname):
    fhand = open(fname)
    df = pd.read_csv(fhand, sep=",")
    fhand.close()
    
    subjecturi = "<http://sg.thomsonreuters.com/account/#"
    predicateUri = "<https://permid.org/PermIdLink>"  # doesn't end with "/"
    objectUri = "<https://permid.org/1-"
    
    items = []
    
    for rid, val in df.iterrows():
        annotate = {}
        annotate["subject"]= subjecturi + val["Account"] + ">"
        annotate["predicate"]= predicateUri
        annotate["object"]= objectUri + str(val["Instruments_PermID"]) + ">"
        annotate["score"]= -1
        annotate["activeRange"]= -1     # this is will be the life of the instrument in the account
        properties={"Qty" : {"value": str(val["Qty"]) , "type": "NUMBER"} }

        annotate["properties"]= properties
        items.append(annotate)
    
    annotatejson = {"items" : items}
       
    return annotatejson 

def postannotate(annotate, token_DF):   
    url = "https://dds-test.thomsonreuters.com/app/api/annotation/bulk"
    headers = {'Content-Type': 'application/json', 'Accept': 'text/plain', 'Authorization' : 'Bearer ' + token_DF }
    response = requests.post(url, data = json.dumps(annotate), headers=headers, verify = False) #, data =values)
    print "Response code is " + str(response.status_code)# response code if ok should be 
    print response.text

fname_1 = cwd + "\\create_positions.csv"
fname_3 = cwd + "trade_annotate.json"
annotate = createAnnotate(fname_1)
fhand_3 = open(fname_3, "w")

json.dump(annotate, fhand_3)
fhand_3.close()

postannotate(annotate, token)

Response code is 500
<html>
<head>
  <meta http-equiv="Content-Type" content="text/html; charset=UTF-8"/>
  <title>System Error</title>
    <script type="text/javascript" src="/app/assets/masterindexmain-db88d49ae75221ec185230bf9645fb6f.js" ></script>
    <link rel="stylesheet" type="text/css" media="screen, projection" href="/app/assets/masterindexmain-ef873964055ae49fcdd35bf798c594a6.css"/>
</head>
<body>

<div class="navbar navbar-fixed-top">
    <div class="navbar-inner">
        <div class="container-fluid">
            <a class="btn btn-navbar" data-toggle="collapse" data-target=".nav-collapse">
                <span class="icon-bar"></span>
                <span class="icon-bar"></span>
                <span class="icon-bar"></span>
            </a>
            <a class="brand header-logo" href="/app/generation/index"><img src="/app/assets/branding/tr_white-072772b5ff66bfa74492076816cc6406.svg" height="41" width="152"/> </a>

            <div class="nav-collapse">
              