# Using Custom Entities to track Portfolio data quality

In this Notebook, we demonstrate how Custom Entities can be used to support Data Quality check on Portfolios.
We will create a new Custom Entity type named Portfolio_DataQuality which stores a list of attributes defining the 'Quality' of a Portfolio.
Each Portfolio will be linked ([Relationship feature](https://support.lusid.com/knowledgebase/article/KA-01679/)) to one or many Portfolio_DataQuality entities.

In [1]:
from lusidtools.jupyter_tools import toggle_code

"""Custom Entities in LUSID 

Illustrates the use of Custom Entities.

Attributes
----------
Custom Entities
Portfolios
Data Quality
RelationShip
"""

toggle_code("Toggle Docstring")

In [2]:
import os
import pandas as pd
import numpy as np
from datetime import datetime, timezone, date
import io
import json
import pytz
from IPython.core.display import HTML

# Then import the key modules from the LUSID package (i.e. The LUSID SDK)
import lusid as lu
import lusid.models as lm
import fbnsdkutilities.utilities as utils

# And use absolute imports to import key functions from Lusid-Python-Tools and other helper package

from lusidjam import RefreshingToken
from lusidtools.cocoon.cocoon import load_from_data_frame
from lusidtools.pandas_utils.lusid_pandas import lusid_response_to_data_frame
from lusidtools.jupyter_tools import StopExecution
from lusidtools.cocoon.cocoon_printer import (
    format_portfolios_response,
)

# Set DataFrame display formats
pd.set_option("max_colwidth", 100)
pd.set_option("display.max_columns", None)
pd.set_option("display.max_rows", None)
pd.options.display.float_format = "{:,.2f}".format
display(HTML("<style>.container { width:90% !important; }</style>"))

# Set the secrets path
secrets_path = os.getenv("FBN_SECRETS_PATH")

# For running the notebook locally
if secrets_path is None:
    secrets_path = os.path.join(os.path.dirname(os.getcwd()), "secrets.json")

api_factory = utils.ApiClientFactory(
        lu,
        token=RefreshingToken(),
        api_secrets_filename = secrets_path,
        app_name="LusidJupyterNotebook")

api_status = pd.DataFrame(
    api_factory.build(lu.ApplicationMetadataApi).get_lusid_versions().to_dict()
)

display(api_status)

Unnamed: 0,api_version,build_version,excel_version,links
0,v0,0.6.11252.0,0.5.3234,"{'relation': 'RequestLogs', 'href': 'http://fbn-ci.lusid.com/app/insights/logs/0HMQD9IG2C6IP:000..."


## Table of Contents
* [1. Create Portfolios](#-Create-Portfolios)
* [2. Create Custom Entity Portfolio_DataQuality](#-Create-CustomEntities)
* [3. Display Portfolios Data Quality](#-Display-DataQuality)
* [4. Create Relationship {Portfolio_DataQuality, Portfolio}](#-Create-Relationships)
* [5. Display Portfolio relationships](#-Display-Relationships)


# 1. Create Portfolios<a class="anchor" id="-Create-Portfolios"></a>

We'll start by upserting 2 Portfolios listed in 'data/custom_entities_portfolios.csv':
- Portfolio X
- Portfolio Y

In [3]:
#read and create portfolios

portfolio_df = pd.read_csv('data/custom_entities_portfolios.csv')
scope = 'Fbn_CE'
result = load_from_data_frame(
    api_factory=api_factory,
    scope=scope,
    data_frame=portfolio_df,
    mapping_required={
        "code": "code",
        "display_name": "display_name",
        "base_currency": "base_currency"
    },
    mapping_optional={
        "created": "$2020-01-01T00:00:00+00:00"
    },
    file_type="portfolios",
)

succ, failed = format_portfolios_response(result)
pd.DataFrame(data=[{"success": len(succ), "failed": len(failed)}])

Unnamed: 0,success,failed
0,2,0


# 2. Create Custom Entity Portfolio_DataQuality<a class="anchor" id="-Create-CustomEntities"></a>

This section creates a new Entity type modeling attributes needed to check a portfolio data quality. You can learn about Lusid Custom Entities [here](https://support.lusid.com/knowledgebase/article/KA-01750/en-us).

In [4]:
#create a new custom entity definition Portfolio_DataQuality

custom_entity_definitions_api = api_factory.build(lu.CustomEntityDefinitionsApi)

entity_type_name = 'Portfolio_DataQuality'
display_name = 'Portfolio DataQuality'
description = 'Defines the quality of the associated Portfolio'
field_schema = [
    lm.CustomEntityFieldDefinition(name='portfolioCode', lifetime='Perpetual', type = 'String', required=True),
    lm.CustomEntityFieldDefinition(name='effectiveDate', lifetime='Perpetual', type = 'DateTime', required=True),
    lm.CustomEntityFieldDefinition(name='returnsDataNOFReceived', lifetime='Perpetual', type = 'Boolean', required=False),
    lm.CustomEntityFieldDefinition(name='returnsDataGOFReceived', lifetime='Perpetual', type = 'Boolean', required=False),
    lm.CustomEntityFieldDefinition(name='performanceDataQuality', lifetime='Perpetual', type = 'String', required=False),
    lm.CustomEntityFieldDefinition(name='holdingDataReceived', lifetime='Perpetual', type = 'Boolean', required=False),
    lm.CustomEntityFieldDefinition(name='unitPriceDataReceived', lifetime='Perpetual', type = 'Boolean', required=False)
]

try:
    new_definition_response = custom_entity_definitions_api.create_custom_entity_definition(
        lm.CustomEntityDefinitionRequest(entity_type_name,display_name,description,field_schema))

    print(f'Custom Entity {new_definition_response.entity_type} has been created')
    
except lu.ApiException as e:
    if json.loads(e.body)["code"] != 791: # CustomEntityDefinitionAlreadyExists
        raise e
        
    print(f'Custom Entity {entity_type_name} already exists')
    

Custom Entity Portfolio_DataQuality already exists


In order to upsert and retrieve Custom Entities, we need to set up a new Identifier Property named `portfolioDqId`. This will be used to uniquely identify Custom Entities instances.

In [5]:
#create a unique identifier for Portfolio_DataQuality

property_definitions_api = api_factory.build(lu.PropertyDefinitionsApi)

property_definition = lm.CreatePropertyDefinitionRequest(
    domain="CustomEntity",
    scope=scope,
    code='portfolioDqId',
    display_name='Portfolio_DataQuality Id',
    constraint_style="Identifier",
    data_type_id=lm.ResourceId(scope="system", code="string"),
)

try:
    property_definitions_api.create_property_definition(
        create_property_definition_request=property_definition
    )
except lu.ApiException as e:
    if json.loads(e.body)["name"] == "PropertyAlreadyExists":
        print(
            f"Property {property_definition.domain}/{property_definition.scope}/{property_definition.code} already exists"
        )


Property CustomEntity/Fbn_CE/portfolioDqId already exists


We can now upsert `~Portfolio_DataQuality` instances for PortfolioX and PortfolioY. Each instance represents a data quality check for one portfolio on a specific day.

In [6]:
#upsert Portfolio_DataQuality instances

def create_data_quality_entity(scope,portfolioCode,effective_date,display_name,
    returnsDataNOFReceived,returnsDataGOFReceived,performanceDataQuality,holdingDataReceived,unitPriceDataReceived):
    try:
        custom_entities_api = api_factory.build(lu.CustomEntitiesApi)

        custom_entity_id= lm.CustomEntityId(
            identifier_scope= scope,
            identifier_type='portfolioDqId',
            identifier_value=f'port-dq-check-{scope}-{portfolioCode}-{effective_date}')
        fields = [
            lm.CustomEntityField('portfolioCode',portfolioCode),
            lm.CustomEntityField('effectiveDate',effective_date),
            lm.CustomEntityField('returnsDataNOFReceived',returnsDataNOFReceived),
            lm.CustomEntityField('returnsDataGOFReceived',returnsDataGOFReceived),
            lm.CustomEntityField('performanceDataQuality',performanceDataQuality),
            lm.CustomEntityField('holdingDataReceived',holdingDataReceived),
            lm.CustomEntityField('unitPriceDataReceived',unitPriceDataReceived)
        ]

        custom_entities_api.upsert_custom_entity('~Portfolio_DataQuality',lm.CustomEntityRequest(
            display_name=display_name,
            description=f'Data Quality of {portfolioCode} on {effective_date}',
            identifiers=[custom_entity_id],
            fields=fields
        ))

        print(f"Data Quality with portfolioDqId: port-dq-check-{scope}-{portfolioCode}-{effective_date} has been created")

    except lu.ApiException as e:
        body = json.loads(e.body)
        if body["code"] != 667:  # RelationDefinitionAlreadyExists
            print(body)
        else:
            print(body['title'])

#DataQuality of Portfolio X
create_data_quality_entity(scope,'PortfolioX', date(2022, 6, 6), 'PortfolioX DQ', 
    False, False, 'Nothing to report',False, False)
create_data_quality_entity(scope,'PortfolioX', date(2022, 6, 7), 'PortfolioX DQ', 
    False, True, 'Nothing to report',False, False)
create_data_quality_entity(scope,'PortfolioX', date(2022, 6, 8), 'PortfolioX DQ', 
    True, True, 'Nothing to report',True, False)

#DataQuality of Portfolio Y
create_data_quality_entity(scope,'PortfolioY', date(2022, 6, 6), 'PortfolioY DQ', 
    True, True, 'Quality approved',False, False)

Data Quality with portfolioDqId: port-dq-check-Fbn_CE-PortfolioX-2022-06-06 has been created
Data Quality with portfolioDqId: port-dq-check-Fbn_CE-PortfolioX-2022-06-07 has been created
Data Quality with portfolioDqId: port-dq-check-Fbn_CE-PortfolioX-2022-06-08 has been created
Data Quality with portfolioDqId: port-dq-check-Fbn_CE-PortfolioY-2022-06-06 has been created


# 3. Display Portfolios Data Quality<a class="anchor" id="-Display-DataQuality"></a>

`CustomEntitiesApi` allows to query  custom entities by specifying a type and a filter. Here, we get ~Portfolio_DataQuality instances where portfolioCode field equals 'PortfolioX'.

In [7]:
#retrieve DataQuality of Portfolio X
custom_entities_api = api_factory.build(lu.CustomEntitiesApi)
data_quality_list = custom_entities_api.list_custom_entities(
    entity_type = "~Portfolio_DataQuality", 
    filter = "fields[portfolioCode] eq 'PortfolioX'").values

output_data_quality = {dq.name:[] for dq in data_quality_list[0].fields}
identifiers = []

for dq in data_quality_list:
    identifiers.append(dq.identifiers[0].identifier_value)
    for f in dq.fields:
        output_data_quality[f.name].append(f.value)

 
pd.DataFrame(output_data_quality, index = identifiers)

Unnamed: 0,effectiveDate,holdingDataReceived,performanceDataQuality,portfolioCode,returnsDataGOFReceived,returnsDataNOFReceived,unitPriceDataReceived
port-dq-check-Fbn_CE-PortfolioX-2022-06-06,2022-06-06T00:00:00.0000000+00:00,False,Nothing to report,PortfolioX,False,False,False
port-dq-check-Fbn_CE-PortfolioX-2022-06-07,2022-06-07T00:00:00.0000000+00:00,False,Nothing to report,PortfolioX,True,False,False
port-dq-check-Fbn_CE-PortfolioX-2022-06-08,2022-06-08T00:00:00.0000000+00:00,True,Nothing to report,PortfolioX,True,True,False


# 4. Create Relationship {Portfolio_DataQuality, Portfolio}<a class="anchor" id="-Create-Relationships"></a>

Relationships provide a means to link ~Portfolio_DataQuality Custom Entity and Portfolio Entity. In this example, we can have multiple ~Portfolio_DataQuality instances associated with one Portfolio, which represents one data quality check per day. You can find more about Relationships [here](https://support.lusid.com/knowledgebase/article/KA-01679/).

In [8]:
#create a new relationship deinition between Portfolio and Portfolio_DataQuality

relationship_definitions_api = api_factory.build(lu.RelationshipDefinitionsApi)

try:
    relationship_response = relationship_definitions_api.create_relationship_definition(
        lm.CreateRelationshipDefinitionRequest(
            scope = scope,
            code = 'Data_Quality',
            source_entity_type = '~Portfolio_DataQuality',
            target_entity_type = 'Portfolio',
            display_name = 'Data Quality',
            outward_description = 'checks',
            inward_description = 'is checked by',
            life_time = 'Perpetual',
            relationship_cardinality = 'ManyToOne'
        )
    )
    print(f'Relationship definition {relationship_response.relationship_definition_id} has been created ')
except lu.ApiException as e:
    body = json.loads(e.body)
    if body["code"] != 667:  # RelationDefinitionAlreadyExists
        print(body)
    else:
        print(body['title'])

Relation definition with scope 'Fbn_CE' and code 'Data_Quality' already exists


After creating `Data_Quality` relationship definition, we can now link ~Portfolio_DataQuality instances to PortfolioX and PortfolioY.

In [9]:
#apply relationships instances {PortfolioX, PortfolioX DQ} and {PortfolioY, PortfolioY DQ}
def create_dataquality_relationship (portfolio_scope,portfolio_code,portfolio_DQ_scope, portfolio_DQ_code):
    relationships_api = api_factory.build(lu.RelationshipsApi)

    relationship_scope = scope
    relationship_code = 'Data_Quality'

    source_entity_id =  {
        'idTypeScope': portfolio_DQ_scope,
        'idTypeCode': 'portfolioDqId',
        'code':portfolio_DQ_code}
    target_entity_id = {
        'scope': portfolio_scope,
        'code': portfolio_code}
    try:
        response = relationships_api.create_relationship(
            relationship_scope,
            relationship_code,
            lm.CreateRelationshipRequest(
                source_entity_id = source_entity_id,
                target_entity_id = target_entity_id
            ))
        print(f'relationship {response.relationship_definition_id.code}: {response.source_entity.entity_id} - {response.target_entity.entity_id} has been created.')
    except lu.ApiException as e:
        body = json.loads(e.body)
        if body["code"] != 667:  # RelationDefinitionAlreadyExists
            print(body)
        else:
            print(body['title'])

#create relationship  {PortfolioX, PortfolioX DQ}
create_dataquality_relationship(scope,'PortfolioX',scope,'port-dq-check-Fbn_CE-PortfolioX-2022-06-06')
create_dataquality_relationship(scope,'PortfolioX',scope,'port-dq-check-Fbn_CE-PortfolioX-2022-06-07')
create_dataquality_relationship(scope,'PortfolioX',scope,'port-dq-check-Fbn_CE-PortfolioX-2022-06-08')  

#create relationship  {PortfolioY, PortfolioY DQ}
create_dataquality_relationship(scope,'PortfolioY',scope,'port-dq-check-Fbn_CE-PortfolioY-2022-06-06')



relationship Data_Quality: {'identifierScope': 'Fbn_CE', 'identifierType': 'portfolioDqId', 'identifierValue': 'port-dq-check-Fbn_CE-PortfolioX-2022-06-06'} - {'scope': 'Fbn_CE', 'code': 'PortfolioX'} has been created.
relationship Data_Quality: {'identifierScope': 'Fbn_CE', 'identifierType': 'portfolioDqId', 'identifierValue': 'port-dq-check-Fbn_CE-PortfolioX-2022-06-07'} - {'scope': 'Fbn_CE', 'code': 'PortfolioX'} has been created.
relationship Data_Quality: {'identifierScope': 'Fbn_CE', 'identifierType': 'portfolioDqId', 'identifierValue': 'port-dq-check-Fbn_CE-PortfolioX-2022-06-08'} - {'scope': 'Fbn_CE', 'code': 'PortfolioX'} has been created.
relationship Data_Quality: {'identifierScope': 'Fbn_CE', 'identifierType': 'portfolioDqId', 'identifierValue': 'port-dq-check-Fbn_CE-PortfolioY-2022-06-06'} - {'scope': 'Fbn_CE', 'code': 'PortfolioY'} has been created.


# 5. Display Portfolio relationships<a class="anchor" id="-Display-Relationships"></a>

Using `PortfoliosApi`, we can easily retrieve all Entities (native and custom) linked to a Portfolio. Here, we get all relationships involving PortfolioX. This can be used to get ~Portfolio_DataQuality unique identifiers.

In [10]:
#get portfolio relationships
portfolios_api = api_factory.build(lu.PortfoliosApi)

relationships = portfolios_api.get_portfolio_relationships(scope,'PortfolioX').values
column1 = 'enity_type' 
column2 = 'relationship_definition_id'
column3 = 'related_enity_identifier'

output_relationships = {column1:[], column2:[], column3:[]}

for relationship in relationships:
    output_relationships[column1].append(relationship.related_entity.entity_type)
    output_relationships[column2].append(f'{relationship.relationship_definition_id.scope}/{relationship.relationship_definition_id.code}')
    output_relationships[column3].append(f'{relationship.related_entity.identifiers[0].identifier_scope}/{relationship.related_entity.identifiers[0].identifier_type}/{relationship.related_entity.identifiers[0].identifier_value}')

pd.DataFrame(output_relationships)


Unnamed: 0,enity_type,relationship_definition_id,related_enity_identifier
0,~Portfolio_DataQuality,Fbn_CE/Data_Quality,Fbn_CE/portfolioDqId/port-dq-check-Fbn_CE-PortfolioX-2022-06-08
1,~Portfolio_DataQuality,Fbn_CE/Data_Quality,Fbn_CE/portfolioDqId/port-dq-check-Fbn_CE-PortfolioX-2022-06-06
2,~Portfolio_DataQuality,Fbn_CE/Data_Quality,Fbn_CE/portfolioDqId/port-dq-check-Fbn_CE-PortfolioX-2022-06-07
