# Introduction on data

In this exercise we combine issuances and holdings datasets, and we fetch legal entities Level 2 data using [Gleif API](https://documenter.getpostman.com/view/7679680/SVYrrxuU?version=latest#40ef2ec4-b8bd-46de-8ad5-5359ed828242), to connect corporate dots and simulate a financial markets analysis on defaults. 

<u>NOTE</u>: The provided granular datasets *issuances.csv* and *holdings.csv* are composed of artificial data: the ISINs in the first datasets were randomly generated, and instruments were casually assigned to holder companies. We are pretending to own two datasets providing granular securities data on issuances and holdings perspective.

In [14]:
import pandas as pd

In [15]:
issuances_df = pd.read_csv('issuances.csv', sep=';')
holdings_df = pd.read_csv('holdings.csv')

*issuances.csv* provides ISIN-by-ISIN information, i.e. for each security the following details are provided:
- *isin*: the ID of the security;
- *issue_date*: date of issuance;
- *publication_price*: price of publication;
- *volume*: number of outstanding amount of stock;
- *market_capitalization*: market value of the company's outstanding amount of stock;
- *issuer_lei*: the ID (LEI) of the issuer company.

In [16]:
issuances_df.head()

Unnamed: 0,isin,issue_date,publication_price,volume,market_capitalization,issuer_lei
0,ES9912477644,11/05/2023,-100.0,2000,-200000.0,529900QDG42IVLG7C683
1,DK5728198744,30/09/2020,-11.0,100,-1100.0,529900RF2834I0MC2L78
2,AT9885255322,14/07/2023,-10.0,3546,-35460.0,549300SDDBLYMENCB570
3,LU4963238630,23/04/2020,-5.0,5,-25.0,2138003FL72FCO53LM68
4,LT4925090717,12/01/2023,-0.1,2,-0.2,549300X7KXMBP1V42N81


*holdingd.csv* provides on information about who is holding what security:
- *isin*: the ID of the security;
- *holder_lei*: the ID (LEI) of the holder company

In [17]:
holdings_df.head()

Unnamed: 0,isin,holder_lei
0,SK4383059521,254900RGIHNSOOTM0L46
1,SK4383059521,2138006HP23N8GYS9Q88
2,SK4383059521,529900JYUND014UQ0P58
3,SK4383059521,549300M7R6X5LJOH0491
4,CH8229968527,254900WJJF84AQA4EN11


The two datasets point to two different aspects of financial market, and if we join them, we can have a broader perspective on the market.

If we can merge them based on the *ISIN*, we can observe if from securities point of view: given one security, we can find information on the issuer and the holder of that security. 

We could also join based on company (i.e. *issuer_lei* on *holder_lei*) to see whether an issuer company is also a holder of securities issued by another company.

## Companies data

While various instrument details are available, no details on companies were provided. To collect them, we can use [Gleif API](https://documenter.getpostman.com/view/7679680/SVYrrxuU?version=latest#40ef2ec4-b8bd-46de-8ad5-5359ed828242).

The GLEIF API gives developers access to full LEI Data search engine functionality, including filters, full-text and single-field searches of Level 1 (LEI Record) Data, retrieval of LEI Records (including links to their Level 2 data, where available), based on a search of their associated Level 2 (relationship) data, and "fuzzy" matching of important data fields such as names and addresses.

Requests are HTTP REST calls, following the JSON API specification.

In [18]:
import requests

In [23]:
session = requests.Session()
url = "https://api.gleif.org/api/v1/lei-records/"
company_id = "2W8N8UU78PMDQKZENC08"

In [24]:
session_json = session.get(url + company_id, headers={"Accept": "application/vnd.api+json"})

In [25]:
session_json.json()

{'meta': {'goldenCopy': {'publishDate': '2024-02-25T08:00:00Z'}},
 'data': {'type': 'lei-records',
  'id': '2W8N8UU78PMDQKZENC08',
  'attributes': {'lei': '2W8N8UU78PMDQKZENC08',
   'entity': {'legalName': {'name': 'INTESA SANPAOLO SPA',
     'language': 'it-IT'},
    'otherNames': [],
    'transliteratedOtherNames': [],
    'legalAddress': {'language': 'it-IT',
     'addressLines': ['PIAZZA SAN CARLO, 156'],
     'addressNumber': None,
     'addressNumberWithinBuilding': None,
     'mailRouting': None,
     'city': 'TORINO',
     'region': 'IT-TO',
     'country': 'IT',
     'postalCode': '10121'},
    'headquartersAddress': {'language': 'it-IT',
     'addressLines': ['PIAZZA SAN CARLO, 156'],
     'addressNumber': None,
     'addressNumberWithinBuilding': None,
     'mailRouting': None,
     'city': 'TORINO',
     'region': 'IT-TO',
     'country': 'IT',
     'postalCode': '10121'},
    'registeredAt': {'id': 'RA000407', 'other': None},
    'registeredAs': '00799960158',
    'jurisdi

Let's suppose we want to know the <b>company name</b> and <b>legal address country</b>.

In [16]:
df = pd.merge(issuances_df, holdings_df, on='isin') 

In [17]:
df

Unnamed: 0,isin,issue_date,publication_price,volume,market_capitalization,issuer_lei,holder_lei
0,LV8029671369,17/01/2023,0.000,4003448,0.000,2138006HP23N8GYS9Q88,815600311DA821549008
1,LV8029671369,17/01/2023,0.000,4003448,0.000,2138006HP23N8GYS9Q88,213800WR5NOVOEYFLN72
2,LV8029671369,17/01/2023,0.000,4003448,0.000,2138006HP23N8GYS9Q88,529900V9M5N0BGGB2Z36
3,LV8029671369,17/01/2023,0.000,4003448,0.000,2138006HP23N8GYS9Q88,529900U28ZJDY22FJA32
4,GB9433873554,13/12/2022,0.000,401456,0.000,635400NPTAOBMC2LW363,529900YHRPQXXQ333M09
...,...,...,...,...,...,...,...
1995,SI3034685711,25/01/2019,221.311,32549,7203451.739,549300YX4S1LLSMK2627,549300U08TTKCUFWTC83
1996,HU8177021197,11/07/2019,246.670,9105,2245930.350,549300DIVO0BXH2SBY91,5493004SYPRAVRVNK561
1997,HU8177021197,11/07/2019,246.670,9105,2245930.350,549300DIVO0BXH2SBY91,549300711LWQZIF20Y32
1998,HU8177021197,11/07/2019,246.670,9105,2245930.350,549300DIVO0BXH2SBY91,529900M16K5ETQQEXE12


To collect further information about 

References:
- [^1]() GLEIF API Introduction [https://documenter.getpostman.com/view/7679680/SVYrrxuU?version=latest#intro](https://documenter.getpostman.com/view/7679680/SVYrrxuU?version=latest#intro)
-