# Python: Open PermID APIs

## Overview

This example demonstrates how to use a Python Open PermID library. The library covers all features of Open PermID APIs including Record Matching, Entity Search, and Intelligent Tagging. 

PermID is a shortening of “Permanent Identifier” which is a machine-readable number assigned to entities, securities, organizations (companies, government agencies, universities, etc.), quotes, individuals, and more. It is specifically designed for use by machines to reference related information programmatically. Open PermID is publicly available for free at [https://permid.org/](https://permid.org/).

The Python OpenPermID is available on [pypi.org](https://pypi.org/project/OpenPermID/). It can be installed via the following **pip** command.

```
pip install OpenPermID
```
To use the Python OpenPermID, the application needs to create an OpenPermID object and set an access token to it. The access token can be retrieved after login to the [Open PermID](https://permid.org/) website.

In [1]:
from OpenPermID import OpenPermID

opid = OpenPermID()
opid.set_access_token("<ACCESS TOKEN>")

## 1. Entity Search

This function is used to search an entity's PermID value from a string. 
```
serach(q, entityType='all', format="dataframe", start=1, num=5, order='rel')
```
|Parameter Name|Required|Description|
|--------------|--------|-----------|
|q|Yes|A query string to search for. It could be either just the search string value, or prefix it with "<fieldname>:" to constrain the search to a specific field, such as "**refinitiv**", "**ticker:IBM**", and "**ticker: msft AND exchange:NSM**". For a list of all available fields, please refer to the [PermID User guide](https://developers.refinitiv.com/open-permid/permid-entity-search/docs?content=4885&type=documentation_item).|
|entityType|No|The type of entity to search for. Possible values are **all**, **organization**, **instrument**, or **quote**. The default value is **all**|
|format|No|The format of the output. Possible values are **dataframe**, **json**, or **xml**. The default value is **dataframe**|
|start|No|The index of the first result returned, in the list of results ordered according to the order parameter. The index is 1-based. The default value is 1.|
|num|No|The maximum number of results returned for each entity (separately). Possible values are 5, 10, 20, 50, and 100. The default value is 5.|
|order|No|The order of the search results. Possible values are **rel** (Descending order of relevance), **az** (Ascending alphabetical order of the entity name), or **za** ( Descending alphabetical order of the entity name). The default value is **rel**.|

This function returns a tuple containing a result and error string. When the **entityType** is **all** and the **format** is **dataframe**, it returns multiple data frames indexed by the entity types (**quotes**, **organizations**, and **instruments**). For other entity types with the **dataframe** format, it returns a data frame. The result could be a data frame, JSON, or XML string depending on the **format** parameter. 

The following code calls the **search** method to search for a "Refinitiv" string with the default parameters.


In [2]:
output,err = opid.search('Refinitiv')

### Display the organizations' entities

In [3]:
output['organizations']

Unnamed: 0,@id,hasURL,orgSubtype,organizationName
0,https://permid.org/1-8589934184,,Company,Refinitiv UK Financial Ltd
1,https://permid.org/1-5000120664,https://www.refinitiv.com/ja,Company,Refinitiv Japan KK
2,https://permid.org/1-4296693138,,Company,Refinitiv Asia Pte Ltd
3,https://permid.org/1-4295921907,,Investment Manager/Advisor,Refinitiv Global Markets Inc
4,https://permid.org/1-5000693632,,Company,Refinitiv de Mexico SA de CV


### Display the instruments' entities

In [4]:
output['instruments']

Unnamed: 0,@id,assetClass,hasName,isIssuedBy,isIssuedByName
0,https://permid.org/1-21661915727,Ordinary Shares,Refinitiv US Holdings Ord Shs (Unlisted),https://permid.org/1-5064690523,Refinitiv US Holdings Inc
1,https://permid.org/1-21705613453,Ordinary Shares,Refinitiv US Holdings Class C Ord Shs (Unlisted),https://permid.org/1-5064690523,Refinitiv US Holdings Inc
2,https://permid.org/1-21705464733,Preference Shares,Refinitiv US Holdings 10% Prf Shs (Unlisted),https://permid.org/1-5064690523,Refinitiv US Holdings Inc
3,https://permid.org/1-21705464732,Preference Shares,Refinitiv US Holdings 14.5% Prf Shs (Unlisted),https://permid.org/1-5064690523,Refinitiv US Holdings Inc


### Display the quotes' entities

In [5]:
output['quotes']

Unnamed: 0,@id,assetClass,hasName,hasRIC,isQuoteOf,isQuoteOfInstrumentName
0,https://permid.org/1-21705613554,Ordinary Shares,REFINITIV C ORD,RFTb.UNL,https://permid.org/1-21705613453,Refinitiv US Holdings Class C Ord Shs (Unlisted)
1,https://permid.org/1-21661916398,Ordinary Shares,REFINITIV ORD,RFT.UNL,https://permid.org/1-21661915727,Refinitiv US Holdings Ord Shs (Unlisted)


## 2. Entity Lookup

If you know a PermID of an entity, you can use the **lookup** method to retrieve the entity description. 

It accepts three parameters:

|Parameter Name|Required|Description|
|--------------|--------|-----------|
|id|Yes|The PermID used to lookup e.g. 1-5064690523|
|format|No|The format of the output. Possible values are **dataframe**, **json-ld**, or **turtle**. The default value is **dataframe**|
|orient|No|The format of the returned data frame. Possible values are **row**, or **column**. The default value is **row**|

This function returns a tuple containing a result and error string. The result could be a data frame, JSON, or turtle string depending on the **format** parameter.

The following code calls the **lookup** method to retrieve the entity information of the 1-5064690523 PermID with the **column** orient parameter.

In [6]:
output,err = opid.lookup("1-5064690523", orient="column")
output

Unnamed: 0,1-5064690523
@id,https://permid.org/1-5064690523
@type,tr-org:Organization
mdaas:HeadquartersAddress,3 Times Sq\n\n\nNEW YORK\nNEW YORK\n10036-6564\n
mdaas:RegisteredAddress,200 Bellevue Pkwy Ste 210\n\n\nWILMINGTON\nDEL...
tr-common:hasPermId,5064690523
hasPrimaryInstrument,https://permid.org/1-21661915727
hasActivityStatus,tr-org:statusActive
tr-org:hasHeadquartersPhoneNumber,16462234000
tr-org:hasLEI,549300NF240HXJO7N016
hasLatestOrganizationFoundedDate,2018-03-16T00:00:00Z


## 3. Record Matching

The PermID Record Matching API allows you to match entity Person, Organization, Instrument,
and Quote records with Refinitiv’ PermIDs. 


```Python
match(data,dataType='Organization',numberOfMatchesPerRecord=1,raw_output=False)
```
|Parameter Name|Required|Description|
|--------------|--------|-----------|
|data|Yes|A CSV string or data frame for matching. For formats of the CSV string, please refer to the [PermID User guide](https://developers.refinitiv.com/open-permid/permid-entity-search/docs?content=4885&type=documentation_item).|
|dataType|No|The type of entity to search for. Possible values are **Person**, **Organization**, **Instrument**, or **Quote**. The default value is **Organization**.|
|numberOfMatchesPerRecord|No|A number of possible matches to output for each record in the input. The maximum number of possible matches is 5. The default value is 1.|
|raw_output|No|A boolean value set to retrieve a result as a JSON string instead of a data frame. The default value is False which returns a data frame.|

This function returns a tuple containing a result and error string. The result could be a data frame, or JSON string depending on the **raw_output** parameter.

The following code calls the **match** method to match the organization entities with a CSV string.

In [7]:
organization="""
LocalID,Standard Identifier,Name,Country,Street,City,PostalCode,State,Website
1,,Apple,US,"Apple Campus, 1 Infinite Loop",Cupertino,95014,California,
2,,Apple,,,,,,
3,,Teva Pharmaceutical Industries Ltd,IL,,Petah Tikva,,,
4,,Tata Sky,IN,,,,,
5,RIC:IBM.N|Ticker:IBM,,,,,,,
6,Ticker:MSFT,,,,,,,
7,LEI:INR2EJN1ERAN0W5ZP974,,,,,,,
8,Ticker:FB&&Exchange:NSM,,,,,,,
9,Ticker:AAPL&&MIC:XNGS,,,,,,,
"""
output,err = opid.match(organization)
output

Unnamed: 0,Input_City,Input_Country,Input_LocalID,Input_Name,Input_PostalCode,Input_Standard Identifier,Input_State,Input_Street,Match Level,Match OpenPermID,Match Ordinal,Match OrgName,Match Score,Original Row Number,ProcessingStatus
0,Cupertino,US,1,Apple,95014.0,,California,"Apple Campus, 1 Infinite Loop",Excellent,https://permid.org/1-4295905573,1,Apple Inc,98%,2,OK
1,,,2,Apple,,,,,Excellent,https://permid.org/1-4295905573,1,Apple Inc,92%,3,OK
2,Petah Tikva,IL,3,Teva Pharmaceutical Industries Ltd,,,,,Excellent,https://permid.org/1-4295875158,1,Teva Pharmaceutical Industries Ltd,99%,4,OK
3,,IN,4,Tata Sky,,,,,Excellent,https://permid.org/1-4297589397,1,Tata Sky Ltd,92%,5,OK
4,,,5,,,RIC:IBM.N|Ticker:IBM,,,Excellent,https://permid.org/1-4295904307,1,International Business Machines Corp,100%,6,OK
5,,,6,,,Ticker:MSFT,,,Excellent,https://permid.org/1-4295907168,1,Microsoft Corp,100%,7,OK
6,,,7,,,LEI:INR2EJN1ERAN0W5ZP974,,,Excellent,https://permid.org/1-4295907168,1,Microsoft Corp,100%,8,OK
7,,,8,,,Ticker:FB&&Exchange:NSM,,,Excellent,https://permid.org/1-4297297477,1,Facebook Inc,100%,9,OK
8,,,9,,,Ticker:AAPL&&MIC:XNGS,,,Excellent,https://permid.org/1-4295905573,1,Apple Inc,100%,10,OK


The following code calls the **match** method to match the person entities with a data frame.

In [8]:
import pandas as pd
person = pd.DataFrame(columns = ['LocalID',
                                 'FirstName',
                                 'MiddleName',
                                 'PreferredName',
                                 'LastName',
                                 'CompanyPermID',
                                 'CompanyName',
                                 'NamePrefix',
                                 'NameSuffix']) 
person = person.append(pd.Series(['1','Satya','','','Nadella','','Microsoft Corp','',''], 
                                 index=person.columns),ignore_index=True)
person = person.append(pd.Series(['2','Satya','','','Nadella','4295907168','','',''], 
                                 index=person.columns),ignore_index=True)
person = person.append(pd.Series(['3','Martin','','','Jetter','','International Business Machines Corp','',''], 
                                 index=person.columns),ignore_index=True)
person = person.append(pd.Series(['4','Bill','','','Gates','','Microsoft Corp','',''], 
                                 index=person.columns),ignore_index=True)
output,err = opid.match(person, dataType='Person')
output

Unnamed: 0,Input_First Name,Input_Last Name,Input_LocalID,Input_OrgName,Input_OrgOpenPermID,Match First Name,Match Last Name,Match Level,Match OpenPermID,Match Ordinal,Match OrgName,Match OrgOpenPermID,Match Score,Original Row Number,ProcessingStatus
0,Satya,Nadella,1,Microsoft Corp,,Satya,Nadella,Good,https://permid.org/1-34413262612,1,MICROSOFT CORPORATION,https://permid.org/1-4295907168,0.75,2,OK
1,Satya,Nadella,2,,https://permid.org/1-4295907168,Satya,Nadella,Excellent,https://permid.org/1-34413262612,1,MICROSOFT CORPORATION,https://permid.org/1-4295907168,0.95,3,OK
2,Martin,Jetter,3,International Business Machines Corp,,Martin,Jetter,Excellent,https://permid.org/1-34418338814,1,INTERNATIONAL BUSINESS MACHINES CORPORATION,https://permid.org/1-4295904307,0.83,4,OK
3,Bill,Gates,4,Microsoft Corp,,William,Gates,Good,https://permid.org/1-34413157709,1,MICROSOFT CORPORATION,https://permid.org/1-4295907168,0.71,5,OK


## 4. Record Matching File

This method is similar to the above **match** method. It is used to match the entity Person, Organization, Instrument, and Quote records with Refinitiv’s PermIDs. However, instead of passing a string or data frame, it accepts a file name that contains records to be matched.

```Python
matchFile(filename,dataType='Organization',numberOfMatchesPerRecord=1,raw_output=False)
```

|Parameter Name|Required|Description|
|--------------|--------|-----------|
|filename|Yes|A filename of the CSV file containing records to be matched. Templates for the CSV files can be downloaded at the [Record Matching](https://permid.org/match) website.|
|dataType|No|The type of entity to search for. Possible values are **Person**, **Organization**, **Instrument**, or **Quote**. The default value is **Organization**.|
|numberOfMatchesPerRecord|No|A number of possible matches to output for each record in the input. The maximum number of possible matches is 5. The default value is 1.|
|raw_output|No|A boolean value set to retrieve a result as a JSON string instead of a data frame. The default value is False which returns a data frame.|

This function returns a tuple containing a result and error string. The result could be a data frame or JSON string depending on the **raw_output** parameter.

The following code calls the **matchFile** method to match records in an organization CSV file.

In [9]:
output,err = opid.matchFile("Organization_input.csv")
output

Unnamed: 0,Input_City,Input_Country,Input_LocalID,Input_Name,Input_PostalCode,Input_Standard Identifier,Input_State,Input_Street,Match Level,Match OpenPermID,Match Ordinal,Match OrgName,Match Score,Original Row Number,ProcessingStatus
0,Cupertino,US,1,Apple,95014.0,,California,"Apple Campus, 1 Infinite Loop",Excellent,https://permid.org/1-4295905573,1,Apple Inc,98%,2,OK
1,,,2,Apple,,,,,Excellent,https://permid.org/1-4295905573,1,Apple Inc,92%,3,OK
2,Petah Tikva,IL,3,Teva Pharmaceutical Industries Ltd,,,,,Excellent,https://permid.org/1-4295875158,1,Teva Pharmaceutical Industries Ltd,99%,4,OK
3,,IN,4,Tata Sky,,,,,Excellent,https://permid.org/1-4297589397,1,Tata Sky Ltd,92%,5,OK
4,,,5,,,RIC:IBM.N|Ticker:IBM,,,Excellent,https://permid.org/1-4295904307,1,International Business Machines Corp,100%,6,OK
5,,,6,,,Ticker:MSFT,,,Excellent,https://permid.org/1-4295907168,1,Microsoft Corp,100%,7,OK
6,,,7,,,LEI:INR2EJN1ERAN0W5ZP974,,,Excellent,https://permid.org/1-4295907168,1,Microsoft Corp,100%,8,OK
7,,,8,,,Ticker:FB&&Exchange:NSM,,,Excellent,https://permid.org/1-4297297477,1,Facebook Inc,100%,9,OK
8,,,9,,,Ticker:AAPL&&MIC:XNGS,,,Excellent,https://permid.org/1-4295905573,1,Apple Inc,100%,10,OK


## 5. Intelligent Tagging

This method allows you to tag free-text documents with rich semantic metadata, by identifying and tagging entities, events, and topics.
```
calais(text, language='English', contentType='raw', outputFormat='json')
```
|Parameter Name|Required|Description|
|--------------|--------|-----------|
|text|Yes|Content to be tagged. It could be raw text, html, xml, or pdf|
|language|No|Indicates the language of the input text. Currently, possible values are **English**, **Chinese**, **French**, **German**, **Japanese**, or **Spanish**. The default value is **English**.|
|contentType|No|Indicates the content type of the input text. Possible values are **raw**, **html**, **xml**, or **pdf**. The default value is **raw**.|
|outputFormat|No|Defines the output response format. Possible values are **json**, **rdf**, or **n3**. The default value is **json**.|

This function returns a tuple containing a result and error string. The result could be a JSON, RDF or N-Triples string depending on the **outputFormat** parameter.

The following code calls the **calais** method to tag the raw text.

In [10]:
raw_text ="""
TOKYO (Reuters) - Financial markets reeled on Thursday as stocks dived and oil slumped after U.S. President Donald Trump took the dramatic step of banning travel from Europe to stem the spread of coronavirus, threatening more disruptions to trade and the world economy.

With the pandemic wreaking havoc on daily life of millions worldwide, investors were also disappointed by the lack of broad measures in Trump's plan to fight the pathogen, prompting traders to bet of further aggressive easing by the Federal Reserve.

Euro Stoxx 50 futures STXEc1 plunged 8.3% to their lowest levels since mid-2016. They were last down 6.9% while investors rushed to safe-haven assets from bonds to gold to the yen and the Swiss franc.

U.S. S&P 500 futures ESc1 plummeted as much as 4.9% in Asia and last traded down 3.6%, a day after the S&P 500 .SPX lost 4.89%, leaving the index on the brink of entering bear market territory, defined as a 20% fall from a recent top.

MSCI's broadest gauge of world shares, ACWI .MIWD00000PUS, could follow suit, having fallen 19.2% so far from its record peak hit only a month ago.
"""
output,err = opid.calais(raw_text)
print(output)



## 6. Quota

Open PermID APIs have a daily quota limit. There is no API used to get quota information. However, the quota information is available in the HTTP's headers of response messages.
```
   x-permid-quota-daily: 5000
   x-permid-quota-used: 18
```
This library records this quota information and users can retrieve it by calling the following method.
```
get_usage()
```
This method returns a data frame contains the quota information recorded by this library.

In [11]:
opid.get_usage()

Unnamed: 0,Quota Daily,Quota Used,Time
0,5000,1,"Fri, 20 Mar 2020 04:15:32 GMT"
1,5000,2,"Fri, 20 Mar 2020 04:15:45 GMT"
2,5000,3,"Fri, 20 Mar 2020 04:15:51 GMT"
3,5000,3,"Fri, 20 Mar 2020 04:15:58 GMT"
4,5000,4,"Fri, 20 Mar 2020 04:16:05 GMT"
5,5000,4,"Fri, 20 Mar 2020 04:16:13 GMT"


## Summary

Open PermID provides REST APIs to look up, search, match, and tag PermID entities. This example demonstrates how to use a Python Open PermID library. To use this library, you need to have an access token which is freely available when registering at the [PermID](https://permid.org/) website. The library is  easy to use and the source code is available in [GitHub](https://github.com/Refinitiv-API-Samples/Article.OpenPermID.Python.APIs).

For the complete usage guide, please refer to [Python: Open PermID APIs](https://developers.refinitiv.com/article/python-open-permid-apis)
