**this is an example notebook for querying from the Opensantions API**

*Proceeding:* 

1. Query the individual sanctions lists from the API and write a loop to access all the days from July 2021 onward 

2. Parse into a dataframe, that is filtered and cleaned 

3. Write a function that matches a day with the previous day and flags the new additions or deletions. Parse those as a new column indicating removal or addition date.  

4. Merge all dataframes for all lists (UK, EU and US) together and aggregate onto a monthly level 

5. Create a separate dataframe for all designations concerning Russian entities 

 
datasets of interest:

us_ofac_sdn

us_ofac_cons

us_bis_denied


eu_fsf

eu_sanctions_map


gb_hwt_sanctions

In [1]:
#import packages
import requests
import json
import pandas as pd
import numpy as np
import os

**start with the SDN list**

In [18]:
# basic API query
url = "https://api.opensanctions.org/search/us_ofac_sdn?api_key=e7efa0f0189f850f7c949d2858b12296"
https://api.opensanctions.org/reconcile/us_ofac_sdn?api_key=YOUR_API_KEY
https://api.opensanctions.org/search/us_ofac_sdn?q=John+Doe&api_key=YOUR_API_KEY
response= requests.get(url)

# ususally you want the response from the API to be json, since this is the easiest to process/parse 
data = response.json()

In [18]:
#modified API query
url = "https://api.opensanctions.org/search/us_ofac_sdn?api_key=e7efa0f0189f850f7c949d2858b12296"
#set the parameter for the query
data = {'last_updated': '2018-04-01', 'end': '2018-04-08'}
response= requests.get(url, params=)
#query for 'createdAt' OR 'first_seen' OR 'modifiedAt'

# ususally you want the response from the API to be json, since this is the easiest to process/parse 
data = response.json()

In [None]:
#loop over pages, set limit manually?

In [19]:
data

{'limit': 10,
 'offset': 0,
 'total': {'value': 10000, 'relation': 'gte'},
 'results': [{'id': 'NK-7kyY9QMMGmLek7Y6PPudHx',
   'caption': 'JSC V. Tikhomirov Scientific Research Institute of Instrument Design',
   'schema': 'Company',
   'properties': {'jurisdiction': ['ru'],
    'createdAt': ['2023-01-25'],
    'address': ['3 Ul. Gagarina, 140180 Zhukovski',
     '140180, Российская Федерация, Московская обл., г. Жуковский, ул Гагарина, д. 3',
     'Gagarin Str, 3, 140180 Zhukovsky',
     '140180, Russian Federation, Moscow region, Zhukovsky, Gagarina street, 3',
     'УЛИЦА ГАГАРИНА, Жуковский, Московская область',
     'Д. 3, ГАГАРИНА, МОСКОВСКАЯ, 140180',
     '140180, Російська Федерація, Московська обл., м. Жуковський, вул. Гагаріна, д. 3'],
    'country': ['ru'],
    'notes': ['OGRN: 1025001627859',
     'JSC “V.V. Tikhomirov Research Institute of Instrument Engineering” is a military-industrial complex of Russia. It develops weapon control systems for fighter aircraft and medium

In [4]:
#parse into df
df = pd.DataFrame(data)

ValueError: Mixing dicts with non-Series may lead to ambiguous ordering.

**Opensanctions Archive data - 07/2021 till 12/2023**

In [None]:
#loop over API to create access to all dates since 2021/07/21
#use query for "last_change" variable

In [None]:
import requests
from pprint import pprint

# The OpenSanctions service API. This endpoint will only do sanctions checks.
URL = "https://api.opensanctions.org/match/sanctions?algorithm=best"

# Read an environment variable to get the API key:
API_KEY = os.environ.get("OPENSANCTIONS_API_KEY")

# Create an HTTP session which manages connections and defines shared header configuration:
session = requests.Session()
session.headers['Authorization'] = f"ApiKey {API_KEY}"

# A query for a person with a specific name and birth date. Note multiple names given 
# in different alphabets:
EXAMPLE_1 = {
    "schema": "Person",
    "properties": {
        "name": ["Arkadiii Romanovich Rotenberg", "Ротенберг Аркадий"],
        "birthDate": ["1951"],
    },
}

# Similarly, a company search using just a name and jurisdiction.
EXAMPLE_2 = {
    "schema": "Company",
    "properties": {
        "name": ["Stroygazmontazh"],
        "jurisdiction": ["Russia"],
    },
}

# We put both of these queries into a matching batch, giving each of them an
# ID that we can recognize it by later:
BATCH = {"queries": {"q1": EXAMPLE_1, "q2": EXAMPLE_2}}

# This configures the scoring system. "fuzzy" is related only to the pre-retrieval
# of entities and can be turned off for a performance boost.
PARAMS = {"algorithm": "best", "fuzzy": "false"}

# Send the batch off to the API and raise an exception for a non-OK response code.
response = session.post(URL, json=BATCH, params=PARAMS)
response.raise_for_status()

responses = response.json().get("responses")

# The responses will include a set of results for each entity, and a parsed version of
# the original query:
example_1_response = responses.get("q1")
example_2_response = responses.get("q2")

# You can use the returned query to debug if the API correctly parsed and interpreted 
# the queries you provided. If any of the fields or values are missing, it's an
# indication their format wasn't accepted by the system.
pprint(example_2_response["query"])

# The results are a list of entities, formatted using the same structure as your
# query examples. By default, the API will at most return five potential matches.
for result in example_2_response['results']:
    pprint(result)