## XBRL US API - ACFR statements by report  

### Authenticate for access token
Click in the gray code cell below, then click the Run button above to execute the cell. Type your XBRL US Web account email, account password, Client ID, and secret as noted, pressing the Enter key on the keyboard after each entry.

XBRL US limits records returned for a query to improve efficiency; this script loops to collect all data from the Public Filings Database for a query. **Non-members might not be able to return all data for a query** - join XBRL US for comprehensive access - https://xbrl.us/join.

In [2]:
print('Enter your XBRL US Web account email: ')
import os, re, sys, json
import requests
import pandas as pd
from IPython.display import display, HTML
import numpy as np
import getpass
from datetime import datetime
import urllib
from urllib.parse import urlencode
email = input()
password = getpass.getpass(prompt='Password: ')
clientid = getpass.getpass(prompt='Client ID: ')
secret = getpass.getpass(prompt='Secret: ')

body_auth = {'username' : ''.join(email),
            'client_id': ''.join(clientid),
            'client_secret' : ''.join(secret),
            'password' : ''.join(password),
            'grant_type' : 'password',
            'platform' : 'ipynb' }

payload = urlencode(body_auth)
url = 'https://api.xbrl.us/oauth2/token'
headers = {"Content-Type": "application/x-www-form-urlencoded"}

res = requests.request("POST", url, data=payload, headers=headers)
auth_json = res.json()

if 'error' in auth_json:
    print ("\n\nThere was a problem generating an access token with these credentials. Run the first cell again to enter credentials.")
else:
    print ("\n\nYour access token expires in 60 minutes. After it expires, run the cell immediately below this one to generate a new token and continue to use the query cell. \n\nFor now, skip ahead to the section 'Make a Query'.")
access_token = auth_json['access_token']
refresh_token = auth_json['refresh_token']
newaccess = ''
newrefresh = ''
#print('access token: ' + access_token + ' refresh token: ' + refresh_token)

Enter your XBRL US Web account email: 


Your access token expires in 60 minutes. After it expires, run the cell immediately below this one to generate a new token and continue to use the query cell. 

For now, skip ahead to the section 'Make a Query'.


#### Refresh token
The cell below is only needed to refresh an expired access token after 60 minutes. When the access token no longer returns results, run the cell below to refresh the access token or re-enter credentials by running the cell above. Until the refresh token process is needed, **skip ahead to _Make a Query_**.


In [2]:
token = token if newrefresh != '' else refresh_token

refresh_auth = {'client_id': ''.join(clientid),
            'client_secret' : ''.join(secret),
            'grant_type' : 'refresh_token',
            'platform' : 'ipynb',
            'refresh_token' : ''.join(token) }
refreshres = requests.post(url, data=refresh_auth)
refresh_json = refreshres.json()
access_token = refresh_json['access_token']
refresh_token = refresh_json['refresh_token']#print('access token: ' + access_token + 'refresh token: ' + refresh_token)
print('Your access token is refreshed for 60 minutes. If it expires again, run this cell to generate a new token and continue to use the query cells below.')
print(access_token)

NameError: name 'newrefresh' is not defined

### Make a query
After the access token confirmation appears above, you can modify the query below and use the **_Cell >> Run_** menu option with the cell **immediately below this text** to run the query for updated results.

The sample results are from a set of ACFR reports posted to the XBRL US Public Filings Database.  To test for results quickly, modify the **_report\_ids_** to shorten the list, and change the **_XBRL\_Elements_** to return different data from an ACFR statement.
  
Refer to XBRL API documentation at https://xbrlus.github.io/xbrl-api/#/Facts/getFactDetails for other endpoints and parameters to filter and return.

In [3]:
import requests
import pandas as pd
from datetime import datetime
import urllib.parse
from IPython.display import display, HTML

# Define the parameters for the filter and fields to be returned,
# run the loop to return results
offset_value = 0
res_df = []

# Define which endpoint to use
endpoint = 'cube' #taxonomy presentation linkbase + facts

# Define the parameters of the query

# statementIDs = [801150, 300000, 705000, 801100, 400000, 300690, 300691, 804000,
#        200110, 709100, 404000, 805050, 704100, 700000, 100000, 300001,
#        805000, 804050, 605000, 604000, 606000, 607000, 704000, 702000,
#        601000, 600000, 602000, 501200, 502000, 501800, 603000, 803000,
#        803050, 707000, 709000, 200060, 200300, 200050, 200000, 100700,
#        100400, 100100, 100800, 101100, 200040, 300200, 500000, 500800,
#        500600, 300600, 300500, 300745, 300740, 300710]

statementIDs = [404000, 300690, 200000, 801150, 200110]

# query for ACFR reports, sort by year descending and name ascending
# https://api.xbrl.us/api/v1/report/search?report.source-name=ACFR&fields=report.entity-name,report.period-focus,report.year-focus,report.filing-date,report.id,report.entry-url,report.source-name

report_ids = [
    '677268',  # County of Ogemaw: https://xbrlus.github.io/acfr/ixviewer/ix.html?doc=../samples/100/Ogemaw-20210930-Annual-Accounts.htm
    '677267'  # Flint, Michigan: https://xbrlus.github.io/acfr/ixviewer/ix.html?doc=../samples/107/FLINTF652021.htm
    #'591765',  # William Rainey Harper College: https://xbrlus.github.io/acfr/ixviewer/ix.html?doc=../samples/106/HARPER2021.htm
    #'591766',  # Oakton Community College: https://xbrlus.github.io/acfr/ixviewer/ix.html?doc=../samples/77/OAKTON2021.html
    #'591767'  # College of DuPage: https://xbrlus.github.io/acfr/ixviewer/ix.html?doc=../samples/82/COD2021.htm
]

# query for unique Statements in the 2022 GRIP Taxonomy
# https://api.xbrl.us/api/v1/dts/729592/network/search?network.link-name=presentationLink&fields=network.role-description.sort(ASC),dts.id&unique

XBRL_Elements = [
    str(x) for x in statementIDs
]

# Define data fields to return (multi-sort based on order)

fields = [
    # this is the list of the characteristics of the data being returned by the query
    'report.id',
    'period.fiscal-year',
    'cube.description.sort(ASC)',
    'cube.tree-sequence.sort(ASC)',
    'report.entity-name',
    'dimensions.count',
    'dimension-pair',
    'cube.primary-local-name',
    'fact.value',
    'unit'
]

params = {
    # this is the list of what's being queried against the endpoint
    'report.id': ','.join(report_ids),
    'fields': ','.join(fields),
    'unique': ''
}

# Create query and loop for all results - code below does not need to be changed
search_endpoint = 'https://api.xbrl.us/api/v1/' + endpoint + '/search'
orig_fields = params['fields']
query_start = datetime.now()

for xbrl_element in XBRL_Elements:
    params['cube.description'] = xbrl_element
    printed = False
    while True:
        if not printed:
            printed = True
        res = requests.get(search_endpoint, params=params, headers={'Authorization': 'Bearer {}'.format(access_token)})
        res_json = res.json()
        if 'error' in res_json:
            print('There was an error: {}'.format(res_json['error_description']))
            break

        print("up to", str(offset_value + res_json['paging']['limit']), "records are found so far ...")

        res_df += res_json['data']

        if res_json['paging']['count'] < res_json['paging']['limit']:
            print(" - this set contained fewer than the", res_json['paging']['limit'], "possible, only",
                  str(res_json['paging']['count']), "records.")
            break
        else:
            offset_value += res_json['paging']['limit']
            if 100 == res_json['paging']['limit']:
                params['fields'] = orig_fields + ',' + endpoint + '.offset({})'.format(offset_value)
                if offset_value == 10 * res_json['paging']['limit']:
                    break
            elif 500 == res_json['paging']['limit']:
                params['fields'] = orig_fields + ',' + endpoint + '.offset({})'.format(offset_value)
                if offset_value == 4 * res_json['paging']['limit']:
                    break
            params['fields'] = orig_fields + ',' + endpoint + '.offset({})'.format(offset_value)

if not 'error' in res_json:
    current_datetime = datetime.now().replace(microsecond=0)
    time_taken = current_datetime - query_start
    index = pd.DataFrame(res_df).index
    total_rows = len(index)
    your_limit = res_json['paging']['limit']
    limit_message = "If the results below match the limit noted above, you might not be seeing all rows, and should consider upgrading (https://xbrl.us/access-token).\n"

    if your_limit == 100:
        print("\nThis non-Member account has a limit of ", 10 * your_limit,
              " rows per query from our Public Filings Database. " + limit_message)
    elif your_limit == 500:
        print("\nThis Basic Individual Member account has a limit of ", 4 * your_limit,
              " rows per query from our Public Filings Database. " + limit_message)

    print("\nAt " + current_datetime.strftime("%c") + ", the query finished with  ", str(total_rows),
          "  rows returned in " + str(time_taken) + " for \n" + urllib.parse.unquote(res.url))

    df = pd.DataFrame(res_df)

up to 5000 records are found so far ...
 - this set contained fewer than the 5000 possible, only 206 records.
up to 5000 records are found so far ...
 - this set contained fewer than the 5000 possible, only 135 records.
up to 5000 records are found so far ...
 - this set contained fewer than the 5000 possible, only 145 records.
up to 5000 records are found so far ...
 - this set contained fewer than the 5000 possible, only 293 records.
up to 5000 records are found so far ...
 - this set contained fewer than the 5000 possible, only 90 records.

At Mon Apr 15 13:58:26 2024, the query finished with   869   rows returned in 0:00:09.351173 for 
https://api.xbrl.us/api/v1/cube/search?report.id=677268,677267&fields=report.id,period.fiscal-year,cube.description.sort(ASC),cube.tree-sequence.sort(ASC),report.entity-name,dimensions.count,dimension-pair,cube.primary-local-name,fact.value,unit&unique=&cube.description=200110


In [4]:
# create separate column for cube id
df['cube_id'] = df['cube.description'].apply(lambda x: x.split('-')[0].strip())


df['fact.value'] = pd.to_numeric(df['fact.value'], errors='coerce')
df = df[df['fact.value'].notna()]

In [5]:
df.to_csv('xbrl_data.csv', index=False)