# Companies House API
### Filing history example
---
How to collect the filing history of a given company for a selected category and save it to a CSV file.

Categories in the API have a finer granularity than those showing on the website.  
Category labels are included in the [filing history resource](https://developer-specs.company-information.service.gov.uk/companies-house-public-data-api/resources/filinghistorylist?v=latest) documentation page which is what you it is used to make the API calls. 

The `filingHistoryList` categories are:
- accounts
- address
- annual-return
- capital
- change-of-name
- incorporation
- liquidation
- miscellaneous
- mortgage
- officers
- resolution

#### Script execution
Invoke the script by including the *company number* and the *category* as arguments

<div class="alert alert-block alert-info">
    As the script requires some arguments in input, do <b>NOT</b> run it in a Jupyter notebook.<br/>  
    Run it inside the shell, for example the <i>Anaconda cmd Prompt</i>.
</div>

For example:

```bash

## Example Usage for TESCO
## Tesco Company number: 00445790
##
## Retrieving "incorporation" records

tony@tony:~$ python filing_history.py 00445790 incorporation

total records: 129. Downloading 2 pages of 100 records each.
https://api.company-information.service.gov.uk/company/00445790/filing-history?items_per_page=100&category=incorporation
https://api.company-information.service.gov.uk/company/00445790/filing-history?items_per_page=100&category=incorporation&start_index=1

00445790_incorporation.csv

tony@tony:~$

```

In this case, the script downloaded two batches of max 100 records (there's a limit of 100 records per call) and saved the results in this file: *00445790_incorporation.csv* which you can open with Excel. The formatting is not completely resolved, but it gives you an idea... 

Code is below. Copy and paste it in a python file (for instance `filing_history.py`) and **remember to amend the line** `api_key = "YOUR-API-KEY"` **with your API KEY**.  
Then type the above in the shell/Anaconda Prompt.

In [None]:
#!/usr/bin/env python 
"""\
Companies House API test Script 
Usage: python.exe <company_number> <filing history category>
"""

import sys, requests, pandas as pd


# Functions
def mround(n, r):
    # Rounds to the nearest "r"
    ###
    
    n -= n % -r
    return n

def calculatePages(n):
    # Calculates number of pages
    ###
    
    return int(mround(n, 100)/100)


def historyPage(url, calculate_pages = False):
    # Returns list [pages, items dataframe]
    ###
    
    response = requests.get(url, headers = HEADERS)
    
    if response.status_code == 200:
        data = response.json()
        
        if calculate_pages == True:
            records = data.get("total_count")
            pages = calculatePages(data.get("total_count"))
            print(f"total records: {records}. Downloading {pages} pages of 100 records each.")
        else:
            pages = -1
    
    else:
        print("Error occurred. Status code: ", n)
        sys.exit(1)
    
    print(url)
    return [pages, pd.json_normalize(data.get("items"))]


# Main

api_key = "YOUR-API-KEY"
company_number = sys.argv[1]
category = sys.argv[2]

URL = f"https://api.company-information.service.gov.uk/company/{company_number}/filing-history?items_per_page=100&category={category}"
HEADERS = {"Authorization": api_key, "Accept": "application/json"}

# history object is a custom "object" which looks like this [pages, data] created by the historyPage function 
history_object = historyPage(URL, calculate_pages = True)

pages = history_object[0] 
history_df = history_object[1]
history_columns_order = history_df.columns

for page in range(1, pages):
    url = URL + f"&start_index={page}"
    
    history_object = historyPage(url) 
    new_history_df = history_object[1]
    #new_history_df = new_history_df[history_df.columns] #rearrange columns
    
    history_df = pd.concat([history_df, new_history_df])

filename = f"{company_number}_{category}.csv"
history_df.to_csv(filename, index = False)
print(filename)