### Notebook to load and analyze Federal Court of Appeal cases

Sean Rehaag

License: Creative Commons Attribution-NonCommercial 4.0 International [(CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/)

Dataset & Code to be cited as: 

    Sean Rehaag, "Federal Court of Appeal Bulk Decisions Dataset" (2023), online: Refugee Law Laboratory <https://refugeelab.ca/bulk-data/fca/>.

Notes:

(1) Data Source: [Federal Court of Appeal](https://www.fca-caf.gc.ca). 

(2) Unofficial Data: The data are unofficial reproductions of materials on the Federal Court of Appeal website. Links to official versions are included in the dataset.

(3) Non-Affiliation / Endorsement: The data has been collected and reproduced without any affiliation or endorsement from the Federal Court of Appeal

(4) Non-Commerical Use: As indicated in the license, data may be used for non-commercial use (with attribution) only. For commercial use, see the Federal Court of Appeal website's [Terms of Use](https://www.fca-caf.gc.ca/fca-caf_eng/important_eng.html).

(5) Accuracy: Data was collected and processed programmatically for the purposes of academic research. While we make best efforts to ensure accuracy, data gathering of this kind inevitably involves errors. As such the data should be viewed as preliminary information aimed to prompt further research and discussion, rather than as definitive information. 

(6) Limitation: Only includes cases with neutral citation, which began to be used in 2001

(7) Delay: Decisions may take many months to be translated (sometimes over a year). As a result, in the most recent years, decisions may only be available in one language.

### Requirements:

    pip install pandas
    pip install requests

(Written on Python 3.9.12)


In [1]:
# import libraries
import pandas as pd
import json
import pathlib
import requests

# Set variables
start_year = 2001  # First year of data sought (2001 +)
end_year = 2022  # Last year of data sought (2022 -)
language = None  # language of cases sought ('en', 'fr', or None for both)


### Load Data

Two Options: Local & Remote

In [2]:
# OPTION 1: Load data locally via cloned repo

# First, clone git repo
# Then run this code to load data

# Set path to data
data_path = pathlib.Path('DATA/YEARLY/')

# load data (all years, json files)
results = []
for year in range(start_year, end_year+1):
    with open(data_path / f'{year}.json') as f:
        results.extend(json.load(f))

# convert to dataframe
df = pd.DataFrame(results)

# filter by language if applicable
if language:
    df = df[df['language'] == language]

In [3]:
# OPTION 2: Load data remotely from GitHub without cloning repo
# Note: load time varies depending on internet connection (approx 300 MB of data)

base_ulr = 'https://raw.githubusercontent.com/Refugee-Law-Lab/fca_bulk_data/master/DATA/YEARLY/'

# load data
results = []
for year in range(start_year, end_year+1):
    url = base_ulr + f'{year}.json'
    results.extend(requests.get(url).json())

# convert to dataframe
df = pd.DataFrame(results)

# filter by language if applicable
if language:
    df = df[df['language'] == language]

### Analyze Data

In [4]:
# View dataframe
df

Unnamed: 0,citation,citation2,year,name,language,decision_date,source_url,scraped_timestamp,unofficial_text
0,2001 FCA 1,,2001,Bouvidard Ltée. c. Commission de l'assurance e...,en,2001-02-01,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2022-08-29,Bouvidard Ltée. c. Commission de l'assurance e...
1,2001 FCA 10,,2001,Mclean v. Canada (Minister of Citizenship and ...,en,2001-02-08,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2022-08-29,Mclean v. Canada (Minister of Citizenship and ...
2,2001 FCA 100,,2001,Mai v. Canada (Minister of Citizenship and Imm...,en,2001-04-02,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2022-08-29,Mai v. Canada (Minister of Citizenship and Imm...
3,2001 FCA 101,,2001,Mitchell Verification Services Group Inc. v. C...,en,2001-04-03,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2022-08-29,Mitchell Verification Services Group Inc. v. C...
4,2001 FCA 102,,2001,Pontbriand v. Canada (Attorney General),en,2001-04-03,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2022-08-29,Pontbriand v. Canada (Attorney General)\nCourt...
...,...,...,...,...,...,...,...,...,...
13885,2022 CAF 93,,2022,AllStaff Inc. c. Canada (Procureur Général),fr,2022-05-30,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2023-04-13,AllStaff Inc. c. Canada (Procureur général)\nB...
13886,2022 CAF 96,,2022,"CSX Transportation, Inc. c. ABB Inc.",fr,2022-05-31,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2023-04-13,"CSX Transportation, Inc. c. ABB Inc.\nBase de ..."
13887,2022 CAF 97,,2022,Thomson c. Canada (Procureur général),fr,2022-06-01,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2023-04-13,Thomson c. Canada (Procureur général)\nBase de...
13888,2022 CAF 98,,2022,Brown c. Ontario (Tribunal de l'aide sociale),fr,2022-06-01,https://decisions.fca-caf.gc.ca/fca-caf/decisi...,2023-04-13,Brown c. Ontario (Tribunal de l'aide sociale)\...


In [5]:
# language counts
df['language'].value_counts()

en    6990
fr    6900
Name: language, dtype: int64

In [6]:
# Yearly counts
year_counts = df.year.value_counts()
years_count = sorted(year_counts.index)
for year_count in years_count:
    print(f'{year_count}: {year_counts[year_count]}')


2001: 790
2002: 1001
2003: 947
2004: 870
2005: 859
2006: 832
2007: 810
2008: 810
2009: 663
2010: 636
2011: 642
2012: 579
2013: 514
2014: 494
2015: 483
2016: 559
2017: 456
2018: 387
2019: 496
2020: 371
2021: 409
2022: 282
