### Notebook to load and analyze Social Security Tribunal of Canada cases

Author: Sean Rehaag

License: Creative Commons Attribution-NonCommercial 4.0 International [(CC BY-NC 4.0)](https://creativecommons.org/licenses/by-nc/4.0/). 

Dataset & Code to be cited as: 

    Sean Rehaag, "Social Security Tribunal Bulk Decisions Dataset" (2023), online: Refugee Law Laboratory <https://refugeelab.ca/bulk-data/sst>.

### Notes:

Notes:

(1) Data Source: [Social Security Tribunal of Canada](https://www.sst-tss.gc.ca).

(2) Unofficial Data: The data are unofficial reproductions of materials on the Social Security Tribunal website. Links to official versions are included in the dataset.

(3) Non-Affiliation / Endorsement: The data has been collected and reproduced without any affiliation or endorsement from the Social Security Tribunal of Canada.

(4) Non-Commerical Use: As indicated in the license, data may be used for non-commercial use (with attribution) only. For commercial use, see the Social Security Tribunal of Canada website's [Terms of Use](https://www.sst-tss.gc.ca/en/terms-and-conditions).

(5) Accuracy: Data was collected and processed programmatically for the purposes of academic research. While we make best efforts to ensure accuracy, data gathering of this kind inevitably involves errors. As such the data should be viewed as preliminary information aimed to prompt further research and discussion, rather than as definitive information.

### Requirements:

    pip install pandas

### If using parquet

    pip install pyarrow

### if loading remotely (other than via Hugging Face)
    
    pip install requests

### If loading remotely via Hugging Face

    pip install datasets
    

(Written on Python 3.9.12)

### Load Data

Four Options: Local & Remote

In [6]:
# OPTION 1: Load Hugging Face dataset

from datasets import load_dataset
import pandas as pd

dataset = load_dataset("refugee-law-lab/canadian-legal-data", split="train",data_dir="SST" )

# convert to dataframe
df = pd.DataFrame(dataset)

df


Unnamed: 0,citation,citation2,dataset,year,name,language,document_date,source_url,scraped_timestamp,unofficial_text,other
0,2013 SSTAD 1,,SST,2013,K.O. v. Minister of Human Resources and Skills...,en,2013-05-14,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,K.O. v. Minister of Human Resources and Skills...,
1,2013 SSTAD 10,,SST,2013,S. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,S. H. v. Minister of Human Resources and Skill...,
2,2013 SSTAD 11,,SST,2013,M. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,M. H. v. Minister of Human Resources and Skill...,
3,2013 SSTAD 12,,SST,2013,L. L. and Minister of Human Resources and Skil...,en,2013-12-31,https://decisions.sst-tss.gc.ca/sst-tss/cpp-rp...,2023-07-27,L. L. and Minister of Human Resources and Skil...,
4,2013 SSTAD 13,,SST,2013,R. S. v. Minister of Human Resources and Skill...,en,2013-12-17,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,R. S. v. Minister of Human Resources and Skill...,
...,...,...,...,...,...,...,...,...,...,...,...
26545,2023 TSS 993,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-07-27,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26546,2023 TSS 994,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-01-30,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26547,2023 TSS 997,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-07-28,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,
26548,2023 TSS 998,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-05-19,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,


In [7]:
# OPTION 2: Load parquet data remotely from Huggingface without cloning repo
import pandas as pd
import requests
from io import BytesIO

url = 'https://huggingface.co/datasets/refugee-law-lab/canadian-legal-data/resolve/main/SST/train.parquet'

# load data
results = requests.get(url)

# convert to dataframe
df = pd.read_parquet(BytesIO(results.content))

df

# (if code fails, add engine='pyarrow' to read_parquet() function)

Unnamed: 0,citation,citation2,dataset,year,name,language,document_date,source_url,scraped_timestamp,unofficial_text,other
0,2013 SSTAD 1,,SST,2013,K.O. v. Minister of Human Resources and Skills...,en,2013-05-14,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,K.O. v. Minister of Human Resources and Skills...,
1,2013 SSTAD 10,,SST,2013,S. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,S. H. v. Minister of Human Resources and Skill...,
2,2013 SSTAD 11,,SST,2013,M. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,M. H. v. Minister of Human Resources and Skill...,
3,2013 SSTAD 12,,SST,2013,L. L. and Minister of Human Resources and Skil...,en,2013-12-31,https://decisions.sst-tss.gc.ca/sst-tss/cpp-rp...,2023-07-27,L. L. and Minister of Human Resources and Skil...,
4,2013 SSTAD 13,,SST,2013,R. S. v. Minister of Human Resources and Skill...,en,2013-12-17,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,R. S. v. Minister of Human Resources and Skill...,
...,...,...,...,...,...,...,...,...,...,...,...
26545,2023 TSS 993,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-07-27,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26546,2023 TSS 994,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-01-30,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26547,2023 TSS 997,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-07-28,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,
26548,2023 TSS 998,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-05-19,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,


In [9]:
# OPTION 3: Load data remotely from GitHub without cloning repo
# Note: load time varies depending on internet connection (approx 600 MB of data for all years/languages)
# This is the slowest loading option.

import pandas as pd
import json
import requests

# Set variables
start_year = 2013  # First year of data sought (2013 +)
end_year = 2023  # Last year of data sought (2023 -)

base_ulr = 'https://raw.githubusercontent.com/Refugee-Law-Lab/sst_bulk_data/master/DATA/YEARLY/'

# load data
results = []
for year in range(start_year, end_year+1):
    url = base_ulr + f'{year}.json'
    results.extend(requests.get(url).json())

# convert to dataframe
df = pd.DataFrame(results)

df

Unnamed: 0,citation,citation2,dataset,year,name,language,document_date,source_url,scraped_timestamp,unofficial_text,other
0,2013 SSTAD 1,,SST,2013,K.O. v. Minister of Human Resources and Skills...,en,2013-05-14,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,K.O. v. Minister of Human Resources and Skills...,
1,2013 SSTAD 10,,SST,2013,S. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,S. H. v. Minister of Human Resources and Skill...,
2,2013 SSTAD 11,,SST,2013,M. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,M. H. v. Minister of Human Resources and Skill...,
3,2013 SSTAD 12,,SST,2013,L. L. and Minister of Human Resources and Skil...,en,2013-12-31,https://decisions.sst-tss.gc.ca/sst-tss/cpp-rp...,2023-07-27,L. L. and Minister of Human Resources and Skil...,
4,2013 SSTAD 13,,SST,2013,R. S. v. Minister of Human Resources and Skill...,en,2013-12-17,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,R. S. v. Minister of Human Resources and Skill...,
...,...,...,...,...,...,...,...,...,...,...,...
26545,2023 TSS 993,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-07-27,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26546,2023 TSS 994,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-01-30,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26547,2023 TSS 997,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-07-28,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,
26548,2023 TSS 998,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-05-19,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,


In [9]:
# OPTION 4: Load data locally via cloned repo

# First, clone git repo
# Then run this code to load data

import pandas as pd
import json
import pathlib

# Set variables
start_year = 2013  # First year of data sought (2013 +)
end_year = 2023  # Last year of data sought (2023 -)

# Set path to data
data_path = pathlib.Path('DATA/YEARLY/')

# load data
results = []
for year in range(start_year, end_year+1):
    with open(data_path / f'{year}.json') as f:
        results.extend(json.load(f))

# convert to dataframe
df = pd.DataFrame(results)

df


Unnamed: 0,citation,citation2,dataset,year,name,language,document_date,source_url,scraped_timestamp,unofficial_text,other
0,2013 SSTAD 1,,SST,2013,K.O. v. Minister of Human Resources and Skills...,en,2013-05-14,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,K.O. v. Minister of Human Resources and Skills...,
1,2013 SSTAD 10,,SST,2013,S. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,S. H. v. Minister of Human Resources and Skill...,
2,2013 SSTAD 11,,SST,2013,M. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,M. H. v. Minister of Human Resources and Skill...,
3,2013 SSTAD 12,,SST,2013,L. L. and Minister of Human Resources and Skil...,en,2013-12-31,https://decisions.sst-tss.gc.ca/sst-tss/cpp-rp...,2023-07-27,L. L. and Minister of Human Resources and Skil...,
4,2013 SSTAD 13,,SST,2013,R. S. v. Minister of Human Resources and Skill...,en,2013-12-17,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,R. S. v. Minister of Human Resources and Skill...,
...,...,...,...,...,...,...,...,...,...,...,...
26545,2023 TSS 993,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-07-27,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26546,2023 TSS 994,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-01-30,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26547,2023 TSS 997,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-07-28,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,
26548,2023 TSS 998,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-05-19,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,


### Analyze Data

In [10]:
# View dataframe
df.head()

Unnamed: 0,citation,citation2,dataset,year,name,language,document_date,source_url,scraped_timestamp,unofficial_text,other
0,2013 SSTAD 1,,SST,2013,K.O. v. Minister of Human Resources and Skills...,en,2013-05-14,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,K.O. v. Minister of Human Resources and Skills...,
1,2013 SSTAD 10,,SST,2013,S. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,S. H. v. Minister of Human Resources and Skill...,
2,2013 SSTAD 11,,SST,2013,M. H. v. Minister of Human Resources and Skill...,en,2013-11-27,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,M. H. v. Minister of Human Resources and Skill...,
3,2013 SSTAD 12,,SST,2013,L. L. and Minister of Human Resources and Skil...,en,2013-12-31,https://decisions.sst-tss.gc.ca/sst-tss/cpp-rp...,2023-07-27,L. L. and Minister of Human Resources and Skil...,
4,2013 SSTAD 13,,SST,2013,R. S. v. Minister of Human Resources and Skill...,en,2013-12-17,https://decisions.sst-tss.gc.ca/sst-tss/cppd-r...,2023-07-26,R. S. v. Minister of Human Resources and Skill...,


In [11]:
df.tail()

Unnamed: 0,citation,citation2,dataset,year,name,language,document_date,source_url,scraped_timestamp,unofficial_text,other
26545,2023 TSS 993,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-07-27,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26546,2023 TSS 994,,SST,2023,JP c Commission de l’assurance-emploi du Canada,fr,2023-01-30,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,JP c Commission de l’assurance-emploi du Canad...,
26547,2023 TSS 997,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-07-28,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,
26548,2023 TSS 998,,SST,2023,HR c Commission de l’assurance-emploi du Canada,fr,2023-05-19,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,HR c Commission de l’assurance-emploi du Canad...,
26549,2023 TSS 999,,SST,2023,SG c Commission de l’assurance-emploi du Canada,fr,2023-07-28,https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/...,2023-12-04,SG c Commission de l’assurance-emploi du Canad...,


In [11]:
# language counts
df['language'].value_counts()

language
en    13284
fr    13266
Name: count, dtype: int64

In [12]:
# Yearly counts
year_counts = df.year.value_counts()
years_count = sorted(year_counts.index)
for year_count in years_count:
    print(f'{year_count}: {year_counts[year_count]}')


2013: 54
2014: 1221
2015: 3710
2016: 2753
2017: 3171
2018: 2605
2019: 3176
2020: 2461
2021: 2003
2022: 3454
2023: 1942


In [13]:
# select 5 random rows from df_unique, iterate through them and print unofficial text
import random
random.seed(999)

random_rows = random.sample(range(0, len(df)), 5)

# iterate through random rows and print unofficial text
for row in random_rows:
    print('##################################')
    print(df.iloc[row]['citation'])
    print(df.iloc[row]['source_url'])
    print(df.iloc[row]['document_date'])
    print(df.iloc[row]['year'])
    print(df.iloc[row]['language'])
    print('##################################')
    print()
    print(df.iloc[row]['unofficial_text'])
    print()
    print()
    print('____________________________________________________________________________')
    print()
    print()
    print()


##################################
2023 TSS 102
https://decisions.sst-tss.gc.ca/sst-tss/ei-ae/fr/item/523828/index.do
2023-07-21
2023
fr
##################################

SM c Commission de l’assurance emploi du Canada
Collection
Assurance-emploi (AE)
Date de décision
2023-07-21
Référence neutre
2023 TSS 102
Numéro de référence
GE-23-193
Membre
Teresa Day
Division
Division générale
Décision
Appel rejeté
Décisions connexes
TSS - SM c Commission de l’assurance emploi du Canada - 2023 TSS 92 - 2023-09-20 - Division d’appel
Contenu de la décision
[TRADUCTION]
Citation : SM c Commission de l’assurance-emploi du Canada, 2023 TSS 102
Tribunal de la sécurité sociale du Canada Division générale, section de l’assurance-emploi
Décision
Appelante : S. M.
Représentant : Christopher Hall
Intimée : Commission de l’assurance-emploi du Canada
Décision portée en appel : Décision découlant de la révision de la Commission de l’assurance emploi du Canada (555064) datée du 9 décembre 2022 (communiquée par