# QCEW Fetch Data Kit

## Introduction

This Jupyter Notebook is designed to automate the processing of the Quarterly Census of Employment and Wages (QCEW) data, provided by the [U.S. Bureau of Labor Statistics](https://www.bls.gov/cew/) . The datasets  processed by this notebook contain detailed employment statistics, including the number of establishments, employment levels, total quarterly wages, and more, broken down by industry and ownership sectors for each county, as defined by the [technical documentation](https://www.bls.gov/cew/additional-resources/open-data/csv-data-slices.htm). This notebook seperates annual and quarterly records to concatonate data into split wide-form '.csv's which are also saved in long-from.  See the Parameters section below to select the variables to be included in the extract.

## Process Outline

The process carried out by this workflow can be described as follows:
  - The script uses data gathered by the user from [QCEW website](https://www.bls.gov/cew/about-data/data-availability.htm) or automatically with [morpc-qcew-fetch](https://github.com/morpc/morpc-qcew-fetch). The dataset can contain a mix of quarterly and annual QCEW files.
  - The script detects and concatonates the quarterly datasets into one wide-form '.csv' and the annual data datasets are concatenated to another. 
  - Wide-form data is processed to create new long-form '.csv's for output.
  - For each processed long and wide-from '.csv', a .resource.yaml file is created, following the [Frictionless Data Resource specification](https://framework.frictionlessdata.io/docs/framework/resource.html). This file includes metadata about the CSV file, such as its name, path, format, and the schema it conforms to, as well as a hash code for integrity checking. Additionally, it contains descriptive information about the dataset and references to its source.
  - The YAML files for schemas and resource descriptors are used to make data more usable by simplifying its publication and consumption. By adhering to Frictionless standards, the script ensures that the datasets it produces are easily shareable, validatable, and integrable into a wide range of data tools and platforms.

## Setup

### Import required packages

In [1]:
import os
import pandas as pd
import frictionless
import requests
import sys
sys.path.append(os.path.normpath("../../morpc-common"))
import morpc

### Parameters

#### Static parameters

In [2]:
# List of identifying columns for long-form tables
id_vars = [
    'area_fips', 'own_code', 'industry_code', 'agglvl_code', 'size_code', 'year', 'qtr', 'disclosure_code', 'lq_disclosure_code', 'oty_disclosure_code'
]

# Location where output files will be saved
OUTPUT_DIR = os.path.normpath("./output_data")

# Location where input files must be placed
INPUT_DIR = os.path.normpath("./input_data")

# File name for long-form quarterly table
QCEW_QUARTERLY_LONG_OUTPUT_NAME  = "qcew_quarterly_long.csv" 
# File name for wide-form quarterly table
QCEW_QUARTERLY_WIDE_OUTPUT_NAME  = "qcew_quarterly_wide.csv" 

# File name for long-form annual table
QCEW_ANNUAL_LONG_OUTPUT_NAME  = "qcew_annual_long.csv" 
# File name for wide-form annual table
QCEW_ANNUAL_WIDE_OUTPUT_NAME  = "qcew_annual_wide.csv" 

# Quarterly data paths
QCEW_QUARTERLY_LONG_OUTPUT_PATH = os.path.join(OUTPUT_DIR, QCEW_QUARTERLY_LONG_OUTPUT_NAME)
QCEW_QUARTERLY_WIDE_OUTPUT_PATH = os.path.join(OUTPUT_DIR, QCEW_QUARTERLY_WIDE_OUTPUT_NAME)

QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE = "qcew_quarterly_long.resource.yaml" 
QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE_PATH = os.path.join(OUTPUT_DIR, QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE)
QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE = "qcew_quarterly_wide.resource.yaml"
QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE_PATH = os.path.join(OUTPUT_DIR, QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE)

# Annual paths
QCEW_ANNUAL_LONG_OUTPUT_PATH = os.path.join(OUTPUT_DIR, QCEW_ANNUAL_LONG_OUTPUT_NAME)
QCEW_ANNUAL_WIDE_OUTPUT_PATH = os.path.join(OUTPUT_DIR, QCEW_ANNUAL_WIDE_OUTPUT_NAME)

QCEW_ANNUAL_LONG_OUTPUT_RESOURCE = "qcew_annual_long.resource.yaml" 
QCEW_ANNUAL_LONG_OUTPUT_RESOURCE_PATH = os.path.join(OUTPUT_DIR, QCEW_ANNUAL_LONG_OUTPUT_RESOURCE)
QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE = "qcew_annual_wide.resource.yaml"
QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE_PATH = os.path.join(OUTPUT_DIR, QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE)

# Define quarterly and annual schema directories from local copies
QUARTERLY_TABLE_SCHEMA_FILENAME = "morpc-qcew-quarterly.schema.yaml"
QUARTERLY_TABLE_SCHEMA_PATH = os.path.join(OUTPUT_DIR, QUARTERLY_TABLE_SCHEMA_FILENAME)
ANNUAL_TABLE_SCHEMA_FILENAME = "morpc-qcew-annual.schema.yaml"
ANNUAL_TABLE_SCHEMA_PATH = os.path.join(OUTPUT_DIR, ANNUAL_TABLE_SCHEMA_FILENAME)
LONG_TABLE_SCHEMA_FILENAME = "morpc-qcew-long.schema.yaml"
LONG_TABLE_SCHEMA_PATH = os.path.join(OUTPUT_DIR, LONG_TABLE_SCHEMA_FILENAME)

# Documentation URL for the QCEW data - static because it points to the general documentation page
QCEW_TABLE_DOC_URL="https://www.bls.gov/cew/additional-resources/open-data/csv-data-slices.htm"

Create a function to check if a value is numberic

In [3]:
def is_numeric(val):
    """Check if the value is numeric."""
    try:
        float(val)
        return True
    except ValueError:
        return False

### Define inputs

QCEW data to process must come from [QCEW website](https://www.bls.gov/cew/about-data/data-availability.htm) or be fetched automatically with [morpc-qcew-fetch](https://github.com/morpc/morpc-qcew-fetch) and placed in the 'input_data' directory. With 'morpc-qcew-fetch', users can automatically obtain QCEW records from multiple regions and years based on user-specified paramters.

In [4]:
print("Annual schema file stored in: {}".format(ANNUAL_TABLE_SCHEMA_PATH))
print("Quarterly schema file stored in: {}".format(QUARTERLY_TABLE_SCHEMA_PATH))
print("Long schema file stored in: {}".format(LONG_TABLE_SCHEMA_PATH))
print("QCEW files to be compiled must be stored in: {}".format(INPUT_DIR))

Annual schema file stored in: output_data\morpc-qcew-annual.schema.yaml
Quarterly schema file stored in: output_data\morpc-qcew-quarterly.schema.yaml
Long schema file stored in: output_data\morpc-qcew-long.schema.yaml
QCEW files to be compiled must be stored in: input_data


### Define outputs

In [5]:
print("Long quarterly QCEW data will be saved to: {}".format(QCEW_QUARTERLY_LONG_OUTPUT_PATH))
print("Long quarterly QCEW data resource files will be saved to: {}".format(QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE_PATH))
print("Wide quarterly QCEW data will be saved to: {}".format(QCEW_QUARTERLY_WIDE_OUTPUT_PATH))
print("Wide quarterly QCEW data resource files will be saved to: {}".format(QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE_PATH))
print("")
print("Long annual QCEW data will be saved to: {}".format(QCEW_ANNUAL_LONG_OUTPUT_PATH))
print("Long annual QCEW data resource files will be saved to: {}".format(QCEW_ANNUAL_LONG_OUTPUT_RESOURCE_PATH))
print("Wide annual QCEW data will be saved to: {}".format(QCEW_ANNUAL_WIDE_OUTPUT_PATH))
print("Wide annual QCEW data resource files will be saved to: {}".format(QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE_PATH))

Long quarterly QCEW data will be saved to: output_data\qcew_quarterly_long.csv
Long quarterly QCEW data resource files will be saved to: output_data\qcew_quarterly_long.resource.yaml
Wide quarterly QCEW data will be saved to: output_data\qcew_quarterly_wide.csv
Wide quarterly QCEW data resource files will be saved to: output_data\qcew_quarterly_wide.resource.yaml

Long annual QCEW data will be saved to: output_data\qcew_annual_long.csv
Long annual QCEW data resource files will be saved to: output_data\qcew_annual_long.resource.yaml
Wide annual QCEW data will be saved to: output_data\qcew_annual_wide.csv
Wide annual QCEW data resource files will be saved to: output_data\qcew_annual_wide.resource.yaml


## Main code

### Seperating quarterly and annual data and concating to wide-form data tables

This script checks every '.csv' file in "input_data" and sorts between quarterly and annual data. The files are validated against their respective schema*. If valid, the files are concatonated into seperate wide-form '.csv', one for annual data and one for quarterly data.

In [6]:
numeric_dfs = []  # List to store data frames with numeric 'qtr'
non_numeric_dfs = []  # List to store other data frames

# Iterate over all files in the given directory
for filename in os.listdir(os.path.normpath(INPUT_DIR)):
    if filename.endswith('.csv'):
        file_path = os.path.join(os.path.normpath(INPUT_DIR), filename)
        try:
            df = pd.read_csv(file_path)

            # Check if any entry in 'qtr' is numeric
            if df['qtr'].apply(is_numeric).any():
                numeric_dfs.append(df)
            else:
                non_numeric_dfs.append(df)
        except Exception as e:
            print(f"Error processing file {filename}: {e}")

# Concatenate all data frames for numeric 'qtr' into one CSV
if numeric_dfs:
    pd.concat(numeric_dfs).to_csv(QCEW_QUARTERLY_WIDE_OUTPUT_PATH, index=False)
    print(f"All quarterly QCEW CSV files have been concatenated into {QCEW_QUARTERLY_WIDE_OUTPUT_PATH}")
else:
    print("No quarterly QCEW CSV files to process.")
    
# Concatenate all other data frames into another CSV
if non_numeric_dfs:
    pd.concat(non_numeric_dfs).to_csv(QCEW_ANNUAL_WIDE_OUTPUT_PATH, index=False)
    print(f"All annual QCEW CSV files have been concatenated into {QCEW_ANNUAL_WIDE_OUTPUT_PATH}")
else:
    print("No annual QCEW CSV files to process.")

All quarterly QCEW CSV files have been concatenated into output_data\qcew_quarterly_wide.csv
All annual QCEW CSV files have been concatenated into output_data\qcew_annual_wide.csv


In [7]:
if pd.concat(numeric_dfs) is not None:
    display(pd.concat(numeric_dfs).head())

if pd.concat(non_numeric_dfs) is not None:
    display(pd.concat(non_numeric_dfs).head())

Unnamed: 0,area_fips,own_code,industry_code,agglvl_code,size_code,year,qtr,disclosure_code,qtrly_estabs,month1_emplvl,...,oty_month3_emplvl_chg,oty_month3_emplvl_pct_chg,oty_total_qtrly_wages_chg,oty_total_qtrly_wages_pct_chg,oty_taxable_qtrly_wages_chg,oty_taxable_qtrly_wages_pct_chg,oty_qtrly_contributions_chg,oty_qtrly_contributions_pct_chg,oty_avg_wkly_wage_chg,oty_avg_wkly_wage_pct_chg
0,39073,0,10,70,0,2020,3,,566,6717,...,76,1.2,3045842,5.6,1292087,18.7,33691,22.7,32,5.1
1,39073,1,10,71,0,2020,3,,8,48,...,9,18.0,52004,10.1,0,0.0,0,0.0,-59,-7.2
2,39073,1,102,72,0,2020,3,,8,48,...,9,18.0,52004,10.1,0,0.0,0,0.0,-59,-7.2
3,39073,1,1021,73,0,2020,3,,7,45,...,-1,-2.2,-3402,-0.7,0,0.0,0,0.0,0,0.0
4,39073,1,1028,73,0,2020,3,,1,3,...,10,250.0,55406,344.5,0,0.0,0,0.0,34,8.3


Unnamed: 0,area_fips,own_code,industry_code,agglvl_code,size_code,year,qtr,disclosure_code,annual_avg_estabs,annual_avg_emplvl,...,oty_total_annual_wages_chg,oty_total_annual_wages_pct_chg,oty_taxable_annual_wages_chg,oty_taxable_annual_wages_pct_chg,oty_annual_contributions_chg,oty_annual_contributions_pct_chg,oty_annual_avg_wkly_wage_chg,oty_annual_avg_wkly_wage_pct_chg,oty_avg_annual_pay_chg,oty_avg_annual_pay_pct_chg
0,39041,0,10,70,0,2020,A,,5891,84743,...,75101458,1.4,-80289035,-8.8,-2060679,-11.7,92,8.3,4810,8.4
1,39041,1,10,71,0,2020,A,,16,264,...,570740,3.8,0,0.0,0,0.0,-8,-0.7,-444,-0.8
2,39041,1,102,72,0,2020,A,,16,264,...,570740,3.8,0,0.0,0,0.0,-8,-0.7,-444,-0.8
3,39041,1,1021,73,0,2020,A,,9,194,...,85916,0.8,0,0.0,0,0.0,35,3.2,1831,3.2
4,39041,1,1024,73,0,2020,A,,1,13,...,55378,8.0,0,0.0,0,0.0,56,5.1,2908,5.1


### Creating long-form annual table, if wide-form annual data exists

Melt the original wide-form annual data into a long-form table where each row corresponds to a single variable for a given county, year, establishment ownership, establishment size, industry, aggregation code, and disclosure codes

In [8]:
# Load the annual wide CSV
df_annual = pd.read_csv(QCEW_ANNUAL_WIDE_OUTPUT_PATH)

# Verify if there's data in the annual DataFrame
if not df_annual.empty:
    value_vars_annual = df_annual.columns.difference(id_vars)
    df_annual_long = pd.melt(df_annual, id_vars=id_vars, value_vars=value_vars_annual, 
                             var_name='variable', value_name='value')
    df_annual_long.to_csv(QCEW_ANNUAL_LONG_OUTPUT_PATH, index=False)
    (f"All annual QCEW data have been saved as long into: {QCEW_ANNUAL_LONG_OUTPUT_PATH}")
else: 
    print("No annual data to melt")

In [9]:
if df_annual_long is not None:
    display((df_annual_long).head())

Unnamed: 0,area_fips,own_code,industry_code,agglvl_code,size_code,year,qtr,disclosure_code,lq_disclosure_code,oty_disclosure_code,variable,value
0,39041,0,10,70,0,2020,A,,,,annual_avg_emplvl,84743.0
1,39041,1,10,71,0,2020,A,,,,annual_avg_emplvl,264.0
2,39041,1,102,72,0,2020,A,,,,annual_avg_emplvl,264.0
3,39041,1,1021,73,0,2020,A,,,,annual_avg_emplvl,194.0
4,39041,1,1024,73,0,2020,A,,,,annual_avg_emplvl,13.0


### Creating long-form quarterly table, if wide-form quarterly data exists

Melt the original wide-form quarterly data into a long-form table where each row corresponds to a single variable for a given county, year, quarter, establishment ownership, establishment size, industry, aggregation code, and disclosure codes

In [10]:
# Load the quarterly wide CSV
df_quarterly = pd.read_csv(QCEW_QUARTERLY_WIDE_OUTPUT_PATH)

# Verify if there's data in the quarterly DataFrame
if not df_quarterly.empty:
    value_vars_quarterly = df_quarterly.columns.difference(id_vars)
    df_quarterly_long = pd.melt(df_quarterly, id_vars=id_vars, value_vars=value_vars_quarterly, 
                                var_name='variable', value_name='value')
    df_quarterly_long.to_csv(QCEW_QUARTERLY_LONG_OUTPUT_PATH, index=False)
    (f"All quarterly QCEW data have been saved as long into: {QCEW_QUARTERLY_LONG_OUTPUT_PATH}")
    df_quarterly_long.head()
else: 
    print("No quarterly data to melt")

In [11]:
if df_quarterly is not None:
    display((df_annual_long).head())

Unnamed: 0,area_fips,own_code,industry_code,agglvl_code,size_code,year,qtr,disclosure_code,lq_disclosure_code,oty_disclosure_code,variable,value
0,39041,0,10,70,0,2020,A,,,,annual_avg_emplvl,84743.0
1,39041,1,10,71,0,2020,A,,,,annual_avg_emplvl,264.0
2,39041,1,102,72,0,2020,A,,,,annual_avg_emplvl,264.0
3,39041,1,1021,73,0,2020,A,,,,annual_avg_emplvl,194.0
4,39041,1,1024,73,0,2020,A,,,,annual_avg_emplvl,13.0


## Create and validate resource file for annual wide-form table, if it exists

In [12]:
df_wide_annual = pd.read_csv(QCEW_ANNUAL_WIDE_OUTPUT_PATH)

# Finding the maximum and minimum values in the 'year' column
max_year = df_wide_annual['year'].max()
min_year = df_wide_annual['year'].min()

# Update title and description with the county name
title = f"Compiled QCEW County Data, {min_year}-{max_year} (wide form)"
description = f"Employment and wage data for counties in MOPRC region, derived from the U.S. Bureau of Labor Statistics."

# Resource creation for WIDE ANNUAL
if not df_wide_annual.empty:
    acsResource = {
        "name": "qcew_annual_wide",
        "title": title,
        "description": description,
        "path": QCEW_ANNUAL_WIDE_OUTPUT_NAME,
        "format": "csv",
        "mediatype": "text/csv",
        "encoding": "utf-8",
        "bytes": os.path.getsize(QCEW_ANNUAL_WIDE_OUTPUT_PATH),
        "hash": morpc.md5(QCEW_ANNUAL_WIDE_OUTPUT_PATH),
        "schema": ANNUAL_TABLE_SCHEMA_FILENAME,
        "profile":'tabular-data-resource'
    }
    
    # Create the resource object
    resource = frictionless.Resource(acsResource)

    print("Writing resource file to {}".format(QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE_PATH))
    cwd = os.getcwd()
    os.chdir(os.path.dirname(QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE_PATH))
    dummy = resource.to_yaml(os.path.basename(QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE_PATH))
    os.chdir(cwd)
    
    print("Validating resource on disk (including data and schema). This may take some time.")
    resourceOnDisk = frictionless.Resource(QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE_PATH)
    results = resourceOnDisk.validate()
    if(results.valid):
        print("Resource is valid\n")
    else:
        print("ERROR: Resource is NOT valid. Errors follow.\n")
        print(results)
        raise RuntimeError

Writing resource file to output_data\qcew_annual_wide.resource.yaml
Validating resource on disk (including data and schema). This may take some time.
Resource is valid



## Create and validate resource file for quarterly wide-form table, if it exists

In [13]:
df_wide_annual = pd.read_csv(QCEW_QUARTERLY_WIDE_OUTPUT_PATH)

# Finding the maximum and minimum values in the 'year' column
max_year = df_wide_annual['year'].max()
min_year = df_wide_annual['year'].min()

# Update title and description with the county name
title = f"Compiled QCEW County Data, {min_year}-{max_year} (wide form)"
description = f"Employment and wage data for counties in MOPRC region, derived from the U.S. Bureau of Labor Statistics."

# Resource creation for WIDE QUARTERLY
if not df_wide_annual.empty:
    acsResource ={
    "name": "qcew_quarterly_wide",
    "title": title,
    "description": description,
    "path": QCEW_QUARTERLY_WIDE_OUTPUT_NAME,
    "format": "csv",
    "mediatype": "text/csv",
    "encoding": "utf-8",
    "bytes": os.path.getsize(QCEW_QUARTERLY_WIDE_OUTPUT_PATH),
    "hash": morpc.md5(QCEW_QUARTERLY_WIDE_OUTPUT_PATH),
    "schema": QUARTERLY_TABLE_SCHEMA_FILENAME,
    "profile":'tabular-data-resource'
    }
    
    # Create the resource object
    resource = frictionless.Resource(acsResource)

    print("Writing resource file to {}".format(QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE_PATH))
    cwd = os.getcwd()
    os.chdir(os.path.dirname(QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE_PATH))
    dummy = resource.to_yaml(os.path.basename(QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE_PATH))
    os.chdir(cwd)
    
    print("Validating resource on disk (including data and schema). This may take some time.")
    resourceOnDisk = frictionless.Resource(QCEW_QUARTERLY_WIDE_OUTPUT_RESOURCE_PATH)
    results = resourceOnDisk.validate()
    if(results.valid):
        print("Resource is valid\n")
    else:
        print("ERROR: Resource is NOT valid. Errors follow.\n")
        print(results)
        raise RuntimeError

Writing resource file to output_data\qcew_quarterly_wide.resource.yaml
Validating resource on disk (including data and schema). This may take some time.
Resource is valid



## Create and validate resource file for annual long-form table, if it exists

In [14]:
df_wide_annual = pd.read_csv(QCEW_ANNUAL_LONG_OUTPUT_PATH)

# Finding the maximum and minimum values in the 'year' column
max_year = df_wide_annual['year'].max()
min_year = df_wide_annual['year'].min()

# Update title and description with the county name
title = f"Compiled QCEW County Data, {min_year}-{max_year} (wide form)"
description = f"Employment and wage data for counties in MOPRC region, derived from the U.S. Bureau of Labor Statistics."

# Resource creation for LONG ANNUAL
if not df_wide_annual.empty:
    acsResource = {
        "name": "qcew_annual_long",
        "title": title,
        "description": description,
        "path": QCEW_ANNUAL_LONG_OUTPUT_NAME,
        "format": "csv",
        "mediatype": "text/csv",
        "encoding": "utf-8",
        "bytes": os.path.getsize(QCEW_ANNUAL_LONG_OUTPUT_PATH),
        "hash": morpc.md5(QCEW_ANNUAL_LONG_OUTPUT_PATH),
        "schema": LONG_TABLE_SCHEMA_FILENAME,
        "profile":'tabular-data-resource'
    }
    
    # Create the resource object
    resource = frictionless.Resource(acsResource)

    print("Writing resource file to {}".format(QCEW_ANNUAL_LONG_OUTPUT_RESOURCE_PATH))
    cwd = os.getcwd()
    os.chdir(os.path.dirname(QCEW_ANNUAL_WIDE_OUTPUT_RESOURCE_PATH))
    dummy = resource.to_yaml(os.path.basename(QCEW_ANNUAL_LONG_OUTPUT_RESOURCE_PATH))
    os.chdir(cwd)
    
    print("Validating resource on disk (including data and schema). This may take some time.")
    resourceOnDisk = frictionless.Resource(QCEW_ANNUAL_LONG_OUTPUT_RESOURCE_PATH)
    results = resourceOnDisk.validate()
    if(results.valid):
        print("Resource is valid\n")
    else:
        print("ERROR: Resource is NOT valid. Errors follow.\n")
        print(results)
        raise RuntimeError

Writing resource file to output_data\qcew_annual_long.resource.yaml
Validating resource on disk (including data and schema). This may take some time.
Resource is valid



## Create and validate resource file for quarterly long-form table, if it exists

In [15]:
df_wide_annual = pd.read_csv(QCEW_QUARTERLY_LONG_OUTPUT_PATH)

# Finding the maximum and minimum values in the 'year' column
max_year = df_wide_annual['year'].max()
min_year = df_wide_annual['year'].min()

# Update title and description with the county name
title = f"Compiled QCEW County Data, {min_year}-{max_year} (long form)"
description = f"Employment and wage data for counties in MOPRC region, derived from the U.S. Bureau of Labor Statistics."

# Resource creation for WIDE ANNUAL
if not df_wide_annual.empty:
    
    acsResource = {
      "name": "qcew_quarterly_long",
      "title": title,
      "description": description,
      "path": QCEW_QUARTERLY_LONG_OUTPUT_NAME,
      "format": "csv",
      "mediatype": "text/csv",
      "encoding": "utf-8",
      "bytes": os.path.getsize(QCEW_QUARTERLY_LONG_OUTPUT_PATH),
      "hash": morpc.md5(QCEW_QUARTERLY_LONG_OUTPUT_PATH),
      "schema": LONG_TABLE_SCHEMA_FILENAME,
      "profile":'tabular-data-resource'
    }
    
    # Create the resource object
    resource = frictionless.Resource(acsResource)

    print("Writing resource file to {}".format(QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE_PATH))
    cwd = os.getcwd()
    os.chdir(os.path.dirname(QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE_PATH))
    dummy = resource.to_yaml(os.path.basename(QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE_PATH))
    os.chdir(cwd)
    
    print("Validating resource on disk (including data and schema). This may take some time.")
    resourceOnDisk = frictionless.Resource(QCEW_QUARTERLY_LONG_OUTPUT_RESOURCE_PATH)
    results = resourceOnDisk.validate()
    if(results.valid):
        print("Resource is valid\n")
    else:
        print("ERROR: Resource is NOT valid. Errors follow.\n")
        print(results)
        raise RuntimeError

Writing resource file to output_data\qcew_quarterly_long.resource.yaml
Validating resource on disk (including data and schema). This may take some time.
Resource is valid

