The BlackCat API came with no instructions. Here we just inspect what is in it and its format. This notebook contains:  
1. the code for inspecting the data 
2. What to do to check whether data has changed in 2024 (and years after that). 
  
We are inferring what to do. Contact BlackCat for further instructions.

In [1]:
import requests
import json
import pandas as pd
import numpy as np
import pendulum
import re

**NOTE that the URL has the year at the end. Change this to whatever year of data you would like to get.**

In [5]:
api_url = "https://services.blackcattransit.com/api/APIModules/GetNTDReportsByYear/BCG_CA/2023"

In [6]:
response = requests.get(api_url)

In [7]:
response

<Response [200]>

In [8]:
blob = response.json()

In [9]:
blob

[{'ReportId': 834,
  'Organization': 'Fresno County Rural Transit Agency',
  'ReportPeriod': '2023',
  'ReportStatus': 'Submitted',
  'ReportLastModifiedDate': '9/18/2023 11:55:34 AM',
  'NTDReportingStationsAndMaintenance': {'Data': [{'Id': 163,
     'ReportId': 834,
     'ServiceMode': 'Bus (MB) (Fixed Route)',
     'PTOwnedByServiceProvider': None,
     'PTOwnedByPublicAgency': None,
     'PTLeasedByPublicAgency': None,
     'PTLeasedByServiceProvider': None,
     'DOOwned': None,
     'DOLeasedByPublicAgency': None,
     'DOLeasedFromPrivateEntity': None,
     'LastModifiedDate': '2023-09-18T16:36:36.67'},
    {'Id': 164,
     'ReportId': 834,
     'ServiceMode': 'Demand Response (DR)',
     'PTOwnedByServiceProvider': None,
     'PTOwnedByPublicAgency': None,
     'PTLeasedByPublicAgency': None,
     'PTLeasedByServiceProvider': None,
     'DOOwned': None,
     'DOLeasedByPublicAgency': None,
     'DOLeasedFromPrivateEntity': None,
     'LastModifiedDate': '2023-09-18T16:36:36.67'

**Get table and org list**

In [10]:
# type(blob) #list
len(blob)  #35

# type(blob[0]) #dict

# blob[0]

# blob['Tables']

85

In [11]:
tables = []

# # For listing out ONLY the python dictionary keys in the blob (these are the tables) that start with "NTD"
# for k, v in blob[0].items():
#     if k.startswith("NTD"):
#         tables.append(k)

# For listing out ALL the dict keys:
for k, v in blob[0].items():
    tables.append(k)

print(tables)

['ReportId', 'Organization', 'ReportPeriod', 'ReportStatus', 'ReportLastModifiedDate', 'NTDReportingStationsAndMaintenance', 'NTDTransitAssetManagementA15', 'NTDAssetAndResourceInfo', 'NTDReportingP10', 'NTDReportingP20', 'NTDReportingP50', 'NTDReportingA35', 'NTDReportingRR20_Intercity', 'NTDReportingRR20_Rural', 'NTDReportingRR20_Urban_Tribal', 'NTDReportingTAMNarrative', 'SS60']


### Inspect whether any tables have changed from last year (2023)

1. Pull up the external tables yaml at airflow/dags/create_external_tables/ntd_report_validation/external_table_all_ntdreports.yml
2. Pull up the external_blackcat.all_ntdreports table in BigQuery
  
To help do #3 and #4 below, use the cell below to copy in table names one by one and inspect the API data for whichever year one is interested in. If you don't see any data, then cycle through the JSON list by changing `blob[0]` to `blob[1]`, `blob[3]` etc.  
  
3. Compare the table names above with what is in the table list on the schema there. NOTE table names in BigQuery are not *exactly* the same as the API, they have been made all lowercase with `_data` added.
4. Compare the individual columns within each of the above tables to what is there.
  
Change the schema in the yaml as needed to reflect the data. Do not remove any old column names. Just add new columns and/or tables

In [135]:
blob[0]['SS60']['Data']

[{'Id': 47,
  'ItemId': 1,
  'ReportId': 834,
  'Item': 'Major Safety and Security Events',
  'Type': None,
  'CssClass': None,
  'TransitVehicleAssualts': 0,
  'RevenueFacilityAssualts': 0,
  'NonRevenueFacilityAssualts': 0,
  'OtherLocationAssualts': 0,
  'MajorEvents': None,
  'Fatalities': None,
  'Injuries': None,
  'Quantity': None,
  'LastModifiedDate': '2023-09-18T16:20:42.953'},
 {'Id': 48,
  'ItemId': 2,
  'ReportId': 834,
  'Item': 'Non-Major Events (non-injury)',
  'Type': None,
  'CssClass': None,
  'TransitVehicleAssualts': 0,
  'RevenueFacilityAssualts': 0,
  'NonRevenueFacilityAssualts': 0,
  'OtherLocationAssualts': 0,
  'MajorEvents': None,
  'Fatalities': None,
  'Injuries': None,
  'Quantity': None,
  'LastModifiedDate': '2023-09-18T16:20:42.97'},
 {'Id': 49,
  'ItemId': 3,
  'ReportId': 834,
  'Item': 'Operator Injuries',
  'Type': None,
  'CssClass': None,
  'TransitVehicleAssualts': 0,
  'RevenueFacilityAssualts': 0,
  'NonRevenueFacilityAssualts': 0,
  'OtherLoc

#### Extract the org name and details from the blob
This is extra code to show how to inspect what organizations are in the API at any given time, in case it is helpful. 

In [99]:
for x in blob:
    report_id = x.get('ReportId')
    org = x.get('Organization')
    period = x.get('ReportPeriod')
    status = x.get('ReportStatus')
    last_mod_string = x.get('ReportLastModifiedDate')
    last_mod = pendulum.from_format(last_mod_string, 'MM/DD/YYYY HH:mm:ss A').in_tz('America/Los_Angeles')
    iso = last_mod.to_iso8601_string()
    print(f"Report details: ID {report_id}, org {org}, report period {period}, status {status}, last modified on {last_mod_string}.")
#     print(f"New datetime {last_mod}")
#     print(f"iso is {iso}")

Report details: ID 677, org Marin County Transit District, report period 2022, status Approved, last modified on 9/20/2022 6:15:50 PM.
Report details: ID 679, org County of Shasta Department of Public Works, report period 2022, status Approved, last modified on 9/21/2022 11:41:39 AM.
Report details: ID 692, org San Diego Metropolitan Transit System, report period 2022, status Approved, last modified on 9/27/2022 2:56:05 PM.
Report details: ID 702, org County of Los Angeles - Department of Public Works, report period 2022, status Approved, last modified on 9/29/2022 4:00:26 PM.
Report details: ID 706, org Santa Barbara County Association of Governments, report period 2022, status Approved, last modified on 9/30/2022 3:56:39 PM.
Report details: ID 713, org Yosemite Area Regional Transportation System , report period 2022, status Approved, last modified on 10/4/2022 2:33:48 PM.
Report details: ID 718, org Transit Joint Powers Authority for Merced County, report period 2022, status Approve

#### Quick check on what orgs are in this API, and how many have RR-20 info

In [12]:
org_data = []

for x in blob:
    report_id = x.get('ReportId')
    org = x.get('Organization')
    period = x.get('ReportPeriod')
    status = x.get('ReportStatus')
    last_mod = pendulum.from_format(x.get('ReportLastModifiedDate'), 'MM/DD/YYYY HH:mm:ss A').in_tz('America/Los_Angeles')
    iso = last_mod.to_iso8601_string()
    
    
    rural = x['NTDReportingRR20_Rural']
    for k,v in rural.items():
        rural_n = len(v)
    city = x['NTDReportingRR20_Intercity']
    for k,v in city.items():
        city_n = len(v)
    urban_tribal = x['NTDReportingRR20_Urban_Tribal']
    for k,v in urban_tribal.items():
        urban_n = len(v)
    
    org_info = pd.DataFrame(data=[[report_id, org, period, status, iso, rural_n, city_n, urban_n]], 
                            columns=['report_id', 'organization', 'report_period', 'report_status', 'last_modified', 
                                     'rr20_rural_rows', 'rr20_intercity_rows', 'rr20_urban-tribal_rows'])
#     whole_df = pd.concat([org_info, raw_df], axis=1).sort_values(by='organization')
    
    org_data.append(org_info)


In [14]:
newapi = pd.concat(org_data)
print(len(newapi))
newapi.head()

85


Unnamed: 0,report_id,organization,report_period,report_status,last_modified,rr20_rural_rows,rr20_intercity_rows,rr20_urban-tribal_rows
0,834,Fresno County Rural Transit Agency,2023,Submitted,2023-09-18T04:55:34-07:00,51,0,0
0,841,City of Needles,2023,Submitted,2023-09-18T07:35:31-07:00,47,0,0
0,843,Alpine County Community Development,2023,Submitted,2023-09-18T10:07:11-07:00,47,0,0
0,847,County of Los Angeles - Department of Public W...,2023,Submitted,2023-09-19T02:50:41-07:00,0,0,9
0,848,Santa Cruz Metropolitan Transit District,2023,Submitted,2023-09-19T03:00:14-07:00,0,0,9


In [15]:
newapi.to_csv('../data/newapi_rr20_11-27-23.csv')

## Convert API data to dataframes
Here using the test API to develop a function.

Just shove entire blob into a dataframe - this approach is what's recommended by Cal-ITP. They prefer we then do any transformations and separating of tables on dbt.  
Downsides:
* there are many columns with nested data (converts to lists and dictionaries). Basically each NTD report is in ONE column.
* the column names get changed because of the nesting and of repeated columns

In [8]:
df = pd.json_normalize(blob)
df

Unnamed: 0,ReportId,Organization,ReportPeriod,ReportStatus,ReportLastModifiedDate,NTDReportingStationsAndMaintenance.Data,NTDTransitAssetManagementA15.Data,NTDAssetAndResourceInfo.Data,NTDReportingP10.Data,NTDReportingP20.Data,NTDReportingP50.Data,NTDReportingA35.Data,NTDReportingRR20_Intercity.Data,NTDReportingRR20_Rural.Data,NTDReportingRR20_Urban_Tribal.Data,NTDReportingTAMNarrative.Data,SS60.Data
0,677,Marin County Transit District,2022,Approved,9/20/2022 6:15:50 PM,[],[],[],[],"[{'Id': 0, 'ReportId': 677, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 677, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 1, 'ItemId': 8, 'ReportId': 677, 'Item...",[],[]
1,679,County of Shasta Department of Public Works,2022,Approved,9/21/2022 11:41:39 AM,"[{'Id': 4, 'ReportId': 679, 'ServiceMode': 'Bu...",[],"[{'Id': 19, 'VehicleId': 16178, 'ReportId': 67...",[],"[{'Id': 0, 'ReportId': 679, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 679, 'Mode': {'id': 0, ...",[],[],"[{'Id': 70, 'ReportId': 679, 'Item': 'Bus (MB)...",[],"[{'Id': 4, 'ReportId': 679, 'Type': 'Revenue V...",[]
2,692,San Diego Metropolitan Transit System,2022,Approved,9/27/2022 2:56:05 PM,[],[],[],[],"[{'Id': 0, 'ReportId': 692, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 692, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 10, 'ItemId': 8, 'ReportId': 692, 'Ite...",[],[]
3,702,County of Los Angeles - Department of Public W...,2022,Approved,9/29/2022 4:00:26 PM,[],[],[],[],"[{'Id': 0, 'ReportId': 702, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 702, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 19, 'ItemId': 8, 'ReportId': 702, 'Ite...",[],[]
4,706,Santa Barbara County Association of Governments,2022,Approved,9/30/2022 3:56:39 PM,[],[],[],[],"[{'Id': 0, 'ReportId': 706, 'ServiceMode': 'Co...","[{'Id': 0, 'ReportId': 706, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 28, 'ItemId': 8, 'ReportId': 706, 'Ite...",[],[]
5,713,Yosemite Area Regional Transportation System,2022,Approved,10/4/2022 2:33:48 PM,"[{'Id': 48, 'ReportId': 713, 'ServiceMode': 'B...","[{'Id': 19, 'FacilityId': 118, 'ReportId': 713...","[{'Id': 4904, 'VehicleId': 13444, 'ReportId': ...",[],"[{'Id': 0, 'ReportId': 713, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 713, 'Mode': {'id': 0, ...",[],[],"[{'Id': 990, 'ReportId': 713, 'Item': 'Bus (MB...",[],"[{'Id': 54, 'ReportId': 713, 'Type': 'Revenue ...",[]
6,718,Transit Joint Powers Authority for Merced County,2022,Approved,10/5/2022 12:16:36 PM,[],[],[],[],"[{'Id': 0, 'ReportId': 718, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 718, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 37, 'ItemId': 8, 'ReportId': 718, 'Ite...",[],[]
7,725,City of Tehachapi,2022,Approved,10/5/2022 4:18:10 PM,"[{'Id': 69, 'ReportId': 725, 'ServiceMode': 'D...",[],[],[],"[{'Id': 0, 'ReportId': 725, 'ServiceMode': 'De...","[{'Id': 0, 'ReportId': 725, 'Mode': {'id': 0, ...",[],[],"[{'Id': 1467, 'ReportId': 725, 'Item': 'Demand...",[],[],[]
8,726,Sonoma County Transit,2022,Approved,10/5/2022 4:53:27 PM,[],[],[],[],"[{'Id': 0, 'ReportId': 726, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 726, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 64, 'ItemId': 8, 'ReportId': 726, 'Ite...",[],[]
9,727,City of Santa Maria,2022,Approved,10/5/2022 5:28:39 PM,[],[],[],[],"[{'Id': 0, 'ReportId': 727, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 727, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 82, 'ItemId': 8, 'ReportId': 727, 'Ite...",[],[]


In [20]:
df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 32 entries, 0 to 31
Data columns (total 17 columns):
 #   Column                                   Non-Null Count  Dtype 
---  ------                                   --------------  ----- 
 0   ReportId                                 32 non-null     int64 
 1   Organization                             32 non-null     object
 2   ReportPeriod                             32 non-null     object
 3   ReportStatus                             32 non-null     object
 4   ReportLastModifiedDate                   32 non-null     object
 5   NTDReportingStationsAndMaintenance.Data  32 non-null     object
 6   NTDTransitAssetManagementA15.Data        32 non-null     object
 7   NTDAssetAndResourceInfo.Data             32 non-null     object
 8   NTDReportingP10.Data                     32 non-null     object
 9   NTDReportingP20.Data                     32 non-null     object
 10  NTDReportingP50.Data                     32 non-null     object


In [21]:
df['ReportLastModifiedDate'] =  df['ReportLastModifiedDate'].astype('datetime64[ns]')
# df['ReportLastModifiedDate'] = pd.to_datetime(df['ReportLastModifiedDate'], format='%m/%d/YYYY HH:mm:ss %p')

In [22]:
df

Unnamed: 0,ReportId,Organization,ReportPeriod,ReportStatus,ReportLastModifiedDate,NTDReportingStationsAndMaintenance.Data,NTDTransitAssetManagementA15.Data,NTDAssetAndResourceInfo.Data,NTDReportingP10.Data,NTDReportingP20.Data,NTDReportingP50.Data,NTDReportingA35.Data,NTDReportingRR20_Intercity.Data,NTDReportingRR20_Rural.Data,NTDReportingRR20_Urban_Tribal.Data,NTDReportingTAMNarrative.Data,SS60.Data
0,677,Marin County Transit District,2022,Approved,2022-09-20 18:15:50,[],[],[],[],"[{'Id': 0, 'ReportId': 677, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 677, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 1, 'ItemId': 8, 'ReportId': 677, 'Item...",[],[]
1,679,County of Shasta Department of Public Works,2022,Approved,2022-09-21 11:41:39,"[{'Id': 4, 'ReportId': 679, 'ServiceMode': 'Bu...",[],"[{'Id': 19, 'VehicleId': 16178, 'ReportId': 67...",[],"[{'Id': 0, 'ReportId': 679, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 679, 'Mode': {'id': 0, ...",[],[],"[{'Id': 70, 'ReportId': 679, 'Item': 'Bus (MB)...",[],"[{'Id': 4, 'ReportId': 679, 'Type': 'Revenue V...",[]
2,692,San Diego Metropolitan Transit System,2022,Approved,2022-09-27 14:56:05,[],[],[],[],"[{'Id': 0, 'ReportId': 692, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 692, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 10, 'ItemId': 8, 'ReportId': 692, 'Ite...",[],[]
3,702,County of Los Angeles - Department of Public W...,2022,Approved,2022-09-29 16:00:26,[],[],[],[],"[{'Id': 0, 'ReportId': 702, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 702, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 19, 'ItemId': 8, 'ReportId': 702, 'Ite...",[],[]
4,706,Santa Barbara County Association of Governments,2022,Approved,2022-09-30 15:56:39,[],[],[],[],"[{'Id': 0, 'ReportId': 706, 'ServiceMode': 'Co...","[{'Id': 0, 'ReportId': 706, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 28, 'ItemId': 8, 'ReportId': 706, 'Ite...",[],[]
5,713,Yosemite Area Regional Transportation System,2022,Approved,2022-10-04 14:33:48,"[{'Id': 48, 'ReportId': 713, 'ServiceMode': 'B...","[{'Id': 19, 'FacilityId': 118, 'ReportId': 713...","[{'Id': 4904, 'VehicleId': 13444, 'ReportId': ...",[],"[{'Id': 0, 'ReportId': 713, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 713, 'Mode': {'id': 0, ...",[],[],"[{'Id': 990, 'ReportId': 713, 'Item': 'Bus (MB...",[],"[{'Id': 54, 'ReportId': 713, 'Type': 'Revenue ...",[]
6,718,Transit Joint Powers Authority for Merced County,2022,Approved,2022-10-05 12:16:36,[],[],[],[],"[{'Id': 0, 'ReportId': 718, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 718, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 37, 'ItemId': 8, 'ReportId': 718, 'Ite...",[],[]
7,725,City of Tehachapi,2022,Approved,2022-10-05 16:18:10,"[{'Id': 69, 'ReportId': 725, 'ServiceMode': 'D...",[],[],[],"[{'Id': 0, 'ReportId': 725, 'ServiceMode': 'De...","[{'Id': 0, 'ReportId': 725, 'Mode': {'id': 0, ...",[],[],"[{'Id': 1467, 'ReportId': 725, 'Item': 'Demand...",[],[],[]
8,726,Sonoma County Transit,2022,Approved,2022-10-05 16:53:27,[],[],[],[],"[{'Id': 0, 'ReportId': 726, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 726, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 64, 'ItemId': 8, 'ReportId': 726, 'Ite...",[],[]
9,727,City of Santa Maria,2022,Approved,2022-10-05 17:28:39,[],[],[],[],"[{'Id': 0, 'ReportId': 727, 'ServiceMode': 'Bu...","[{'Id': 0, 'ReportId': 727, 'Mode': {'id': 0, ...",[],[],[],"[{'Id': 82, 'ItemId': 8, 'ReportId': 727, 'Ite...",[],[]


In [228]:
user_dict = blob[0]['NTDReportingP50']['Data']
user_dict # a list of dictionaries. Each dict is one row of data.

[{'Id': 0,
  'ReportId': 677,
  'Mode': {'id': 0,
   'Text': 'Bus (MB) (Fixed Route)',
   'Value': '3',
   'Group': None,
   'BoolValue': False},
  'Type': {'id': 0,
   'Text': 'PT - Purchased Transportation',
   'Value': '2',
   'Group': None,
   'BoolValue': False},
  'WebLink': None,
  'FilePath': None,
  'LastModifiedDate': None},
 {'Id': 0,
  'ReportId': 677,
  'Mode': {'id': 0,
   'Text': 'Demand Response (DR)',
   'Value': '9',
   'Group': None,
   'BoolValue': False},
  'Type': {'id': 0,
   'Text': 'PT - Purchased Transportation',
   'Value': '2',
   'Group': None,
   'BoolValue': False},
  'WebLink': None,
  'FilePath': None,
  'LastModifiedDate': None}]

In [229]:
raw_df = pd.DataFrame.from_dict(user_dict)
raw_df

Unnamed: 0,Id,ReportId,Mode,Type,WebLink,FilePath,LastModifiedDate
0,0,677,"{'id': 0, 'Text': 'Bus (MB) (Fixed Route)', 'V...","{'id': 0, 'Text': 'PT - Purchased Transportati...",,,
1,0,677,"{'id': 0, 'Text': 'Demand Response (DR)', 'Val...","{'id': 0, 'Text': 'PT - Purchased Transportati...",,,


However in several tables, rows have several columns that are nested dictionaries.  
  
The following code explores ways to unnest them and expand the dataframe rows. **NOTE WE DID NOT USE THIS APPROACH IN PRODUCTION. We decided to unnest tables using SQL instead, in the `staging` dbt models.**

In [235]:
pd.json_normalize(user_dict)

# This expands columns instead of expanding rows. Not exactly what we want.

Unnamed: 0,Id,ReportId,WebLink,FilePath,LastModifiedDate,Mode.id,Mode.Text,Mode.Value,Mode.Group,Mode.BoolValue,Type.id,Type.Text,Type.Value,Type.Group,Type.BoolValue
0,0,677,,,,0,Bus (MB) (Fixed Route),3,,False,0,PT - Purchased Transportation,2,,False
1,0,677,,,,0,Demand Response (DR),9,,False,0,PT - Purchased Transportation,2,,False


In [243]:
# We only really want the "Text" value in the dictionaries in the "Mode" and "Type" columns.
# user_dict[0]['Mode']
user_dict[0]

{'Id': 0,
 'ReportId': 677,
 'Mode': {'id': 0,
  'Text': 'Bus (MB) (Fixed Route)',
  'Value': '3',
  'Group': None,
  'BoolValue': False},
 'Type': {'id': 0,
  'Text': 'PT - Purchased Transportation',
  'Value': '2',
  'Group': None,
  'BoolValue': False},
 'WebLink': None,
 'FilePath': None,
 'LastModifiedDate': None}

In [245]:
# How to replace certain values in a key:value pair of an existing python dictionary.
original = user_dict[0]
copy = {**original, 'Mode': original['Mode']['Text'], 
        'Type': original['Type']['Text']}
copy

{'Id': 0,
 'ReportId': 677,
 'Mode': 'Bus (MB) (Fixed Route)',
 'Type': 'PT - Purchased Transportation',
 'WebLink': None,
 'FilePath': None,
 'LastModifiedDate': None}

----
Done! This worked but is not super ideal because we hard-code the keys that we want to change instead of iterating over them, but it works as long as we know which dictionary items in each table are nested.  

In [249]:
# Trying loop of creating new dict from old dict.
# New dict will not be nested - checks for a nested dict in each value; for each nested dict, 
# we extract only the k,v pair where the key == 'Text' 

copy_test = {**original}
for k,v in copy_test.items():
    if type(v) is dict:
        copy_test[k] = copy_test[k]['Text']
        
copy_test

{'Id': 0,
 'ReportId': 677,
 'Mode': 'Bus (MB) (Fixed Route)',
 'Type': 'PT - Purchased Transportation',
 'WebLink': None,
 'FilePath': None,
 'LastModifiedDate': None}

In [251]:
## Worked! Now try the above loop over an entire JSON data table

for x in user_dict:
    for k,v in x.items():
        if type(v) is dict:
            x[k] = x[k]['Text']

In [252]:
raw_df = pd.DataFrame.from_dict(user_dict)
raw_df

Unnamed: 0,Id,ReportId,Mode,Type,WebLink,FilePath,LastModifiedDate
0,0,677,Bus (MB) (Fixed Route),PT - Purchased Transportation,,,
1,0,677,Demand Response (DR),PT - Purchased Transportation,,,


#### Table is now one level and in the format desired.