# Run Functions to Add Information to Projects

To run the data through the script, all you need to do is update `my_file` path to the most recent export from FMIS and QMRS uploaded to GCS, then run the function in the section `Export Data` with your dataframe and the current date. Then your aggregated data will be ready in GCS. 

In [1]:
import pandas as pd
from siuba import *

import _script_utils

In [2]:
pd.set_option("display.max_columns", 100)
pd.set_option('display.max_colwidth', None)

## Read in Data and function development / Test Function

For the following function:
* update the file path for `my_file` to the most recent file name of the FMIS & QMRS export
* the second kwargs is the unique recipient identifier, in this case it should stay the same with subsequent exports
* the third kwargs is the aggregation level you want for the data. Unless otherwise specified, it should be `agg` which is one row per project

In [3]:
GCS_FILE_PATH  = 'gs://calitp-analytics-data/data-analyses/dla/dla-iija'

In [4]:
my_file = "FMIS_Projects_Universe_IIJA_Reporting_03012024_ToDLA.xlsx"

In [5]:
df = _script_utils.run_script(my_file, 'summary_recipient_defined_text_field_1_value', 'agg')

  df['implementing_agency_locode'] = df['implementing_agency_locode'].str.replace('.0', '')


### Testing the data

In [6]:
df.sample(3)

Unnamed: 0,fmis_transaction_date,project_number,implementing_agency,summary_recipient_defined_text_field_1_value,program_code,program_code_description,recipient_project_number,improvement_type,improvement_type_description,old_project_title_desc,obligations_amount,congressional_district,district,county_code,county_name,county_name_abbrev,implementing_agency_locode,rtpa_name,mpo_name,new_project_title,new_description_col
1308,45175,31L1001,California,S ER NONE,ER01,Emergency Relieve Funding,0117000079S,6,4R - Restoration & Rehabilitation,"IN HUMBOLDT CO., ABOUT 6 MI W OF WILLOW CREEK AT CEDAR CREEK ROAD. EMERGENCY RELIEF - EMERGENCY OPENING STORM DAMAGE REPAIRS.",32379,|02|,|01|,23,Humboldt County,|HUM|,,,,Road Restoration & Rehabilitation in Humboldt County,"Road Restoration & Rehabilitation in Humboldt County, part of the program(s) Emergency Relieve Funding. (Federal Project ID: 31L1001)."
1018,45103,5958094,Imperial County,L5958SCAG,Y001,National Highway Performance Program (NHPP),1115000144L,15|16|43,Preliminary Engineering|Right of Way|Utilities,"FORRESTER ROAD OVER WESTSIDE MAIN CANAL 0.6 MILES NORTH OF KEYSTONE ROAD, BR. NO. 58C-0014 REPLACE EXISTING TWO LANES BRIDGE WITH A NEW TWO LANE BRID",2464055,|51|,|11|,25,Imperial County,|IMP|,5958.0,Imperial County Transportation Commission,Southern California Association Of Governments,Preliminary Engineering Projects in Imperial County,"Preliminary Engineering Projects in Imperial County, part of the program(s) National Highway Performance Program (NHPP). (Federal Project ID: 5958094)."
214,44767,5933143,Alameda County,L5933MTC,Y240,Surface Transportation Block Grant,0418000118L,17|28,Construction Engineering|Facilities for Pedestrians and Bicycles,"IN CASTRO VALLEY: ON ANITA AVENUE BETWEEN CASTRO VALLEY BLVD. AND SOMERSET AVENUE CONSTRUCT SIDE WALKS,CURBS, GUTTERS, DRIVEWAYS, PEDESTRIAN RAMPS AN",2000000,|15|,|04|,1,Alameda County,|ALA|,5933.0,Metropolitan Transportation Commission,Metropolitan Transportation Commission,Construct Pedestrian Safety Improvements in Alameda County,"Construct Pedestrian Safety Improvements in Alameda County, part of the program(s) Surface Transportation Block Grant. (Federal Project ID: 5933143)."


In [7]:
## when grouping by funding program (pne project can have multiple rows), len is 1612
len(df)

1968

In [8]:
## check one project with multiple funding codes
df>>filter(_.project_number=='5004049')

Unnamed: 0,fmis_transaction_date,project_number,implementing_agency,summary_recipient_defined_text_field_1_value,program_code,program_code_description,recipient_project_number,improvement_type,improvement_type_description,old_project_title_desc,obligations_amount,congressional_district,district,county_code,county_name,county_name_abbrev,implementing_agency_locode,rtpa_name,mpo_name,new_project_title,new_description_col
1129,45133,5004049,San Diego,L5004SANDAG,Y001|Y110|Y908|Y909,National Highway Performance Program (NHPP)|Bridge Formula Program|Bridge Replacement and Rehabilitation Program,11955780L,10|17,Bridge Replacement - Added Capacity|Construction Engineering,"WEST MISSION BAY DRIVE OVER THE SAN DIEGO RIVER BRIDGE REPLACEMENT, BR. NO. 57C-0023",69715548,|52|,|11|,73,San Diego County,|SD|,4,San Diego Association of Governments,San Diego Association Of Governments,Replace Bridge in San Diego,"Replace Bridge in San Diego, part of the program(s) National Highway Performance Program (NHPP), and the Bridge Formula Program, and the Bridge Replacement and Rehabilitation Program. (Federal Project ID: 5004049)."


## Export Data

In [10]:
### rename the file for export to GCS

In [9]:
# _script_utils.export_to_gcs(df, "IIJA_FMIS_AllProjects_03012024_ToDLA_agg.csv")