# 5311 Insights
<hr style="border:2px solid #8CBCCB">


In [1]:
import pandas as pd

#Formatting notebook
pd.options.display.max_rows = 100
pd.options.display.float_format = "{:.0f}".format
pd.options.display.float_format = '{:,}'.format
import warnings
warnings.filterwarnings("ignore")

#Import script 
import _data_prep as data_prep
import _utils

In [2]:
#Color Palette
LIGHT_BLUE ="#8CBCCB"
DARK_BLUE = "#2EA8CE"
YELLOW = "#F4D837"
GREEN = "#51BF9D"
PURPLE = "#9487C0"

In [3]:
#Cleaned & aggregated DF 
aggregated = data_prep.aggregated_df()
final = data_prep.final_df()

## Analyze 5311 by District

In [4]:
district = (aggregated.groupby(['caltrans_district',])
           .agg({
                 'doors_sum':'sum',
                 'allocationamount':'sum',
                 'expendedamount':'sum',
                 'total_vehicles':'sum',
                 'average_age_of_fleet__in_years_':'median'
                 })
            .reset_index()
) 

district = (_utils
            .cols_cleanup(district)
           ) 

In [5]:
(district
 .style.bar(subset=["Allocationamount"], color=LIGHT_BLUE)
 .bar(subset=["Expendedamount"], color=PURPLE)
 .bar(subset=["Total Vehicles"], color=YELLOW)
 .bar(subset=["Doors Sum"], color=GREEN)
)

Unnamed: 0,Caltrans District,Doors Sum,Allocationamount,Expendedamount,Total Vehicles,Average Age Of Fleet In Years
0,01 - Eureka,193.0,18514520.0,12328651.03,177.0,5.424279
1,02 - Redding,78.0,6293412.0,4625264.02,62.0,5.938889
2,03 - Marysville,511.0,16444169.0,11525387.78,353.0,6.592634
3,04 - Oakland,3044.0,10945699.28,7418303.0,2145.0,7.201646
4,05 - San Luis Obispo,681.0,16604962.0,11601678.88,531.0,7.12843
5,06 - Fresno,639.0,29962524.0,21368584.63,507.0,6.691667
6,07 - Los Angeles,12.0,2706782.0,2169754.17,7.0,7.5
7,08 - San Bernardino,1264.0,17885468.0,13913177.69,1076.0,4.0
8,09 - Bishop,146.0,3830444.0,2467504.56,112.0,11.0
9,10 - Stockton,440.0,11420717.0,8459654.4,335.0,6.156062


## What are the GTFS statuses of applicants?

In [6]:
GTFS_orgs = (aggregated.groupby(['GTFS'])
             .agg({'organization_name':'nunique'})
             .reset_index()
             .rename(columns = {'organization_name': 'Count_of_Organizations'})
             .sort_values('Count_of_Organizations', ascending = False)
            )

In [7]:
_utils.basic_bar_chart(GTFS_orgs, 'Count_of_Organizations', 'GTFS', 'GTFS') 

The version of firefox cannot be detected. Trying with latest driver version


## Which Caltrans District has the most applicants?

In [8]:
Orgs_in_district = (aggregated
                    .groupby(['caltrans_district'])
                    .agg({'organization_name':'nunique'})
                    .reset_index()
                    .rename(columns = {'organization_name':'count_of_organizations'})
                    .sort_values(by='count_of_organizations', ascending=False)
                   )

In [9]:
_utils.basic_bar_chart(Orgs_in_district,'caltrans_district','count_of_organizations','caltrans_district') 

The version of firefox cannot be detected. Trying with latest driver version


##  What is the GTFS status by Fleet Size?

In [10]:
vehicle_size=  ['vehicles_percent_older_than_9',
       'vehicles_percent_older_than_15','vehicles_percent_0_to_9']

In [11]:
_utils.multi_charts((_utils.aggregation_one(final, 'GTFS')), 'GTFS', vehicle_size)

The version of firefox cannot be detected. Trying with latest driver version


The version of firefox cannot be detected. Trying with latest driver version


The version of firefox cannot be detected. Trying with latest driver version


## Analyze fleet size by vehicle age.

In [12]:
fleet = _utils.aggregation_one(final,'fleet_size')

In [13]:
fleet = _utils.cols_cleanup(fleet
         .loc[fleet['fleet_size'] != 'No Info']) 


In [14]:
fleet[['Fleet Size','Vehicles Older Than 9', 'Vehicles Older Than 15',
       'Vehicles 0 To 9',]].style.background_gradient(cmap = 'BuGn' )

Unnamed: 0,Fleet Size,Vehicles Older Than 9,Vehicles Older Than 15,Vehicles 0 To 9
0,Large,14216.0,4768.0,34058.0
1,Medium,2340.0,451.0,10712.0
3,Small,282.0,35.0,990.0


## What is the fleet size by agency?

In [15]:
fleet_agencies = (aggregated
                  .groupby(['fleet_size'])
                  .agg({'organization_name':'nunique'})
                  .reset_index()
                  .rename(columns = {'organization_name':'Total Organizations'})
                 )

In [16]:
_utils.basic_bar_chart(fleet_agencies,'fleet_size','Total Organizations','fleet_size') 

The version of firefox cannot be detected. Trying with latest driver version


## What are the most common reporter types?

In [17]:
Reporter_type_agg = (aggregated
                     .groupby(['reporter_type'])
                     .agg({'organization_name':'nunique'})
                     .reset_index()
                     .rename(columns = {'organization_name':'Count_of_Agencies'})
                    )

In [18]:
_utils.basic_bar_chart(Reporter_type_agg,'Count_of_Agencies', 'reporter_type', 'reporter_type') 

The version of firefox cannot be detected. Trying with latest driver version


## Which organization received the most funds overall? What is their GTFS Status?

In [19]:
Most_Money = (aggregated
              .groupby(['organization_name','GTFS'])
              .agg({'allocationamount':'sum'})
              .rename(columns = {'allocationamount': '5311_funds_received'})
              .reset_index()
              .sort_values('5311_funds_received', ascending = False)
              .head(10)
             )

In [20]:
_utils.basic_bar_chart(Most_Money,'5311_funds_received', 'organization_name','organization_name') 

The version of firefox cannot be detected. Trying with latest driver version


In [21]:
most_money_list = Most_Money.organization_name.tolist()
gtfs_most_funded = aggregated[aggregated["organization_name"].isin(most_money_list)]
gtfs_most_funded = _utils.cols_cleanup(gtfs_most_funded[['organization_name','GTFS']]).sort_values('Gtfs')
gtfs_most_funded

Unnamed: 0,Organization Name,Gtfs
42,Fresno County Rural Transit Agency,Static Incomplete_RT Incomplete
58,Monterey-Salinas Transit,Static Incomplete_RT OK
2,Butte County Association of Governments,Static OK_RT Incomplete
47,Kern Regional Transit,Static OK_RT Incomplete
56,Mendocino Transit Authority,Static OK_RT Incomplete
60,Mountain Area Regional Transit Authority,Static OK_RT Incomplete
79,Transit Joint Powers Authority for Merced County,Static OK_RT Incomplete
45,Humboldt Transit Authority,Static OK_RT OK
50,Lake Transit Authority,Static OK_RT OK
83,Victor Valley Transit Authority,Static OK_RT OK


## Which subset of 5311 is the most popular, using funds requested as a metric?

In [22]:
bc_funds = (final
            .groupby(['funding_program'])
            .agg({'organization_name':'nunique',
                  'allocationamount':'sum'})
            .rename(columns = {'organization_name':'Count_of_Organizations'
                               ,'allocationamount':'total_sum'})
            .reset_index()
            .sort_values(by =['total_sum'])
             )

In [23]:
_utils.basic_bar_chart(bc_funds,'total_sum', 'funding_program', 'funding_program') 

The version of firefox cannot be detected. Trying with latest driver version
