# Number of Trips to the New Employment Centers
#### Purpose:
To find the number of trips to the new employment centers by different factors such as primary mode, trip purpose, and vehicle type on weekdays of fall 2019 in the county of San Diego. 

#### Data Source:
There are two data sources that were used. Both data sources come from Replica Place Studies(please login to your Replica account):\
$\;\;\;\;\;\;$ 1. Replica Place Studies: https://studio.replicahq.com/places/studies/p544f96 \
$\;\;\;\;\;\;$ 2. Replica Place Studies: https://studio.replicahq.com/places/studies/ulztfb3


#### Transformations being preformed:
Two datasets were downloaded separately from Replica Place Studies. They were then uploaded to J Drive. 

#### Location of Outputs:
J:\DataScience\DSEconProdDessem\EC2\Replica\Trips_to_EC_fall_2019_thursday\Outputs

#### Author: 
Navid Hedayati (navid.hedayati@sandag.org)

#### Data Created 
3/2/2023

# Import Libraries

In [1]:
# Necessary libraries
import pandas as pd
import numpy as np
import geopandas as gpd

# Read Data
Two datasets were read. They are called replica and replica_2.

In [2]:
replica = pd.read_csv(r"J:\DataScience\DSEconProdDessem\EC2\Replica\Trips_to_EC_fall_2019_thursday\replica-first_study-02_14_23-trips_dataset\replica-first_study-02_14_23-trips_dataset.csv")

In [3]:
replica_2 = pd.read_csv(r"J:\DataScience\DSEconProdDessem\EC2\Replica\Trips_to_EC_fall_2019_thursday\replica-num_trips_blockgrps_to_ecs_fall_19_thusrday-02_23_23-trips_dataset\replica-num_trips_blockgrps_to_ecs_fall_19_thusrday-02_23_23-trips_dataset.csv")

In [4]:
# dataset dimentions
replica.shape

(7557621, 39)

In [5]:
# dataset dimentions
replica_2.shape

(7557621, 37)

In [6]:
replica.head()

Unnamed: 0,origin_bgrp,origin_cty,origin_st,destination_bgrp,destination_cty,destination_custom,primary_mode,trip_purpose,previous_trip_purpose,trip_start_time,...,trip_taker_available_vehicles,trip_taker_resident_type,trip_taker_home_bgrp,trip_taker_home_trct,trip_taker_home_cty,trip_taker_home_st,trip_taker_work_bgrp,trip_taker_work_trct,trip_taker_work_cty,trip_taker_work_st
0,"1 (Tract 5520.01, Los Angeles, CA)","Los Angeles County, CA",California,"1 (Tract 187, San Diego, CA)","San Diego County, CA",Marine Corps Base Camp Pendleton,private_auto,work,home,07:39:00,...,three_plus,core,"1 (Tract 5520.01, Los Angeles, CA)","5520.01 (Los Angeles, CA)","Los Angeles County, CA",California,"1 (Tract 187, San Diego, CA)","187 (San Diego, CA)","San Diego County, CA",California
1,"2 (Tract 1105, Orange, CA)","Orange County, CA",California,"1 (Tract 187, San Diego, CA)","San Diego County, CA",Marine Corps Base Camp Pendleton,private_auto,social,shop,10:35:51,...,\N,visitor,\N,\N,\N,\N,\N,\N,\N,\N
2,"2 (Tract 5736.01, Los Angeles, CA)","Los Angeles County, CA",California,"1 (Tract 187, San Diego, CA)","San Diego County, CA",Marine Corps Base Camp Pendleton,private_auto,social,social,14:31:22,...,two,core,"5 (Tract 5549, Los Angeles, CA)","5549 (Los Angeles, CA)","Los Angeles County, CA",California,\N,\N,\N,\N
3,"2 (Tract 5039.01, Los Angeles, CA)","Los Angeles County, CA",California,"1 (Tract 187, San Diego, CA)","San Diego County, CA",Marine Corps Base Camp Pendleton,carpool,work,social,14:45:47,...,three_plus,core,"5 (Tract 423.13, Orange, CA)","423.13 (Orange, CA)","Orange County, CA",California,"1 (Tract 187, San Diego, CA)","187 (San Diego, CA)","San Diego County, CA",California
4,"4 (Tract 5705.02, Los Angeles, CA)","Los Angeles County, CA",California,"1 (Tract 187, San Diego, CA)","San Diego County, CA",Marine Corps Base Camp Pendleton,carpool,work,home,06:25:00,...,one,core,"4 (Tract 5705.02, Los Angeles, CA)","5705.02 (Los Angeles, CA)","Los Angeles County, CA",California,"1 (Tract 187, San Diego, CA)","187 (San Diego, CA)","San Diego County, CA",California


In [7]:
replica_2.head()

Unnamed: 0,origin_bgrp,origin_cty,origin_st,destination_custom,primary_mode,trip_purpose,previous_trip_purpose,trip_start_time,trip_end_time,trip_duration_minutes,...,trip_taker_available_vehicles,trip_taker_resident_type,trip_taker_home_bgrp,trip_taker_home_trct,trip_taker_home_cty,trip_taker_home_st,trip_taker_work_bgrp,trip_taker_work_trct,trip_taker_work_cty,trip_taker_work_st
0,Outside of region,Outside of region,Outside of region,Ocean Beach,other_travel_mode,lodging,other_activity_type,17:17:22,06:44:40,807,...,\N,visitor,\N,\N,\N,\N,\N,\N,\N,\N
1,Outside of region,Outside of region,Outside of region,Ocean Beach,other_travel_mode,lodging,social,02:18:02,15:43:40,805,...,\N,visitor,\N,\N,\N,\N,\N,\N,\N,\N
2,Outside of region,Outside of region,Outside of region,Pacific Beach,other_travel_mode,home,work,03:48:50,17:02:40,793,...,one,core,"5 (Tract 80.06, San Diego, CA)","80.06 (San Diego, CA)","San Diego County, CA",California,Outside of region,Outside of region,Outside of region,Outside of region
3,Outside of region,Outside of region,Outside of region,San Diego Airport,other_travel_mode,region_departure,social,15:26:21,04:54:05,807,...,\N,visitor,\N,\N,\N,\N,\N,\N,\N,\N
4,Outside of region,Outside of region,Outside of region,Downtown,other_travel_mode,social,social,02:35:01,16:05:33,810,...,two,core,"1 (Tract 186.10, San Diego, CA)","186.10 (San Diego, CA)","San Diego County, CA",California,"3 (Tract 185.13, San Diego, CA)","185.13 (San Diego, CA)","San Diego County, CA",California


# Number of Trips by Primary Mode
The goal of this section is to get the number of trips to each employment centers by primary modes. These primary modes are biking, carpool, commercial, on demand auto, and other travel modes.  

In [8]:
# Group the replica dataset by destination_custom(i.e, the new  employment centers) and primary_mode fields. Then aggregate it to get the count of trips in each primary mode.
trips_primary_mode = replica.groupby(['destination_custom', 'primary_mode']).agg({'count'}).reset_index()[[ 'destination_custom', 'primary_mode', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [9]:
trips_primary_mode.shape

(809, 3)

In [10]:
trips_primary_mode.head()

Unnamed: 0_level_0,destination_custom,primary_mode,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,biking,75
1,Alpine,carpool,4283
2,Alpine,commercial,1510
3,Alpine,on_demand_auto,316
4,Alpine,other_travel_mode,288


In [9]:
# Save the outputs in a csv file
trips_primary_mode.to_csv("trips_primary_mode.csv",sep=",")

# Number of Trips by Purpose
The goal of this section is to get the number of trips to each employment center by trip puroses such as commercial, eat, and home.

In [11]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and trip_purpose fields. Then aggregate it to get the count of trips in each trip purpose.
trips_purpose = replica.groupby(['destination_custom','trip_purpose']).agg({'count'}).reset_index()[['destination_custom', 'trip_purpose', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [12]:
trips_purpose.shape

(1085, 3)

In [13]:
trips_purpose.head()

Unnamed: 0_level_0,destination_custom,trip_purpose,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,commercial,1523
1,Alpine,eat,3504
2,Alpine,home,4912
3,Alpine,lodging,142
4,Alpine,maintenance,1548


In [13]:
# Save outputs in a csv file
trips_purpose.to_csv("trips_purpose.csv", sep=",")

# Number of Trips by Previous Trip Purpose 
The goal of this section is to get the number of trips to each new employment center by previous trip purpose. 

In [82]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and previous_trip_purpose fields. Then aggregate it to get the count of trips in each previous trip purposes.
previous_trips_purpose = replica.groupby(['destination_custom','previous_trip_purpose']).agg({'count'}).reset_index()[['destination_custom','previous_trip_purpose', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [83]:
previous_trips_purpose.shape

(1259, 3)

In [84]:
previous_trips_purpose.head()

Unnamed: 0_level_0,destination_custom,previous_trip_purpose,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,\N,32
1,Alpine,commercial,1523
2,Alpine,eat,1970
3,Alpine,home,7289
4,Alpine,lodging,312


In [135]:
# Save the outputs in a csv file
previous_trips_purpose.to_csv("previous_trips_purpose.csv",sep=",")

# Number of Trips by Vehicle Type
The goal of this section is to get the number of trips to each employment center by vehicle types. The vehicle types  are heavy_commercial, medium_commercial, and unknown_vehicle_type. 

In [14]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and vehicle_type fields. Then aggregate it to get the count of trips in each vehile_type.
trips_vehicle_type = replica.groupby(['destination_custom','vehicle_type']).agg({'count'}).reset_index()[['destination_custom','vehicle_type', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [15]:
trips_vehicle_type.shape

(340, 3)

In [16]:
trips_vehicle_type.head()

Unnamed: 0_level_0,destination_custom,vehicle_type,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,heavy_commercial,23
1,Alpine,medium_commercial,1487
2,Alpine,unknown_vehicle_type,21581
3,Balboa Park,heavy_commercial,5
4,Balboa Park,medium_commercial,680


In [137]:
# Save the outputs in a csv file
trips_vehicle_type.to_csv("trips_vehicle_type.csv",sep=",")

# Number of Trips by Origin Land Use
The goal of this section is to get the number of trips to each employment center by the trips origin land use. The origin_land_use field have categories such as civic_institutional, education, multi_family, and single family.

In [17]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and origin_building_use fields. Then aggregate it to get the count of trips in each trip origin land use category.
trips_origin_building_use = replica.groupby(['destination_custom','origin_building_use']).agg({'count'}).reset_index()[['destination_custom','origin_building_use', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [18]:
trips_origin_building_use.shape

(1350, 3)

In [19]:
trips_origin_building_use.head()

Unnamed: 0_level_0,destination_custom,origin_building_use,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,civic_institutional,531
1,Alpine,education,1638
2,Alpine,healthcare,179
3,Alpine,industrial,322
4,Alpine,multi_family,1099


In [141]:
# Save the outputs in a csv file
trips_origin_building_use.to_csv("trips_origin_building_use.csv",sep=",")

# Number of Trips by Destination Land Use
The goal of this section is to get the number of trips to each employment center by the trips destination land use. The destination_land_use field have categories such as civic_institutional, education, healthcare, and mixed_use. 

In [20]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and destination_land_use fields. Then aggregate it to get the count of trips in each trip destination land use category.
trips_destination_land_use = replica.groupby(['destination_custom','destination_land_use']).agg({'count'}).reset_index()[['destination_custom','destination_land_use', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [21]:
trips_destination_land_use.shape

(1150, 3)

In [22]:
trips_destination_land_use.head()

Unnamed: 0_level_0,destination_custom,destination_land_use,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,civic_institutional,427
1,Alpine,education,659
2,Alpine,healthcare,12
3,Alpine,industrial,107
4,Alpine,mixed_use,1661


In [143]:
# Save the outputs in a csv file
trips_destination_land_use.to_csv("trips_destination_land_use.csv",sep=",")

# Number of Trips by Destination Building Use
The goal of this section is to get the number of trips to each employment center by the trips destination building use. The destination_building_use field has categories such as civic_institutional, education, healthcare, and industrial. 

In [23]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and destination_building_use fields. Then aggregate it to get the count of trips in each trip destination building use category.
trips_destination_building_use = replica.groupby(['destination_custom','destination_building_use']).agg({'count'}).reset_index()[['destination_custom','destination_building_use', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [24]:
trips_destination_building_use.shape

(1151, 3)

In [25]:
trips_destination_building_use.head()

Unnamed: 0_level_0,destination_custom,destination_building_use,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,civic_institutional,591
1,Alpine,education,695
2,Alpine,healthcare,168
3,Alpine,industrial,119
4,Alpine,multi_family,773


In [145]:
# Save the outputs in a csv file
trips_destination_building_use.to_csv("trips_destination_building_use.csv",sep=",")

# Number of Trips by Duration
The goal of this section is to get the number of trips to each employment center by trip duratin. trip_duration_minutes field was used to create a new field called trip_duration. This filed has five categories. these categories are 0-5_min, 5-15_min, 15-30_min, 30-60_min, and 60+_min.

In [26]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and trip_duration_minutes fields. Then aggregate it to get the count of trips in each category in trip_duration_minutes field.
trips_duration = replica.groupby(['destination_custom','trip_duration_minutes']).agg({'count'}).reset_index()[['destination_custom','trip_duration_minutes', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [27]:
trips_duration.shape

(36020, 3)

In [28]:
trips_duration.head()

Unnamed: 0_level_0,destination_custom,trip_duration_minutes,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,0,988
1,Alpine,1,1446
2,Alpine,2,1547
3,Alpine,3,1466
4,Alpine,4,1138


In [29]:
trips_duration['trip_duration_minutes'] = trips_duration['trip_duration_minutes'].replace(0,1)

In [30]:
#trip_duration_minutes field was used to create a new field called trip_duration. This filed has five categories. these categories are 0-5_min, 5-15_min, 15-30_min, 30-60_min, and 60+_min.
trips_duration['trip_duration'] = pd.cut(x=trips_duration['trip_duration_minutes'], bins=[0,5,15,30,60,np.inf], labels=['0-5_min', '5-15_min', '15-30_min','30-60_min','60+_min'], ordered =True)

In [31]:
trips_duration.head()

Unnamed: 0_level_0,destination_custom,trip_duration_minutes,trips,trip_duration
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count,Unnamed: 4_level_1
0,Alpine,1,988,0-5_min
1,Alpine,1,1446,0-5_min
2,Alpine,2,1547,0-5_min
3,Alpine,3,1466,0-5_min
4,Alpine,4,1138,0-5_min


In [32]:
trips_duration = trips_duration.drop(columns='trip_duration_minutes')

In [33]:
trips_duration = trips_duration.groupby(['destination_custom','trip_duration']).agg({'sum'}) 

In [36]:
trips_duration.head(10)

Unnamed: 0_level_0,Unnamed: 1_level_0,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,count
Unnamed: 0_level_2,Unnamed: 1_level_2,sum
destination_custom,trip_duration,Unnamed: 2_level_3
Alpine,0-5_min,7562
Alpine,5-15_min,6021
Alpine,15-30_min,5303
Alpine,30-60_min,3409
Alpine,60+_min,796
Balboa Park,0-5_min,5879
Balboa Park,5-15_min,9181
Balboa Park,15-30_min,7339
Balboa Park,30-60_min,2665
Balboa Park,60+_min,1395


In [26]:
# Save the outputs in a csv file
trips_duration.to_csv("trips_duration.csv", sep = ",")

# Number of Trips by Distance
The goal of this section is to get the number of trips to each employment center by trip distance in miles. trip_distance_miles field was used to create a new fiel called trip_distance. This filed has six categories. these categories are less_than_1_mile, 1-2_miles, 2-5_miles, 5-10_miles, 10-25_miles, and 25+_miles.

In [37]:
# Group the replica dataset by destination_custom(i.e, the new employment centers) and trip_distance_miles fields. Then aggregate it to get the count of trips in each category in trip_distance_miles field.
trips_distance = replica.groupby(['destination_custom','trip_distance_miles']).agg({'count'}).reset_index()[['destination_custom','trip_distance_miles', 'origin_bgrp']].rename(columns={'origin_bgrp':'trips'})

In [38]:
trips_distance.shape

(137596, 3)

In [39]:
trips_distance.head()

Unnamed: 0_level_0,destination_custom,trip_distance_miles,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,Alpine,0.0,508
1,Alpine,0.1,602
2,Alpine,0.2,413
3,Alpine,0.3,419
4,Alpine,0.4,450


In [40]:
trips_distance['trip_distance_miles'] = trips_distance['trip_distance_miles'].replace(0.0,0.1)

In [41]:
#trip_distance_miles field was used to create a new fiel called trip_distance. This filed has six categories. these categories are are less_than_1_mile, 1-2_miles, 2-5_miles, 5-10_miles, 10-25_miles, and 25+_miles.
trips_distance['trip_distance'] = pd.cut(x=trips_distance['trip_distance_miles'], bins=[0,1,2,5,10,25,np.inf], labels=['less_than_1_mile','1-2_miles','2-5_miles','5-10_miles','10-25_miles','25+_miles'], ordered =True )

In [42]:
trips_distance.head()

Unnamed: 0_level_0,destination_custom,trip_distance_miles,trips,trip_distance
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count,Unnamed: 4_level_1
0,Alpine,0.1,508,less_than_1_mile
1,Alpine,0.1,602,less_than_1_mile
2,Alpine,0.2,413,less_than_1_mile
3,Alpine,0.3,419,less_than_1_mile
4,Alpine,0.4,450,less_than_1_mile


In [43]:
trips_distance = trips_distance.drop(columns='trip_distance_miles')

In [44]:
trips_distance = trips_distance.groupby(['destination_custom','trip_distance']).agg({'sum'})

In [45]:
trips_distance.head(12)

Unnamed: 0_level_0,Unnamed: 1_level_0,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,count
Unnamed: 0_level_2,Unnamed: 1_level_2,sum
destination_custom,trip_distance,Unnamed: 2_level_3
Alpine,less_than_1_mile,4687
Alpine,1-2_miles,2772
Alpine,2-5_miles,2890
Alpine,5-10_miles,2143
Alpine,10-25_miles,6301
Alpine,25+_miles,4298
Balboa Park,less_than_1_mile,4726
Balboa Park,1-2_miles,1735
Balboa Park,2-5_miles,5469
Balboa Park,5-10_miles,5093


In [26]:
# Save the outputs in a csv file 
trips_distance.to_csv("trips_distance.csv",sep=",")

# Number of Trips by Block Groups
The goal of this section is to get the number of trips to each employment center from each block group. 

In [46]:
replica_2.shape

(7557621, 37)

In [47]:
# Group the replica_2 dataset by destination_custom(i.e, the new employment centers) and origin_bgrp fields. Then aggregate it to get the count of trips from each block groups to the employment centers.
trips_bgrp_to_EC = replica_2.groupby(['origin_bgrp','destination_custom']).agg({'count'}).reset_index()[['origin_bgrp','destination_custom','origin_cty']].rename(columns={'origin_cty':'trips'})

In [48]:
trips_bgrp_to_EC.shape

(218271, 3)

In [49]:
trips_bgrp_to_EC.head()

Unnamed: 0_level_0,origin_bgrp,destination_custom,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
0,"0 (Tract 5766.02, Los Angeles, CA)",Oceanside Civic Center,1
1,"0 (Tract 5766.02, Los Angeles, CA)",Otay Mesa East,1
2,"0 (Tract 9901, San Diego, CA)",Carlsbad Village,1
3,"0 (Tract 9901, San Diego, CA)",Chula Vista Northwest,1
4,"0 (Tract 9901, San Diego, CA)",Chula Vista Otay,1


In [50]:
trips_bgrp_to_EC = trips_bgrp_to_EC.sort_values(by=['destination_custom'])

In [51]:
trips_bgrp_to_EC.head()

Unnamed: 0_level_0,origin_bgrp,destination_custom,trips
Unnamed: 0_level_1,Unnamed: 1_level_1,Unnamed: 2_level_1,count
14228,"1 (Tract 153.01, San Diego, CA)",Alpine,16
44281,"1 (Tract 32.09, San Diego, CA)",Alpine,8
86384,"2 (Tract 117, San Diego, CA)",Alpine,2
13611,"1 (Tract 149.02, San Diego, CA)",Alpine,82
86351,"2 (Tract 117, Imperial, CA)",Alpine,6


In [52]:
trips_bgrp_to_EC.shape

(218271, 3)

In [66]:
# Save the outputs in a csv file
trips_bgrp_to_EC.to_csv("trips_bgrp_to_EC.csv",sep=",")