# 2023 PROJECT: Del Giudice Francesca, Pantò Martina, Righetti Gaia

# 2023 Project

You have to work on the [ZTBus: A Large Dataset of Time-Resolved City Bus Driving Missions](https://www.research-collection.ethz.ch/handle/20.500.11850/626723) repository.

It contains:
*  [metaData.csv](https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/626723/metaData.csv?sequence=1&isAllowed=y), shortly *trips*
*  several other files containing detailed data on some bus parameters, whose name is in the *trips* file. Those files can be downloaded as a [zip file](https://www.research-collection.ethz.ch/bitstream/handle/20.500.11850/626723/ZTBus_compressed.zip?sequence=3&isAllowed=y). Let us call those datasets the *details* datasets.

### Notes

1.    It is mandatory to use GitHub for developing the project.
1.    The project must be a jupyter notebook.
1.    There is no restriction on the libraries that can be used, nor on the Python version.
1.    All questions on the project **must** be asked in a public channel on [Zulip](https://focs.zulipchat.com).
1.    At most 3 students can be in each group. You must create the groups by yourself. You can use the Zulip channel to create the groups.
1.    You do not have to send me the project *before* the discussion.

In [3]:
import pandas as pd

In [4]:
trips = pd.read_csv("metaData.csv", na_values="-", parse_dates=["startTime_iso"])
trips.head()

Unnamed: 0,name,busNumber,startTime_iso,startTime_unix,endTime_iso,endTime_unix,drivenDistance,busRoute,energyConsumption,itcs_numberOfPassengers_mean,itcs_numberOfPassengers_min,itcs_numberOfPassengers_max,status_gridIsAvailable_mean,temperature_ambient_mean,temperature_ambient_min,temperature_ambient_max
0,B183_2019-04-30_03-18-56_2019-04-30_08-44-20,183,2019-04-30 03:18:56+00:00,1556594336,2019-04-30T08:44:20Z,1556613860,77213.87,,478585200.0,5.53886,0,20,0.74064,282.378,281.15,293.15
1,B183_2019-04-30_13-22-07_2019-04-30_17-54-02,183,2019-04-30 13:22:07+00:00,1556630527,2019-04-30T17:54:02Z,1556646842,59029.6,31.0,402258500.0,33.11458,4,74,0.855234,287.5443,285.15,293.15
2,B183_2019-05-01_05-58-51_2019-05-01_22-32-30,183,2019-05-01 05:58:51+00:00,1556690331,2019-05-01T22:32:30Z,1556749950,240900.4,33.0,1445733000.0,19.68914,0,55,0.77786,288.749,280.15,294.15
3,B183_2019-05-03_02-50-21_2019-05-03_05-53-20,183,2019-05-03 02:50:21+00:00,1556851821,2019-05-03T05:53:20Z,1556862800,42565.48,,281986700.0,1.685185,0,8,0.767122,282.4129,281.15,292.15
4,B183_2019-05-03_15-41-57_2019-05-03_23-06-24,183,2019-05-03 15:41:57+00:00,1556898117,2019-05-03T23:06:24Z,1556924784,125277.2,72.0,620725800.0,23.75357,1,67,0.907342,284.7325,282.15,287.15


In [5]:
trips.shape

(1409, 16)

In [6]:
trips.columns

Index(['name', 'busNumber', 'startTime_iso', 'startTime_unix', 'endTime_iso',
       'endTime_unix', 'drivenDistance', 'busRoute', 'energyConsumption',
       'itcs_numberOfPassengers_mean', 'itcs_numberOfPassengers_min',
       'itcs_numberOfPassengers_max', 'status_gridIsAvailable_mean',
       'temperature_ambient_mean', 'temperature_ambient_min',
       'temperature_ambient_max'],
      dtype='object')

### 1. Extract all trips with `busRoute` 83

In [7]:
trips[trips["busRoute"]=="83"]

Unnamed: 0,name,busNumber,startTime_iso,startTime_unix,endTime_iso,endTime_unix,drivenDistance,busRoute,energyConsumption,itcs_numberOfPassengers_mean,itcs_numberOfPassengers_min,itcs_numberOfPassengers_max,status_gridIsAvailable_mean,temperature_ambient_mean,temperature_ambient_min,temperature_ambient_max
154,B183_2020-03-03_04-42-38_2020-03-03_19-44-51,183,2020-03-03 04:42:38+00:00,1583210558,2020-03-03T19:44:51Z,1583264691,225047.90,83,1.544278e+09,23.47531,0,118,0.472180,280.5450,279.15,289.1500
155,B183_2020-03-06_04-53-23_2020-03-06_19-44-42,183,2020-03-06 04:53:23+00:00,1583470403,2020-03-06T19:44:42Z,1583523882,224512.30,83,1.631816e+09,17.41578,0,69,0.451028,279.8850,278.15,289.1500
157,B183_2020-03-09_14-16-13_2020-03-09_19-34-17,183,2020-03-09 14:16:13+00:00,1583763373,2020-03-09T19:34:17Z,1583782457,77824.36,83,5.406013e+08,23.18182,0,74,0.460099,281.0489,279.15,291.1500
158,B183_2020-03-10_04-50-03_2020-03-10_19-51-25,183,2020-03-10 04:50:03+00:00,1583815803,2020-03-10T19:51:25Z,1583869885,225095.80,83,1.692171e+09,20.96410,0,86,0.475233,279.8363,279.15,291.1500
159,B183_2020-03-12_04-56-41_2020-03-12_19-44-57,183,2020-03-12 04:56:41+00:00,1583989001,2020-03-12T19:44:57Z,1584042297,224181.20,83,1.145860e+09,17.21235,0,80,0.340882,287.3445,282.15,291.1500
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1399,B208_2022-11-30_04-47-53_2022-11-30_19-50-22,208,2022-11-30 04:47:53+00:00,1669783673,2022-11-30T19:50:22Z,1669837822,223165.00,83,1.560888e+09,27.89066,2,100,0.456196,280.6948,279.15,293.1500
1400,B208_2022-12-01_05-19-41_2022-12-01_18-20-57,208,2022-12-01 05:19:41+00:00,1669871981,2022-12-01T18:20:57Z,1669918857,190196.00,83,1.418847e+09,26.03927,0,96,0.450413,279.7655,279.15,292.1500
1401,B208_2022-12-02_04-47-48_2022-12-02_19-40-01,208,2022-12-02 04:47:48+00:00,1669956468,2022-12-02T19:40:01Z,1670010001,224473.40,83,1.611150e+09,24.80384,2,91,0.438693,279.7888,279.15,291.1500
1405,B208_2022-12-07_05-13-02_2022-12-07_19-19-53,208,2022-12-07 05:13:02+00:00,1670389982,2022-12-07T19:19:53Z,1670440793,210041.60,83,1.536697e+09,28.78539,0,115,0.434858,279.5283,278.15,292.6655


### 2. Extract all trips where `busRoute` is not a number

In [9]:
clean_trips = trips[trips['busRoute'].notnull()]
clean_trips.head()

Unnamed: 0,name,busNumber,startTime_iso,startTime_unix,endTime_iso,endTime_unix,drivenDistance,busRoute,energyConsumption,itcs_numberOfPassengers_mean,itcs_numberOfPassengers_min,itcs_numberOfPassengers_max,status_gridIsAvailable_mean,temperature_ambient_mean,temperature_ambient_min,temperature_ambient_max
1,B183_2019-04-30_13-22-07_2019-04-30_17-54-02,183,2019-04-30 13:22:07+00:00,1556630527,2019-04-30T17:54:02Z,1556646842,59029.6,31,402258500.0,33.11458,4,74,0.855234,287.5443,285.15,293.15
2,B183_2019-05-01_05-58-51_2019-05-01_22-32-30,183,2019-05-01 05:58:51+00:00,1556690331,2019-05-01T22:32:30Z,1556749950,240900.4,33,1445733000.0,19.68914,0,55,0.77786,288.749,280.15,294.15
4,B183_2019-05-03_15-41-57_2019-05-03_23-06-24,183,2019-05-03 15:41:57+00:00,1556898117,2019-05-03T23:06:24Z,1556924784,125277.2,72,620725800.0,23.75357,1,67,0.907342,284.7325,282.15,287.15
5,B183_2019-05-05_07-41-02_2019-05-05_23-20-07,183,2019-05-05 07:41:02+00:00,1557042062,2019-05-05T23:20:07Z,1557098407,283206.9,46,1661700000.0,16.49925,0,74,0.997746,280.1668,277.15,291.15
6,B183_2019-05-06_03-10-43_2019-05-06_19-20-34,183,2019-05-06 03:10:43+00:00,1557112243,2019-05-06T19:20:34Z,1557170434,224131.6,31,1388008000.0,28.03509,0,83,0.87103,282.2435,277.15,291.15


In [10]:
clean_trips[clean_trips['busRoute'].str.contains('\D')]

Unnamed: 0,name,busNumber,startTime_iso,startTime_unix,endTime_iso,endTime_unix,drivenDistance,busRoute,energyConsumption,itcs_numberOfPassengers_mean,itcs_numberOfPassengers_min,itcs_numberOfPassengers_max,status_gridIsAvailable_mean,temperature_ambient_mean,temperature_ambient_min,temperature_ambient_max
533,B183_2021-12-18_23-37-00_2021-12-19_03-38-35,183,2021-12-18 23:37:00+00:00,1639870620,2021-12-19T03:38:35Z,1639885115,76216.06,N4,481350300.0,9.198582,0,37,0.491653,276.8632,275.15,288.15
553,B183_2022-01-07_23-40-43_2022-01-08_03-31-21,183,2022-01-07 23:40:43+00:00,1641598843,2022-01-08T03:31:21Z,1641612681,68557.06,N2,453625100.0,4.626984,0,13,0.427488,276.9673,275.15,287.15
554,B183_2022-01-08_23-40-17_2022-01-09_03-35-32,183,2022-01-08 23:40:17+00:00,1641685217,2022-01-09T03:35:32Z,1641699332,67962.92,N2,475383300.0,7.495495,0,26,0.515514,278.5645,277.15,288.15
561,B183_2022-01-15_23-41-46_2022-01-16_03-40-23,183,2022-01-15 23:41:46+00:00,1642290106,2022-01-16T03:40:23Z,1642304423,77156.70,N1,525168300.0,6.512500,0,32,0.473809,274.9937,273.15,286.15
568,B183_2022-01-21_23-35-40_2022-01-22_03-26-24,183,2022-01-21 23:35:40+00:00,1642808140,2022-01-22T03:26:24Z,1642821984,71917.75,N2,455476000.0,5.357143,0,23,0.493608,275.3073,274.15,281.15
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1373,B208_2022-10-21_22-38-32_2022-10-22_02-42-21,208,2022-10-21 22:38:32+00:00,1666391912,2022-10-22T02:42:21Z,1666406541,78567.16,N1,434776600.0,16.333330,0,45,0.431852,289.2550,288.15,296.15
1374,B208_2022-10-22_22-34-45_2022-10-23_02-29-59,208,2022-10-22 22:34:45+00:00,1666478085,2022-10-23T02:29:59Z,1666492199,73427.97,N2,399773700.0,17.710530,0,57,0.443358,287.3486,285.15,295.15
1394,B208_2022-11-25_23-35-16_2022-11-26_03-30-39,208,2022-11-25 23:35:16+00:00,1669419316,2022-11-26T03:30:39Z,1669433439,72911.26,N2,447553400.0,11.216670,1,32,0.465024,281.3884,280.15,293.15
1407,B208_2022-12-09_23-55-12_2022-12-10_03-24-28,208,2022-12-09 23:55:12+00:00,1670630112,2022-12-10T03:24:28Z,1670642668,59548.57,N1,451916500.0,20.105260,0,74,0.495739,279.4540,277.15,291.15


### 3. For each (busNumber, busRoute) pair, determine the number of trips

In [11]:
trips.groupby(['busNumber','busRoute']).size() 

busNumber  busRoute
183        31           12
           32           12
           33          130
           46          104
           72          114
           83          441
           N1           10
           N2           19
           N4           11
208        31            5
           32           14
           33           25
           46           19
           72           44
           83          405
           N1            6
           N2           20
           N4            7
dtype: int64

### 4. For each trip, compute the ratio between the energy consumption and the average number of passengers

In [12]:
trips["ratio"]=trips['energyConsumption']/trips['itcs_numberOfPassengers_mean']
trips["ratio"]

0       8.640500e+07
1       1.214747e+07
2       7.342794e+07
3       1.673328e+08
4       2.613190e+07
            ...     
1404    1.070215e+07
1405    5.338462e+07
1406    4.738047e+07
1407    2.247753e+07
1408    3.345769e+07
Name: ratio, Length: 1409, dtype: float64

### 5. For each station (`itcs_stopName`), determine the average number of passengers.

In [13]:
import zipfile as zp

In [15]:
archive = zp.ZipFile("ZTBus_compressed.zip",mode='r') 
details=[]

In [20]:
for i in archive.namelist():
    if i.endswith('.csv'):
        df=pd.read_csv(archive.open(i), usecols=["itcs_stopName", "itcs_numberOfPassengers", "gnss_altitude", "status_haltBrakeIsActive", "status_parkBrakeIsActive", "temperature_ambient","odometry_vehicleSpeed"],
                   	na_values="-", low_memory=False)
        df["name"]=i
    details.append(df)

In [23]:
complete_details= pd.concat(details)
clean_complete_details = complete_details.dropna(subset=["itcs_stopName"])

MemoryError: Unable to allocate 743. MiB for an array with shape (2, 48674462) and data type float64

In [24]:
clean_complete_details .groupby("itcs_stopName")["itcs_numberOfPassengers"].mean()  

NameError: name 'clean_complete_details' is not defined

### 6. For each station, determine the buses that have stopped there at least once.

In [25]:
import re

def find_bus(row):
    m=re.match("B(?P<busNumber>\d*).*", row)
    if m:
            return int(m.group("busNumber"))
    else:
            return -1

In [26]:
clean_complete_details ["busNumber"] = clean_complete_details ["name"].apply(find_bus)

NameError: name 'clean_complete_details' is not defined

In [27]:
clean_complete_details .groupby(['itcs_stopName','busNumber']).size()[clean_complete_details .groupby(['itcs_stopName','busNumber']).size()>=1]

NameError: name 'clean_complete_details' is not defined

### 7. For each station, determine the buses that have stopped there at least ten times.

In [28]:
clean_complete_details .groupby(['itcs_stopName','busNumber']).size()[clean_complete_details .groupby(['itcs_stopName','busNumber']).size()>=10]

NameError: name 'clean_complete_details' is not defined

### 9. For each (route, bus) pair, compute the ratio between the overall energy consumption and the overall driven distance. 

In [29]:
sum_ec=trips.groupby(["busRoute","busNumber"])["energyConsumption"].sum()
sum_dd=trips.groupby(["busRoute","busNumber"])["drivenDistance"].sum()
energy_ratio=sum_ec/sum_dd
energy_ratio

busRoute  busNumber
31        183          6086.891785
          208          5290.862713
32        183          6174.585331
          208          5491.247671
33        183          5970.747507
          208          5639.244537
46        183          5631.035959
          208          5583.110707
72        183          5899.029643
          208          5410.318878
83        183          5844.977154
          208          5819.638891
N1        183          5980.920966
          208          5640.061883
N2        183          5701.850332
          208          5405.152963
N4        183          6154.986937
          208          6067.190112
dtype: float64

### 10. Starting from the results of the previous point, for each route compute the buses with max and min energy ratio, and save the difference between these ratios in a dataframe.

In [30]:
sum_ec_df=trips.groupby(["busRoute", "busNumber"], as_index = False)["energyConsumption"].sum()
sum_dd_df=trips.groupby(["busRoute", "busNumber"], as_index = False)["drivenDistance"].sum()

In [31]:
merge_ec_dd=pd.merge(sum_ec_df, sum_dd_df, on=["busRoute","busNumber"])
merge_ec_dd

Unnamed: 0,busRoute,busNumber,energyConsumption,drivenDistance
0,31,183,13877820000.0,2279952.0
1,31,208,5535865000.0,1046306.68
2,32,183,13933200000.0,2256540.1
3,32,208,12833400000.0,2337064.21
4,33,183,176199600000.0,29510481.41
5,33,208,34748420000.0,6161892.39
6,46,183,102623700000.0,18224657.94
7,46,208,23373660000.0,4186493.7
8,72,183,139436800000.0,23637238.45
9,72,208,51003820000.0,9427138.15


In [32]:
merge_ec_dd["energyRatio"]=merge_ec_dd["energyConsumption"]/merge_ec_dd["drivenDistance"]
merge_ec_dd

Unnamed: 0,busRoute,busNumber,energyConsumption,drivenDistance,energyRatio
0,31,183,13877820000.0,2279952.0,6086.891785
1,31,208,5535865000.0,1046306.68,5290.862713
2,32,183,13933200000.0,2256540.1,6174.585331
3,32,208,12833400000.0,2337064.21,5491.247671
4,33,183,176199600000.0,29510481.41,5970.747507
5,33,208,34748420000.0,6161892.39,5639.244537
6,46,183,102623700000.0,18224657.94,5631.035959
7,46,208,23373660000.0,4186493.7,5583.110707
8,72,183,139436800000.0,23637238.45,5899.029643
9,72,208,51003820000.0,9427138.15,5410.318878


In [33]:
merge_min=merge_ec_dd.groupby("busRoute", as_index=False)[["busNumber","energyRatio"]].min()
merge_max=merge_ec_dd.groupby("busRoute", as_index=False)[["busNumber","energyRatio"]].max()

In [34]:
merge_tot=pd.merge(merge_min, merge_max, on="busRoute" , suffixes=["_min", "_max"])
merge_tot

Unnamed: 0,busRoute,busNumber_min,energyRatio_min,busNumber_max,energyRatio_max
0,31,183,5290.862713,208,6086.891785
1,32,183,5491.247671,208,6174.585331
2,33,183,5639.244537,208,5970.747507
3,46,183,5583.110707,208,5631.035959
4,72,183,5410.318878,208,5899.029643
5,83,183,5819.638891,208,5844.977154
6,N1,183,5640.061883,208,5980.920966
7,N2,183,5405.152963,208,5701.850332
8,N4,183,6067.190112,208,6154.986937


In [35]:
merge_tot["difference_er"]=merge_tot["energyRatio_max"]-merge_tot["energyRatio_min"]
merge_tot

Unnamed: 0,busRoute,busNumber_min,energyRatio_min,busNumber_max,energyRatio_max,difference_er
0,31,183,5290.862713,208,6086.891785,796.029072
1,32,183,5491.247671,208,6174.585331,683.33766
2,33,183,5639.244537,208,5970.747507,331.502969
3,46,183,5583.110707,208,5631.035959,47.925252
4,72,183,5410.318878,208,5899.029643,488.710765
5,83,183,5819.638891,208,5844.977154,25.338263
6,N1,183,5640.061883,208,5980.920966,340.859083
7,N2,183,5405.152963,208,5701.850332,296.697369
8,N4,183,6067.190112,208,6154.986937,87.796825


### 11. Find the bus maximizing the difference computed in the previous point.

In [36]:
merge_tot.loc[merge_tot["difference_er"].idxmax()][["busRoute","difference_er"]]

busRoute                 31
difference_er    796.029072
Name: 0, dtype: object

### 12. Extract the rows of the details such that the `gnss_altitude` differs from the value in the preceding row. Store also the difference in the variable `altitude_variation`.

In [38]:
clean_complete_details_2= complete_details.dropna(subset = ["gnss_altitude"])

NameError: name 'complete_details' is not defined

In [39]:
clean_complete_details_2["gnss_altitude"].head(10)

NameError: name 'clean_complete_details_2' is not defined

In [40]:
clean_complete_details_2["altitude_variation"]=clean_complete_details_2['gnss_altitude'].diff()

NameError: name 'clean_complete_details_2' is not defined

In [41]:
clean_complete_details_2[clean_complete_details_2["altitude_variation"]!=0]

NameError: name 'clean_complete_details_2' is not defined

### 13. For each details dataset, compute the sum of the absolute value (i.e. the sign is not considered) of `altitude_variation`.

In [44]:
clean_complete_details_2["altitude_variation_abs"]=clean_complete_details_2["altitude_variation"].abs()

NameError: name 'clean_complete_details_2' is not defined

In [None]:
clean_complete_details _2.groupby("name")["altitude_variation_abs"].sum()

### 14.  For each month of the year, compute the average ambient temperature

In [45]:
trips.dtypes

name                                         object
busNumber                                     int64
startTime_iso                   datetime64[ns, UTC]
startTime_unix                                int64
endTime_iso                                  object
endTime_unix                                  int64
drivenDistance                              float64
busRoute                                     object
energyConsumption                           float64
itcs_numberOfPassengers_mean                float64
itcs_numberOfPassengers_min                   int64
itcs_numberOfPassengers_max                   int64
status_gridIsAvailable_mean                 float64
temperature_ambient_mean                    float64
temperature_ambient_min                     float64
temperature_ambient_max                     float64
ratio                                       float64
dtype: object

In [46]:
trips['month']=trips['startTime_iso'].dt.month
trips

Unnamed: 0,name,busNumber,startTime_iso,startTime_unix,endTime_iso,endTime_unix,drivenDistance,busRoute,energyConsumption,itcs_numberOfPassengers_mean,itcs_numberOfPassengers_min,itcs_numberOfPassengers_max,status_gridIsAvailable_mean,temperature_ambient_mean,temperature_ambient_min,temperature_ambient_max,ratio,month
0,B183_2019-04-30_03-18-56_2019-04-30_08-44-20,183,2019-04-30 03:18:56+00:00,1556594336,2019-04-30T08:44:20Z,1556613860,77213.87,,4.785852e+08,5.538860,0,20,0.740640,282.3780,281.15,293.1500,8.640500e+07,4
1,B183_2019-04-30_13-22-07_2019-04-30_17-54-02,183,2019-04-30 13:22:07+00:00,1556630527,2019-04-30T17:54:02Z,1556646842,59029.60,31,4.022585e+08,33.114580,4,74,0.855234,287.5443,285.15,293.1500,1.214747e+07,4
2,B183_2019-05-01_05-58-51_2019-05-01_22-32-30,183,2019-05-01 05:58:51+00:00,1556690331,2019-05-01T22:32:30Z,1556749950,240900.40,33,1.445733e+09,19.689140,0,55,0.777860,288.7490,280.15,294.1500,7.342794e+07,5
3,B183_2019-05-03_02-50-21_2019-05-03_05-53-20,183,2019-05-03 02:50:21+00:00,1556851821,2019-05-03T05:53:20Z,1556862800,42565.48,,2.819867e+08,1.685185,0,8,0.767122,282.4129,281.15,292.1500,1.673328e+08,5
4,B183_2019-05-03_15-41-57_2019-05-03_23-06-24,183,2019-05-03 15:41:57+00:00,1556898117,2019-05-03T23:06:24Z,1556924784,125277.20,72,6.207258e+08,23.753570,1,67,0.907342,284.7325,282.15,287.1500,2.613190e+07,5
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
1404,B208_2022-12-06_14-43-49_2022-12-06_18-22-52,208,2022-12-06 14:43:49+00:00,1670337829,2022-12-06T18:22:52Z,1670350972,51798.78,32,4.260419e+08,39.808990,0,83,0.739349,279.6404,278.15,291.1500,1.070215e+07,12
1405,B208_2022-12-07_05-13-02_2022-12-07_19-19-53,208,2022-12-07 05:13:02+00:00,1670389982,2022-12-07T19:19:53Z,1670440793,210041.60,83,1.536697e+09,28.785390,0,115,0.434858,279.5283,278.15,292.6655,5.338462e+07,12
1406,B208_2022-12-08_05-22-20_2022-12-08_18-39-15,208,2022-12-08 05:22:20+00:00,1670476940,2022-12-08T18:39:15Z,1670524755,190372.70,83,1.415700e+09,29.879400,0,102,0.439916,279.1724,277.15,292.1500,4.738047e+07,12
1407,B208_2022-12-09_23-55-12_2022-12-10_03-24-28,208,2022-12-09 23:55:12+00:00,1670630112,2022-12-10T03:24:28Z,1670642668,59548.57,N1,4.519165e+08,20.105260,0,74,0.495739,279.4540,277.15,291.1500,2.247753e+07,12


In [47]:
trips.groupby('month')['temperature_ambient_mean'].mean()

month
1     278.958474
2     280.529928
3     283.319157
4     286.608216
5     290.051582
6     296.143297
7     297.376967
8     295.847943
9     292.362038
10    287.714072
11    281.884420
12    279.381561
Name: temperature_ambient_mean, dtype: float64

### 15. For each bus compute the total time when the halt brake is active and the total time when the park brake is active. Compute also the ratio between those two times.

In [48]:
complete_details["busNumber"] = complete_details["name"].apply(find_bus)

NameError: name 'complete_details' is not defined

In [49]:
sum_Halt_hour = (complete_details.groupby("busNumber")["status_haltBrakeIsActive"].sum())/3600 
sum_Halt_hour

NameError: name 'complete_details' is not defined

In [59]:
sum_Park_hour = (complete_details.groupby("busNumber")["status_parkBrakeIsActive"].sum())/3600 
sum_Park_hour

NameError: name 'complete_details' is not defined

In [51]:
ratio_Brakes = sum_Halt_hour/sum_Park_hour
ratio_Brakes

NameError: name 'sum_Halt_hour' is not defined

### 16. For each pair of stops that are consecutive in at least a trip, compute the average speed achieved when going from the first to the second stop.

In [52]:
import statistics

def mean_speed(i1,i2, df_mean):
    """ Returns the mean of the values of the variable "odometry_vehicleSpeed" of the dataframe df_mean from index i1 to 12"""
    lista=list()
    for i in range(i1,i2):
        lista.append(df_mean["odometry_vehicleSpeed"][i])
        
    return statistics.mean(lista)

In [53]:
index=1
for i in details[0:5]:
# for i in details:
    i_nonasitcs=i.dropna(subset=["itcs_stopName"])
    list_indexes=i_nonasitcs.index.values.tolist()
    

    for j in range(0,len(list_indexes)-1):
        mean_indexes=mean_speed(list_indexes[j],list_indexes[j+1], i)
        print(i_nonasitcs["itcs_stopName"][list_indexes[j]],
              i_nonasitcs["itcs_stopName"][list_indexes[j+1]],
              list_indexes[j],
              list_indexes[j+1],
              mean_indexes)
   
        avg_speed_df.at[index,"FirstStop"]=i_nonasitcs["itcs_stopName"][list_indexes[j]]
        

Zürich, Herdernstrasse Zürich, Hardplatz 246 322 6.710928429


NameError: name 'avg_speed_df' is not defined

In [57]:
avg_speed_df.at[index,"SecondStop"]=i_nonasitcs["itcs_stopName"][list_indexes[j+1]]
        avg_speed_df.at[index,"Index_FirstStop"]=list_indexes[j]
        avg_speed_df.at[index,"Index_SecondStop"]=list_indexes[j+1]
        avg_speed_df.at[index,"AvgSpeed"]=mean_indexes
   
        index+=1

IndentationError: unexpected indent (1041190612.py, line 2)

In [58]:
avg_speed_df

NameError: name 'avg_speed_df' is not defined