# Maintenance Schedule for Specialty Markings
The purpose of this notebook is to create a proposed maintenance schedule for specialty markings

#### Disclaimer
This product is for informational purposes and may not have been prepared for or be suitable for legal, engineering, or surveying purposes. It does not represent an on-the-ground survey and represents only the approximate relative location of property boundaries. This product has been produced by Austin Transportation Department for the sole purpose of geographic reference. No warranty is made by the City of Austin regarding specific accuracy or completeness.

## Imports

In [107]:
import pandas as pd
import xlrd

## Constants

In [108]:
FILE = r'C:\Users\Govs\Projects\Files\OMA_Maint_Table'

## Setup Table
The first step is to setup a table listing the counts of specialty markings seperated into 4 categories:

| Type | Years | Method |
| --- | --- | --- |
| CBD | 4 | CBD Polygon Intersect Counts |
| Signal | 4 | Signalized intersection join based on Intersection ID |
| Bike | 6 | Categorize by specialty markings type and subtype |
| Other | 6 | When no type is applicable |

Lets create 4 columns, which lists the annual number of specialty markings that will be maintained based on year cycle. A fifth column will list the annual number of assets maintained for that grid.

In [109]:
# List variables
field_list = ['GRIDS_200_ID','MAJORITY_DISTRICT',
              'SPECIALTY_COUNT_CBD','SPECIALTY_COUNT_SIGNAL','SPECIALTY_COUNT_BIKE','SPECIALTY_COUNT_OTHER']
t = ['CBD','SIGNAL','BIKE','OTHER']

# Create dataframe
df = pd.read_excel(FILE+ '.xls')
df = df.filter(field_list)

df['TOTAL'] = df[field_list[2:7]].sum(axis=1)

Lets display the first 10 columns to get a clue into the total number of specialty markings are in each grid as well as the average total of assets for each grid.

In [110]:
display(df.head(10).filter(['GRIDS_200_ID','MAJORITY_DISTRICT','TOTAL','ANNUAL_TOTAL']).set_index('GRIDS_200_ID'))

Unnamed: 0_level_0,MAJORITY_DISTRICT,TOTAL
GRIDS_200_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
724,6,0
725,6,0
726,6,0
727,6,134
728,6,157
729,6,404
730,6,0
756,7,2
757,7,0
758,6,0


In [111]:
grid_avg = int(df['TOTAL'].mean())
print("The average total of assets for each grid is {}.".format(str(grid_avg)))

The average total of assets for each grid is 73.


Let's group this table by the <b>MAJORITY_DISTRICT</b> field while listing the total grids.

In [112]:
cols = ['SPECIALTY_COUNT_CBD','SPECIALTY_COUNT_SIGNAL','SPECIALTY_COUNT_BIKE','SPECIALTY_COUNT_OTHER']
i = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

districts = df.groupby('MAJORITY_DISTRICT').sum()[cols].reindex(i)
districts['TOTAL_GRIDS'] = df.groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].nunique()
districts['TOTAL'] =  districts[cols[:4]].sum(axis=1)
#districts['ANNUAL_DISTRICT_TOTAL'] = districts[cols[4:8]].sum(axis=1)
districts['GRIDS_LIST'] = df.groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].unique()

display(districts.filter(['MAJORITY_DISTRICT','TOTAL_GRIDS','TOTAL',]))

Unnamed: 0_level_0,TOTAL_GRIDS,TOTAL
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1
1,74,3481
2,83,1881
3,18,2269
4,11,2761
5,38,3066
6,90,3563
7,39,6333
8,74,2382
9,15,8536
10,52,2553


In [113]:
district_avg = int(districts['TOTAL'].mean())
print("The average total of assets for each district is {}.".format(str(district_avg)))

The average total of assets for each district is 3682.


## Summarize Category per District
The next steps after setting up the table is to
 - Determine the girds affected by category
 - Catergorize grids by majority district

In [114]:
## Method used to create tables
def category_table(name,df):
    i = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
    col = ['MAJORITY_DISTRICT','SPECIALTY_COUNT_' + name,
           #'ANNUAL_COUNT_' + name
          ]
    cat_df = df.copy().query('SPECIALTY_COUNT_{} > 0'.format(name)).set_index(['GRIDS_200_ID']).filter(col)
    district = cat_df.groupby('MAJORITY_DISTRICT').sum()[['SPECIALTY_COUNT_' + name,
                                                          #'ANNUAL_COUNT_' + name
                                                         ]]
    district['GRIDS_LIST'] = df.query('SPECIALTY_COUNT_{} > 0'.format(name)).groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].unique()
    district['TOTAL_GRIDS'] = df.query('SPECIALTY_COUNT_{} > 0'.format(name)).groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].nunique()
    ids = district['GRIDS_LIST'][0]
    #avg = int(cat_df['ANNUAL_COUNT_' + name].mean())
    return district.reindex(i)

### CBD

In [115]:
cbd_district = category_table('CBD',df)
display(cbd_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_CBD,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,,,
2,,,
3,,,
4,,,
5,,,
6,,,
7,,,
8,,,
9,2453.0,"[1551, 1465, 1466, 1518, 1519]",5.0
10,,,


### Signals

In [116]:
sig_district = category_table('SIGNAL',df)
display(sig_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_SIGNAL,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,794,"[1116, 1117, 1118, 1119, 1062, 1423, 1278, 127...",21
2,493,"[1734, 1763, 1764, 1765, 1766, 1767, 1768, 178...",21
3,564,"[1605, 1735, 1639, 1640, 1641, 1676, 1677, 167...",13
4,734,"[1120, 1121, 1150, 1151, 1064, 1207, 1208, 124...",11
5,834,"[1608, 1736, 1737, 1769, 1642, 1680, 1681, 169...",20
6,1256,"[727, 728, 729, 760, 761, 792, 793, 634, 635, ...",34
7,2104,"[974, 975, 976, 977, 1092, 1152, 990, 993, 994...",31
8,513,"[1558, 1560, 1587, 1610, 1612, 1615, 1739, 177...",28
9,1624,"[1396, 1425, 1426, 1371, 1372, 1383, 1606, 160...",15
10,609,"[1093, 1122, 1123, 1124, 1153, 1154, 1067, 938...",20


### Bike

In [117]:
bike_district = category_table('BIKE',df)
display(bike_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_BIKE,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1369,"[1115, 1116, 1117, 1118, 1119, 1149, 1026, 106...",30
2,597,"[1734, 1764, 1765, 1766, 1767, 1768, 1784, 178...",13
3,793,"[1605, 1735, 1639, 1640, 1641, 1677, 1678, 167...",13
4,781,"[1120, 1121, 1150, 1151, 1064, 1207, 1208, 124...",11
5,912,"[1608, 1736, 1737, 1769, 1770, 1642, 1680, 168...",23
6,1088,"[727, 728, 729, 760, 761, 792, 793, 634, 635, ...",33
7,1596,"[974, 975, 976, 977, 1092, 1152, 990, 993, 994...",34
8,854,"[1587, 1739, 1772, 1773, 1775, 1777, 1779, 178...",20
9,1514,"[1396, 1425, 1426, 1371, 1372, 1383, 1606, 160...",15
10,787,"[979, 1122, 1124, 1153, 1154, 1031, 1032, 1067...",23


### Other

In [118]:
other_district = category_table('OTHER',df)
display(other_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_OTHER,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1318,"[1115, 1116, 1117, 1118, 1119, 1149, 1026, 106...",42
2,791,"[1730, 1732, 1733, 1734, 1761, 1765, 1766, 176...",29
3,912,"[1604, 1605, 1735, 1639, 1640, 1641, 1676, 167...",15
4,1246,"[1120, 1121, 1150, 1151, 1064, 1207, 1208, 124...",11
5,1320,"[1608, 1736, 1737, 1769, 1770, 1642, 1680, 168...",27
6,1219,"[727, 728, 729, 760, 761, 792, 633, 634, 635, ...",43
7,2633,"[756, 974, 975, 976, 977, 1092, 1152, 990, 993...",35
8,1015,"[1429, 1557, 1558, 1560, 1562, 1587, 1610, 161...",39
9,2945,"[1396, 1425, 1426, 1371, 1372, 1383, 1606, 160...",15
10,1157,"[978, 979, 1093, 1094, 1122, 1123, 1153, 1154,...",28


## Sort grids by average
After setting up the tables, assign grids to each maintenance year to an equitable amount of assets maintained per specialty markings type.

Note that each district has a variable total number of assets

In [119]:
# Method to apply for CBD, Signals, Bike, and Other
def sortGrids(name,df,years):
    temp = dict([(x,[]) for x in range(1,years + 1)])
    yr = 1
    col = 'SPECIALTY_COUNT_{}'.format(name)
    year_df = df.query(col + ' > 0')
    district_df = pd.DataFrame(index=i)
    district_df.index.name = 'MAJORITY_DISTRICT'
    for index,row in year_df.set_index('GRIDS_200_ID').iterrows():
        temp[yr].append((index,row[col]))
        yr += 1 if sum(x[1] for x in temp[yr]) >= grid_avg else yr
        min_yr = dict([(x,sum(x[1] for x in temp[x])) for x in range(1,years + 1)])
        yr = min(min_yr, key=min_yr.get)
    for x in range(1,years + 1):
        new_col = [name + '_YEAR_' + str(x),'PERCENT_' + name + '_' + str(x)]
        y = [i[0] for i in temp[x]]
        temp_df = year_df[year_df['GRIDS_200_ID'].isin(y)]
        temp_df = temp_df.groupby('MAJORITY_DISTRICT').sum()[[col]].reindex(i).rename(columns={col:new_col[0]})
        temp_df['PERCENT_SIGNAL_' + str(x)] = 100 * temp_df[new_col[0]] / category_table(name,df)[col].sum()
        district_df = district_df.join(temp_df)
    return temp,district_df

#### CBD

In [120]:
cbd_ids, year_cbd = sortGrids('CBD',df,4)
display(year_cbd)

Unnamed: 0_level_0,CBD_YEAR_1,PERCENT_SIGNAL_1,CBD_YEAR_2,PERCENT_SIGNAL_2,CBD_YEAR_3,PERCENT_SIGNAL_3,CBD_YEAR_4,PERCENT_SIGNAL_4
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,,,,,,,,
2,,,,,,,,
3,,,,,,,,
4,,,,,,,,
5,,,,,,,,
6,,,,,,,,
7,,,,,,,,
8,,,,,,,,
9,187.0,7.623318,622.0,25.356706,88.0,3.587444,1556.0,63.432532
10,,,,,,,,


In [121]:
display(year_cbd.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
CBD_YEAR_1,187.0
PERCENT_SIGNAL_1,7.623318
CBD_YEAR_2,622.0
PERCENT_SIGNAL_2,25.356706
CBD_YEAR_3,88.0
PERCENT_SIGNAL_3,3.587444
CBD_YEAR_4,1556.0
PERCENT_SIGNAL_4,63.432532


#### Signals

In [122]:
sig_ids, year_sig = sortGrids('SIGNAL',df,4)
display(year_sig)

Unnamed: 0_level_0,SIGNAL_YEAR_1,PERCENT_SIGNAL_1,SIGNAL_YEAR_2,PERCENT_SIGNAL_2,SIGNAL_YEAR_3,PERCENT_SIGNAL_3,SIGNAL_YEAR_4,PERCENT_SIGNAL_4
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,79,0.829396,330,3.464567,224,2.351706,161,1.690289
2,13,0.136483,134,1.406824,172,1.805774,174,1.826772
3,126,1.322835,99,1.03937,105,1.102362,234,2.456693
4,164,1.721785,28,0.293963,400,4.199475,142,1.490814
5,285,2.992126,199,2.089239,259,2.71916,91,0.955381
6,226,2.372703,350,3.674541,363,3.811024,317,3.328084
7,771,8.094488,513,5.385827,322,3.380577,498,5.228346
8,151,1.585302,132,1.385827,103,1.081365,127,1.333333
9,551,5.784777,244,2.56168,282,2.96063,547,5.742782
10,8,0.08399,338,3.548556,138,1.448819,125,1.312336


In [123]:
display(year_sig.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
SIGNAL_YEAR_1,2374.0
PERCENT_SIGNAL_1,24.923885
SIGNAL_YEAR_2,2367.0
PERCENT_SIGNAL_2,24.850394
SIGNAL_YEAR_3,2368.0
PERCENT_SIGNAL_3,24.860892
SIGNAL_YEAR_4,2416.0
PERCENT_SIGNAL_4,25.364829


#### Bike

In [124]:
bike_ids, year_bike = sortGrids('BIKE',df,6)
display(year_bike)

Unnamed: 0_level_0,BIKE_YEAR_1,PERCENT_SIGNAL_1,BIKE_YEAR_2,PERCENT_SIGNAL_2,BIKE_YEAR_3,PERCENT_SIGNAL_3,BIKE_YEAR_4,PERCENT_SIGNAL_4,BIKE_YEAR_5,PERCENT_SIGNAL_5,BIKE_YEAR_6,PERCENT_SIGNAL_6
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,106,1.030026,338,3.284423,411.0,3.993781,115,1.117481,172,1.671363,227,2.205811
2,192,1.865708,20,0.194345,185.0,1.797687,17,0.165193,83,0.80653,100,0.971723
3,87,0.845399,73,0.709358,189.0,1.836556,131,1.272957,145,1.408998,168,1.632494
4,229,2.225245,129,1.253522,16.0,0.155476,67,0.651054,319,3.099796,21,0.204062
5,101,0.98144,171,1.661646,158.0,1.535322,149,1.447867,226,2.196094,107,1.039743
6,135,1.311826,275,2.672238,198.0,1.924011,151,1.467302,159,1.545039,170,1.651929
7,197,1.914294,315,3.060927,337.0,3.274706,384,3.731416,148,1.43815,215,2.089204
8,104,1.010592,215,2.089204,,,100,0.971723,254,2.468176,181,1.758818
9,398,3.867457,29,0.2818,60.0,0.583034,469,4.55738,155,1.50617,403,3.916043
10,152,1.477019,172,1.671363,177.0,1.719949,113,1.098047,53,0.515013,120,1.166067


In [125]:
display(year_bike.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
BIKE_YEAR_1,1701.0
PERCENT_SIGNAL_1,16.529006
BIKE_YEAR_2,1737.0
PERCENT_SIGNAL_2,16.878826
BIKE_YEAR_3,1731.0
PERCENT_SIGNAL_3,16.820523
BIKE_YEAR_4,1696.0
PERCENT_SIGNAL_4,16.48042
BIKE_YEAR_5,1714.0
PERCENT_SIGNAL_5,16.65533


#### Other

In [126]:
other_ids, year_other = sortGrids('OTHER',df,6)
display(year_other)

Unnamed: 0_level_0,OTHER_YEAR_1,PERCENT_SIGNAL_1,OTHER_YEAR_2,PERCENT_SIGNAL_2,OTHER_YEAR_3,PERCENT_SIGNAL_3,OTHER_YEAR_4,PERCENT_SIGNAL_4,OTHER_YEAR_5,PERCENT_SIGNAL_5,OTHER_YEAR_6,PERCENT_SIGNAL_6
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,455.0,3.125859,74,0.508381,237,1.628195,284,1.951085,102,0.700742,166,1.140423
2,205.0,1.408354,122,0.838142,115,0.790052,105,0.721352,64,0.439681,180,1.236603
3,,,99,0.680132,71,0.487771,162,1.112943,481,3.304479,99,0.680132
4,226.0,1.552624,385,2.644957,174,1.195383,170,1.167903,158,1.085463,133,0.913713
5,186.0,1.277824,195,1.339654,274,1.882385,179,1.229733,146,1.003023,340,2.335807
6,45.0,0.309151,163,1.119813,260,1.786205,192,1.319044,189,1.298434,370,2.541907
7,370.0,2.541907,363,2.493817,561,3.854081,370,2.541907,541,3.71668,428,2.940368
8,247.0,1.696895,92,0.632042,146,1.003023,204,1.401484,249,1.710635,77,0.528991
9,490.0,3.366309,1109,7.618851,285,1.957955,293,2.012916,374,2.569387,394,2.706788
10,121.0,0.831272,110,0.755702,223,1.532014,385,2.644957,121,0.831272,197,1.353394


In [127]:
display(year_other.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
OTHER_YEAR_1,2345.0
PERCENT_SIGNAL_1,16.110195
OTHER_YEAR_2,2712.0
PERCENT_SIGNAL_2,18.631492
OTHER_YEAR_3,2346.0
PERCENT_SIGNAL_3,16.117065
OTHER_YEAR_4,2344.0
PERCENT_SIGNAL_4,16.103325
OTHER_YEAR_5,2425.0
PERCENT_SIGNAL_5,16.659797


## Assigning Grids FY Maintenance date by Specialty type 

In [141]:
from datetime import datetime
id_lists = [cbd_ids,sig_ids,bike_ids,other_ids]
id_years = {}
final = df.copy().set_index('GRIDS_200_ID')
for i in range(len(id_lists)):
    temp = {}
    for v in range(1, len(id_lists[i]) + 1):
        year_keys = {t[0]:int(datetime.now().year + v) for t in id_lists[i][v]}
        temp.update(year_keys)
    id_years.update(temp)
    temp_df = pd.DataFrame.from_dict(id_years,orient='index').rename(columns={0:'FY_SPECIALTY_' + t[i]})
    final = final.join(temp_df,sort=True).fillna('NONE')
display(final)

Unnamed: 0_level_0,MAJORITY_DISTRICT,SPECIALTY_COUNT_CBD,SPECIALTY_COUNT_SIGNAL,SPECIALTY_COUNT_BIKE,SPECIALTY_COUNT_OTHER,TOTAL,FY_SPECIALTY_CBD,FY_SPECIALTY_SIGNAL,FY_SPECIALTY_BIKE,FY_SPECIALTY_OTHER
GRIDS_200_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
579,6,0,0,0,2,2,NONE,NONE,NONE,2022
580,6,0,0,0,18,18,NONE,NONE,NONE,2022
629,NOT IN DISTRICT,0,0,0,0,0,NONE,NONE,NONE,NONE
633,6,0,0,0,1,1,NONE,NONE,NONE,2024
634,6,0,6,33,29,68,NONE,2021,2025,2024
...,...,...,...,...,...,...,...,...,...,...
2166,5,0,0,0,0,0,NONE,NONE,NONE,NONE
2167,5,0,0,0,0,0,NONE,NONE,NONE,NONE
2191,5,0,0,0,0,0,NONE,NONE,NONE,NONE
2224,5,0,0,0,0,0,NONE,NONE,NONE,NONE


### Export table in csv

In [156]:
file_path = r"G:\ATD\Signs_and_Markings\MISC_PROJECTS\Maintenance_Plan_Signs_and_Markings\SPECIALTY_MARKINGS_MAINTENANCE_PLAN"

final.to_csv(file_path + r'\OMA_Maint_Specialty.csv',index='GRIDS_200_ID')
writer = pd.ExcelWriter(file_path + r'\OMA_Maint_Specialty.xlsx', engine='xlsxwriter')

year_lists = [year_cbd,year_sig,year_bike,year_other]

for i in range(len(year_lists)):
    year_lists[i].to_excel(writer, sheet_name=t[i])
    year_lists[i].sum().to_frame().rename(columns={0:'SUM'}).to_excel(writer, sheet_name=t[i]+ '_SUM')
    ws = writer.sheets[t[i]]
    ws1 = writer.sheets[t[i] + '_SUM']
    ws.set_column('A:O',19)
    ws1.set_column('A:B',20)
final.to_excel(writer, sheet_name='Final')
writer.save()