# Maintenance Schedule for Specialty Markings
The purpose of this notebook is to create a proposed maintenance schedule for specialty markings

#### Disclaimer
This product is for informational purposes and may not have been prepared for or be suitable for legal, engineering, or surveying purposes. It does not represent an on-the-ground survey and represents only the approximate relative location of property boundaries. This product has been produced by Austin Transportation Department for the sole purpose of geographic reference. No warranty is made by the City of Austin regarding specific accuracy or completeness.

## Imports

In [30]:
import pandas as pd
import xlrd

## Constants

In [31]:
FILE = r'\\coacd.org\dfs\TPSD\ATD\Signs_and_Markings\MISC_PROJECTS\Maintenance_Plan_Signs_and_Markings\SPECIALTY_MARKINGS_MAINTENANCE_PLAN\OMA_Maint_Specialty'

## Setup Table
The first step is to setup a table listing the counts of specialty markings seperated into 4 categories:

| Type | Years | Method |
| --- | --- | --- |
| CBD | 4 | CBD Polygon Intersect Counts |
| Signal | 4 | Signalized intersection join based on Intersection ID |
| Bike | 6 | Categorize by specialty markings type and subtype |
| Other | 6 | When no type is applicable |

Lets create 4 columns, which lists the annual number of specialty markings that will be maintained based on year cycle. A fifth column will list the annual number of assets maintained for that grid.

In [32]:
# List variables
field_list = ['GRIDS_200_ID','MAJORITY_DISTRICT',
              'SPECIALTY_COUNT_CBD','SPECIALTY_COUNT_SIGNAL','SPECIALTY_COUNT_BIKE','SPECIALTY_COUNT_OTHER']
t = ['CBD','SIGNAL','BIKE','OTHER']

# Create dataframe
df = pd.read_csv(FILE+ '.csv')
df = df.filter(field_list)

df['TOTAL'] = df[field_list[2:7]].sum(axis=1)

Lets display the first 10 columns to get a clue into the total number of specialty markings are in each grid as well as the average total of assets for each grid.

In [33]:
display(df.head(10).filter(['GRIDS_200_ID','MAJORITY_DISTRICT','TOTAL','ANNUAL_TOTAL']).set_index('GRIDS_200_ID'))

Unnamed: 0_level_0,MAJORITY_DISTRICT,TOTAL
GRIDS_200_ID,Unnamed: 1_level_1,Unnamed: 2_level_1
579,6,2
580,6,18
629,NOT IN DISTRICT,0
633,6,1
634,6,68
635,6,119
636,6,42
664,6,0
665,6,0
666,6,35


In [34]:
grid_avg = int(df['TOTAL'].mean())
print("The average total of assets for each grid is {}.".format(str(grid_avg)))

The average total of assets for each grid is 73.


Let's group this table by the <b>MAJORITY_DISTRICT</b> field while listing the total grids.

In [35]:
cols = ['SPECIALTY_COUNT_CBD','SPECIALTY_COUNT_SIGNAL','SPECIALTY_COUNT_BIKE','SPECIALTY_COUNT_OTHER']
i = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']

districts = df.groupby('MAJORITY_DISTRICT').sum()[cols].reindex(i)
districts['TOTAL_GRIDS'] = df.groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].nunique()
districts['TOTAL'] =  districts[cols[:4]].sum(axis=1)
#districts['ANNUAL_DISTRICT_TOTAL'] = districts[cols[4:8]].sum(axis=1)
districts['GRIDS_LIST'] = df.groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].unique()

display(districts.filter(['MAJORITY_DISTRICT','TOTAL_GRIDS','TOTAL',]))

Unnamed: 0_level_0,TOTAL_GRIDS,TOTAL
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1
1,74,3481
2,83,1881
3,18,2269
4,11,2761
5,38,3066
6,90,3563
7,39,6333
8,74,2382
9,15,8536
10,52,2553


In [36]:
district_avg = int(districts['TOTAL'].mean())
print("The average total of assets for each district is {}.".format(str(district_avg)))

The average total of assets for each district is 3682.


## Summarize Category per District
The next steps after setting up the table is to
 - Determine the girds affected by category
 - Catergorize grids by majority district

In [37]:
## Method used to create tables
def category_table(name,df):
    i = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
    col = ['MAJORITY_DISTRICT','SPECIALTY_COUNT_' + name]
    cat_df = df.copy().query('SPECIALTY_COUNT_{} > 0'.format(name)).set_index(['GRIDS_200_ID']).filter(col)
    district = cat_df.groupby('MAJORITY_DISTRICT').sum()[['SPECIALTY_COUNT_' + name]]
    district['GRIDS_LIST'] = df.query('SPECIALTY_COUNT_{} > 0'.format(name)).groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].unique()
    district['TOTAL_GRIDS'] = df.query('SPECIALTY_COUNT_{} > 0'.format(name)).groupby('MAJORITY_DISTRICT')['GRIDS_200_ID'].nunique()
    ids = district['GRIDS_LIST'][0]
    return district.reindex(i)

### CBD

In [38]:
cbd_district = category_table('CBD',df)
display(cbd_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_CBD,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,,,
2,,,
3,,,
4,,,
5,,,
6,,,
7,,,
8,,,
9,2453.0,"[1465, 1466, 1518, 1519, 1551]",5.0
10,,,


### Signals

In [39]:
sig_district = category_table('SIGNAL',df)
display(sig_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_SIGNAL,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,794,"[1062, 1116, 1117, 1118, 1119, 1278, 1279, 128...",21
2,493,"[1734, 1763, 1764, 1765, 1766, 1767, 1768, 178...",21
3,564,"[1516, 1549, 1550, 1605, 1639, 1640, 1641, 167...",13
4,734,"[1064, 1120, 1121, 1150, 1151, 1207, 1208, 124...",11
5,834,"[1552, 1553, 1608, 1642, 1680, 1681, 1692, 173...",20
6,1256,"[634, 635, 667, 668, 669, 727, 728, 729, 760, ...",34
7,2104,"[810, 850, 851, 852, 883, 899, 934, 935, 936, ...",31
8,513,"[1467, 1468, 1520, 1554, 1558, 1560, 1587, 161...",28
9,1624,"[1341, 1371, 1372, 1383, 1396, 1425, 1426, 146...",15
10,609,"[938, 1067, 1093, 1122, 1123, 1124, 1153, 1154...",20


### Bike

In [40]:
bike_district = category_table('BIKE',df)
display(bike_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_BIKE,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1369,"[1026, 1060, 1061, 1062, 1115, 1116, 1117, 111...",30
2,597,"[1734, 1764, 1765, 1766, 1767, 1768, 1784, 178...",13
3,793,"[1516, 1548, 1549, 1550, 1605, 1639, 1640, 164...",13
4,781,"[1064, 1120, 1121, 1150, 1151, 1207, 1208, 124...",11
5,912,"[1552, 1553, 1608, 1642, 1680, 1681, 1692, 173...",23
6,1088,"[634, 635, 666, 667, 668, 669, 686, 727, 728, ...",33
7,1596,"[810, 812, 850, 851, 852, 883, 898, 899, 900, ...",34
8,854,"[1487, 1520, 1554, 1587, 1645, 1694, 1700, 170...",20
9,1514,"[1341, 1371, 1372, 1383, 1396, 1425, 1426, 146...",15
10,787,"[938, 939, 979, 1031, 1032, 1067, 1069, 1122, ...",23


### Other

In [41]:
other_district = category_table('OTHER',df)
display(other_district)

Unnamed: 0_level_0,SPECIALTY_COUNT_OTHER,GRIDS_LIST,TOTAL_GRIDS
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1318,"[1026, 1060, 1061, 1062, 1115, 1116, 1117, 111...",42
2,791,"[1674, 1675, 1730, 1732, 1733, 1734, 1761, 176...",29
3,912,"[1516, 1548, 1549, 1550, 1604, 1605, 1639, 164...",15
4,1246,"[1064, 1120, 1121, 1150, 1151, 1207, 1208, 124...",11
5,1320,"[1552, 1553, 1608, 1642, 1680, 1681, 1692, 173...",27
6,1219,"[579, 580, 633, 634, 635, 636, 666, 667, 668, ...",43
7,2633,"[756, 810, 812, 850, 851, 883, 898, 899, 900, ...",35
8,1015,"[1429, 1467, 1468, 1487, 1520, 1521, 1554, 155...",39
9,2945,"[1341, 1371, 1372, 1383, 1396, 1425, 1426, 146...",15
10,1157,"[938, 939, 978, 979, 1031, 1067, 1068, 1093, 1...",28


## Sort grids by average
After setting up the tables, assign grids to each maintenance year to an equitable amount of assets maintained per specialty markings type.

Note that each district has a variable total number of assets

In [42]:
# Method to apply for CBD, Signals, Bike, and Other
def sortGrids(name,df,years):
    temp = dict([(x,[]) for x in range(1,years + 1)])
    yr = 1
    col = 'SPECIALTY_COUNT_{}'.format(name)
    year_df = df.query(col + ' > 0')
    district_df = pd.DataFrame(index=i)
    district_df.index.name = 'MAJORITY_DISTRICT'
    for index,row in year_df.set_index('GRIDS_200_ID').iterrows():
        temp[yr].append((index,row[col]))
        yr += 1 if sum(x[1] for x in temp[yr]) >= grid_avg else yr
        min_yr = dict([(x,sum(x[1] for x in temp[x])) for x in range(1,years + 1)])
        yr = min(min_yr, key=min_yr.get)
    for x in range(1,years + 1):
        new_col = [name + '_YEAR_' + str(x),'PERCENT_' + name + '_' + str(x)]
        y = [i[0] for i in temp[x]]
        temp_df = year_df[year_df['GRIDS_200_ID'].isin(y)]
        temp_df = temp_df.groupby('MAJORITY_DISTRICT').sum()[[col]].reindex(i).rename(columns={col:new_col[0]})
        temp_df['PERCENT_' + name + '_' + str(x)] = 100 * temp_df[new_col[0]] / category_table(name,df)[col].sum()
        district_df = district_df.join(temp_df)
    return temp,district_df

#### CBD

In [43]:
cbd_ids, year_cbd = sortGrids('CBD',df,4)
display(year_cbd)

Unnamed: 0_level_0,CBD_YEAR_1,PERCENT_CBD_1,CBD_YEAR_2,PERCENT_CBD_2,CBD_YEAR_3,PERCENT_CBD_3,CBD_YEAR_4,PERCENT_CBD_4
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,,,,,,,,
2,,,,,,,,
3,,,,,,,,
4,,,,,,,,
5,,,,,,,,
6,,,,,,,,
7,,,,,,,,
8,,,,,,,,
9,622.0,25.356706,170.0,6.930289,1556.0,63.432532,105.0,4.280473
10,,,,,,,,


In [44]:
display(year_cbd.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
CBD_YEAR_1,622.0
PERCENT_CBD_1,25.356706
CBD_YEAR_2,170.0
PERCENT_CBD_2,6.930289
CBD_YEAR_3,1556.0
PERCENT_CBD_3,63.432532
CBD_YEAR_4,105.0
PERCENT_CBD_4,4.280473


#### Signals

In [45]:
sig_ids, year_sig = sortGrids('SIGNAL',df,4)
display(year_sig)

Unnamed: 0_level_0,SIGNAL_YEAR_1,PERCENT_SIGNAL_1,SIGNAL_YEAR_2,PERCENT_SIGNAL_2,SIGNAL_YEAR_3,PERCENT_SIGNAL_3,SIGNAL_YEAR_4,PERCENT_SIGNAL_4
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
1,252,2.645669,150,1.574803,78,0.818898,314,3.296588
2,169,1.774278,170,1.784777,5,0.052493,149,1.564304
3,105,1.102362,195,2.047244,59,0.619423,205,2.152231
4,269,2.824147,156,1.637795,279,2.929134,30,0.314961
5,151,1.585302,171,1.795276,312,3.275591,200,2.099738
6,306,3.212598,340,3.569554,355,3.727034,255,2.677165
7,394,4.136483,471,4.944882,564,5.92126,675,7.086614
8,115,1.207349,169,1.774278,126,1.322835,103,1.081365
9,571,5.994751,312,3.275591,480,5.03937,261,2.740157
10,77,0.808399,231,2.425197,105,1.102362,196,2.057743


In [46]:
display(year_sig.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
SIGNAL_YEAR_1,2409.0
PERCENT_SIGNAL_1,25.291339
SIGNAL_YEAR_2,2365.0
PERCENT_SIGNAL_2,24.829396
SIGNAL_YEAR_3,2363.0
PERCENT_SIGNAL_3,24.808399
SIGNAL_YEAR_4,2388.0
PERCENT_SIGNAL_4,25.070866


#### Bike

In [47]:
bike_ids, year_bike = sortGrids('BIKE',df,6)
display(year_bike)

Unnamed: 0_level_0,BIKE_YEAR_1,PERCENT_BIKE_1,BIKE_YEAR_2,PERCENT_BIKE_2,BIKE_YEAR_3,PERCENT_BIKE_3,BIKE_YEAR_4,PERCENT_BIKE_4,BIKE_YEAR_5,PERCENT_BIKE_5,BIKE_YEAR_6,PERCENT_BIKE_6
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,193,1.875425,403.0,3.916043,176,1.710232,349,3.391313,215,2.089204,33.0,0.320669
2,78,0.757944,99.0,0.962006,145,1.408998,119,1.15635,92,0.893985,64.0,0.621903
3,70,0.680206,56.0,0.544165,216,2.098921,96,0.932854,190,1.846273,165.0,1.603343
4,282,2.740258,,,249,2.41959,125,1.214654,125,1.214654,,
5,147,1.428433,254.0,2.468176,157,1.525605,117,1.136916,124,1.204936,113.0,1.098047
6,168,1.632494,194.0,1.885142,220,2.13779,143,1.389564,176,1.710232,187.0,1.817122
7,225,2.186376,353.0,3.430182,154,1.496453,316,3.070644,98,0.952288,450.0,4.372753
8,188,1.826839,128.0,1.243805,71,0.689923,71,0.689923,178,1.729667,218.0,2.118356
9,115,1.117481,171.0,1.661646,188,1.826839,305,2.963755,311,3.022058,424.0,4.120105
10,234,2.273832,74.0,0.719075,151,1.467302,57,0.553882,208,2.021184,63.0,0.612185


In [48]:
display(year_bike.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
BIKE_YEAR_1,1700.0
PERCENT_BIKE_1,16.519289
BIKE_YEAR_2,1732.0
PERCENT_BIKE_2,16.83024
BIKE_YEAR_3,1727.0
PERCENT_BIKE_3,16.781654
BIKE_YEAR_4,1698.0
PERCENT_BIKE_4,16.499854
BIKE_YEAR_5,1717.0
PERCENT_BIKE_5,16.684482


#### Other

In [49]:
other_ids, year_other = sortGrids('OTHER',df,6)
display(year_other)

Unnamed: 0_level_0,OTHER_YEAR_1,PERCENT_OTHER_1,OTHER_YEAR_2,PERCENT_OTHER_2,OTHER_YEAR_3,PERCENT_OTHER_3,OTHER_YEAR_4,PERCENT_OTHER_4,OTHER_YEAR_5,PERCENT_OTHER_5,OTHER_YEAR_6,PERCENT_OTHER_6
MAJORITY_DISTRICT,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1
1,286,1.964826,240.0,1.648805,201,1.380874,309,2.122836,122,0.838142,160,1.099203
2,242,1.662545,,,140,0.961803,41,0.281671,208,1.428964,160,1.099203
3,253,1.738115,98.0,0.673262,30,0.206101,306,2.102226,212,1.456444,13,0.08931
4,133,0.913713,59.0,0.405331,272,1.868645,158,1.085463,312,2.143446,312,2.143446
5,151,1.037373,210.0,1.442704,18,0.12366,382,2.624347,85,0.583952,474,3.256389
6,124,0.851882,332.0,2.280846,366,2.514427,135,0.927453,67,0.460291,195,1.339654
7,579,3.977741,505.0,3.46936,164,1.126683,534,3.66859,601,4.128882,250,1.717505
8,208,1.428964,288.0,1.978566,163,1.119813,87,0.597692,66,0.453421,203,1.394614
9,347,2.383897,375.0,2.576257,761,5.228085,388,2.665567,724,4.973894,350,2.404507
10,110,0.755702,304.0,2.088486,300,2.061006,110,0.755702,36,0.247321,297,2.040396


In [50]:
display(year_other.sum().to_frame().rename(columns={0:'SUM'}))

Unnamed: 0,SUM
OTHER_YEAR_1,2433.0
PERCENT_OTHER_1,16.714757
OTHER_YEAR_2,2411.0
PERCENT_OTHER_2,16.563616
OTHER_YEAR_3,2415.0
PERCENT_OTHER_3,16.591096
OTHER_YEAR_4,2450.0
PERCENT_OTHER_4,16.831547
OTHER_YEAR_5,2433.0
PERCENT_OTHER_5,16.714757


## Assigning Grids FY Maintenance date by Specialty type 

In [51]:
from datetime import datetime
id_lists = [cbd_ids,sig_ids,bike_ids,other_ids]
id_years = {}
final = df.copy().set_index('GRIDS_200_ID')
for i in range(len(id_lists)):
    temp = {}
    for v in range(1, len(id_lists[i]) + 1):
        year_keys = {t[0]:int(datetime.now().year + v) for t in id_lists[i][v]}
        temp.update(year_keys)
    id_years.update(temp)
    temp_df = pd.DataFrame.from_dict(id_years,orient='index').rename(columns={0:'FY_SPECIALTY_' + t[i]})
    final = final.join(temp_df,sort=True).fillna('NONE')
display(final)

Unnamed: 0_level_0,MAJORITY_DISTRICT,SPECIALTY_COUNT_CBD,SPECIALTY_COUNT_SIGNAL,SPECIALTY_COUNT_BIKE,SPECIALTY_COUNT_OTHER,TOTAL,FY_SPECIALTY_CBD,FY_SPECIALTY_SIGNAL,FY_SPECIALTY_BIKE,FY_SPECIALTY_OTHER
GRIDS_200_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
579,6,0,0,0,2,2,NONE,NONE,NONE,2020
580,6,0,0,0,18,18,NONE,NONE,NONE,2021
629,NOT IN DISTRICT,0,0,0,0,0,NONE,NONE,NONE,NONE
633,6,0,0,0,1,1,NONE,NONE,NONE,2022
634,6,0,6,33,29,68,NONE,2020,2020,2023
...,...,...,...,...,...,...,...,...,...,...
2166,5,0,0,0,0,0,NONE,NONE,NONE,NONE
2167,5,0,0,0,0,0,NONE,NONE,NONE,NONE
2191,5,0,0,0,0,0,NONE,NONE,NONE,NONE
2224,5,0,0,0,0,0,NONE,NONE,NONE,NONE


### Export table in csv

In [52]:
file_path = r"G:\ATD\Signs_and_Markings\MISC_PROJECTS\Maintenance_Plan_Signs_and_Markings\SPECIALTY_MARKINGS_MAINTENANCE_PLAN"

final.to_csv(file_path + r'\OMA_Maint_Specialty.csv',index='GRIDS_200_ID')
writer = pd.ExcelWriter(file_path + r'\OMA_Maint_Specialty.xlsx', engine='xlsxwriter')

year_lists = [year_cbd,year_sig,year_bike,year_other]

for i in range(len(year_lists)):
    year_lists[i].to_excel(writer, sheet_name=t[i])
    year_lists[i].sum().to_frame().rename(columns={0:'SUM'}).to_excel(writer, sheet_name=t[i]+ '_SUM')
    ws = writer.sheets[t[i]]
    ws1 = writer.sheets[t[i] + '_SUM']
    ws.set_column('A:O',19)
    ws1.set_column('A:B',20)
final.to_excel(writer, sheet_name='Final')
writer.save()