# 2023-07-07
## Meeting outcomes
* CLip lui's dataset into the area that daniel chose, share with andres
* Wait for Daniel to send network, for me to run a betweenness flow simulation
* Use the Brooklyn case as a test case for betweenness variation between rhino and pytho


## Proprity for me after discussion with orion:
* make sure the new node insertion code works and matches well]

# Steps forward:

* SEND UPDATE ABOUT NYC WHEN TEST CASE IS DONE

## effecient node edge creation
* ~~Make sure the functionality to construct nodes-edges works for a safey buffer~~
* ~~Isolate the node insertion code into script~~
* ~~commit script to my branch~~
* ~~email orion about it, mention how easy to integrate.~~
* look into issue with redundant edges

## Testing
* ~~structure tests so its a one csv: first n rows are settings, next m rows are edges flows, with first column being same segment ids from network file~~
* ~~standard test case is four files: network.jeojson - origins.geojson - destination.jeojson - testflows.csv~~
* ~~function reads a csv into list of dicts: test settings and series of output.~~
* ~~build Harvard Square testflows.csv~~
~~* make sure this config runs on Harvard Square~~
* build Manhattan testflow.csv

## Documentation
* set up a slideshow/diagram to show the relationship between different compoonents of the library.
* Set up a notebook to go over estimating flows from one origin to one destination.
* Document all relevant settings
* show how this generalizes to the pairings.csv
* show how this generalizes to iterating over moodel settings
* show how this generalizes to iterating over scenarios

## input structure
maybe some parameters are global to zonal. Updated only when needed, instead of passed as inputs?

## output structure
* include Logger as a class, and zonal would have an instance of that class. Logger handles event documentation, and captures output.
* think clearly about the above usecases and structure output accordingly
* for each origin, have an origin record, showing its knn weight, reach, gravity towards each of its destination, header rows showing settings and parameters
* for the network, columns of od flows, headed  by settings and parameters rows that explicitly detail units, weight type, calibration status, .. any relevant data, settings and parameters..



In [None]:
import os
import geopandas as gpd
import pandas as pd
import sys
import math


sys.path.append('../')
from madina.zonal.zonal import Zonal
from madina.una.betweenness import parallel_betweenness
from madina.una.elastic import get_elastic_weight

for test_case in os.listdir("Test Cases"):
    # TODO: Check OS compatibility, ensure this is compatible with Unix systems..
    test_case_folder = "Test Cases" + "\\" + test_case + "\\"
    test_config = pd.read_csv(test_case_folder + "test_configs.csv")
    test_flows =  pd.read_csv(test_case_folder + "test_flows.csv")

    harvard_square = Zonal(projected_crs='EPSG:3857')

    harvard_square.load_layer(
        layer_name='streets',
        file_path=  test_case_folder + test_config.at[0, 'Network_File']
        )

    harvard_square.load_layer(
        layer_name=test_config.at[0, 'Origin_Name'],
        file_path= test_case_folder + test_config.at[0, 'Origin_File']
        )

    harvard_square.load_layer(
        layer_name=test_config.at[0, 'Destination_Name'],
        file_path= test_case_folder + test_config.at[0, 'Destination_File']
        )
    
    harvard_square.create_street_network(
        source_layer='streets', 
        discard_redundant_edges=True,
        node_snapping_tolerance=1.0
    )

    harvard_square.insert_node(
        layer_name=test_config.at[0, 'Origin_Name'], 
        label='origin', 
        weight_attribute=test_config.at[3, 'Origin_Weight']
    )

    harvard_square.insert_node(
        layer_name=test_config.at[0, 'Destination_Name'], 
        label='destination', 
        weight_attribute=test_config.at[3, 'Destination_Weight']
    )

    harvard_square.create_graph(light_graph=True, d_graph=True)

    node_gdf = harvard_square.network.nodes
    origin_gdf = node_gdf[node_gdf['type'] == 'origin']

    harvard_square.network.nodes["original_weight"] = harvard_square.network.nodes["weight"]


    # ["original_weight", "elastic_weight", "knn_weight"]

    for test_idx in test_config.index:
        harvard_square.network.turn_penalty_amount = test_config.at[test_idx, 'Turn penalty']
        harvard_square.network.turn_threshold_degree = test_config.at[test_idx, 'Turn threshold']

        if test_config.at[test_idx, 'Elastic_weights']:
            harvard_square.network.nodes["weight"] = harvard_square.network.nodes["original_weight"]
            get_elastic_weight(
                harvard_square.network,
                search_radius=test_config.at[test_idx, 'Radius'],
                detour_ratio=test_config.at[test_idx, 'Detour'],
                beta=test_config.at[test_idx, ' Beta '],
                decay=True, #test_config.at[test_idx, 'Decay'],
                #turn_penalty=test_config.at[test_idx, 'Turns'],
                turn_penalty=False,
            )
            for o_idx in origin_gdf.index:
                harvard_square.network.nodes.at[o_idx, 'weight'] =  harvard_square.network.nodes.at[o_idx, 'elastic_weight']


        return_dict = parallel_betweenness(
            harvard_square.network,
            search_radius=test_config.at[test_idx, 'Radius'],
            detour_ratio=test_config.at[test_idx, 'Detour'],
            decay=test_config.at[test_idx, 'Decay'], #if test['Elastic weights'] else True,
            decay_method=test_config.at[test_idx, 'Decay_Mode'],  # "power", "exponent"
            beta=test_config.at[test_idx, ' Beta '],
            path_detour_penalty="equal",  # "power", "exponent", "equal"
            origin_weights=False if type(test_config.at[test_idx, 'Origin_Weight']) != str else True,
            closest_destination=test_config.at[test_idx, 'Closest_destination'],
            destination_weights=False if type(test_config.at[test_idx, 'Destination_Weight']) != str  else True,    #or (test['Elastic weights'])
            # perceived_distance=False,
            num_cores=2,
            light_graph=True,
            turn_penalty=test_config.at[test_idx, 'Turns'],
        )
        simulated_sum_of_flow = return_dict['edge_gdf']['betweenness'].sum()
        test_flow = test_flows[test_config.at[test_idx, 'test_name']].sum()

        print (test_config.loc[test_idx])
        print (f"{test_config.at[test_idx, 'test_name']}\t\t{simulated_sum_of_flow = }\t test flow = { test_flow }\t difference = {simulated_sum_of_flow - test_flow}\t similarity {1-(simulated_sum_of_flow - test_flow)/ test_flow:.2%}")
    print ("DOne Case...")
    break

In [None]:
test_config

In [None]:
import time
start = time.time()

buildings_file = r"C:\Users\abdul\Dropbox (MIT)\PhD Thesis\Madina\madina\unit_testing\Test Cases\Manhattan\Home_PT_6538.geojson"
subway_file = r"C:\Users\abdul\Dropbox (MIT)\PhD Thesis\Madina\madina\unit_testing\Test Cases\Manhattan\Metro_PT_6538.geojson"
network_file = r"C:\Users\abdul\Dropbox (MIT)\PhD Thesis\Madina\madina\unit_testing\Test Cases\Manhattan\network_clipped_dupremovedAS.geojson"

import sys
sys.path.append('../')
from madina.zonal.zonal import Zonal


harvard_square = Zonal(projected_crs='EPSG:6538')

print(f"{(time.time()-start)*1000:6.2f}ms\t imports done, object created")
start = time.time()

harvard_square.load_layer(
    layer_name='streets',
    file_path=network_file
    )

print(f"{(time.time()-start)*1000:6.2f}ms\t street data loaded")
start = time.time()

harvard_square.load_layer(
    layer_name='buildings',
    file_path=buildings_file
    )

print(f"{(time.time()-start)*1000:6.2f}ms\t building data loaded")
start = time.time()

harvard_square.load_layer(
    layer_name='subway',
    file_path=subway_file
    )

print(f"{(time.time()-start)*1000:6.2f}ms\t subway data loaded")
start = time.time()

harvard_square.create_street_network(
    source_layer='streets', 
    discard_redundant_edges=False, 
    node_snapping_tolerance=1.0
)

print(f"{(time.time()-start)*1000:6.2f}ms\t street network created")
start = time.time()

harvard_square.insert_node(
    layer_name='buildings', 
    label='origin', 
    weight_attribute='TotalPop'
)


print(f"{(time.time()-start)*1000:6.2f}ms\t origins insertes")
start = time.time()

harvard_square.insert_node(
    layer_name='subway', 
    label='destination', 
    weight_attribute='line_ent_st'
)

print(f"{(time.time()-start)*1000:6.2f}ms\t destinations insertes")
start = time.time()

harvard_square.create_graph(light_graph=True, d_graph=True)

print(f"{(time.time()-start)*1000:6.2f}ms\t graph created insertes")
start = time.time()

node_gdf = harvard_square.network.nodes
origin_gdf = node_gdf[node_gdf['type'] == 'origin']

harvard_square.network.nodes["original_weight"] = harvard_square.network.nodes["weight"]
# ["original_weight", "elastic_weight", "knn_weight"]



In [None]:
start = time.time()

return_dict = parallel_betweenness(
    harvard_square.network,
    search_radius=800,
    detour_ratio=1.15,
    decay=False,
    decay_method='exponent',  # "power", "exponent"
    beta=0.004,
    path_detour_penalty="equal",  # "power", "exponent", "equal"
    origin_weights=True,
    closest_destination=False,
    destination_weights=True, 
    # perceived_distance=False,
    num_cores=8,
    light_graph=True,
    turn_penalty=False,
)
simulated_sum_of_flow = return_dict['edge_gdf']['betweenness'].sum()


print(f"{(time.time()-start)*1000:6.2f}ms\t Betweenness estimated")
start = time.time()

In [None]:
return_dict['edge_gdf']



joined_results = harvard_square.layers['streets'].gdf.join(
    return_dict['edge_gdf'][['parent_street_id', 'betweenness']].set_index('parent_street_id'))#.rename(
    #columns={
        #"betweenness": f"{network_weight}_{'with_turns' if turn_penalty else 'no_turns'}_{'elastic_weight' if elastic_weight else 'unadjusted_weight'}_{pairing['Between_Name']}"})



joined_results[['__GUID', 'betweenness', 'geometry']].to_csv('2023-07-07 manhattan betweenness flow test.csv')
joined_results[['__GUID', 'betweenness', 'geometry']].to_file('2023-07-07 manhattan betweenness flow test.geoJSON', driver="GeoJSON")

In [None]:
joined_results = harvard_square.layers['streets'].gdf.join(harvard_square.network.edges[['parent_street_id', 'betweenness']].set_index('parent_street_id'))
joined_results[['__GUID', 'betweenness', 'geometry']].to_csv('2023-07-07 manhattan betweenness flow test.csv')
joined_results[['__GUID', 'betweenness', 'geometry']].to_file('2023-07-07 manhattan betweenness flow test.geoJSON', driver="GeoJSON")