Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rjf/cost abstractions #87

Merged
merged 45 commits into from
Dec 22, 2023
Merged

Rjf/cost abstractions #87

merged 45 commits into from
Dec 22, 2023

Conversation

robfitzgerald
Copy link
Collaborator

@robfitzgerald robfitzgerald commented Dec 21, 2023

this PR splits out cost estimation from the TraversalModel into a new long running service that builds CostModel instances. cost defaults can be set at app configuration time, but also can be overridden at query time. specific state variables are assigned cost rates by name. each state variable name can also be assigned a cost coefficient value. this involved a pretty large refactor of the library. the notebook example has been updated to use the cost model, using some reasonable defaults documented in the default TOML files.

how to use costs

  • for each type of search state variable there may be an instance of a cost rate which maps the state attribute to a cost value
  • in order to consider a state variable as part of the cost function, it must appear amongst the state_variable_coefficients query field
  • the state variable coefficients are used to scale costs that are used during the search
  • the final search state is used to compute the total cost, using any state variable appearing in the state_variable_coefficients field
  • query overwrites function differently depending on the field
field type description overwrite behavior
state_variable_coefficients HashMap<String, f64> lists the state variables to use and a search objective coefficient for each replace
vehicle_state_variable_rates HashMap<String, VehicleCostRate> list vehicle state attributes and the rate we use to compute their cost merge
network_state_variable_rates HashMap<String, NetworkCostRate> list network state attributes and the lookup method used to compute their cost merge (not yet implemented)
cost_aggregation CostAggregation how &[(&String, Cost)] becomes the final Cost replace

example configuration

for example, the configuration:

[cost]
cost_aggregation = "sum"
[cost.state_variable_coefficients]
distance = 0
time = 1
[cost.vehicle_state_variable_rates.time]
type = "raw"

is used for the end-to-end CompassApp speeds test, which uses raw time costs with a coefficient of 1 (100% of time considered).

a default cost model has been added to the config.default.toml file to cover distance and time values.

new APIs

the different rate types are found in vehicle_cost_rate.rs. the default vehicle rate configuration is to use the raw values for any vehicle costs:

[cost.vehicle_state_variable_rates.distance]
type = "raw"
[cost.vehicle_state_variable_rates.time]
type = "raw"
[cost.vehicle_state_variable_rates.energy]
type = "raw"
[cost.vehicle_state_variable_rates.energy_liquid]
type = "raw"
[cost.vehicle_state_variable_rates.energy_electric]
type = "raw"

there is also a key for defining network-based cost rates:

[cost.network_state_variable_rates.parking]
type = "traversal_cost"
cost_input_file = "parking-data.csv.gz"

those variants are found in network_cost_rate.rs. this feature is un-tested.

Closes #83.

@nreinicke
Copy link
Collaborator

This looks awesome! I've been doing some simple testing this morning and noticed that the runtimes are quite a bit longer for the raw energy optimal routes. For example, I used the compass_tomtom_denver_energy_smartcore.toml config and the following query:

{
  "model_name": "2017_CHEVROLET_Bolt",
  "starting_soc_percent": 100,
  "destination_y": 39.62627481432341,
  "destination_x": -104.99460207519721,
  "origin_y": 39.798311884359094,
  "origin_x": -104.86796368632217,
  "state_variable_coefficients": {
    "time": 0,
    "energy": 1
  }
}

For the code from the main branch, this runs in 500ms or so but on this branch we time out after 2 minutes.

Does the query look right to you? I've confirmed that the model is using the default raw configuration based on the results from the shortest time route ("time": 1, "energy": 0) which gives me the cost summary:

{
  "cost": {
    "energy": 8.111696303262868,
    "time": 25.139387990744755,
    "total_cost": 33.251084294007626
  },
  "info": {
    "cost_aggregation": "sum",
    "network_state_variable_rates": {},
    "state_variable_coefficients": {
      "energy": 0,
      "time": 1
    },
    "state_variable_indices": [
      [
        "time",
        1
      ],
      [
        "energy",
        2
      ]
    ],
    "vehicle_state_variable_rates": {
      "distance": {
        "type": "raw"
      },
      "energy": {
        "type": "raw"
      },
      "energy_electric": {
        "type": "raw"
      },
      "energy_liquid": {
        "type": "raw"
      },
      "time": {
        "type": "raw"
      }
    }
  }
}

Maybe the cost estimate is off for raw energy?

@robfitzgerald
Copy link
Collaborator Author

Does the query look right to you?

interesting. wondering

  1. how many edges were explored (tree_size)
  2. what the traversal summary looks like

if the cost function isn't correct for the cost estimate, we may have lost our a* heuristic behavior that we want.

@nreinicke
Copy link
Collaborator

nreinicke commented Dec 22, 2023

Yeah let me bump up the time limit and run again to get the summary. I did notice that the final energy value of 8.11 from the shortest time search does match what we get from the version on the main branch, making me think the actual traversal cost for energy is still working the same.

@nreinicke
Copy link
Collaborator

Okay, this is a new one. I raised the time limit but ended up with this error instead after 3 minutes:

{
  "error": "loop in search result revisits edge 23718",
  "request": {
    "destination_edge": 84955,
    "destination_x": -104.9946020751972,
    "destination_y": 39.62627481432341,
    "model_name": "2017_CHEVROLET_Bolt",
    "origin_edge": 510159,
    "origin_x": -104.86796368632216,
    "origin_y": 39.798311884359094,
    "query_weight_estimate": 21.983725812182445,
    "road_classes": [
      "1",
      "2",
      "3",
      "4",
      "5",
      "6"
    ],
    "starting_soc_percent": 100,
    "state_variable_coefficients": {
      "energy": 1,
      "time": 0
    }
  }
}

Do we still have a catch for negative energies?

@robfitzgerald
Copy link
Collaborator Author

Do we still have a catch for negative energies?

hmm good point. was that happening in the VehicleType? if that's the case, i didn't touch it, but if it was in the energy traversal model, i may have wiped it out by mistake. also, i should throw in a minimum cost in the CostModel so we can't have costs of zero or less. i'll go digging, thanks for your help investigating!

@robfitzgerald
Copy link
Collaborator Author

ok, i'm also seeing big changes, though on my system, your query does complete (after my recent cost fix, just pushed up). here's an energy-optimal route, computed in 40 seconds:

Screen Shot 2023-12-22 at 10 19 59 AM

i ran a distance, time and energy-optimal route via the coefficients using your query:

name parameters runtime tree size route distance route time route energy total cost
distance-optimal distance = 1 2.5 sec 61044 17.60 miles 42.25 minutes 4.68kWh $64.54
time-optimal time = 1 5 sec 131980 21.56 miles 25.14 minutes 8.11kWh $54.81
energy-optimal energy = 1 40 sec 186194 17.97 miles 56 minutes 4.19 kWh $78.47

the trends look correct, and i find them fascinating. runtimes for distance and time look normal. i'm not using your cacheing, that seems it could help here. i want to build routee-compass from the main branch and run this energy-optimal route again, because i'm realizing i don't actually know what to expect for this route's runtimes and results. but, i'm actually quite satisfied from how this stuff looks at-a-glance!

back to your question

Do we still have a catch for negative energies?

we have these lines here in the BEV VehicleType, is this what you mean? or was there something else that needed to happen further down the line to ensure non-negativity there?

@robfitzgerald
Copy link
Collaborator Author

yes, it appears that the cost estimate is not doing what it should. here's the tree for the energy-optimal route:

Screen Shot 2023-12-22 at 10 47 26 AM

so, basically, Dijkstra's. to investigate why, i think i will need to wait until next week. again, from my comment above, if you recall what the steps were to properly involve VehicleTypes with the EnergyTraversalModel, i would appreciate any hints about where i could be mucking that up.

@nreinicke
Copy link
Collaborator

Oh interesting, yeah we had a simple catch in the traversal cost method to pin energy to zero but I think your recent commit to cap all costs to the positive domain should fix. I'm okay if we dig into this next week since everything else seems to be working.

Overall this is looking great and I love how much flexibility it gives us. I just finished looking through the changes and have a few comments (hopefully nothing big). But, if any comments are out of scope for finishing today, maybe we just merge and make new issues for anything we don't want to forget over the break?

@@ -0,0 +1,9 @@
use super::vehicle::vehicle_cost_mapping::VehicleCostMapping;

pub struct CostConfiguration {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like this might be an artifact of a previous iteration as I can't find it linked in anywhere

/// to the state value.
#[derive(Serialize, Deserialize, Clone)]
#[serde(tag = "type", rename_all = "snake_case")]
pub enum VehicleCostRate {
Copy link
Collaborator

@nreinicke nreinicke Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the word "rate" is a little confusing to me here since Raw and Offset are not actually a rate. I wonder if we could find a more general word that captures the idea that it transforms the cost in some way (maybe VehicleCostTransform or VehicleCostOperation?)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i struggled with this name, and like the idea of fixing it, but i'm gonna push this out for future work because the naming convention here also impacts network costs, the cost model, and the query-time argument names.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah that makes sense to me

@@ -46,12 +45,16 @@ impl VehicleType for BEV {
fn name(&self) -> String {
self.name.clone()
}
fn state_variable_names(&self) -> Vec<String> {
vec![String::from("energy"), String::from("battery_state")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe we should be explicit here and call this energy_electric and remove the energy value from the default config?

@@ -36,6 +35,9 @@ impl VehicleType for ICE {
fn name(&self) -> String {
self.name.clone()
}
fn state_variable_names(&self) -> Vec<String> {
vec![String::from("energy")]
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similar to the bev, this could probably become energy_liquid and we could pull out energy from the config.

@@ -32,6 +33,7 @@ impl SearchAlgorithm {
destination: Option<VertexId>,
graph: Arc<ExecutorReadOnlyLock<Graph>>,
traversal_model: Arc<dyn TraversalModel>,
utility_model: CostModel,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should probably switch this cost_model: CostModel

.fold(Cost::ZERO, |a, b| a + *b);
state_variable_costs.insert(String::from("total_cost"), total_cost);

let result = serde_json::json!(state_variable_costs);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would it be possible to add an additional field here, something like total_realized_cost that captures the sum of the cost that the model actually optimized for since total cost will be the same regardless of the coefficients that are used.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that appears as the "cost" field in the base level of the reponse, which gets placed there by the summary output plugin. does that cover your ask?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

to disambiguate, i just renamed that "route_cost".

also, looking at the summary output plugin, i'm thinking all of this functionality could probably be moved into the CompassApp.apply_output_processing method, and we could remove the JSON summary extensions and summary plugin.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

that appears as the "cost" field in the base level of the reponse, which gets placed there by the summary output plugin. does that cover your ask?

ohhh yeah I missed that

I'm thinking all of this functionality could probably be moved into the CompassApp.apply_output_processing method, and we could remove the JSON summary extensions and summary plugin.

yeah agreed, that would make sense


let best_case_result = self
.vehicle
.best_case_energy_state((distance, self.service.output_distance_unit), state)?;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OH! I think this state just needs to be wrapped in get_vehicle_state_from_state(state) to extract the vehicle specific subset of the whole traversal state.

Suggested change
.best_case_energy_state((distance, self.service.output_distance_unit), state)?;
.best_case_energy_state((distance, self.service.output_distance_unit), get_vehicle_state_from_state(state))?;

Copy link
Collaborator Author

@robfitzgerald robfitzgerald Dec 22, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice catch! after fixing this, the energy-optimal search runtime is down for me from 40 seconds to 0 seconds. so, now, wondering why that went so fast, and why time and distance are taking so long in comparison. updated table:

name parameters runtime tree size miles minutes kWh total cost
distance-optimal distance = 1 2.5 sec 61044 17.60 42.25 4.68 64.54
time-optimal time = 1 5 sec 131980 21.56 25.14 8.11 54.81
energy-optimal energy = 1 0 sec 13653 19.53 69 2.92 92.21

the trends all look right here for the distance/time/energy tradeoffs. this is using raw costs so the cost column is kinda nonsense (hence removing the dollar sign).

the a* heuristic is probably still to blame here, given the size of the trees for those searches.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah it's also curious why the energy value is down to 2.9 (versus 4 something from the previous version).

something else I noticed for the energy optimal case is that the result["cost_summary"]["cost"]["energy_electric"] = 2.92 but the result["cost"] value is 3.19

perhaps we should just lump this into a new issue and do a deep dive after the break?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

result["cost_summary"]["cost"]["energy_electric"] = 2.92 but the result["cost"] value is 3.19

right. i think this would be explained if the state updates were allowed zero/negative-valued energies but the cost function was not.

perhaps we should just lump this into a new issue and do a deep dive after the break?

i at least have an issue for exploring the poor runtimes (#92) - but please feel free to drop another issue on for exploring the energy results as well.

Copy link
Collaborator

@nreinicke nreinicke left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work on adding this all in, it looks great!!

@robfitzgerald robfitzgerald merged commit 73ed0ad into main Dec 22, 2023
5 checks passed
@robfitzgerald robfitzgerald deleted the rjf/cost-abstractions branch December 22, 2023 19:20
@robfitzgerald
Copy link
Collaborator Author

robfitzgerald commented Dec 22, 2023

using our "real-ish" cost factors, i ran a test for distance, energy, time, and "all 3"-optimal routes:

Screen Shot 2023-12-22 at 12 50 48 PM

 	name 	 total_cost 	distance 	time 	       energy_electric
0 	distance $21.485037 	17.603244 	42.254257 	4.683044
1 	energy 	 $31.347543 	19.529403 	69.753055 	2.924176
2 	time 	 $17.560405 	21.555581 	25.139388 	8.111696
3 	all 	 $17.515325 	21.009211 	25.796709 	7.276318

it all makes sense to me. pretty psyched!

@robfitzgerald
Copy link
Collaborator Author

@nreinicke hey, fyi, i was able to get grid search to work with the new setup, here was my solution:

"grid_search": {
    "_ignore": [
      {
        "name": "distance",
        "state_variable_coefficients": {
          "distance": 1,
          "time": 0,
          "energy_electric": 0
        }
      },
      {
        "name": "time",
        "state_variable_coefficients": {
          "distance": 0,
          "time": 1,
          "energy_electric": 0
        }
      },
      {
        "name": "energy",
        "state_variable_coefficients": {
          "distance": 0,
          "time": 0,
          "energy_electric": 1
        }
      },
      {
        "name": "all",
        "state_variable_coefficients": {
          "distance": 1,
          "time": 1,
          "energy_electric": 1
        }
      }
    ]
  }

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

CostFunction: abstraction or convention
2 participants