Prediction Kalman testing

I am going to use a weeks worth of arrival departure data from HART and a single days worth of avl data for the test runs.

I will first look at the stoppathpredictions for an single stoppath on a single route to ensure we are getting stable/reasonable values.

Going to pick a single route to work on. Need to first check that arrival and departures are generated consistently and they are in the correct order for the route.

The arrival and departures are not in the correct order for Route 1. This brings me back to the Matchers and having a setting for max speed as per the notes page.

I have now commited the change for max speed.

Set maxDistance to segment to 100 from 300 and this improved things. Going to check Route 2.

Some vehicles becoming unpredicatble as too early or late. Changing allowableEarlySeconds and allowableLateSeconds to 1200 seconds each.

Direction needs to be included when looking for previous vehicle on same route in the Kalman implementation.

Kalman looks at last vehicle but should it have a cut off point where this is not taken into account say if it is over an hour old?

I have added a new PlaybackPredictionAccuracyModule that records prediction accuracy when playback run at full speed.

This branch https://github.com/scrudden/transitime-docker/tree/prediction_comparison_hart of transitime-docker uses this new playback module to run several days worth of HART data for Route 2 through transiTime and records the prediction accuracy of the configured method. I intend running this for several configurations and capturing some graphs to do a basic comparisons of methods. The graph for each will be error versus horizon. I have set up a Libre office spreadsheet linked to the postgres database to create the graphs.

This could be done with the following

transiTime defaults based on schedule.
LastVehicle for travel times.
transiTime default with 1 days UpdateTravelTimes run.
transiTime default with 2 days UpdateTravelTimes run.
Historical Average for travel times and dwell times 1 days data.
Historical Average for travel times and dwell times 2 days data.
Kalman filter for travel times with 1 days historical data. Dwell times using historical average.
Kalman filter for travel times with 2 days historical data. Dwell times using historical average.

Kalman error values needs be persisted at least in a cache between runs. Ehcache will not persist to disk without using paid for options so think I will swap cache to use https://commons.apache.org/proper/commons-jcs/index.html instead.

Going to use code from https://github.com/Transitime/core to produce a baseline set of results. I need to add new playback module to this branch so this will work.

Have done this base line and it can be run using this transiTime docker commit.

I also want to ensure the performance in terms of processing speed is similar between transiTime branches. I can turn off caching in the VIA branch using -Dnet.sf.ehcache.disabled=true java option and run when configured to use normal transiTime prediction algorithm. Need to add some sort of timer to record the speed of the processing in the playback module.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Prediction Kalman testing

Clone this wiki locally