Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: filter to fill in stop times from schedules #243

Merged
merged 6 commits into from Jul 20, 2022

Commits on Jun 13, 2022

  1. Configuration menu
    Copy the full SHA
    37f5c23 View commit details
    Browse the repository at this point in the history

Commits on Jun 14, 2022

  1. refactor: move GTFS out of Filter namespace

    Move `Concentrate.Filter.GTFS.*` to `Concentrate.GTFS.*`.
    
    We are introducing a `Concentrate.GroupFilter` that will use `GTFS`
    data, so organizing this part of the app under the `Filter` namespace
    doesn't make as much sense as it did before.
    digitalcora committed Jun 14, 2022
    Configuration menu
    Copy the full SHA
    b61c59f View commit details
    Browse the repository at this point in the history

Commits on Jun 27, 2022

  1. fix: potential periodic GTFS.PickupDropOff issue

    Building the records to be inserted into ETS can take a while (over a
    minute), and since all records are deleted from the table before this
    work begins, the module will return `:unknown` for all lookups during
    this time. This issue would occur once per hour, as per the static GTFS
    refresh interval, and would result in `RemoveUnneededTimes` temporarily
    not working.
    
    This refactor applies the same approach used in `StopIDs`: build all
    records up-front, then clear the table and immediately insert all the
    records in one batch. This still takes a second or two, but the time
    gap is much shorter, reducing the chance that an incorrect feed would
    be produced and the time until a corrected feed would be produced.
    
    Eliminating the issue entirely will require a more complex approach,
    such as writing some "upsert" logic which also deletes records that
    shouldn't be in the table anymore, or using Mnesia for its transaction
    support.
    digitalcora committed Jun 27, 2022
    Configuration menu
    Copy the full SHA
    666cc90 View commit details
    Browse the repository at this point in the history
  2. Configuration menu
    Copy the full SHA
    c3ca297 View commit details
    Browse the repository at this point in the history

Commits on Jun 29, 2022

  1. refactor: combine stop-time-related GTFS servers

    Having three different servers loading `stop_times.txt` and maintaining
    their own subset of it was resulting in very high CPU and memory usage.
    Combining these, similar to `GTFS.Trips`, saves on system resources and
    significantly reduces the "warm-up time" in which filters are not fully
    functional.
    
    Approximate numbers on a developer machine:
    
    Metric       | pre `StopTimes` | post `StopTimes` | consolidated
    ------------ | --------------- | ---------------- | ------------
    Update Time  | 1.5 min         | 4 min            | 1.5 min
    Peak Memory  | 2.2GB           | 3.3GB            | 2.0GB
    Steady-State | 1.3GB           | 1.7GB            | 880MB
    
    The first column is from when there were only two `stop_times.txt`
    servers, `PickupDropOff` and `StopIDs`. We can see that consolidation
    is also an improvement over these numbers.
    
    There are some incidental changes that should not have any effect on the
    behavior of the app:
    
    * `RemoveUnneededTimes` no longer works with stop time updates that are
      missing a `stop_sequence`. This should not have happened anyway since
      ac72186, which formalized the requirement that all stop time updates
      should have a stop sequence for merging to work correctly.
    
    * `RemoveUnneededTimes` and `IncludeStopID` no longer work with GTFS
      feeds that have _only_ a `stop_times.txt`. Like `ScheduledStopTimes`,
      they now require the full chain of `stop_times.txt`, `trips.txt`,
      `routes.txt`, and `agencies.txt` to be present and valid, though the
      files are unused in these filters. Both the old and new behaviors are
      accidents of the implementation details, and the new behaviour could
      be changed if we wanted to explicitly support invalid "partial" GTFS
      feeds, but currently we have no need for this.
    digitalcora committed Jun 29, 2022
    Configuration menu
    Copy the full SHA
    edbc809 View commit details
    Browse the repository at this point in the history

Commits on Jul 19, 2022

  1. Configuration menu
    Copy the full SHA
    3b1f93e View commit details
    Browse the repository at this point in the history