Skip to content
This repository has been archived by the owner on May 27, 2024. It is now read-only.

StartDate and EndDate are coming back as Null #23

Open
alankessler opened this issue Aug 2, 2017 · 5 comments
Open

StartDate and EndDate are coming back as Null #23

alankessler opened this issue Aug 2, 2017 · 5 comments
Labels

Comments

@alankessler
Copy link

alankessler commented Aug 2, 2017

I modified the tool to work with my postgres database:

alankessler@01d10c7

However, I think that's unrelated to the error I'm now getting:

[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.Agency
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.ShapePoint
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.Route
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.Stop
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.Trip
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.StopTime
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.ServiceCalendar
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.ServiceCalendarDate
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.FareAttribute
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.FareRule
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.Frequency
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.Pathway
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.Transfer
[main] INFO org.onebusaway.gtfs.serialization.GtfsReader - reading entities: org.onebusaway.gtfs.model.FeedInfo

LoadStatus of GTFS Feed: SUCCESS

Connected to Database: trimet

GTFS Data Valid Start Date: null

GTFS Data Valid End Date: null
Finished updating table

I've tried with each of these trimet data sets with the same result:
https://developer.trimet.org/schedule/gtfs.zip
https://transitfeeds-data.s3-us-west-1.amazonaws.com/public/feeds/trimet/43/20170714/gtfs.zip
https://transitfeeds-data.s3-us-west-1.amazonaws.com/public/feeds/trimet/43/20170727/gtfs.zip

I'd love any suggestions on how to resolve this.

Thanks,
Alan

@barbeau barbeau added the bug label Aug 2, 2017
@barbeau
Copy link
Member

barbeau commented Aug 2, 2017

From a quick look, I'm guessing this has to do with how TriMet is representing their calendar dates in their GTFS data.

Most agencies use calendar.txt (https://developers.google.com/transit/gtfs/reference/calendar-file) to represent their regularly scheduled service, and calendar_dates.txt (https://developers.google.com/transit/gtfs/reference/calendar_dates-file) to represent exceptions to that service.

TriMet (and some other agencies) is representing ALL transit service in calendar_dates.txt - https://developers.google.com/transit/gtfs/reference/calendar_dates-file says:

The calendar_dates table allows you to explicitly activate or disable service IDs by date. You can use it in two ways.

  • Recommended: Use calendar_dates.txt in conjunction with calendar.txt, where calendar_dates.txt defines any exceptions to the default service categories defined in the calendar.txt file. If your service is generally regular, with a few changes on explicit dates (for example, to accomodate special event services, or a school schedule), this is a good approach.
  • Alternate: Omit calendar.txt, and include ALL dates of service in calendar_dates.txt. If your schedule varies most days of the month, or you want to programmatically output service dates without specifying a normal weekly schedule, this approach may be preferable.

So TriMet is using the "alternate" way.

I believe this tool currently relies on the existence of calendar to calculate the GTFS start and end dates. We should also support pulling these values from calendar_dates.txt if calendar.txt does not exist.

We're not actively working on this tool, so if you want to take a shot at a pull request supporting this that would be awesome. Otherwise, we'll try to take a look when we can.

@barbeau
Copy link
Member

barbeau commented Aug 2, 2017

Looks like the current calendar.txt start and end time are pulled from GtfsStatisticsService object at https://github.com/CUTR-at-USF/ontime-performance-calculator/blob/master/src/main/java/edu/usf/cutr/OPC/FeedProcessor.java#L123.

GtfsStatisticsService.getCalendarServiceRangeStart() and GtfsStatisticsService.getCalendarServiceRangeEnd() need to be modified to pull dates from gtfsDao.getAllCalendarDates() if gtfsDao.getAllCalendars() is empty (IIRC).

@alankessler
Copy link
Author

Thank you! That seems to have done it. Just so I don't lose it, here's the relevant patch. I'll work on a PR with conditional logic when I get a chance.

diff --git i/src/main/java/edu/usf/cutr/OPC/gtfs/GtfsStatisticsService.java w/src/main/java/edu/usf/cutr/OPC/gtfs/GtfsStatisticsService.java
index 16de49a..95605ca 100644
--- i/src/main/java/edu/usf/cutr/OPC/gtfs/GtfsStatisticsService.java
+++ w/src/main/java/edu/usf/cutr/OPC/gtfs/GtfsStatisticsService.java
@@ -9,6 +9,7 @@ import java.util.Date;

 import org.onebusaway.gtfs.impl.GtfsRelationalDaoImpl;
 import org.onebusaway.gtfs.model.ServiceCalendar;
+import org.onebusaway.gtfs.model.ServiceCalendarDate;


 /**
@@ -26,11 +27,11 @@ public class GtfsStatisticsService {

                Date startDate = null;

-               for (ServiceCalendar serviceCalendar : gtfsDao.getAllCalendars()) {
+               for (ServiceCalendarDate serviceCalendarDate : gtfsDao.getAllCalendarDates()) {

                        if (startDate == null
-                                       || serviceCalendar.getStartDate().getAsDate().before(startDate))
-                               startDate = serviceCalendar.getStartDate().getAsDate();
+                                       || serviceCalendarDate.getDate().getAsDate().before(startDate))
+                               startDate = serviceCalendarDate.getDate().getAsDate();
                }

                return startDate;
@@ -41,10 +42,10 @@ public class GtfsStatisticsService {

                Date endDate = null;

-               for (ServiceCalendar serviceCalendar : gtfsDao.getAllCalendars()) {
+               for (ServiceCalendarDate serviceCalendarDate : gtfsDao.getAllCalendarDates()) {
                        if (endDate == null
-          || serviceCalendar.getEndDate().getAsDate().after(endDate))
-        endDate = serviceCalendar.getEndDate().getAsDate();
+          || serviceCalendarDate.getDate().getAsDate().after(endDate))
+        endDate = serviceCalendarDate.getDate().getAsDate();
                }

                return endDate;

@barbeau
Copy link
Member

barbeau commented Aug 3, 2017

@alankessler thanks for the diff! Good to know that worked.

@barbeau
Copy link
Member

barbeau commented Aug 3, 2017

Also, I think the proper fix for this is actually to loop through both getAllCalendars() and getAllCalendarDates(), and save the earliest and latest dates as the start/end dates. Either file could contain the min/max date.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants