Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixed some functions in StandardTripsCreator #126

Merged
merged 1 commit into from
Mar 1, 2018
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
73 changes: 57 additions & 16 deletions osm2gtfs/creators/trips_creator.py
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,8 @@ def _prepare_trips(self, feed, schedule, itinerary):
if input_fr == itinerary.fr and input_to == itinerary.to:
trip_services = trip["services"]
for service in trip_services:
services.append(service)
if service not in services:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent the case when the itineraries in the Schedule are split by some reason, do not double the services.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should they be split. Sorry I lack of context here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

They are in some cases of the Nicaraguan timetable.json because of the generation by easy-timetable-generator. But anyway, why shouldn't it be supported?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I still don't understand it. Can you explain concretely why they should be split. Sorry for asking this stupid question, I just didn't get it, yet.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm referring to the case when there are two or more dict items inside the line's list for the same line, which present different times but using the same service period. See this example:

    "lines": {
        "E-MAN-EST": [
            {
                "exeptions": [],
                "from": "Mercado Mayoreo",
                "services": [
                    "Mo-Su"
                ],
                "stations": [
                    "Mercado Mayoreo",
                    "COTRAN Sur"
                ],
                "times": [
                    [
                        "05:45",
                        "08:15"
                    ],
                    [
                        "08:15",
                        "10:45"
                    ],
                    [
                        "09:15",
                        "11:45"
                    ],
                    [
                        "10:45",
                        "13:15"
                    ],
                    [
                        "11:45",
                        "14:15"
                    ]
                ],
                "to": "COTRAN Sur"
            },
            {
                "exeptions": [],
                "from": "Mercado Mayoreo",
                "services": [
                    "Mo-Su"
                ],
                "stations": [
                    "Mercado Mayoreo",
                    "COTRAN Sur"
                ],
                "times": [
                    [
                        "13:15",
                        "15:45"
                    ],
                    [
                        "15:15",
                        "17:45"
                    ],
                    [
                        "13:45",
                        "16:15"
                    ],
                    [
                        "15:45",
                        "18:15"
                    ]
                ],
                "to": "COTRAN Sur"
            },
            ...

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the explanation. I think this (edge) case is kind of an inconsistency in the standard schedule format and I personally would rather require a clean input format. But I guess if you and @Skippern are in favour of supporting it, there is a deeper reason I still can't see.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But shouldn't then the two dicts not been merged? As far as I see, now the second one (and all that follow) would be just discarded.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

now the second one (and all that follow) would be just discarded

To clear it up a bit, I'll explain what happened before and what should happen IMHO:

  • Before this PR, the services variable contained the following after iterating over the two (or more) dicts for one line: [Mo-Su, Mo-Su]. So in line 98 and more specifically in line 112, it would append the times from both dicts twice to the trips variable with the same service period Mo-Su.
  • With this change, the service variable would only contain the following: [Mo-Su] and the trips would be getting doubled.

This code isn't about the whole dict that would be discarded, but only preventing the doubling of a service period.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Now I got it. Thanks!

services.append(service)

if not services:
print(" Warning: From and to values didn't match with schedule.")
Expand Down Expand Up @@ -124,14 +125,32 @@ def _verify_data(self, schedule, line, itinerary):
str(line.route_id) + ")")
print(" " + itinerary.osm_url)
print(" " + line.osm_url)
return True
return False
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure, but shouldn't it return False when there's an error?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, it looks like this should be False, indeed.


# Check if time information in schedule can be found for
# the itinerary
if itinerary.route_id not in schedule['lines']:
print(" Warning: Route not found in schedule.")
print("Warning: Route not found in schedule.")
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to keep it as in the other cases.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

return False

# Check if from and to tags are valid and correspond to
# the actual name of the first and last stop of the itinerary.
for trip in schedule['lines'][itinerary.route_id]:
if (trip["from"] == itinerary.fr and trip["to"] == itinerary.to):
trip_stations = trip["stations"]
if trip_stations[0] != itinerary.fr:
print("Warning: The first station in the stations list of trip (" +
str(itinerary.route_id) + ") doesn't match first station of itinerary.")
print(" " + itinerary.osm_url)
print(" Please review the schedule file.")
return False
elif trip_stations[-1] != itinerary.to:
print("Warning: The last station in the stations list of trip (" +
str(itinerary.route_id) + ") doesn't match last station of itinerary.")
print(" " + itinerary.osm_url)
print(" Please review the schedule file.")
return False

return True

def _add_shape_to_feed(self, feed, shape_id, itinerary):
Expand Down Expand Up @@ -165,16 +184,16 @@ def _add_itinerary_trips(self, feed, itinerary, line, trip_builder,
gtfs_trip = route.AddTrip(feed, headsign=itinerary.to,
service_period=trip_builder['service_period'])
trips_count += 1
search_idx = 0
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To prevent problems in itineraries where a stop (or a stop_name) is served twice, so that it doesn't pick the same time information twice (this would give "high-speed services" for the transitfeed validator), I've introduced the index from where it should search for the next time.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tested with Managua and it is the same result. Generally checked the code and looks good.


# Go through all stops of an itinerary
for itinerary_stop_id in itinerary.get_stops():
for itinerary_stop_idx, itinerary_stop_id in enumerate(itinerary.get_stops()):

# Load full stop object
try:
itinerary_stop = trip_builder[
'all_stops']['regular'][itinerary_stop_id]
except ValueError:

sys.stderr.write(
"Itinerary (" + itinerary.route_url + ") misses a stop: \n")
sys.stderr.write(
Expand All @@ -187,20 +206,40 @@ def _add_itinerary_trips(self, feed, itinerary, line, trip_builder,
except ValueError:
print("Warning: Stop in itinerary was not found in GTFS.")
print(" " + itinerary_stop.osm_url)
continue

# Make sure we compare same unicode encoding
if type(itinerary_stop.name) is str:
itinerary_stop.name = itinerary_stop.name.decode('utf-8')

time = "-"
schedule_stop_idx = -1
# Check if we have specific time information for this stop.
try:
time = trip[trip_builder['stops'].index(itinerary_stop.name)]
schedule_stop_idx = trip_builder['stops'].index(itinerary_stop.name, search_idx)
except ValueError:
pass

# Validate time information
if time != "-":
if itinerary_stop.get_parent_station() is not None:
# If stop name not found, check for the parent_station name, too.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here it is the explanation.

itinerary_station = trip_builder[
'all_stops']['stations'][str(itinerary_stop.get_parent_station())]
if type(itinerary_station.name) is str:
itinerary_station.name = itinerary_station.name.decode('utf-8')
try:
schedule_stop_idx = trip_builder[
'stops'].index(itinerary_station.name, search_idx)
except ValueError:
pass

# Make sure the last stop of itinerary will keep being the last stop in GTFS
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To solve same problem described above.

last_stop_schedule = schedule_stop_idx == len(trip_builder['stops']) - 1
last_stop_itinerary = itinerary_stop_idx == len(itinerary.get_stops()) - 1
if last_stop_schedule != last_stop_itinerary:
schedule_stop_idx = -1

if schedule_stop_idx != -1:
time = trip[schedule_stop_idx]
search_idx = schedule_stop_idx + 1

# Validate time information
try:
time_at_stop = str(
datetime.strptime(time, "%H:%M").time())
Expand All @@ -216,10 +255,10 @@ def _add_itinerary_trips(self, feed, itinerary, line, trip_builder,
else:
try:
gtfs_trip.AddStopTime(gtfs_stop)
except ValueError:
print("Warning: Could not add first a stop to trip.")
except transitfeed.problems.OtherProblem:
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It never throws ValueError, but OtherProblem.

print("Warning: Could not add first stop to trip without time information.")
print(" " + itinerary_stop.name +
" - " + itinerary_stop.osm_id)
" - " + itinerary_stop.osm_url)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just to give more and consistent information

break

# Add reference to shape
Expand Down Expand Up @@ -296,13 +335,14 @@ def _load_itinerary_schedule(self, schedule, itinerary, service):
:return times: List of strings
"""
times = None
times = []
for trip in schedule['lines'][itinerary.route_id]:
trip_services = trip["services"]
if (trip["from"] == itinerary.fr and
trip["to"] == itinerary.to and
service in trip_services):
times = trip["times"]
for time in trip["times"]:
times.append(time)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to not loose times (overriding them).

if times is None:
print("Warning: Couldn't load times from schedule for route")
return times
Expand All @@ -322,4 +362,5 @@ def _load_scheduled_stops(self, schedule, itinerary, service):
"to"] == itinerary.to and service in trip_services):
for stop in trip["stations"]:
stops.append(stop)
break
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In order to not duplicate stops.

return stops