[Transferred from GitLab] Creating GTFS DB #5

daphshez · 2016-07-12T09:37:18Z

[Originally posted by @nitzangur ]

A basic scheme is now available under gtfs/.

Next steps:

Scheme level:

Validate index-creation syntax.
Deside which indexes are wanted. 3 Create enums for relevant fields (inline comments).
Make sure type-sizes are fine.
Consider using TinyInt instead of INT when relevant.
Check requested type-size for coordinates.
Check requested type-size for shape_dist_traveled.

IT level:

Create a Postgresql DB based on this scheme.
Make sure the DB is accessible for developing and querying.

Code level:

Change the code under gtfs/parser/ so it will parse GTFS data into the new DB instead of into SQLite DB.
Migrate some other GTFS parsing code (e.g. route stories).
Create a cron to automatically parse the GTFS data every day. (Yehuda - is there any old cron to work with?)

Feel free to add/change some steps. Please response to this issue if you intend to perform some of these steps. (I guess those remarks are relevant to all of the issues.)

nitzangur · 2016-07-13T17:16:45Z

A valid scheme was uploaded in commit 25ed52c.
Postgresql DB is available under the same server as the SIRI-related DB (see here). DB name: gtfs.

daphshez · 2016-07-13T18:58:52Z

Are we planning to override the GTFS data on every import or do we want to accumulate it over time?

I think accumulation makes sense for analysis but it means the import script has to be more clever than the current script.

E.g. in the calendar table, every GTFS service only have future dates in the start_date field. The import script will have to recognise that this is the same old service and re-use the service record.

What do you think?

nitzangur · 2016-07-13T20:26:00Z

There is an option to exclude fields from the history table. If I get it right - excluding a field (e.g. the start_date field) means it won't create separated history records. Bottom line: the current import script (once it works) should do the job.
See: http://ebean-orm.github.io/docs/features/history

efratoio · 2016-07-25T10:09:38Z

Hi,
There are some issues that prevent the insertion to postgres to be completed.

The statements of "CREATE TABLE doesn't end with ";" so cannot be executedfrom the schema.sql file. is this intended?
"routes" "calendar" and another table works! but:
a. in shapes there is a column "shape_dist_traveled" not in file
b. in stop_times "trip_id" is integer but the format is: "20132304_260516" for example
c. the same for trips and trips_id
d. the "parent_station" field in stops is empty, so it cannot be converted to integer

Notice this project - they created postgres schema for gtfs and used for trip_id chracter dataype:

https://github.com/jedhorne/py-gtfs-postgres/blob/master/schema/gtfs_schema.create.sql

daphshez · 2016-10-20T14:38:50Z

I tested, cleaned up and documented the code written by @efratoio & @nitzangur.

I am happy to say that it works, and we have the sample gtfs data in the obus database on our server now! Thanks for everyone who worked on this issue!

There's a readme file that documents how to run it.

While this works, there is a performance issue. In my tests on the server is took about 1 second per 1000 records. The stop_times table in recent GTFS files has about ~20M records, so it would take hours to insert it. I think the key is to do some kind of batch, rather than inserting 1-by-1. I am going to open a separate issue for that.

daphshez mentioned this issue Jul 14, 2016

[Transferred from GitLab] Implement a function to extract bus stops for a given route #4

Closed

daphshez added the Critical path label Aug 3, 2016

daphshez removed the Critical path label Sep 8, 2016

daphshez closed this as completed Oct 20, 2016

eldadru pushed a commit that referenced this issue Mar 4, 2019

fix build, changing temp directory to be configurable #5

84dbd29

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[Transferred from GitLab] Creating GTFS DB #5

[Transferred from GitLab] Creating GTFS DB #5

daphshez commented Jul 12, 2016 •

edited

Loading

nitzangur commented Jul 13, 2016

daphshez commented Jul 13, 2016

nitzangur commented Jul 13, 2016

efratoio commented Jul 25, 2016 •

edited

Loading

daphshez commented Oct 20, 2016

[Transferred from GitLab] Creating GTFS DB #5

[Transferred from GitLab] Creating GTFS DB #5

Comments

daphshez commented Jul 12, 2016 • edited Loading

nitzangur commented Jul 13, 2016

daphshez commented Jul 13, 2016

nitzangur commented Jul 13, 2016

efratoio commented Jul 25, 2016 • edited Loading

daphshez commented Oct 20, 2016

daphshez commented Jul 12, 2016 •

edited

Loading

efratoio commented Jul 25, 2016 •

edited

Loading