Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

allow importing into a custom DB schema #9

Open
dancesWithCycles opened this issue Feb 12, 2024 · 7 comments
Open

allow importing into a custom DB schema #9

dancesWithCycles opened this issue Feb 12, 2024 · 7 comments
Labels
enhancement New feature or request help wanted Extra attention is needed

Comments

@dancesWithCycles
Copy link
Contributor

dancesWithCycles commented Feb 12, 2024

Edit @derhuerst:
The gist of this thread is here: #9 (comment)
gtfs-via-postgres's ability to import into a custom schema should be exposed, and match-gtfs-rt-to-gtfs should be adapted to support this too. Then, a schema switch can be added in here.

depends on derhuerst/match-gtfs-rt-to-gtfs#8


Hi folks,
I am wondering if someone has already successfully replicated this repository using the --schema switch.

I am failing with this feedback.

$ time npm run build

> berlin-gtfs-rt-server@4.0.1 build
> ./build.sh && npm run docs

+ wget --compression auto -r --no-parent --no-directories -R .csv.gz -P gtfs -N https://vbb-gtfs.jannisr.de/latest/
--2024-02-12 13:08:57--  https://vbb-gtfs.jannisr.de/latest/
Resolving vbb-gtfs.jannisr.de (vbb-gtfs.jannisr.de)... 2001:41d0:701:1100::26d, 54.37.75.136
Connecting to vbb-gtfs.jannisr.de (vbb-gtfs.jannisr.de)|2001:41d0:701:1100::26d|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: unspecified [text/html]
Saving to: ‘gtfs/index.html’

index.html                                 [ <=>                                                                         ]     624  --.-KB/s    in 0s      

Last-modified header missing -- time-stamps turned off.
2024-02-12 13:08:57 (6.35 MB/s) - ‘gtfs/index.html’ saved [3222]

Loading robots.txt; please ignore errors.
--2024-02-12 13:08:57--  https://vbb-gtfs.jannisr.de/robots.txt
...

--2024-02-12 13:08:57--  https://vbb-gtfs.jannisr.de/latest/stops.csv
Reusing existing connection to [vbb-gtfs.jannisr.de]:443.
HTTP request sent, awaiting response... 304 Not Modified
File ‘gtfs/stops.csv’ not modified on server. Omitting download.

...

FINISHED --2024-02-12 13:08:57--
Total wall clock time: 0.4s
Downloaded: 1 files, 624 in 0s (6.35 MB/s)
+ env
+ grep '^PG'
...
+ NODE_ENV=production
+ node_modules/.bin/gtfs-to-sql -d --schema vbb --trips-without-shape-id --routes-without-agency-id -- gtfs/agency.csv gtfs/calendar.csv gtfs/calendar_dates.csv gtfs/frequencies.csv gtfs/routes.csv gtfs/stop_times.csv gtfs/stops.csv gtfs/transfers.csv gtfs/trips.csv
+ psql -b
+ sponge
is_valid_lang_code
is_timezone
calendar
  processed 2892 rows
calendar_dates
  processed 97563 rows
service_days
routes
  processed 1271 rows
trips
  processed 241801 rows
frequencies
  processed 0 rows
agency
  processed 33 rows
stops
  processed 41413 rows
stop_times
  processed 5748473 rows
transfers
  processed 53855 rows
NOTICE:  extension "postgis" already exists, skipping
CREATE EXTENSION
CREATE SCHEMA
BEGIN
CREATE FUNCTION
DO
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE TYPE
CREATE TABLE
COPY 2892
CREATE TYPE
CREATE TABLE
COPY 97563
CREATE INDEX
CREATE INDEX
SELECT 269243
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE TYPE
CREATE TABLE
COPY 1271
CREATE INDEX
CREATE TYPE
CREATE TYPE
CREATE TABLE
COPY 241801
CREATE TYPE
CREATE TABLE
COPY 0
CREATE INDEX
CREATE INDEX
CREATE TABLE
COPY 33
CREATE TYPE
CREATE TYPE
CREATE TABLE
COPY 41413
ALTER TABLE
CREATE INDEX
CREATE TYPE
CREATE TYPE
CREATE TABLE
COPY 5748473
CREATE INDEX
CREATE INDEX
UPDATE 5748473
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE VIEW
CREATE FUNCTION
CREATE VIEW
CREATE FUNCTION
CREATE TYPE
CREATE TABLE
ALTER TABLE
COPY 53855
COMMIT
+++ realpath ./build.sh
++ dirname /home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/build.sh
+ lib=/home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/lib
+ NODE_ENV=production
+ node_modules/.bin/build-gtfs-match-index /home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/lib/hafas-info.js /home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/lib/gtfs-info.js
+ psql -b
+ sponge
stops
error: relation "stops" does not exist
    at Parser.parseErrorMessage (/home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/node_modules/pg-protocol/dist/parser.js:287:98)
    at Parser.handlePacket (/home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/node_modules/pg-protocol/dist/parser.js:126:29)
    at Parser.parse (/home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/node_modules/pg-protocol/dist/parser.js:39:38)
    at Socket.<anonymous> (/home/pacs/sib00/users/gtfsr_srv_vbb/git/berlin-gtfs-rt-server/node_modules/pg-protocol/dist/index.js:11:42)
    at Socket.emit (node:events:517:28)
    at addChunk (node:internal/streams/readable:368:12)
    at readableAddChunk (node:internal/streams/readable:341:9)
    at Readable.push (node:internal/streams/readable:278:10)
    at TCP.onStreamRead (node:internal/stream_base_commons:190:23) {
  length: 105,
  severity: 'ERROR',
  code: '42P01',
  detail: undefined,
  hint: undefined,
  position: '297',
  internalPosition: undefined,
  internalQuery: undefined,
  where: undefined,
  schema: undefined,
  table: undefined,
  column: undefined,
  dataType: undefined,
  constraint: undefined,
  file: 'parse_relation.c',
  line: '1392',
  routine: 'parserOpenTable'
}
unrecognized value "on;" for "ON_ERROR_STOP": Boolean expected
BEGIN
ERROR:  relation "stops" does not exist
STATEMENT:  CREATE TABLE stops_stable_ids (
	stop_id TEXT NOT NULL,
	FOREIGN KEY (stop_id) REFERENCES stops,
	stable_id TEXT NOT NULL,
	specificity INTEGER NOT NULL
);
ERROR:  current transaction is aborted, commands ignored until end of transaction block
STATEMENT:  COPY stops_stable_ids FROM STDIN csv;

I do not observe this behavior when I omit the --schema switch.

Cheers!

@dancesWithCycles
Copy link
Contributor Author

This is odd. When I am using the repository straight like this

npm i
./cli.js --schema <schema> --trips-without-shape-id --routes-without-agency-id -- gtfs/*.csv | sponge | psql -b -h <host> -p <port> -U <user> -d <database> > log.txt 2>&1

the schema switch works perfectly fine.

@dancesWithCycles
Copy link
Contributor Author

I might have found the cause of my struggle. The GTFS Schedule feed is successfully imported as you can see from this log file.

less log-gtfs-to-sql.txt:

NOTICE:  extension "postgis" already exists, skipping
CREATE EXTENSION
CREATE SCHEMA
BEGIN
CREATE FUNCTION
DO
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE TYPE
CREATE TABLE
COPY 2892
CREATE TYPE
CREATE TABLE
COPY 97563
CREATE INDEX
CREATE INDEX
SELECT 269243
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE TYPE
CREATE TABLE
COPY 1271
CREATE INDEX
CREATE TYPE
CREATE TYPE
CREATE TABLE
COPY 241801
CREATE TYPE
CREATE TABLE
COPY 0
CREATE INDEX
CREATE INDEX
CREATE TABLE
COPY 33
CREATE TABLE
COPY 201
CREATE TYPE
CREATE TYPE
CREATE TABLE
COPY 41413
ALTER TABLE
CREATE INDEX
CREATE TYPE
CREATE TABLE
COPY 119239
CREATE TABLE
COPY 5282696
CREATE INDEX
CREATE INDEX
CREATE VIEW
CREATE TYPE
CREATE TYPE
CREATE TABLE
COPY 5748473
CREATE INDEX
CREATE INDEX
UPDATE 5748473
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE INDEX
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE FUNCTION
CREATE VIEW
CREATE FUNCTION
CREATE VIEW
CREATE FUNCTION
CREATE TYPE
CREATE TABLE
ALTER TABLE
COPY 53855
COMMIT

However, the node_modules/.bin/build-gtfs-match-index does not find the imported feed as it is not looking in the respective schema.

less log-build-gtfs-match-index.txt:

unrecognized value "on;" for "ON_ERROR_STOP": Boolean expected
BEGIN
ERROR:  relation "stops" does not exist
STATEMENT:  CREATE TABLE stops_stable_ids (
        stop_id TEXT NOT NULL,
        FOREIGN KEY (stop_id) REFERENCES stops,
        stable_id TEXT NOT NULL,
        specificity INTEGER NOT NULL
);
ERROR:  current transaction is aborted, commands ignored until end of transaction block
STATEMENT:  COPY stops_stable_ids FROM STDIN csv;

Am I right or am I right? ;-)

@derhuerst
Copy link
Owner

derhuerst commented Feb 13, 2024

The GTFS Schedule feed is successfully imported […]. […] However, the node_modules/.bin/build-gtfs-match-index does not find the imported feed as it is not looking in the respective schema.

Yes, I think that's correct!

build-gtfs-match-index comes from hafas-gtfs-rt-feed, which just calls match-gtfs-rt-to-gtfs's build-index, which currently doesn't support configuring a schema.

@derhuerst
Copy link
Owner

I'm wondering though: What's your use case for using a schema other than public? berlin-gtfs-rt-server (or rather hafas-gtfs-rt-feed, because berlin-gtfs-rt-server is just a thin shell around it) currently assumes that the PostgreSQL DB is exclusively used by it.

@derhuerst derhuerst added the question Further information is requested label Feb 13, 2024
@dancesWithCycles
Copy link
Contributor Author

dancesWithCycles commented Feb 13, 2024

My hosting is my use case!

I am using a Managed Server where I do not want to drop and create a database every time I update the GTFS feed. I rather drop and create the respective schema.

An alternative approach would be a script that cleans up an existing database without dropping it so that the update happens on a clean database.

At the end of the day I need to make sure to prepare a fresh environment for the GTFS feed import into PostgreSQL without dropping the database. How would you do it?

@derhuerst
Copy link
Owner

At the end of the day I need to make sure to prepare a fresh environment for the GTFS feed import into PostgreSQL without dropping the database. How would you do it?

I have created public-transport/gtfs-via-postgres#57 to discuss this further, because this topic is not specific to berlin-gtfs-rt-server, and because there have been similar discussions in the gtfs-via-postgres Issues before.

@derhuerst
Copy link
Owner

FYI: I'll rename this Issue to be a feature request.

@derhuerst derhuerst changed the title Question: Has anyone successfully used the --schema switch? allow importing into a custom DB schema Feb 16, 2024
@derhuerst derhuerst added enhancement New feature or request help wanted Extra attention is needed and removed question Further information is requested labels Feb 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request help wanted Extra attention is needed
Projects
None yet
Development

No branches or pull requests

2 participants