Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load CSV to Postgres very slow #87

Closed
xqin1 opened this issue May 7, 2022 · 5 comments
Closed

Load CSV to Postgres very slow #87

xqin1 opened this issue May 7, 2022 · 5 comments
Labels

Comments

@xqin1
Copy link

xqin1 commented May 7, 2022

When loading CSV file to Postgres, it seems that the stream of the file limits the number of records being inserted into the PG table each time.

the code:

         let {cmd, text, details} =   await ogr2ogr(filePath,
                {
                    format: 'PostgreSQL',
                    options: ['-nln', 'test', '-gt', 2000000],
                    destination: connection
                });

On Postgresql side, the following statement is executed multiple times and each time to insert small number of records:

COPY "test" ("wkb_geometry", "filer_id", "filer_desc") FROM STDIN;

This is for V3 only, I don't have the issue with V2. Is there any settings for Postgresql only?

@xqin1
Copy link
Author

xqin1 commented May 16, 2022

I think the issue is with option -skipfailures being set by default. With this option, the -gt option is no longer valie.
Here's from OGR2OGR documentment:

When writing into transactional DBMS (SQLite/PostgreSQL,MySQL, etc…), it might be beneficial to increase the number of INSERT statements executed between BEGIN TRANSACTION and COMMIT TRANSACTION statements. This number is specified with the -gt option. For example, for SQLite, explicitly defining -gt 65536 ensures optimal performance while populating some table containing many hundreds of thousands or millions of rows. However, note that -skipfailures overrides -gt and sets the size of transactions to 1.

Is it possible to remove -skipfailures as default option?

@wavded
Copy link
Owner

wavded commented Jun 9, 2022

-skipfailures is very commonly used, in most cases so it is enabled by default due to historically that being the options most users want to get the results they want. That said, we don't have a way to turn it off and should for these cases. I wonder if we add a disableSkipFailures option.

@github-actions
Copy link

This issue is stale because it has been open 30 days with no activity. Remove stale label or comment or this will be closed in 5 days.

@github-actions github-actions bot added the stale label Jul 10, 2022
@github-actions
Copy link

This issue was closed because it has been stalled for 5 days with no activity.

@xqin1
Copy link
Author

xqin1 commented Aug 16, 2022

Sorry being late on this. An disableSkipFailures option would solve the issue, any chance to implement this feature?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants