Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bootstrap very slow when there is an irrelevant schema with many tables #150

Open
d2a-raudenaerde opened this issue Aug 4, 2021 · 2 comments

Comments

@d2a-raudenaerde
Copy link

PGSync version: 2.1.1

Postgres version: 12.5

Python version: 3.8

Problem Description:

I try running bootstrap on a database with 3 schema's. One schema has >1000 tables. I want to run on the other custom schema; but the code inspects ALL tables in ALL schemas, which is very slow, and I think, unnecessary. Modifying the code to use only the specific schema, it speeds up dramatically.

sync.py line 168 and 205 seem relevant.

@toluaina
Copy link
Owner

toluaina commented Aug 9, 2021

pgsync is supposed to work across database schemas. So this is supporting a functionality.
That said there is probably a more efficient way of skipping non-relevant tables.

@loren
Copy link
Contributor

loren commented Sep 15, 2022

I am running into a similar problem, as I have a database with many schemas but only one schema relevant to pgsync and accessible by a service account. When pgsync tries to create the table_notify() function, it errors with InsufficientPrivilege.

A solution here would be to have an optional top level field schemas that lets us specify which schemas we'll interact with. If missing, it defaults to the current behavior: self.__schemas = sa.inspect(self.engine).get_schema_names()

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants