Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Slow performance on selects with joins #171

Closed
MikaMikuMik opened this issue May 30, 2017 · 5 comments
Closed

Slow performance on selects with joins #171

MikaMikuMik opened this issue May 30, 2017 · 5 comments
Labels

Comments

@MikaMikuMik
Copy link

Hi, hope u can do me advice in my problem.

I use oracle fdw 1.5.0, Pg 9.6.3 server, and Oracle client 11.2.0.4.0 client and server.
I also build pg and oracle_fdw from sources on Linux Suse 13.
For doing queries I use pgAdmin III 1.22.1.

I do simple query:

select skv.*
  from myschema.fdw_tbl_01 skv
 inner join myschema.fdw_tbl_02 sp_pl
    on sp_pl.hskv = skv.hskv

It works well, around 12 seconds and about 70.000 records.
I add another table:

select skv.*
  from myschema.fdw_tbl_01 skv
 inner join myschema.fdw_tbl_02 sp_pl
    on sp_pl.hskv = skv.hskv
 inner join myschema.fdw_tbl_03 c
    on c.hskv = skv.hskv
   and c.pl = sp_pl.pl

And this one work very very slow, right now it 10+ minutes and this still works...
The result also about 75.000 records.

@laurenz
Copy link
Owner

laurenz commented May 30, 2017

Did you ANALYZE the foreign tables?

When you join several foreign tables, the tables are pulled into PostgreSQL and the join is performed there.

You can define a foreign table on a join by using a query in parentheses for the table option like this:

OPTIONS (table '(SELECT ... FROM tbl1 JOIN tbl2 ON ...)')

Then the join is performed on the Oracle side, and the query might be more efficient.

@MikaMikuMik
Copy link
Author

Nope. I did it right now, and now it works 15 secords for second query, good!
Tables pulled into pg for every join?
Now going to test another big join, interesting what is does...

@fernandorb10
Copy link

Did you ANALYZE the foreign tables?

When you join several foreign tables, the tables are pulled into PostgreSQL and the join is performed there.

You can define a foreign table on a join by using a query in parentheses for the table option like this:

OPTIONS (table '(SELECT ... FROM tbl1 JOIN tbl2 ON ...)')

Then the join is performed on the Oracle side, and the query might be more efficient.

Hi Laurez,

How can I join tables in the options you indicate in the comment I mention in this answer?

@darold
Copy link

darold commented May 17, 2021

@fernandorb10 you can also try to disable hash and merge join sometime I found that it can "help" PG planner to send the full query to the remote server. Not in this case with oracle_fdw but for example:

contrib_regression=# SET enable_hashjoin to off;
SET
contrib_regression=# SET enable_mergejoin to off;
SET

contrib_regression=# explain verbose select a.* from import.employees a join import.departments b on (a.department_id=b.department_id) join import.jobs c on (a.job_id = c.job_id);
                                                                                                                                                                           QUERY PLAN                                                                                                                                                                            
-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
 Nested Loop  (cost=20000.00..220355.17 rows=106 width=70)
   Output: a.employee_id, a.first_name, a.last_name, a.email, a.phone_number, a.hire_date, a.job_id, a.salary, a.commission_pct, a.manager_id, a.department_id
   Join Filter: ((a.job_id)::text = (c.job_id)::text)
   ->  Foreign Scan on import.jobs c  (cost=10000.00..10190.00 rows=19 width=7)
         Output: c.job_id, c.job_title, c.min_salary, c.max_salary
         Oracle query: SELECT /*dd836855365407a174f39898b61e9457*/ r4."JOB_ID" FROM "HR"."JOBS" r4
         Oracle plan: SELECT STATEMENT
         Oracle plan:   INDEX FULL SCAN JOB_ID_PK
   ->  Foreign Scan  (cost=10000.00..11060.00 rows=106 width=70)
         Output: a.employee_id, a.first_name, a.last_name, a.email, a.phone_number, a.hire_date, a.job_id, a.salary, a.commission_pct, a.manager_id, a.department_id
         Oracle query: SELECT /*caaab2c65cf19fe81be3f5f4d10dede2*/ r1."EMPLOYEE_ID", r1."FIRST_NAME", r1."LAST_NAME", r1."EMAIL", r1."PHONE_NUMBER", r1."HIRE_DATE", r1."JOB_ID", r1."SALARY", r1."COMMISSION_PCT", r1."MANAGER_ID", r1."DEPARTMENT_ID" FROM ("HR"."EMPLOYEES" r1 INNER JOIN "HR"."DEPARTMENTS" r2 ON (r1."DEPARTMENT_ID" = r2."DEPARTMENT_ID"))
         Oracle plan: SELECT STATEMENT
         Oracle plan:   TABLE ACCESS FULL EMPLOYEES  (filter "R1"."DEPARTMENT_ID" IS NOT NULL)
 JIT:
   Functions: 2
   Options: Inlining false, Optimization false, Expressions true, Deforming true
(16 rows)

At least one JOIN have been pushed down :-D I have found that on some FDW the entire statement is executed remotely. Of course if the original plan doesn't use hash or merge join this is useless.

Note that it doesn't mean that it will be faster but you could try. Otherwise the only valuable solution for you is what @laurenz mentioned, create a foreign table that call your query, for example:

CREATE FOREIGN TABLE myschema.fdw_tbl_01_bis (
[...]
) OPTIONS (table '(select skv.* from myschema.fdw_tbl_01 skv inner join myschema.fdw_tbl_02 sp_pl on sp_pl.hskv = skv.hskv
 inner join myschema.fdw_tbl_03 c on c.hskv = skv.hskv and c.pl = sp_pl.pl');

then use SELECT * FROM myschema.fdw_tbl_01_bis; instead.

@laurenz
Copy link
Owner

laurenz commented May 17, 2021

@fernandorb10, please start a new issue for your question.

You provide too little information for a meaningful answer: What is unclear about my comment? If you need it spelled out for your particular use case, please show the table definitions and your query.

Repository owner locked as resolved and limited conversation to collaborators May 17, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

4 participants