Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify and optimize limit and offset serialization #602

Closed
timowest opened this issue Dec 19, 2013 · 9 comments
Closed

Unify and optimize limit and offset serialization #602

timowest opened this issue Dec 19, 2013 · 9 comments
Milestone

Comments

@timowest
Copy link
Member

@timowest timowest commented Dec 19, 2013

Unify and optimize limit and offset serialization

Related
https://groups.google.com/forum/#!topic/jooq-user/G9Op6cQwMkY
http://www.inf.unideb.hu/~gabora/pagination/results.html

timowest added a commit that referenced this issue Dec 19, 2013
@timowest
Copy link
Member Author

@timowest timowest commented Dec 19, 2013

Oracle, SQL Server and Teradata need special treatment of the pagination. Here are the current serializations

OracleTemplates

  • limit
select * from (
  select EMPLOYEE.ID
  from EMPLOYEE EMPLOYEE
  order by EMPLOYEE.FIRSTNAME asc
) where rownum <= ?
  • offset
select * from (
 select a.*, rownum rn from (
  select EMPLOYEE.ID
  from EMPLOYEE EMPLOYEE
  order by EMPLOYEE.FIRSTNAME asc
 ) a) where rn > ?
  • limit and offset
select * from (
 select a.*, rownum rn from (
  select EMPLOYEE.ID
  from EMPLOYEE EMPLOYEE
  order by EMPLOYEE.FIRSTNAME asc
 ) a) where rn > 10 and rownum <= 10

SQLServer2005Templates

  • limit
select top (?) EMPLOYEE.ID
from EMPLOYEE EMPLOYEE
order by EMPLOYEE.FIRSTNAME asc
  • offset
with inner_query as 
(
  select EMPLOYEE.ID, row_number() over (order by EMPLOYEE.FIRSTNAME) as row_number
  from EMPLOYEE EMPLOYEE
)
select * 
from inner_query
where row_number > ?
  • limit and offset
with inner_query as 
(
  select EMPLOYEE.ID, row_number() over (order by EMPLOYEE.FIRSTNAME) as row_number
  from EMPLOYEE EMPLOYEE
)
select * 
from inner_query
where row_number > ? and row_number <= ?

TeradataTemplates

  • limit
select EMPLOYEE.ID
from EMPLOYEE EMPLOYEE
order by EMPLOYEE.FIRSTNAME asc
qualify row_number() over (order by EMPLOYEE.FIRSTNAME asc) <= ?
  • offset
select EMPLOYEE.ID
from EMPLOYEE EMPLOYEE
order by EMPLOYEE.FIRSTNAME asc
qualify row_number() over (order by EMPLOYEE.FIRSTNAME asc) > ?
  • limit and offset
select EMPLOYEE.ID
from EMPLOYEE EMPLOYEE
order by EMPLOYEE.FIRSTNAME asc
qualify row_number() over (order by EMPLOYEE.FIRSTNAME asc) between ? and ?
timowest added a commit that referenced this issue Dec 19, 2013
@mangolas
Copy link
Contributor

@mangolas mangolas commented Dec 20, 2013

Seek option could be worthwile to check out, maybe optional pattern for certain use cases? http://stackoverflow.com/questions/109232/what-is-the-best-way-to-paginate-results-in-sql-server/19609938#19609938,

@mangolas
Copy link
Contributor

@mangolas mangolas commented Dec 20, 2013

SqlServer could also use TOP in the limit-offset variant. But it should be investigated whether it has any speed advantage or not.

Also testing if it makes any difference to sort inner query already in the SqlServer example.

@mangolas
Copy link
Contributor

@mangolas mangolas commented Dec 20, 2013

And then there is question about deterministic results. This below does not necessarily return same results every time:

    select * from (
select EMPLOYEE.ID
from EMPLOYEE EMPLOYEE
order by EMPLOYEE.FIRSTNAME asc
) where rownum <= ?

One example in Oracle to return results deterministically:

select * from ( 
  select ID, row_number() over (order by FIRSTNAME asc, ID) rn
  from EMPLOYEE
 ) where rn between ? and ? 
 order by rn

And in Oracle optimizer hint could be added, just in case

/*+ first_rows(25) */
@mangolas
Copy link
Contributor

@mangolas mangolas commented Dec 20, 2013

If there is already analytical function in use, it should not be much speed penalty to add the total number of results already to the paginating query. So optionally one could get rid of using two queries, one to count the total and one to get paginated results.

count(*) over () total_count
timowest added a commit that referenced this issue Jan 4, 2014
@timowest
Copy link
Member Author

@timowest timowest commented Jan 4, 2014

Improvements for SQLServer2005Templates

SQLServer2005Templates

  • limit
select top (?) EMPLOYEE.ID
from EMPLOYEE EMPLOYEE
order by EMPLOYEE.FIRSTNAME asc
  • offset
select * from (
  select EMPLOYEE.ID, row_number() over (order by EMPLOYEE.FIRSTNAME) as rn
from EMPLOYEE EMPLOYEE) a where rn > ? order by rn
  • limit and offset
select * from (
  select EMPLOYEE.ID, row_number() over (order by EMPLOYEE.FIRSTNAME) as rn
from EMPLOYEE EMPLOYEE) a where rn > ? and rn <= ? order by rn
@timowest
Copy link
Member Author

@timowest timowest commented Jan 4, 2014

Seek option could be worthwile to check out, maybe optional pattern for certain use cases? http://stackoverflow.com/questions/109232/what-is-the-best-way-to-paginate-results-in-sql-server/19609938#19609938,

Yes, maybe via an additional issue.

SqlServer could also use TOP in the limit-offset variant. But it should be investigated whether it has any speed advantage or not.

How would you do it?

Also testing if it makes any difference to sort inner query already in the SqlServer example.

That's not possible. SQL Server forbids using order by in subqueries.

If there is already analytical function in use, it should not be much speed penalty to add the total number of results already to the paginating query. So optionally one could get rid of using two queries, one to count the total and one to get paginated results.

Good point. That's already done #556 ;)

@timowest
Copy link
Member Author

@timowest timowest commented Feb 8, 2014

Released in 3.3.1

@timowest timowest closed this Feb 8, 2014
@timowest timowest added this to the 3.3.1 milestone Apr 13, 2014
@thekalinga
Copy link

@thekalinga thekalinga commented Nov 19, 2018

@timowest Is seek implemented in querydsl? If not, any plans on including this as this gives the best performance if we dont need offset based access (for 95%+ of cases, we dont need offset based access anyways)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants