Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Bug: Pagy 8.4.0 is broken with MS SQL server #704

Closed
7 tasks done
erlingur opened this issue May 20, 2024 · 18 comments
Closed
7 tasks done

Bug: Pagy 8.4.0 is broken with MS SQL server #704

erlingur opened this issue May 20, 2024 · 18 comments

Comments

@erlingur
Copy link

erlingur commented May 20, 2024

馃憖 Before submitting...

  • I upgraded to pagy version 8.4.0
  • I searched through the Documentation
  • I searched through the Issues
  • I searched through the Q&A

馃 REQUIREMENTS

  • I am providing a VALID code file that confirms the bug
  • I am NOT posting any of the USELESS THINGS listed above
  • I am aware that this issue will be automatically closed if the code file is missing or INVALID

馃挰 Description

Hi there, I noticed an issue that a very recent commit just introduced when using Pagy with a MS SQL Server. Passing a 0 into .limit() does not work with MS SQL server. It will give the following error: The number of rows provided for a FETCH clause must be greater then zero.

This is because MS SQL server does not support FETCH NEXT 0 ROWS ONLY.

Docs: https://learn.microsoft.com/en-us/sql/t-sql/queries/select-order-by-clause-transact-sql?view=sql-server-ver16

Quote:

FETCH { FIRST | NEXT } { integer_constant | fetch_row_count_expression } { ROW | ROWS } ONLY
Specifies the number of rows to return after the OFFSET clause has been processed.
The value can be an integer constant or expression that is greater than or equal to one.

Here is a Rackup file as requested that demonstrates the issue. I can't spin up a SQL server for you in this file, sorry. You can launch it with a Docker command: sudo docker run -e "ACCEPT_EULA=Y" -e "MSSQL_SA_PASSWORD=Password.1" -p 1433:1433 --name sql1 --hostname sql1 -d mcr.microsoft.com/mssql/server:2022-latest

rails.ru:

# frozen_string_literal: true

# Starting point to reproduce rails related pagy issues

# DEV USAGE
#    pagy clone rails
#    pagy ./rails.ru

# URL
#    http://0.0.0.0:8000

# HELP
#    pagy -h

# DOC
#    https://ddnexus.github.io/pagy/playground/#2-rails-app

VERSION = '8.4.0'

# Gemfile
require 'bundler/inline'
require 'bundler'
Bundler.configure
gemfile(ENV['PAGY_INSTALL_BUNDLE'] == 'true') do
  source 'https://rubygems.org'
  gem 'oj'
  gem 'puma'
  gem 'rails'
  gem 'tiny_tds'
  gem 'activerecord-sqlserver-adapter'
end

# require 'rails/all'     # too much stuff
require 'action_controller/railtie'
require 'active_record'

OUTPUT = Rails.env.showcase? ? IO::NULL : $stdout

# Rails config
class PagyRails < Rails::Application # :nodoc:
  config.root = __dir__
  config.session_store :cookie_store, key: 'cookie_store_key'
  Rails.application.credentials.secret_key_base = 'absolute_secret'

  config.logger = Logger.new(OUTPUT)
  Rails.logger  = config.logger

  routes.draw do
    root to: 'comments#index'
    get '/javascript' => 'pagy#javascript'
  end
end

# AR config
dir = Rails.env.development? ? '.' : Dir.pwd  # app dir in dev or pwd otherwise
unless File.writable?(dir)
  warn "ERROR: directory #{dir.inspect} is not writable (the pagy-rails-app needs to create DB files)"
  exit 1
end

# Pagy initializer
require 'pagy/extras/pagy'
require 'pagy/extras/items'
require 'pagy/extras/overflow'
Pagy::DEFAULT[:size]     = [1, 4, 4, 1]
Pagy::DEFAULT[:items]    = 10
Pagy::DEFAULT[:overflow] = :empty_page
Pagy::DEFAULT.freeze

connection_options = {adapter: 'sqlserver', username: "sa", password: "Password.1", dataserver: "localhost"}
ActiveRecord::Base.establish_connection(connection_options)
database = 'pagy_repro'
begin
  ActiveRecord::Base.connection.create_database(database)
rescue ActiveRecord::StatementInvalid
  # already exists ...
end
ActiveRecord::Base.establish_connection(connection_options.merge(database: database))

ActiveRecord::Schema.define do
  create_table :posts, force: true do |t|
    t.string :title
  end

  create_table :comments, force: true do |t|
    t.string :body
    t.integer :post_id
  end
end

# Models
class Post < ActiveRecord::Base # :nodoc:
  has_many :comments
end # :nodoc:

class Comment < ActiveRecord::Base # :nodoc:
  belongs_to :post
end # :nodoc:

# Don't seed, it needs to have 0 records to reproduce the issue
# 1.upto(11) do |pi|
#   Post.create(title: "Post #{pi + 1}")
# end
# Post.all.each_with_index do |post, pi|
#   2.times { |ci| Comment.create(post:, body: "Comment #{ci + 1} to Post #{pi + 1}") }
# end

# Down here to avoid logging the DB seed above at each restart
ActiveRecord::Base.logger = Logger.new(OUTPUT)

# Helpers
module CommentsHelper
  include Pagy::Frontend
end

# Controllers
class CommentsController < ActionController::Base # :nodoc:
  include Rails.application.routes.url_helpers
  include Pagy::Backend

  def index
    @pagy, @comments = pagy(Comment.all)
    render inline: TEMPLATE
  end
end

# You don't need this in real rails apps (see https://ddnexus.github.io/pagy/docs/api/javascript/setup/#2-configure)
class PagyController < ActionController::Base
  def javascript
    file = "pagy#{'-dev' if ENV['DEBUG']}.js"
    render js: Pagy.root.join('javascripts', file).read
  end
end

run PagyRails

TEMPLATE = <<~ERB
  <!DOCTYPE html>
  <html lang="en">
    <html>
    <head>
    <title>Pagy Rails App</title>
      <script src="/javascript"></script>
      <script>
        window.addEventListener("load", Pagy.init);
      </script>
      <meta name="viewport" content="width=device-width, initial-scale=1.0">
      <style type="text/css">
        @media screen { html, body {
          font-size: 1rem;
          line-heigth: 1.2s;
          padding: 0;
          margin: 0;
        } }
        body {
          background: white !important;
          margin: 0 !important;
          font-family: sans-serif !important;
        }
        .content {
          padding: 1rem 1.5rem 2rem !important;
        }

        /* Quick demo for overriding the element style attribute of certain pagy helpers
        .pagy input[style] {
          width: 5rem !important;
        }
        */

        /*
          If you want to customize the style,
          please replace the line below with the actual file content
        */
        <%== Pagy.root.join('stylesheets', 'pagy.css').read %>
      </style>
    </head>

    <body>

      <div class="content">
        <h1>Pagy Rails App</h1>
        <p> Self-contained, standalone Rails app usable to easily reproduce any rails related pagy issue.</p>
        <p>Please, report the following versions in any new issue.</p>
        <h2>Versions</h2>
        <ul>
          <li>Ruby:  <%== RUBY_VERSION %></li>
          <li>Rack:  <%== Rack::RELEASE %></li>
          <li>Rails: <%== Rails.version %></li>
          <li>Pagy:  <%== Pagy::VERSION %></li>
        </ul>

        <h3>Collection</h3>
        <div id="records" class="collection">
        <% @comments.each do |comment| %>
          <p style="margin: 0;"><%= comment.body %></p>
        <% end %>
        </div>
        <hr>

        <h4>pagy_nav</h4>
        <%== pagy_nav(@pagy, id: 'nav', aria_label: 'Pages nav') %>

        <h4>pagy_nav_js</h4>
        <%== pagy_nav_js(@pagy, id: 'nav-js', aria_label: 'Pages nav_js') %>

        <h4>pagy_combo_nav_js</h4>
        <%== pagy_combo_nav_js(@pagy, id: 'combo-nav-js', aria_label: 'Pages combo_nav_js') %>

        <h4>pagy_items_selector_js</h4>
        <%== pagy_items_selector_js(@pagy, id: 'items-selector-js') %>

        <h4>pagy_info</h4>
        <%== pagy_info(@pagy, id: 'pagy-info') %>
      </div>

    </body>
  </html>
ERB

I've tried my best to make this as easily reproducible but because of the nature of the issue this is the best I could come up with. Hope it works.

@erlingur erlingur added the bug label May 20, 2024
@benkoshy
Copy link
Collaborator

benkoshy commented May 20, 2024

Thanks for your report.

I was trying to reproduce it with your chosen database and with the command specified:

`Database 'pagy_repro' does not exist. Make sure that the name is entered correctly. (TinyTds::Error)```

if you have the code handy, adding it in the above lines would help inordinately?

@ddnexus
Copy link
Owner

ddnexus commented May 20, 2024

I've tried my best to make this as easily reproducible but because of the nature of the issue this is the best I could come up with.

You certainly did @erlingur ! Thank you for your report.

That commits should have improved the performance of a probably quite rare case, and it has been added thinking that it wouldn't hurt anything else.

Unluckily that is almost never the case 馃し.

At worse we will switch back to pass the items and add a troubleshooting entry explaining how to improve the performance by overriding, only when really needed.

BTW, I didn't try to run your command, so you may still want to help @benkoshy to reproduce it.

Thank you

@ddnexus
Copy link
Owner

ddnexus commented May 20, 2024

@jules-w2

@jules-w2
Copy link

Hey @ddnexus

I'm not sur it's a "rare" case.
the case occurs on the last page when there are millions of entries in the database and the "limit" is greater than the results which remain to be displayed

I think it would be easy to solve all cases with:

collection.offset(pagy.offset).limit(pagy.in >聽0 ? pagy.in :  pagy.items )

Bests

@ddnexus
Copy link
Owner

ddnexus commented May 20, 2024

@joules-w2 about the rarity of the case: in practice we have no idea one way or another.

But my educated guess would suggest that it could be your own setup, app, DB type or settings or query, in combination with who knows what... even a bug in the particular DB version/driver/ORM. Because one thing is certain: there have been no other complaint EVER in 7 years so far.

And when there are so many possible combinations, no other complaints ever, no objective reproducible culprit that we can pin down, I am comfortable to consider that it IS a rare case until proven common.

So adding a condition for every request, of every app, forever! and without any objective measurable reason nor proof, also considering that the benefit is only to improve the performance of the very last page in DBs with zillions of records and the right condition and star alignments... it's quite against my religion 馃し

Of course, if you can prove that the change is worth it, I will be happy to change my mind.

@erlingur
Copy link
Author

@benkoshy @ddnexus Thank you for taking a look! I've update the rails.ru file so it should now create the database for you if it doesn't exist. It works on my machine so hopefully it should work on yours :)

@jules-w2
Copy link

@ddnexus

as soon as I find free time, I will look in depth at the origin of the slowness to understand exactly what is happening

@benkoshy
Copy link
Collaborator

benkoshy commented May 23, 2024

Thank you for taking a look! I've update the rails.ru file so it should now create the database for you if it doesn't exist. It works on my machine so hopefully it should work on yours :)

Excellent: I can confirm reproduction of the issue.

DEBUG -- :    (1.5ms)  IF @@TRANCOUNT > 0 ROLLBACK TRANSACTION
INFO -- : Started GET "/" for 127.0.0.1 at 2024-05-24 06:06:23 +1000
DEBUG -- :   Comment Count (1.4ms)  SELECT COUNT(*) FROM [comments]
DEBUG -- :   Comment Load (2.5ms)  EXEC sp_executesql N'SELECT [comments].* FROM [comments] ORDER BY [comments].[id] ASC OFFSET @0 ROWS FETCH NEXT @1 ROWS ONLY', N'@0 int, @1 int', @0 = 0, @1 = 0  [["OFFSET", nil], ["LIMIT", nil]]
F, [2024-05-24T06:06:23.977388 #405730] FATAL -- :   
ActionView::Template::Error (TinyTds::Error: The number of rows provided for a FETCH clause must be greater then zero.):
    54: 
    55:       <h3>Collection</h3>
    56:       <div id="records" class="collection">
    57:       <% @comments.each do |comment| %>
    58:         <p style="margin: 0;"><%= comment.body %></p>
    59:       <% end %>
    60:       </div>

@ddnexus
Copy link
Owner

ddnexus commented May 23, 2024

Thank you @benkoshy

In order to avoid any surprise, pease could you check also the other way around in the same environment, i.e. changing pagy.in back to pagy.items and trying the exact same query?

@benkoshy
Copy link
Collaborator

benkoshy commented May 23, 2024

Thank you @benkoshy

In order to avoid any surprise, pease could you check also the other way around in the same environment, i.e. changing pagy.in back to pagy.items and trying the exact same query?

As correctly predicted: using items simply works: even though there are no items in the collection. (I tested via overridding in the script rather than referencing a fork of the gem with pagy.items)

# Controllers
class CommentsController < ActionController::Base # :nodoc:
  include Rails.application.routes.url_helpers
  include Pagy::Backend

  def index
    @pagy, @comments = pagy(Comment.all)
    render inline: TEMPLATE
  end

    def pagy_get_items(collection, pagy)
      collection.offset(pagy.offset).limit(pagy.items)      
    end
end

image

DEBUG -- :   Comment Count (2.2ms)  SELECT COUNT(*) FROM [comments]
DEBUG -- :   Comment Load (2.8ms)  EXEC sp_executesql N'SELECT [comments].* FROM [comments] ORDER BY [comments].[id] ASC OFFSET @0 ROWS FETCH NEXT @1 ROWS ONLY', N'@0 int, @1 int', @0 = 0, @1 = 10  [["OFFSET", nil], ["LIMIT", nil]]
127.0.0.1 - - [24/May/2024:08:51:54 +1000] "GET / HTTP/1.1" 200 - 6.4252
I, [2024-05-24T08:51:54.192835 #440730]  INFO -- : Started GET "/javascript" for 127.0.0.1 at 2024-05-24 08:51:54 +1000
127.0.0.1 - - [24/May/2024:08:51:54 +1000] "GET /javascript HTTP/1.1" 200 - 0.0550

@ddnexus
Copy link
Owner

ddnexus commented May 23, 2024

Thank you @benkoshy.

At this point, while waiting to know more about the actual cause of the last page delay in @jules-w2 case, we should go back to pagy.items.

We should also create a new entry in the troubleshooting page, referencing the case and the overriding.

benkoshy added a commit that referenced this issue May 23, 2024
* Please refer to: #704 (comment)

This reverts commit 9a0bffa.
benkoshy added a commit that referenced this issue May 23, 2024
benkoshy added a commit that referenced this issue May 24, 2024
@jules-w2
Copy link

Hey @ddnexus

After further investigation, I found that this issue does not originate from Rails, Ransack, Pagy, etc., but directly from MySQL. You should be able to reproduce it on your side easily if you have a table with a large amount of data.

I am working with MariaDB 11.2.3:

SELECT VERSION();
+----------------+
| VERSION()      |
+----------------+
| 11.2.3-MariaDB |
+----------------+
1 row in set (0.000 sec)

I have 14,553 results for my query:

SELECT COUNT(*) FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00');
+----------+
| COUNT(*) |
+----------+
|    14553 |
+----------+
1 row in set (0.380 sec)

With limit = items.in (in this case 3)

SELECT DISTINCT * FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00') ORDER BY `aws_emails`.`id` DESC LIMIT 3 OFFSET 14550;
3 rows in set (0.377 sec)

The same code with limit 4

SELECT DISTINCT * FROM `aws_emails` WHERE (`aws_emails`.`event_type` = 'Delivery' AND `aws_emails`.`created_at` >= '2024-05-01 04:00:00') ORDER BY `aws_emails`.`id` DESC LIMIT 4 OFFSET 14550;
3 rows in set (51.727 sec)

@ddnexus are you able to reproduce this case ?

@jules-w2
Copy link

jules-w2 commented May 24, 2024

I found the reason :

"MariaDB has to find all [OFFSET+LIMIT] rows, step over the first [OFFSET], then deliver the [LIMIT] for that distant page."

https://emmer.dev/blog/the-dangers-of-offset-with-mysql
https://mariadb.com/kb/en/pagination-optimization/#the-problem
https://planetscale.com/blog/mysql-pagination
https://github.com/planetscale/fast_page?tab=readme-ov-file#pagy

@ddnexus
Copy link
Owner

ddnexus commented May 24, 2024

@jules-w2 great references!

I preliminarily read it quickly and skipping around, and I may have missed something, but your specific case seems to be different (and more extreme) than the usual known progressive performance hit on large tables.

You have a huge difference only on the last page that doesn't seem to be related to the OFFSET as much as related to the LIMIT.

I see the charts in the first link you posted and that is very much a constant increase, related to the need to skip over a huge number of rows (OFFSET), not a huge spike on the very last page, which in your case can be totally removed with the right LIMIT.

Did I miss something?

BTW, could you try the performance by following https://ddnexus.github.io/pagy/docs/how-to/#consider-the-arel-extra-and-or-the-fast-page-gem ?

@ddnexus
Copy link
Owner

ddnexus commented May 25, 2024

@jules-w2 please could you try the query on the last page with the usual OFFSET but without the LIMIT?

And just a confirmation: are you running the query in the MySQL console?

@jules-w2
Copy link

@ddnexus

The problem when looking for [OFFSET+LIMIT] is that if [OFFSET + LIMIT] is greater than the total number of rows, MySQL will look for rows that do not exist.. (and we must enter internal timeouts of MySQL)

This is the typical case which arrives on the last page.

I run the query in MySQL console just to understand the problem. In practice I use ActiveRecord + Ransasck + Pagy.

I will test the 2 recommendations and I will come back to you with the results

@ddnexus
Copy link
Owner

ddnexus commented May 25, 2024

MySQL console just to understand the problem

and that is perfect because it excludes any possible problem with the rest of the environment

I will test the 2 recommendations and I will come back to you with the results

They are together but I think that fastpage is probably the one that could make some difference, but I am just guessing. It would be wise to try one, the other and both together (if possible).

@ddnexus ddnexus mentioned this issue May 28, 2024
7 tasks
@ddnexus
Copy link
Owner

ddnexus commented May 29, 2024

MySQL will look for rows that do not exist..

A couple of things to unpack here:

  1. I didn't find any description of any (OFFSET+LIMIT) > COUNT problem reaching "rows that do not exist" in any of your links. Did I miss something? They all describe just the usual performance hit with skipping over many rows, so I guess that explaining your case with "looking for rows that don't exist" is your own deduction (which seems reasonable, but yet unconfirmed).

  2. It would be very "out of character" for a DB to be so dumb and unaware of the count of a table to get stuck on the last few rows for 1 minute, searching for rows that don't exist. Usually a DB should be pretty resilient to what a query may ask for, especially about data that "don't exist". I am starting to believe that it might be a DB bug.

Whatever is the cause of the problem, at this point it seems quite established that it's external to ruby and pagy, so documenting any useful workaround should be more than enough for the pagy scope.

ddnexus pushed a commit that referenced this issue Jun 2, 2024
@ddnexus ddnexus closed this as completed in 8206aa6 Jun 2, 2024
@ddnexus ddnexus added the fixed label Jun 3, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants