Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bug with long stream identifiers when using Postgres adapter #29297

Merged

Conversation

palkan
Copy link
Contributor

@palkan palkan commented May 31, 2017

Fixes #28751.

PostgreSQL has a limit on identifiers length (63 chars, docs).

Provided fix minifies identifiers longer than 63 chars by hashing them with SHA1.
Although it has an impact on performance, I think, it's negligible.

It would be great to use non-cryptographic hash functions (such as Murmur or CityHash), but that would require adding new dependencies.

@rails-bot
Copy link

Thanks for the pull request, and welcome! The Rails team is excited to review your changes, and you should hear from @matthewd (or someone else) soon.

If any changes to this PR are deemed necessary, please add them as extra commits. This ensures that the reviewer can see what has changed since they last reviewed the code. Due to the way GitHub handles out-of-date commits, this should also make it reasonably obvious what issues have or haven't been addressed. Large or tricky changes may require several passes of review and changes.

This repository is being automatically checked for code quality issues using Code Climate. You can see results for this analysis in the PR status below. Newly introduced issues should be fixed before a Pull Request is considered ready to review.

Please see the contribution instructions for more information.

@palkan palkan force-pushed the fix/action-cable-postgres-identifiers-limit branch from 77691d9 to 7094cd6 Compare May 31, 2017 16:25
Copy link
Member

@matthewd matthewd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine with SHA1 -- it shouldn't be user-controlled data, so I see no risk of collision. And it's used for network communication, so the extra time spent hashing should indeed be negligible.

It does seem a bit unfortunate to leave it totally opaque, though... is it worth keeping the first 22 characters?

@@ -32,7 +33,7 @@ def subscribe_as_queue(channel, adapter = @rx_adapter)
subscribed.wait(WAIT_WHEN_EXPECTING_EVENT)
assert subscribed.set?

yield queue
Timeout.timeout(WAIT_WHEN_EXPECTING_EVENT) { yield queue }
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👎

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you, please, clarify?

That's an easiest way to handle blocking queue.pop calls in these tests (which cause a process to stuck forever).

Of course, it would be better to refactor the tests to not use blocking calls though.

Another idea is to "enhance" Queue#pop with built-in timeout through refinements:

refine Queue do
  def pop(timeout)
    while empty?
      sleep 0.1
      timeout -= 0.1
      return nil unless timeout > 0
    end
    super
  end
end

Or we can add custom queue class.

But either way, I think, we should be able to make tests fail (which is impossible now).

@@ -12,7 +12,7 @@ def setup
if Dir.exist?(ar_tests)
require File.join(ar_tests, "config")
require File.join(ar_tests, "support/config")
local_config = ARTest.config["arunit"]
local_config = ARTest.config["connections"]["postgresql"]["arunit"]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I understand, we want to use the same DB configuration as in ActiveRecord tests (that's why we're using ARTest).

ARTest.config["arunit"] is always nil, 'cause databases config has nested structure (connections -> adapter -> id).

I'm using custom PostgreSQL credentials in config.yml, and it turned out that there is a bug here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It can be extracted into a separate PR, 'cause it does nothing with the bug under consideration.

@palkan
Copy link
Contributor Author

palkan commented Jun 2, 2017

is it worth keeping the first 22 characters?

Keeping in mind that we're working with long GlobalIDs, I'm not sure that would help, 'cause in that case the prefix would contain only a part of a namespace, e.g. "gid://app/MySuperModule/SubModule/Person/5" => "gid://app/MySuperModul".

@palkan palkan force-pushed the fix/action-cable-postgres-identifiers-limit branch 2 times, most recently from b2f6824 to 6efe6b5 Compare July 6, 2017 14:30
@palkan palkan force-pushed the fix/action-cable-postgres-identifiers-limit branch from 6efe6b5 to 2bce777 Compare July 6, 2017 14:34
@palkan
Copy link
Contributor Author

palkan commented Jul 6, 2017

@matthewd Codebase was cleaned up. Anything else I can do to get it merged?

@matthewd matthewd merged commit 40a20de into rails:master Jul 8, 2017
@matthewd
Copy link
Member

matthewd commented Jul 8, 2017

Sorry I didn't reply on this earlier.

Beyond general concern about Timeout, allocating a single WAIT_WHEN_EXPECTING_EVENT to an arbitrarily-long block of code (which could, itself, contain one or more waits) doesn't really honour the implied contract. Tests failing when things are broken is certainly a good thing, though, so finding some way to make that true seems like a good idea. I think I'd still be hesitant for any "set up a subscription" helper to apply a surprise time limit on its contained block, though.

As for the ARTest thing -- yeah, I just couldn't work out why that was necessary for this change. If it's Just Wrong, we should definitely fix that. 🙈

@palkan palkan deleted the fix/action-cable-postgres-identifiers-limit branch July 9, 2017 06:01
@eric1234
Copy link
Contributor

If someone needs this backported to a released version of Rails the following will work:

# Backport of https://github.com/rails/rails/commit/2bce7777b70efe81f45e4ae8dc61b25f1e18771e
require 'action_cable/subscription_adapter/postgresql'

module PostgresChannelShortener
  def broadcast channel, payload
    super channel_identifier(channel), payload
  end

  def subscribe channel, callback, success_callback=nil
    super channel_identifier(channel), callback, success_callback
  end

  def unsubscribe channel, callback
    super channel_identifier(channel), callback
  end

  private

  def channel_identifier channel
    channel.size > 63 ? Digest::SHA1.hexdigest(channel) : channel
  end
end
ActionCable::SubscriptionAdapter::PostgreSQL.prepend PostgresChannelShortener

I drop this in my lib directory then just require it from my config/application.rb file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

ActionCable's stream_for creates channel identifiers are too long for postgres
4 participants