-
Notifications
You must be signed in to change notification settings - Fork 54
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of table names with colon characters #227
Conversation
ColumnFamilyStore builds for its own uses a list of Scylla's tables using the REST API. It holds each table's name in a formant meant for MBean, in an ObjectName object. As the ObjectName documentation explains, when a value contains a comma, equals, colon or quote - it must be quoted, otherwise the ObjectName cannot be constructed. Scylla table names can't ever contain commas, equals or quotes - but they can contain colons, in the case of Alternator GSIs as discovered in issue scylladb#226. So in this patch we quote the columnfamily value (using the ObjectName.quote()) function, to allow it to contain a colon. Our code also wants to read back the ObjectName, so we unquote() it to restore the unquoted name. We could have quoted the keyspace name as well, not just the columnfamily name, but this isn't necessary because even in Alternator, keyspace names cannot contain colons (or commas or equals or quotes). After this patch, "notetool info" starts to work even if an Alternator GSI exists. "nodetool info" doesn't need to access any specific table, but it still causes our JMX implementation to build that list of tables which failed before this patch and now works. Fixes scylladb#226. Signed-off-by: Nadav Har'El <nyh@scylladb.com>
After the previous patch, our JMX implementation no longer fails when there is a table whose name contains the colon character (as in scylladb#226). However, the JMX client (like nodetool) still can't *address* this table - in order to be used in a MBean object name it must be quoted (as explained in the documentation of ObjectName), but our server doesn't understand the quoted name. So in this patch we check if the "scope" (i.e., columnfamily name) is quoted, and if it is, unquote it while searching for the table. We don't do the same treatment for the other components of the bean name (like keyspace) because those can't have a colon character. After this patch, "nodetool tablestats" almost begins to work when there are Alternator GSIs (with a colon in the table's name). It's "almost" because we also need a patch in nodetool, so it will quote the table name before sending it. The patch *here* is what will allow this quoted name to be understood by the JMX server. Refs scylladb#226. Signed-off-by: Nadav Har'El <nyh@scylladb.com>
I need help from the JMX or packaging gurus: How do we confirm that this patch doesn't break anything in the dtest suite? (it's supposed to fix a dtest which isn't yet in, but we need to confirm it doesn't break any existing test). |
As the ObjectName documentation explains, when an addressing an MBean in JMX, when a value contains a comma, equals, colon or quote - it must be quoted, otherwise the ObjectName cannot be constructed. CQL or Alternator table names cannot contain any of these characters, but in Alternator the names of secondary indexes *do* contain a colon, so before this patch nodetool cannot address them - it tries to construct an ObjectName with that name, and fails. Note that "nodetool tablestats" is also broken by this bug, because even if the user doesn't try to specify one specific table, nodetool iterates on all tables and fails when it reaches the view with the colon in its name. So this patch adds the missing quoting if the table name contains a colon. We don't quote table names that don't have a colon to preserve backward compatibility with old implmentation of JMX that don't know how to unquote these names. A separate patch to scylla-jmx added support for unquoting - see scylladb/scylla-jmx#227. After this patch "nodetool tablestats" can work even if there is an Alternator GSI or LSI. Signed-off-by: Nadav Har'El <nyh@scylladb.com>
See also pull requests in other repositories:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Was surprised that you need checks before 'unquote()' or it might throw an exception, but apparently that's the API.
Even after beeing knee-deep in this code for many hours, I'm not sure I fully understood what is really happening here or why it was designed this way. Why does unquote() even need to exist? After you used quote() to create a valid ObjectName, why doesn't the All I can say is that I'm happy we are planning to get of this Java code eventually... |
Submodule update: scylladb/scylladb@282dc0a. |
@denesb did you do any such confirmation before merging this PR? |
No, I just pushed it to next.
…On Wed, 2023-11-22 at 15:18 +0000, nyh wrote:
> I need help from the JMX or packaging gurus: How do we confirm that
> this patch doesn't break anything in the dtest suite? (it's
> supposed to fix a dtest which isn't yet in, but we need to confirm
> it doesn't break any existing test).
@denesb did you do any such confirmation before merging this PR?
I see now next promotion failing on nodetool tests
https://jenkins.scylladb.com/job/scylla-master/job/next/6846/#showFailuresLink
- might be related :-(
—
Reply to this email directly, view it on GitHub, or unsubscribe.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
I can reproduce a failure of
I'll try to debug it. The curious thing is that obviously my patch did not kill en-masse all the nodetool tests, it just broke two specific tests. I'll need to debug now why. Do you want to revert the Scylla patch that updates the submodule and leave the JMX patch in and I'll send a followup patch? Or do you want to revert the scylla-jmx patch as well? Also I repeat my question for the JMX and CI gurus - don't we have CI runs for JMX patches? How do we prevent these cases from happening in the future? |
The submodule update is is only in next, it can simply be dequeued.
There is no CI for tools/* submodules, we merge to master directly. Also, there is usually no CI for submodule updates, unless the maintainer thinks the submodule update is risky, and opens a PR for CI. As you can see, the maintainer will sometimes misjudge the situation. |
Sorry about the mess. Please dequeue the Scylla submodule update, and I'll send a followup PR to this PR to fix the bug (I hope I can do this quickly, I'm debugging it now). |
Submodule update dequeued. |
I suspect we'll get rid of JMX sooner than we'll have tests for it :( |
Sent a followup: #230 |
There is a way to run all of dtest tests, but you first build using the byo a unified package and then give to a dtest job to run all tests. Other option is to run just the gating do a PR to Scylla with submodule change and let it run, I do it multiple time for cqlsh submodule, and it's mostly enough |
My problem is that because I don't do those things often, I have no idea how to do what you just said, and no idea where there is documentation now how to do it.
Yes, this is a good thing to do before merging the submodule change. But in some sense it's a bit too late - it means we already merged the broken patch into scylla-jmx (for example), and yes, we won't have Scylla use the broken patch but it's still in the repository. But on second thought - maybe it's fine and we don't need to improve (and complicate) the process. We don't need to be fanatics on perfect code and bisectability on every small repository - maybe it's not a disaster that scylla-jmx got a buggy patch, and then a day later it got a second patch to fix it. Just as long as the buggy version isn't actually used in the Scylla parent module (and it isn't). |
In some case I'm opening a PR pointing to my fork of the submodule, before merging, and opening a PR just for testing, and point it back to the actual repo/branch, when I confirmed it's working. again maybe a bit convoluted and long, and can confuse the maintainers, but it get me a CI run ontop of my changes, with the least amount of effort. (but with not much control, since I can't change dtest/ccm in such run, or the exact test that would run) |
As the ObjectName documentation explains, when an addressing an MBean in JMX, when a value contains a comma, equals, colon or quote - it must be quoted, otherwise the ObjectName cannot be constructed. CQL or Alternator table names cannot contain any of these characters, but in Alternator the names of secondary indexes *do* contain a colon, so before this patch nodetool cannot address them - it tries to construct an ObjectName with that name, and fails. Note that "nodetool tablestats" is also broken by this bug, because even if the user doesn't try to specify one specific table, nodetool iterates on all tables and fails when it reaches the view with the colon in its name. So this patch adds the missing quoting if the table name contains a colon. We don't quote table names that don't have a colon to preserve backward compatibility with old implmentation of JMX that don't know how to unquote these names. A separate patch to scylla-jmx added support for unquoting - see scylladb/scylla-jmx#227. After this patch "nodetool tablestats" can work even if there is an Alternator GSI or LSI. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes: #355
Submodule update PR: scylladb/scylladb#16146 |
As the ObjectName documentation explains, when an addressing an MBean in JMX, when a value contains a comma, equals, colon or quote - it must be quoted, otherwise the ObjectName cannot be constructed. CQL or Alternator table names cannot contain any of these characters, but in Alternator the names of secondary indexes *do* contain a colon, so before this patch nodetool cannot address them - it tries to construct an ObjectName with that name, and fails. Note that "nodetool tablestats" is also broken by this bug, because even if the user doesn't try to specify one specific table, nodetool iterates on all tables and fails when it reaches the view with the colon in its name. So this patch adds the missing quoting if the table name contains a colon. We don't quote table names that don't have a colon to preserve backward compatibility with old implmentation of JMX that don't know how to unquote these names. A separate patch to scylla-jmx added support for unquoting - see scylladb/scylla-jmx#227. After this patch "nodetool tablestats" can work even if there is an Alternator GSI or LSI. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes: #355 (cherry picked from commit 26f5f71)
As the ObjectName documentation explains, when an addressing an MBean in JMX, when a value contains a comma, equals, colon or quote - it must be quoted, otherwise the ObjectName cannot be constructed. CQL or Alternator table names cannot contain any of these characters, but in Alternator the names of secondary indexes *do* contain a colon, so before this patch nodetool cannot address them - it tries to construct an ObjectName with that name, and fails. Note that "nodetool tablestats" is also broken by this bug, because even if the user doesn't try to specify one specific table, nodetool iterates on all tables and fails when it reaches the view with the colon in its name. So this patch adds the missing quoting if the table name contains a colon. We don't quote table names that don't have a colon to preserve backward compatibility with old implmentation of JMX that don't know how to unquote these names. A separate patch to scylla-jmx added support for unquoting - see scylladb/scylla-jmx#227. After this patch "nodetool tablestats" can work even if there is an Alternator GSI or LSI. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes: #355 (cherry picked from commit 26f5f71) (cherry picked from commit 3764ae9)
As noted in #226, nodetool breaks if a table name contains the colon character. This normally can't happen - CQL table names can only contain alphanumerics and underscores, and the DynamoDB API adds also dashes and dots - but neither allows a colon. But unfortunately, when Alternator has a secondary index (GSI or LSI), the name of the view does get a colon (it looks like "tablename:viewname"). And this confuses JMX, which addresses objects using MBean names (see https://docs.oracle.com/javase/8/docs/api/javax/management/ObjectName.html), where a name that contains a colon needs to be quoted.
This small series fixes two things: First, some internal list of ObjectName objects needs to quote the names it holds to allow the colon character. Second, we should allow the external client - namely, nodetool - to send us a quoted name.
To complete this series I already have patches to two other repositories which I'll send shortly: