Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

In PostgreSQL, allow spaces in table names without schema #26722

Closed
wants to merge 2 commits into from

Conversation

Jerph
Copy link

@Jerph Jerph commented Oct 6, 2016

Summary

Fix for #26721 by checking for a dot before parsing out schema. Test case added.

Other Information

If your table name has spaces (e.g. ‘work orders’) and you don’t
specify schema or use quotes, the postgresql adapter would use the
first word as schema. This fix interprets as table name unless it
contains a dot.

This function has previously been the source of performance issues. This change should cause a slight performance penalty when the schema is included, and improve performance when it's not.

I benchmarked this patch in rails 4.2.7 by running the test cases for 20s:

require 'test_helper'

class PgPerformanceTest < ActiveSupport::TestCase
  TIME    = (ENV['BENCHMARK_TIME'] || 20).to_i
  RECORDS = (ENV['BENCHMARK_RECORDS'] || TIME*1000).to_i

  test 'pg performance' do
    Benchmark.ips(TIME) do |x|
      test_cases = [
        %(table_name),
        %("table.name"),
        %(schema.table_name),
        %("schema".table_name),
        %(schema."table_name"),
        %("schema"."table_name"),
        %("even spaces".table),
        %(spaces table only),
        %("spaces table only"),
        %(schema."table.name"),
      ]

      x.report("without dot check") do
        test_cases.each do |given|
          ActiveRecord::ConnectionAdapters::PostgreSQL::Utils.extract_schema_qualified_name(given)
        end
      end

      x.report("with dot check") do
        test_cases.each do |given|
          ActiveRecord::ConnectionAdapters::PostgreSQL::Utils.my_extract_schema_qualified_name(given)
        end
      end
    end
  end
end
Warming up --------------------------------------
   without dot check     2.776k i/100ms
      with dot check     3.445k i/100ms
Calculating -------------------------------------
   without dot check     28.045k (± 5.0%) i/s -    560.752k in  20.055669s
      with dot check     35.182k (± 4.6%) i/s -    702.780k in  20.020912s
Finished in 44.31501s

First time contributor here! Hope I can get this to PR a usable state.

@matthewd
Copy link
Member

matthewd commented Oct 6, 2016

I don't see why we'd want to treat "x.foo bar" differently from "foo bar".

The current intention appears to be that the table name is treated as an unquoted identifier, meaning "\"foo bar\"" would already work correctly.

@Jerph
Copy link
Author

Jerph commented Oct 6, 2016

@@if so, it's inconsistent with the mysql/sqlite adapter behavior, which appear to handle table names with spaces as a quoted identifier and have no problem with just "foo bar". Then again, they are the outliers in not having separate schema/database concepts, so maybe there's an argument for different behavior.

Honestly, I don't understand why the white space is checked in that regex at all. Is it intentionally another option for specifying schema? White space isn't shown in any of the test cases, and the only way I know to specify schema is with a dot.

This fix doesn't preclude handling the parsing logic any way we want, so actual table names with dots in them (lord help us :-D) are still handled correctly. If there's no dot though, there simply can't be a schema, so it should be treated as a quoted identifier, in my opinion.

If your table name has spaces  (e.g. ‘work orders’) and you don’t
specify schema or use quotes, the postgresql adapter would use the
first word as schema. This fix interprets as table name unless it
contains a dot.
@matthewd
Copy link
Member

matthewd commented Oct 6, 2016

Looks like it's been an undocumented behaviour since #1632.

As is, this would change (read: break) handling of "\"foo\"", in an IMO undesirable way.

Let's go the safest route, and deprecate unquoted spaces -- leave it parsing as a schema for now (and users can quote for the table-name-with-spaces interpretation), then look at stricter options later.

@Jerph
Copy link
Author

Jerph commented Oct 6, 2016

Good point! I added a test for quoted strings without a schema, and I don't believe this changes any behavior with ""foo"" because strings are unquoted by the Name class.

The different behavior should only be with quotes and spaces when there's no dot in the string. Malformed double quotes for instance. Some examples I can think of:

"foo bar" -> [nil, "foo bar"] vs ["foo", "bar"] # the original problem
"foo\"bar" -> [nil, "foo\"bar"] vs ["foo", "bar"] # unquote ignores quotes in the middle
"\"foo\" bar" -> [nil, "foo\" ba"] vs ["foo", "bar"] # unquote strips off the last character, assuming a paired quote

All of these examples were erroneous before, though. I definitely won't say that there are no legitimate identifiers that would be broken, but I can't think of any.

Regarding #1632, I didn't see any reasoning for the use of \s in the regex. If it could be removed that possibly be a better solution. As it is, I can't imagine any code running successfully that relies on the split on white space.

@matthewd
Copy link
Member

matthewd commented Oct 6, 2016

"foo bar" -> [nil, "foo bar"] vs ["foo", "bar"] # the original problem

This doesn't seem sufficiently far-fetched to me that we can silently change it. (The others do seem inconsequential.)

So, I agree the \s should go away, but do believe we need to give warning.

@Jerph
Copy link
Author

Jerph commented Oct 6, 2016

That sounds good. Thanks for helping me on this.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants