New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds migration and type casting support for PostgreSQL Array datatype #7547

Merged
merged 1 commit into from Sep 17, 2012

Conversation

Projects
None yet
7 participants
@danmcclain
Copy link
Contributor

danmcclain commented Sep 6, 2012

PostgreSQL array type support. Any datatype can be used to create an array column, with full migration and schema dumper support.

To declare an array column, use the following syntax:

create_table :table_with_arrays do |t|
  t.integer :int_array, :array => true
  # integer[]
  t.integer :int_array, :array => true, :length => 2
  # smallint[]
  t.string :string_array, :array => true, :length => 30
  # char varying(30)[]
end 

This respects any other migraion detail (limits, defaults, etc). ActiveRecord will serialize and deserialize the array columns on their way to and from the database.

One thing to note: PostgreSQL does not enforce any limits on the number of elements, and any array can be multi-dimensional. Any array that is multi-dimensional must be rectangular (each sub array must have the same number of elements as its siblings).

I can squash this commit after review

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Sep 6, 2012

This will need a rebase.

@bcardarella

This comment has been minimized.

Copy link
Contributor

bcardarella commented Sep 7, 2012

👍

@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 7, 2012

I realized I didn't have test cases for strings that contain special characters (",,,{). I added those cases, and fixed the casting code

@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 8, 2012

Updated tests and code again to cover the case you want to store a string that is "NULL" and not a nil value

@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 9, 2012

@kurko

This comment has been minimized.

Copy link

kurko commented Sep 9, 2012

👍

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Sep 9, 2012

This will need a rebase.

@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 9, 2012

@steveklabnik: Rebased

@schneems

This comment has been minimized.

Copy link
Member

schneems commented Sep 9, 2012

👍 on the feature, I ❤️ postgresql datatypes. @steveklabnik can you please flag this as AR?

I did have a problem bootstrapping your fork:

Gemfile:

gem 'rails',     :git => 'git://github.com/danmcclain/rails.git', :branch => 'pg-arrays'

Then I added the pg gem, ran bundle install and changed my database over to postgresql, then I ran $ rake db:create but it blew up:

$ bundle exec rake db:create

Please install the postgresql adapter: `gem install activerecord-postgresql-adapter` (pg_array_parser is not part of the bundle. Add it to Gemfile.)

/Users/schneems/.rvm/gems/ruby-1.9.3-p194@heroku/gems/bundler-1.2.0/lib/bundler/rubygems_integration.rb:147:in `block in replace_gem'

After adding it to my Gemfile, everything worked like suspected and I was able to add an integer array column to an existing table:

  def change
    add_column :foos, :int_array, :integer, :array => true
  end

Worked great:

1.9.3p194 :006 > foo = Foo.last
  Foo Load (5.1ms)  SELECT "foos".* FROM "foos" ORDER BY "foos"."id" DESC LIMIT 1
 => #<Foo id: 1, name: nil, created_at: "2012-09-09 19:06:11", updated_at: "2012-09-09 19:06:11", int_array: nil> 
1.9.3p194 :007 > foo.int_array = [1,2,3,4,5,6]
 => [1, 2, 3, 4, 5, 6] 
1.9.3p194 :008 > foo.save
   (0.4ms)  BEGIN
   (4.0ms)  UPDATE "foos" SET "int_array" = '{1,2,3,4,5,6}', "updated_at" = '2012-09-09 19:13:48.280150' WHERE "foos"."id" = 1
   (0.5ms)  COMMIT
 => true 
1.9.3p194 :009 > Foo.last.int_array
  Foo Load (0.7ms)  SELECT "foos".* FROM "foos" ORDER BY "foos"."id" DESC LIMIT 1
 => [1, 2, 3, 4, 5, 6] 

It's possible I did something wrong in the setup, but otherwise - as it stands this won't work out of the box. Let me know if you have debugging steps. I would like to see maximum data type support for PG in Rails.

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Sep 9, 2012

@steveklabnik can you please flag this as AR?

You can't tag pull requests, just normal Issues. :/

@carlosantoniodasilva

This comment has been minimized.

Copy link
Member

carlosantoniodasilva commented Sep 9, 2012

@steveklabnik actually you can, but only from the issues list, not from inside the issue/pr itself =(. Not sure if it's a feature or some bug, though :P

@steveklabnik

This comment has been minimized.

Copy link
Member

steveklabnik commented Sep 9, 2012

Ugh. I wondered how some of them had tags, I assumed it was a bug or that you could choose it when making the PR.

@carlosantoniodasilva

This comment has been minimized.

Copy link
Member

carlosantoniodasilva commented Sep 9, 2012

Yeah, looks like a bug, perhaps we should ping someone from @github to ask about it.

@bcardarella

This comment has been minimized.

Copy link
Contributor

bcardarella commented Sep 9, 2012

@schneems

Just so everybody is on the same page, the pg_array_parser gem is a C extension to speed up the parsing of the Postgres array strings. @danmcclain originally wrote it in Ruby and the performance was OK at best. Moving to C for this yielded a significant increase in speed. pg_array_parser also supports JRuby

So that being said, there are two options here with trade offs:

  1. the pg_array_parser components can be integrated into the pg and jdbc-postgres gems.
  2. put back the Ruby implementation of the array parsing but add to the Guide that pg_array_parser should be added in to over ride for better performance. It will only effect people that use the array syntax but it will be an extra step.

Am I missing any other potential options?

@schneems

This comment has been minimized.

Copy link
Member

schneems commented Sep 10, 2012

@bcardarella thanks for the info, and @danmcclain thanks for the work... i'm good with any of those options. If we need integration in another library/gem, we'll likely need those accepted before this gets merged in. Having support for PG arrays in Rails shouldn't rely on any other libraries besides ActiveRecord and the 'pg' gem (or jdbc-postgres for jruby). If adding a library after the fact speeds things up, that would be fine, but it would be nice to get this functionality without having to add another gem to my Gemfile on new projects.

@rafaelfranca

This comment has been minimized.

Copy link
Member

rafaelfranca commented Sep 11, 2012

I agree with @schneems about the pg_array_parser gem. If I want to speed up I can just put it in my Gemfile.

We can add documentation pointing to this gem.

@rafaelfranca

View changes

activerecord/CHANGELOG.md Outdated
@@ -661,4 +661,32 @@

* PostgreSQL hstore types are automatically deserialized from the database.

* PostgreSQL inet and cidr types are converted to `IPAddr` objects.

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

Please put these CHANGELOG entries in the top of the file.

PS: I thought this one was added when merged.

This comment has been minimized.

@danmcclain

danmcclain Sep 11, 2012

Contributor

Forgot to add the inet/cidr one originally, I don't see it anywhere in the changelog now. Please correct me if I'm wrong

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb Outdated
attr_accessor :money_precision
end
# :startdoc:

# Extracts the value from a PostgreSQL column default definition.
# Extracts the value from a PostgreSQL column default definition.

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

Wrong indentation

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb Outdated
spec[:scale] = column.scale.inspect if column.scale
spec[:null] = 'false' unless column.null
spec[:default] = default_string(column.default) if column.has_default?
spec[:array] = 'true' if column.respond_to?(:array) && column.array

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

We should indent the =

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb Outdated
def initialize_type_map
result = execute('SELECT oid, typname, typelem, typdelim FROM pg_type', 'SCHEMA')
leaves, nodes = result.partition { |row| row['typelem'] == '0' }
def initialize_type_map

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

Please don't change the indentation

@@ -494,6 +547,13 @@ def table_alias_length
@table_alias_length ||= query('SHOW max_identifier_length', 'SCHEMA')[0][0].to_i
end

def add_column_options!(sql, options)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

Where this is used?

This comment has been minimized.

@danmcclain

danmcclain Sep 11, 2012

Contributor

This is used when generating the column SQL statement. When getting the column type, this method is called, which correctly adds the [] to the column statement

@rafaelfranca

View changes

activerecord/test/cases/schema_dumper_test.rb Outdated
@@ -96,7 +96,7 @@ def test_schema_dump_includes_not_null_columns
ActiveRecord::SchemaDumper.ignore_tables = [/^[^r]/]
ActiveRecord::SchemaDumper.dump(ActiveRecord::Base.connection, stream)
output = stream.string
assert_match %r{null: false}, output
assert_match %r{null => false}, output

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

Use the ruby 1.9 syntax

This comment has been minimized.

@danmcclain

danmcclain Sep 11, 2012

Contributor

@rafaelfranca This had the 1.9 syntax, but failed the tests when I ran it locally, I will convert it back

This comment has been minimized.

@danmcclain

danmcclain Sep 11, 2012

Contributor

Ah, I see why this happened now.

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb Outdated
spec[:null] = 'false' unless column.null
spec[:default] = default_string(column.default) if column.has_default?
spec[:array] = 'true' if column.respond_to?(:array) && column.array
(spec.keys - [:name, :type]).each{ |k| spec[k].insert(0, "#{k.inspect} => ")}

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

Can we use the Ruby 1.9 hash syntax?

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/abstract/schema_dumper.rb Outdated
[:name, :limit, :precision, :scale, :default, :null]
end

def default_string(value)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

This can be private

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql/cast.rb Outdated
"{#{casted_values.join(',')}}"
end

def quote_and_escape(value)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

This can be private

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql_adapter.rb Outdated
@@ -314,6 +342,31 @@ def adapter_name
ADAPTER_NAME
end

def column_spec(column, types)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 11, 2012

Member

I think we can remove a lot of duplication splitting this method in column_spec and prepare_options

@rafaelfranca

This comment has been minimized.

Copy link
Member

rafaelfranca commented Sep 11, 2012

I did some comments. I think the tests are passing in postgreSQL but not in others adapter because of the 1.8 hash syntax.

@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 11, 2012

@rafaelfranca I think I have addressed all the issues you presented. I have also added a pure ruby implementation of pg_array_parser which will be used if pg_array_parser is not available, that way, it is not required by ActiveRecord. Thanks for the input, let me know if there are any other issues

@schneems

This comment has been minimized.

Copy link
Member

schneems commented Sep 11, 2012

Verified that the latest changes don't rely on an external gem. I'm happy with the feature. 👍 pending any implementation notes from @rafaelfranca. Once we get those cleared away, I would appreciate some more comments/docs specifically I would like some on ColumnDumper module and a comment above where we are rescuing the includes explaining the logic behind why someone might want to use the external library. Thanks for your work on this!

@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 11, 2012

@schneems I will definitely add more documentation to my changes, thanks for reviewing this PR

@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 13, 2012

@schneems I added documentation to the spots you mentioned. Are there any other parts you think should be documented? Let me know if I should expand further where I have already added comments. Thanks

@bcardarella

This comment has been minimized.

Copy link
Contributor

bcardarella commented Sep 13, 2012

I'm guessing the next step is going to be the guide

@rafaelfranca

This comment has been minimized.

Copy link
Member

rafaelfranca commented Sep 13, 2012

@bcardarella I think we don't have any kind of information about database specific datatypes in the guides.

@bcardarella

This comment has been minimized.

Copy link
Contributor

bcardarella commented Sep 13, 2012

@rafaelfranca oh ok :)

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql/cast.rb Outdated
parse_pg_array(string).map{|val| oid.type_cast val}
end

def parse_data(string, index)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 13, 2012

Member

This should be private.

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql/cast.rb Outdated
array
end

def parse_array_contents(array, string, index)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 13, 2012

Member

This should be private

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql/cast.rb Outdated
return local_index,array
end

def add_item_to_array(array, current_item, quoted)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 13, 2012

Member

This should be private

@rafaelfranca

View changes

activerecord/lib/active_record/connection_adapters/postgresql/cast.rb Outdated
require 'pg_array_parser'
include PgArrayParser
rescue LoadError
def parse_pg_array(string)

This comment has been minimized.

@rafaelfranca

rafaelfranca Sep 13, 2012

Member

Maybe would be a good idead move all the methods related to add_item_to_array to a module to beter organize.

@rafaelfranca

This comment has been minimized.

Copy link
Member

rafaelfranca commented Sep 13, 2012

After these changes that I requested remember to rebase and squash the commits.

Dan McClain
Moves column dump specific code to a module included in AbstractAdapter
Having column related schema dumper code in the AbstractAdapter. The
code remains the same, but by placing it in the AbstractAdapter, we can
then overwrite it with Adapter specific methods that will help with
Adapter specific data types.

The goal of moving this code here is to create a new migration key for
PostgreSQL's array type. Since any datatype can be an array, the goal is
to have ':array => true' as a migration option, turning the datatype
into an array. I've implemented this in postgres_ext, the syntax is
shown here: https://github.com/dockyard/postgres_ext#arrays

Adds array migration support

Adds array_test.rb outlining the test cases for array data type
Adds pg_array_parser to Gemfile for testing
Adds pg_array_parser to postgresql_adapter (unused in this commit)

Adds schema dump support for arrays

Adds postgres array type casting support

Updates changelog, adds note for inet and cidr support, which I forgot to add before

Removing debugger, Adds pg_array_parser to JRuby platform

Removes pg_array_parser requirement, creates ArrayParser module used by
PostgreSQLAdapter
@danmcclain

This comment has been minimized.

Copy link
Contributor

danmcclain commented Sep 14, 2012

@rafaelfranca Made the requested changes, and rebased against the latest master. Thanks for the input

rafaelfranca added a commit that referenced this pull request Sep 17, 2012

Merge pull request #7547 from danmcclain/pg-arrays
Adds migration and type casting support for PostgreSQL Array datatype

@rafaelfranca rafaelfranca merged commit beaac33 into rails:master Sep 17, 2012

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment