Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Persistence implementation: ActiveRecord pattern #91

Closed
wants to merge 26 commits into from

Conversation

karmi
Copy link
Contributor

@karmi karmi commented Apr 28, 2014

Implement an ActiveRecord-based persistence for model, oriented mainly towards Rails applications, similar to Tire::Persistence.

Using the Elasticsearch::Persistence::Repository from #71, this patch adds support for the ActiveRecord pattern of persistance for Ruby objects in Elasticsearch.

The goal is to have a 1:1 implementation to the ActiveRecord::Base implementation,
allowing to use it as a drop-in replacement for similar OxMs in Rails applications,
with minimal changes to the model definition and application code.

The model implementation uses Virtus for handling the model attributes, and ActiveModel for validations, callbacks, and similar model-related features.

Example

require 'elasticsearch/persistence/model'

class Person
  include Elasticsearch::Persistence::Model

  settings index: { number_of_shards: 1 }

  attribute :name, String,
            mapping: { fields: {
              name: { type: 'string', analyzer: 'snowball' },
              raw:  { type: 'string', analyzer: 'keyword' }
            } }

  attribute :birthday,   Date
  attribute :department, String
  attribute :salary,     Integer
  attribute :admin,      Boolean, default: false

  validates :name, presence: true

  before_save do
    puts "About to save: #{self}"
  end
end

Person.gateway.create_index! force: true

person = Person.create name: 'John Smith', salary: 10_000
# About to save: #<Person:0x007f961e89f010>
# => #<Person:0x007f961e89f010 ...>

person.id
# => "zNf3yxZDQsOTZfNfTX4E5A"

person = Person.find(person.id)
# => #<Person:0x007f961cf1f478 ... >

person.salary
# => 10000

person.increment :salary
# => { ... "_version"=>2}
person.salary
# => 10001

person.update admin: true
# => { ... "_version"=>3}

Person.search('smith').to_a
# => [#<Person:0x007f961ebc5b90 ...>]

TODO

  • ActiveRecord (Model) pattern
  • Integration tests
  • README documentation
  • Code annotation (Rubydoc)
  • Example Rails application

@2xmc
Copy link

2xmc commented May 21, 2014

Looks great!
Is this an experiment or are you planning to merge it to the master ?

@karmi
Copy link
Contributor Author

karmi commented May 21, 2014

LOL :) No, this is not an experiment, it will be merged into master once done.

@2xmc
Copy link

2xmc commented May 21, 2014

Thanks, looking forward to it.

karmi added 17 commits May 27, 2014 15:44
… Rubygems

Otherwise, Bundler is stuck in endless "Resolving dependencies" loop.
…ashie::Mash for easier access

    response = Article.search query: { match: { title: { query: 'test' } } },
                              aggregations: { dates: { date_histogram: { field: 'created_at', interval: 'hour' } } }

    assert_equal 2, response.response.aggregations.dates.buckets.first.doc_count
    # => 2
…stence

Using the `Elasticsearch::Persistence::Repository` from previous commits,
this patch adds support for the ActiveRecord pattern of persistance
for Ruby objects in Elasticsearch.

The goal is to have a 1:1 implementation to the ActiveRecord::Base implementation,
allowing to use it as a drop-in replacement for similar OxMs in Rails applications,
with minimal changes to the model definition and application code.

The model implementation uses [Virtus](https://github.com/solnic/virtus)
for handling the model attributes, and [ActiveModel](https://github.com/rails/rails/tree/master/activemodel)
for validations, callbacks, and similar model-related features.

Example:
--------

require 'elasticsearch/persistence/model'

class Person
  include Elasticsearch::Persistence::Model

  settings index: { number_of_shards: 1 }

  attribute :name, String,
            mapping: { fields: {
              name: { type: 'string', analyzer: 'snowball' },
              raw:  { type: 'string', analyzer: 'keyword' }
            } }

  attribute :birthday,   Date
  attribute :department, String
  attribute :salary,     Integer
  attribute :admin,      Boolean, default: false

  validates :name, presence: true

  before_save do
    puts "About to save: #{self}"
  end
end

Person.gateway.create_index! force: true

person = Person.create name: 'John Smith', salary: 10_000
About to save: #<Person:0x007f961e89f010>
=> #<Person:0x007f961e89f010 ...>

person.id
=> "zNf3yxZDQsOTZfNfTX4E5A"

person = Person.find(person.id)
=> #<Person:0x007f961cf1f478 ... >

person.salary
=> 10000

person.increment :salary
=> { ... "_version"=>2}
person.salary
=> 10001

person.update admin: true
=> { ... "_version"=>3}

Person.search('smith').to_a
=> [#<Person:0x007f961ebc5b90 ...>]
…ls forms

  Started POST "/articles" for 127.0.0.1 at 2014-04-28 19:03:35 +0200
  Processing by ArticlesController#create as HTML
    Parameters: {"utf8"=>"✓", "authenticity_token"=>"RS4ZqcdL8SPbo0g9kNzPG24D+PpspIit4SyOXcLhYXk=", "article"=>{"title"=>"With Date", "content"=>"", "published_on(1i)"=>"2014", "published_on(2i)"=>"4", "published_on(3i)"=>"1", "published_on(4i)"=>"15", "published_on(5i)"=>"00"}, "commit"=>"Create Article"}

  POST http://localhost:9250/articles/article [status:201, request:0.155s, query:n/a]
  > {"created_at":"2014-04-28T17:03:35.190+00:00","updated_at":"2014-04-28T17:03:35.190+00:00","title":"With Date","content":"","published_on":"2014-04-01T15:00:00.000+00:00"}

This has to be refactored into a Realtie
    Person.all.to_a

    # 2014-05-05 15:02:24 +0200: GET http://localhost:9250/people/person/_search [status:200, request:0.047s, query:0.024s]
    # 2014-05-05 15:02:24 +0200: > {"query":{"match_all":{}},"size":10000}
    # 2014-05-05 15:02:24 +0200: < # {"took":24,"timed_out":false,"_shards":{"total":5,"successful":5,"failed":0},"hits":{"total":100,"max_score":1.0,"hits":[ .... ]}}
    # => [#<Person:0x007ff1d8fb04b0 ... ]
…block is not passed

Example:

    Person
      .find_in_batches(size: 100)
      .each { |batch| puts batch.results.map(&:name) }
    # => Test 0
         Test 1
         Test 2
         Test 3
         Test 4

See:

* http://ruby-doc.org/core-2.1.2/Object.html#method-i-to_enum
* http://blog.arkency.com/2014/01/ruby-to-enum-for-enumerator/
    Person.find_each { |person| puts person.name }

    # # GET http://localhost:9200/people/person/_search?scroll=5m&search_type=scan&size=20
    # # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
    # Test 0
    # Test 1
    # Test 2
    # ...
    # # GET http://localhost:9200/_search/scroll?scroll=5m&scroll_id=c2Nhbj...
    # Test 20
    # Test 21
    # Test 22

See: http://api.rubyonrails.org/classes/ActiveRecord/Batches.html#method-i-find_each
    person.inspect
    #<Person {id: "NkltJP5vRxqk9_RMP7SU8Q", ..., name: "Test 100", ...}>
…/_type/_version are set per model instance

Previously, it was not possible to set a custom ID for a model, when creating/saving it.
This patch fixes this ridiculous oversight.

Also, when the model is saved into a *different* index than the class-level `index_name`, the `_index`
method returns it correctly. Also applies to `_type` and `_version`.
    people = Person.search query: { match: { name: 'smith' } },
                           highlight: { fields: { name: {} } }

    people.first.hit.highlight['name'].first
    # => ["John <em>Smith</em>"]
@ianneub
Copy link

ianneub commented Jun 5, 2014

Is it or will it be possible to set the elasticsearch client for each model? For example:

class Person
  include Elasticsearch::Persistence::Model

  client Elasticsearch::Client.new url: ENV['ELASTICSEARCH_PERSON_SERVER'], log: true

  settings index: { number_of_shards: 1 }

  attribute :name, String,
            mapping: { fields: {
              name: { type: 'string', analyzer: 'snowball' },
              raw:  { type: 'string', analyzer: 'keyword' }
            } }

  attribute :birthday,   Date
  attribute :department, String
  attribute :salary,     Integer
  attribute :admin,      Boolean, default: false

  validates :name, presence: true

  before_save do
    puts "About to save: #{self}"
  end
end

@karmi
Copy link
Contributor Author

karmi commented Jun 5, 2014

@ianneub Yeah, that's possible via the gateway:

class MyModel
  # ...
  gateway do
    client Elasticsearch::Client.new url: 'foobar'
  end
end

MyModel.search('f').to_a.first
Faraday::ConnectionFailed: getaddrinfo: nodename nor servname provided, or not known

@ianneub
Copy link

ianneub commented Jun 5, 2014

Nice! Thanks @karmi

@ianneub
Copy link

ianneub commented Jun 5, 2014

Is it possible to use Kaminari with Elasticsearch::Persistence::Model?

@karmi
Copy link
Contributor Author

karmi commented Jun 5, 2014

@ianneub Not yet, but it is planned.

@baronworks
Copy link

Playing around with this branch and wondering if it is possible to set the parent with a Elasticsearch::Persistence::Model and if so how?

@karmi
Copy link
Contributor Author

karmi commented Jun 10, 2014

@baronworks Yeah, I've got an example application in the works, I'll push it, so you can have a look.

@karmi
Copy link
Contributor Author

karmi commented Jun 11, 2014

I need more time to extract the application into a Rails template. In the meantime, here's how I define the mapping in the model:

class Artist
  include Elasticsearch::Persistence::Model
  # ...
end

class Album
  include Elasticsearch::Persistence::Model

  mapping _parent: { type: 'artist' } do
    indexes :suggest_title, type: 'completion', payloads: true
    indexes :suggest_track, type: 'completion', payloads: true
  end
end

I have an IndexManager class to create the index for both types:

class IndexManager
  def self.create_index(options={})
    client     = Artist.gateway.client
    index_name = Artist.index_name

    client.indices.delete index: index_name rescue nil if options[:force]

    settings = Artist.settings.to_hash.merge(Album.settings.to_hash)
    mappings = Artist.mappings.to_hash.merge(Album.mappings.to_hash)

    client.indices.create index: index_name,
                          body: {
                            settings: settings.to_hash,
                            mappings: mappings.to_hash }
  end

  # ...
end

When an Album is created, parent is passed simply as an argument (all the arguments are passed down the chain):

Album.create { title: "Foo" }, id: 'repeater', parent: 'fugazi'

@baronworks
Copy link

Thanks a lot @karmi, very much appreciated. I will be giving this a try later
this afternoon.

@xinuc
Copy link

xinuc commented Jun 12, 2014

👍 great job @karmi

@xinuc
Copy link

xinuc commented Jun 12, 2014

Can we get the ability to set the document id please?

User.create id: "karmi", name: "Karel Minarik"

@karmi
Copy link
Contributor Author

karmi commented Jun 12, 2014

@xinuc This was already added in 4d6aade, I've added an integration test in 783cd1a just now.

@baronworks
Copy link

The IndexManager for the mappings was the clue I needed and did manage to get my mappings all set up properly. A couple of mapping questions and\or possible issues.

Scenario 1:

document_type mapping

class Artist
    include Elasticsearch::Persistence::Model
    index_name :my_index
    document_type :my_artist
    ...
end

puts Artist.document_type = :my_artist
puts Artist.mappings.to_hash = {:artist=>{:properties=>{} }

Should the mappings key not be :my_artist and not :artist if setting the document_type?

Scenario 2:

attributes vs indexes and object type mappings

class Artist
    include Elasticsearch::Persistence::Model   

    attribute :some_map
    mapping dynamic: 'true' do
        indexes :some_map, type: 'object', default: {} 
    end 
end
puts Artist.mappings.to_hash = {:artist=>{:dynamic=>"true", :properties=>{:some_map=>{:type=>"object}}}}

Which is the mapping I and behaviour I want, but doing:

class Artist
    include Elasticsearch::Persistence::Model      
    mapping dynamic: 'true' do
        indexes :some_map, type: 'object', default: {} 
    end 
    attribute :some_map
end
puts Artist.mappings.to_hash = {:artist=>{:dynamic=>"true", :properties=>{:some_map=>{:type=>"string"}}}}

Causes the mapping for :some_map to now be :type=>"string", based on the lookup_type in Elasticsearch::Peristence::Model::Utils

So for this scenario I have 2 questions:

    1. Will attribute: allow for type Object in the future?
    1. Notice that when setting indexes :field that the field is not an attribute of the model until it is defined using attribute :field. Can defining an indexes: field also define it as a Virtus attribute?

Realize that this is a work in progress and apologies if jumping the gun on any functionality planned but not yet realized. Many thanks!

@xinuc
Copy link

xinuc commented Jun 13, 2014

@karmi ah, apparently I tried to set _id instead of id

Thanks.

@xinuc
Copy link

xinuc commented Jun 16, 2014

waiting for association & scoping support 😁

@karmi
Copy link
Contributor Author

karmi commented Jun 16, 2014

@xinuc There's very little chance there will be any DSL-ish support for associations, because that is handled quite elegantly by Virtus. Scopes can be added in the future, but it's not an immediate plan.

@xinuc
Copy link

xinuc commented Jun 16, 2014

Sorry, I'm not familiar with Virtus. But do you mean virtus can handle somethings like has_many or belongs_to ?

@karmi
Copy link
Contributor Author

karmi commented Jun 16, 2014

@xinuc Yes, exactly, please see the Virtus documentation. There is no belongs_to DSL, you just correctly configure the attribute -- which is the right way. There will be a full example application with Artist has many albums type of relationship published soon, as part of this repository.

@xinuc
Copy link

xinuc commented Jun 17, 2014

Great!

but I really don't understand how this works.

I have

class Conversation
  attribute :messages, Array[Message]
end

class Message
  attribute :body, String
end

my conversation doc

{
  "_index" : "messages",
  "_type" : "conversation",
  "_id" : "wGYKa0uMTKCMWV9rgGq3cw",
  "found" : false
}

and my message doc

{
  "_index" : "messages",
  "_type" : "message",
  "_id" : "BufkHKqHReyzw6eBkl60Ow",
  "_version" : 3,
  "found" : true,
  "_source":{"created_at":"2014-06-17T09:09:18.713+00:00","updated_at":"2014-06-17T09:09:21.255+00:00","body":"hello"}
}

and conversation.messages works great.

where does virtus save the association? there's no foreign key there 😕

@xinuc
Copy link

xinuc commented Jun 17, 2014

ugh, sorry,

apparently all messages saved as embedded doc inside conversation.

{
  "_index" : "messages-conversations",
  "_type" : "conversation",
  "_id" : "wGYKa0uMTKCMWV9rgGq3cw",
  "_version" : 1,
  "found" : true,
  "_source":{"created_at":"2014-06-17T09:08:55.322+00:00","updated_at":"2014-06-17T09:08:55.322+00:00","user_id":null,"partner_id":null,
  "messages":[
      {"created_at":"2014-06-17T09:09:18.713+00:00","updated_at":"2014-06-17T09:09:21.255+00:00","body":"halo","id":"BufkHKqHReyzw6eBkl60Ow"}]
      }
}

Is there any way to create a relational-like association without embedding?

@karmi
Copy link
Contributor Author

karmi commented Jun 17, 2014

@xinuc Have a look at Artist.rb and Album.rb, which use parent+child relationship in Elasticsearch.

…odel

Usage:

    $  bundle exec rails generate scaffold Person name:String email:String --orm=elasticsearch --force
…th persistence model

Usage:

    rails new music --force --skip --skip-bundle --skip-active-record --template /path/to/template.rb
    rails new music --force --skip --skip-bundle --skip-active-record --template https://raw.githubusercontent.com/elasticsearch/elasticsearch-rails/persistence-model/elasticsearch-persistence/examples/music/template.rb
@karmi karmi closed this in b21e959 Jun 18, 2014
@xinuc
Copy link

xinuc commented Jun 18, 2014

w00t

@twmills
Copy link

twmills commented Jun 30, 2014

@karmi Is support for the :parent option available in 0.1.4?

I followed the example code, but the parent doesn't save on the child document.

@picandocodigo picandocodigo deleted the persistence-model branch September 1, 2020 09:05
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants