Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement deferred save/deletion (autosave/mark_for_destruction) functionality #1401

Open
brianewell opened this issue Aug 15, 2017 · 7 comments

Comments

@brianewell
Copy link

I'm curious if there's reason that deferred save/deletion functionality hasn't been implemented in neo4jrb (or if it has and I just couldn't find it). As I've been limping around this limitation in one of my projects, I'd be willing to attempt implementing it, I'm just doing some cursory reconnaissance before I dig deeper into the code.

@jorroll
Copy link
Contributor

jorroll commented Aug 15, 2017

@brianewell, is your goal to mark several things for deletion and then send them all to the database as part of one http request? Because you may be able to accomplish this using neo4j-core's queries method

results = neo4j_session.queries do
  append 'MATCH (n:Foo) RETURN n LIMIT 10'
  append 'MATCH (n:Bar) RETURN n LIMIT 5'
end

results[0] # results of first query
results[1] # results of second query

@cheerfulstoic
Copy link
Contributor

Good callout @thefliik ! One thing to note is that the queries method will (I think) provide the most speedup via HTTP (because Bolt is a constant back and forth for each query and I don't think they've implemented the ability to send many requests at once asynchronously yet)

But if that's not it, an attempt to add the feature would definitely be welcome! I would suggest laying out an example(s) in this issue so that we can discuss it.

At some point it would be great to be able to have the ability to batch create / update objects similar to the activerecord-import gem (possibly using the queries object could be used in the background).

Also be aware that there is the neo4apis gem which gives a DSL for importing many object in batches, as well as other options laid out here

@brianewell
Copy link
Author

@thefliik

@brianewell, is your goal to mark several things for deletion and then send them all to the database as part of one http request?

Ideally, I'd like to create, update or mark for destruction several nodes/relations associated with some primary node and have all of those changes included within the same neo4j transaction block (making them atomic, assuming I understand neo4j transactions).

My specific use case has a single form manipulating multiple nodes and relationships (I'm making heavy use of fields_for, and have been having trouble translating my experience with ActiveRecord to Neo4j.rb. I'd like to be able to reduce complexity in my model code, which right now is a mess of managing the creation and deletion of other nodes in such a way that the validation code will still work. I remember this being much easier to do in ActiveRecord (using autosave and accepts_nested_attributes_for in the model)

Because you may be able to accomplish this using neo4j-core's queries method

Thanks for the pointer, I'll definitely check that approach out first. This post was mainly intended as an inquiry into if the feature actually already existed (and I just couldn't find it) or if it had been attempted before and for some reason didn't persist.

@cheerfulstoic

But if that's not it, an attempt to add the feature would definitely be welcome! I would suggest laying out an example(s) in this issue so that we can discuss it.

Sure thing!

What I envision is something similar to ActiveRecord's autosave association functionality. Specifically "a module that takes care of automatically saving associated records when their parent is saved. In addition to saving, it also destroys any associated records that were marked for destruction."

The use case would be pretty much the same as the use case outlined in the ActiveRecord module. I'll provide an example based on the ActiveRecord version linked above:

One-to-one Example

class Post
    include Neo4j::ActiveNode

    property :title
    has_one :in, :author, type: :AUTHORED, autosave: true
end

Saving changes to the parent and its associated model can now be performed automatically and atomically:

post = Post.first
post.title          # => "The current global position of migrating ducks"
post.author.name    # => "alloy"

post.title = "On the migration of ducks"
post.author.name = "Eloy Duran"

post.save
post.reload
post.title          # => "On the migration of ducks"
post.author.name    # => "Eloy Duran"

Destroying an associated model, as part of the parent's save action, is as simple as marking it for destruction:

post.author.mark_for_destruction
post.author.marked_for_destruction? # => true

Note that the author is not yet removed from the database:

uuid = post.author.uuid
Author.find_by( uuid: uuid ).nil? # => false

Until the post is saved:

post.save
post.reload.author # => nil
Author.find_by( uuid: uuid ) # => nil

One-to-many Example

The default behavior of ActiveRecord when :autosave is not declared is that new objects of a has_many association are saved when their parent object is saved.

class Post < ActiveRecord::Base
    has_many :comments # :autosave option is not declared
end

post = Post.new( title: 'ruby rocks' )
post.comments.build( body: 'hello world' )
post.save # => saves both post and comment

post = Post.create( title: 'ruby rocks' )
post.comments.build( body: 'hello world' )
post.save # => saves both post and comment

post = Post.create( title: 'ruby rocks' )
post.comments.create( body: 'hello world' )
post.save # => saves both post and comment

While this is default behavior for ActiveRecord, it's not default neo4j.rb behavior, and I'm reluctant to change existing behavior. I also question the consistency of this decision on the part of ActiveRecord (why is the save operation for one object explicitly saving another object without my input?) Regardless we can certainly have a discussion about it.

What I would like to see here is an autosave option which will automatically create any new nodes and relationships, modify any updated nodes and relationships, and delete any marked nodes and relationships by following any has_many associations that have autosave: true:

class Post
    has_many :in, :comments, type: :ON, autosave: true
end

post = Post.new( title: "Neo4j rocks" )
post.comments = [ Comment.new( body: "Hello World" ) ]
post.save # => saves both post and comment

Additionally, the association proxy could be extended with build or a similar method name which acts much like create but without saving a relationship between two nodes. I'll use an example from Neo4j.rb screencast #3:

class Asset
    include Neo4j::ActiveNode

    property :title

    has_many :out, :categories, type: :HAS_CATEGORY, autosave: true
end

class Category
    include Neo4j::ActiveNode

    property :name
end

asset = Asset.create( title: 'Talking about Neo4j' ) # => This will create asset in neo4j
category = Category.create( name: 'Graphs' ) # => This will create category in neo4j

asset.categories.create( category, weight: 5 ) # => This will create relationship between asset and category in neo4j
asset.save # => This step is completely unnecessary.

Using build:

asset = Asset.new( title: 'Talking about Neo4j' ) # => This will create asset and not commit it
category = Category.new( name: 'Graphs' ) # => This will create category and not commit it

asset.categories.build( category, weight: 5 ) # => This will create relationship between asset and category in neo4j and not commit it
asset.save # => This will commit the asset, category, and relationship between them to neo4j.

Additionally autosave could be added as an option to Neo4j::ActiveRel under from_class and to_class This would allow you to issue a save from a relationship out to the source node and the target node:

asset = Asset.new( title: 'Talking about Neo4j' ) # => This will create asset and not commit it
category = Category.new( name: 'Graphs' ) # => This will create category and not commit it

asset.categories.create( category, weight: 5 ) # => This will create relationship between asset and category in neo4j and commit the asset, category, and relationship to neo4j
asset.save # => This step is completely unnecessary.

As of neo4j.rb 8.0.18 this fails to properly commit unless asset and category have been explicitly committed beforehand.

This should be enough to spark conversation. Please let me know your thoughts.

@subvertallchris
Copy link
Contributor

It's been quite a while but I seem to remember doing a little research and coming away thinking that AutosaveAssociation would depend on some features that we didn't have. I don't remember what all of those features were, but I am think that an association cache was one of them. We're in a much better place now, I'm sure it's much more doable now.

@jorroll
Copy link
Contributor

jorroll commented Aug 17, 2017

@brianewell

Ideally, I'd like to create, update or mark for destruction several nodes/relations associated with some primary node and have all of those changes included within the same neo4j transaction block (making them atomic, assuming I understand neo4j transactions).

You can already / easily wrap multiple queries inside one transaction block using

Neo4j::ActiveBase.run_transaction do |tx|
  # whatever you want

  Neo4j::ActiveBase.run_transaction do
     #whatever you want
  end
end

This will make everything wrapped inside those transaction blocks roll back on a failure. Of note though, if any transaction fails within a group of nested transactions, all of the transactions in that nesting will fail (basically, the failure bubbles up).

Neo4j supports a lite version of ActiveRecord's build command (though I do miss build). It's explained in the docs here. Basically, you can build an association for a parent that itself has not been persisted yet. When this is done, the association (both relation and node) will automatically be persisted when you callsave on the parent (like with build). However, if the parent has already been persisted, the association itself is immediately persisted. It's probably easier to understand if you just look at the docs. This is definitely a lite version of build though. Very limited.

@cheerfulstoic
Copy link
Contributor

As @subvertallchris we'd need an association cache, but I'm pretty sure that we have that now in the AssociationProxy class, so that might mean that it wouldn't be too difficult. Having matching functionality for ActiveRel is a good callout, but that would probably need to be implemented more or less from scratch regardless.

Regarding the default behavior of ActiveRecord vs. ActiveNode, one thing that I would point out is that I believe we have the ability to do object = ModelClass.new(association: unpersisted_association_object) and doing an object.save would save unpersisted_association_object to the database. So then it would probably be consistent that building via an association and then calling save would save the association as well.

In short, I think it would be wonderful to have build as well as autosaving to be more consistent with ActiveRecord. People definitely come on occasion with problems with fields_for and it's not something that I use and so I don't have much guidance for them.

@brianewell
Copy link
Author

brianewell commented Aug 22, 2017

@thefliik

You can already / easily wrap multiple queries inside one transaction block using

I'm sorry for not communicating this more clearly earlier. I would like behavior that automatically saves associated records when a parent record is saved, and allows records to be flagged for deletion and deleted when some a parent record is saved (or potentially when deleted). I was getting wrapped up in the details of wanting that save to also occur within a single transaction, and while that's still critical and true, it's a much smaller part of the bigger picture.

As of this morning, I was saving associated records using a custom after_save method. I am now using a custom around_save method that opens a Neo4j::ActiveBase.run_transaction block, yields to perform the save of the parent, and then iterates through the associated records.

class Post
    include Neo4j::ActiveNode

    property :title
    has_one :in, :author, type: :AUTHORED

    around_save :save_author

    private

    save_author
        # Do pre-save stuff
        Neo4j::ActiveBase.run_transaction do
            yield
            # Save other objects
        end
    end
end

While this definitely addresses my concerns about atomic transactions (thank you for that), I'm still finding that management logic for this ('do pre-save stuff' and 'save other objects') is taking up more space in my model than I'd like it to, especially compared to the business logic is. I'm also recognizing that I've been tending to repeat my object management code in different models, violating DRY.

I will look closer into build lite before I continue with this thread (sorry @cheerfulstoic, but I'll respond to your post soon)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants