Navigation Menu

Skip to content

Commit

Permalink
v0.1 release
Browse files Browse the repository at this point in the history
* Added batch support, for much faster intiialization of current DB or reindexing all DB.
* Dropped indexes per model, instead, using `node_auto_index` and `relationship_auto_index`, letting Neo4j auto index objects.
* One `neo_save` method instead of `neo_create` and `neo_update`. It takes care of inserting or updating.
  • Loading branch information
elado committed Jan 8, 2013
1 parent 2b1468b commit ead47a6
Show file tree
Hide file tree
Showing 20 changed files with 1,033 additions and 218 deletions.
35 changes: 35 additions & 0 deletions CHANGELOG.md
@@ -1,3 +1,38 @@
## v0.1

* Added batch support, for much faster intiialization of current DB or reindexing all DB.
* Dropped indexes per model, instead, using `node_auto_index` and `relationship_auto_index`, letting Neo4j auto index objects.
* One `neo_save` method instead of `neo_create` and `neo_update`. It takes care of inserting or updating.

### Breaking changes:

Model indexes (such as `users_index`) are now turned off by default. Instead, Neoid uses Neo4j's auto indexing feature.

In order to have the model indexes back, use this in your configuration:

```ruby
Neoid.configure do |c|
c.enable_per_model_indexes = true
end
```

This will turn on for all models.

You can turn off for a specific model with:

```ruby
class User < ActiveRecord::Base
include Neoid::Node

neoidable enable_model_index: false do |c|
end
end
```

## v0.0.51

* Releasing Neoid as a gem.

## v0.0.41 ## v0.0.41


* fixed really annoying bug caused by Rails design -- Rails doesn't call `after_destroy` when assigning many to many relationships to a model, like `user.movies = [m1, m2, m3]` or `user.update_attributes(params[:user])` where it contains `params[:user][:movie_ids]` list (say from checkboxes), but it DOES CALL after_create for the new relationships. the fix adds after_remove callback to the has_many relationships, ensuring neo4j is up to date with all changes, no matter how they were committed * fixed really annoying bug caused by Rails design -- Rails doesn't call `after_destroy` when assigning many to many relationships to a model, like `user.movies = [m1, m2, m3]` or `user.update_attributes(params[:user])` where it contains `params[:user][:movie_ids]` list (say from checkboxes), but it DOES CALL after_create for the new relationships. the fix adds after_remove callback to the has_many relationships, ensuring neo4j is up to date with all changes, no matter how they were committed
Expand Down
195 changes: 148 additions & 47 deletions README.md
Expand Up @@ -3,7 +3,6 @@
[![Build Status](https://secure.travis-ci.org/elado/neoid.png)](http://travis-ci.org/elado/neoid) [![Build Status](https://secure.travis-ci.org/elado/neoid.png)](http://travis-ci.org/elado/neoid)





Make your ActiveRecords stored and searchable on Neo4j graph database, in order to make fast graph queries that MySQL would crawl while doing them. Make your ActiveRecords stored and searchable on Neo4j graph database, in order to make fast graph queries that MySQL would crawl while doing them.


Neoid to Neo4j is like Sunspot to Solr. You get the benefits of Neo4j speed while keeping your schema on your plain old RDBMS. Neoid to Neo4j is like Sunspot to Solr. You get the benefits of Neo4j speed while keeping your schema on your plain old RDBMS.
Expand All @@ -12,18 +11,21 @@ Neoid doesn't require JRuby. It's based on the great [Neography](https://github.


Neoid offers querying Neo4j for IDs of objects and then fetch them from your RDBMS, or storing all desired data on Neo4j. Neoid offers querying Neo4j for IDs of objects and then fetch them from your RDBMS, or storing all desired data on Neo4j.


**Important: Heroku Support is not available because Herokud doesn't support Gremlin. So until further notice, easiest way is to self host a Neo4j on EC2 in the same zone, and connect from your dyno to it**

## Changelog

[See Changelog](https://github.com/elado/neoid/blob/master/CHANGELOG.md)




## Installation ## Installation


Add to your Gemfile and run the `bundle` command to install it. Add to your Gemfile and run the `bundle` command to install it.


```ruby ```ruby
gem 'neoid', '~> 0.0.51' gem 'neoid', '~> 0.1'
``` ```


Future versions may have breaking changes but will arrive with migration code.

**Requires Ruby 1.9.2 or later.** **Requires Ruby 1.9.2 or later.**


## Usage ## Usage
Expand Down Expand Up @@ -51,6 +53,11 @@ Neography.configure do |c|
end end


Neoid.db = $neo Neoid.db = $neo

Neoid.configure do |c|
# should Neoid create sub-reference from the ref node (id#0) to every node-model? default: true
c.enable_subrefs = true
end
``` ```


`01_` in the file name is in order to get this file loaded first, before the models (initializers are loaded alphabetically). `01_` in the file name is in order to get this file loaded first, before the models (initializers are loaded alphabetically).
Expand All @@ -71,9 +78,9 @@ class User < ActiveRecord::Base
end end
``` ```


This will help to create a corresponding node on Neo4j when a user is created, delete it when a user is destroyed, and update it if needed. This will help to create/update/destroy a corresponding node on Neo4j when changed are made a User model.


Then, you can customize what fields will be saved on the node in Neo4j, inside neoidable configuration: Then, you can customize what fields will be saved on the node in Neo4j, inside `neoidable` configuration, using `field`. You can also pass blocks to save content that's not a real column:


```ruby ```ruby
class User < ActiveRecord::Base class User < ActiveRecord::Base
Expand All @@ -89,7 +96,6 @@ class User < ActiveRecord::Base
end end
``` ```



#### Relationships #### Relationships


Let's assume that a `User` can `Like` `Movie`s: Let's assume that a `User` can `Like` `Movie`s:
Expand Down Expand Up @@ -151,7 +157,7 @@ class Like < ActiveRecord::Base
end end
``` ```


Neoid adds `neo_node` and `neo_relationships` to nodes and relationships, respectively. Neoid adds the metohds `neo_node` and `neo_relationships` to instances of nodes and relationships, respectively.


So you could do: So you could do:


Expand All @@ -169,46 +175,60 @@ rel.end_node # user.movies.first.neo_node
rel.rel_type # 'likes' rel.rel_type # 'likes'
``` ```


## Index for Full-Text Search #### Disabling auto saving to Neo4j:


Using `search` block inside a `neoidable` block, you can store certain fields. If you'd like to save nodes manually rather than after_save, use `auto_index: false`:


```ruby ```ruby
# movie.rb class User < ActiveRecord::Base

class Movie < ActiveRecord::Base
include Neoid::Node include Neoid::Node


neoidable do |c| neoidable auto_index: false do |c|
c.field :slug
c.field :name

c.search do |s|
# full-text index fields
s.fulltext :name
s.fulltext :description

# just index for exact matches
s.index :year
end
end end
end end
```


Records will be automatically indexed when inserted or updated. user = User.create!(name: "Elad") # no node is created in Neo4j!

user.neo_save # now there is!
```


## Querying ## Querying


You can query with all [Neography](https://github.com/maxdemarzi/neography)'s API: `traverse`, `execute_query` for Cypher, and `execute_script` for Gremlin. You can query with all [Neography](https://github.com/maxdemarzi/neography)'s API: `traverse`, `execute_query` for Cypher, and `execute_script` for Gremlin.


### Basics:

#### Finding a node by ID

Nodes and relationships are auto indexed in the `node_auto_index` and `relationship_auto_index` indexes, where the key is `Neoid::UNIQUE_ID_KEY` (which is 'neoid_unique_id') and the value is a combination of the class name and model id, `Movie:43`, this value is accessible with `model.neo_unique_id`. So use the constant and this method, never rely on assebling those values on your own because they might change in the future.

That means, you can query like this:

```ruby
Neoid.db.get_node_auto_index(Neoid::UNIQUE_ID_KEY, user.neo_unique_id)
# => returns a Neography hash

Neoid::Node.from_hash(Neoid.db.get_node_auto_index(Neoid::UNIQUE_ID_KEY, user.neo_unique_id))
# => returns a Neography::Node
```

#### Finding all nodes of type

If Subreferences are enabled, you can get the subref node and then get all attached nodes:

```ruby
Neoid.ref_node.outgoing('users_subref').first.outgoing('users_subref').to_a
# => this, according to Neography, returns an array of Neography::Node so no conversion is needed
```

### Gremlin Example: ### Gremlin Example:


These examples query Neo4j using Gremlin for IDs of objects, and then fetches them from ActiveRecord with an `in` query. These examples query Neo4j using Gremlin for IDs of objects, and then fetches them from ActiveRecord with an `in` query.


Of course, you can store using the `neoidable do |c| c.field ... end` all the data you need in Neo4j and avoid querying ActiveRecord. Of course, you can store using the `neoidable do |c| c.field ... end` all the data you need in Neo4j and avoid querying ActiveRecord.




**Most popular categories** **Most liked movies**


```ruby ```ruby
gremlin_query = <<-GREMLIN gremlin_query = <<-GREMLIN
Expand All @@ -228,15 +248,18 @@ movie_ids = Neoid.db.execute_script(gremlin_query)
Movie.where(id: movie_ids) Movie.where(id: movie_ids)
``` ```


Assuming we have another `Friendship` model which is a relationship with start/end nodes of `user` and type of `friends`, *Side note: the resulted movies won't be sorted by like count because the RDBMS won't necessarily do it as we passed a list of IDs. You can sort it yourself with array manipulation, since you have the ids.*



**Movies of user friends that the user doesn't have** **Movies of user friends that the user doesn't have**


Let's assume we have another `Friendship` model which is a relationship with start/end nodes of `user` and type of `friends`,

```ruby ```ruby
user = User.find(1) user = User.find(1)


gremlin_query = <<-GREMLIN gremlin_query = <<-GREMLIN
u = g.idx('users_index')[[ar_id:user_id]].next() u = g.idx('node_auto_index').get(unique_id_key, user_unique_id).next()
movies = [] movies = []
u u
Expand All @@ -246,15 +269,42 @@ gremlin_query = <<-GREMLIN
.except(movies).collect{it.ar_id} .except(movies).collect{it.ar_id}
GREMLIN GREMLIN


movie_ids = Neoid.db.execute_script(gremlin_query, user_id: user.id) movie_ids = Neoid.db.execute_script(gremlin_query, unique_id_key: Neoid::UNIQUE_ID_KEY, user_unique_id: user.neo_unique_id)


Movie.where(id: movie_ids) Movie.where(id: movie_ids)
``` ```


`.next()` is in order to get a vertex object which we can actually query on. ## Full Text Search


### Index for Full-Text Search


### Full Text Search Using `search` block inside a `neoidable` block, you can store certain fields.

```ruby
# movie.rb

class Movie < ActiveRecord::Base
include Neoid::Node

neoidable do |c|
c.field :slug
c.field :name

c.search do |s|
# full-text index fields
s.fulltext :name
s.fulltext :description

# just index for exact matches
s.index :year
end
end
end
```

Records will be automatically indexed when inserted or updated.

### Querying a Full-Text Search index


```ruby ```ruby
# will match all movies with full-text match for name/description. returns ActiveRecord instanced # will match all movies with full-text match for name/description. returns ActiveRecord instanced
Expand All @@ -270,14 +320,63 @@ Neoid.neo_search([Movie, User], "hello")
Movie.neo_search(year: 2013).results Movie.neo_search(year: 2013).results
``` ```


Full text search with Neoid is very limited and is likely not to develop more than this basic functionality. I strongly recommend using gems like Sunspot over Solr.

## Batches

Neoid has a batch ability, that is good for mass updateing/inserting of nodes/relationships. It sends batched requests to Neography, and takes care of type conversion (neography batch returns hashes and other primitive types) and "after" actions (via promises).

A few examples, easy to complex:

```ruby
Neoid.batch(batch_size: 100) do
User.all.each(&:neo_save)
end
```
With `then`:

```ruby
User.first.name # => "Elad"

Neoid.batch(batch_size: 100) do
User.all.each(&:neo_save)
end.then do |results|
# results is an array of the script results from neo4j REST.

results[0].name # => "Elad"
end
```

*Nodes and relationships in the results are automatically converted to Neography::Node and Neography::Relationship, respectively.*

With individual `then` as well as `then` for the entire batch:

```ruby
Neoid.batch(batch_size: 30) do |batch|
(1..90).each do |i|
(batch << [:create_node, { name: "Hello #{i}" }]).then { |result| puts result.name }
end
end.then do |results|
puts results.collect(&:name)
end
```

When in a batch, `neo_save` adds gremlin scripts to a batch, instead of running them immediately. The batch flushes whenever the `batch_size` option is met.
So even if you have 20000 users, Neoid will insert/update in smaller batches. Default `batch_size` is 200.


## Inserting records of existing app ## Inserting records of existing app


If you have an existing database and just want to integrate Neoid, configure the `neoidable`s and run in a rake task or console If you have an existing database and just want to integrate Neoid, configure the `neoidable`s and run in a rake task or console.

Use batches! It's free, and much faster. Also, you should use `includes` to incude the relationship edges on relationship entities, so it doesn't query the DB on each relationship.


```ruby ```ruby
[ Like.includes(:user).includes(:movie), OtherRelationshipModel ].each { |model| model.all.each(&:neo_update) } Neoid.batch do
[ Like.includes(:user).includes(:movie), OtherRelationshipModel.includes(:from_model).includes(:to_model) ].each { |model| model.all.each(&:neo_save) }


NodeModel.all.each(&:neo_update) NodeModel.all.each(&:neo_save)
end
``` ```


This will loop through all of your relationship records and generate the two edge nodes along with a relationship (eager loading for better performance). This will loop through all of your relationship records and generate the two edge nodes along with a relationship (eager loading for better performance).
Expand All @@ -289,30 +388,32 @@ Better interface for that in the future.


## Behind The Scenes ## Behind The Scenes


Whenever the `neo_node` on nodes or `neo_relationship` on relationships is called, Neoid checks if there's a corresponding node/relationship in Neo4j. If not, it does the following: Whenever the `neo_node` on nodes or `neo_relationship` on relationships is called, Neoid checks if there's a corresponding node/relationship in Neo4j (with the auto indexes). If not, it does the following:


### For Nodes: ### For Nodes:


1. Ensures there's a sub reference node (read [here](http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-index.html) about sub reference nodes) 1. Ensures there's a sub reference node (read [here](http://docs.neo4j.org/chunked/stable/tutorials-java-embedded-index.html) about sub references), if that option is on.
2. Creates a node based on the ActiveRecord, with the `id` attribute and all other attributes from `neoidable`'s field list 2. Creates a node based on the ActiveRecord, with the `id` attribute and all other attributes from `neoidable`'s field list
3. Creates a relationship between the sub reference node and the newly created node 3. Creates a relationship between the sub reference node and the newly created node
4. Adds the ActiveRecord `id` to a node index, pointing to the Neo4j node id, for fast lookup in the future 4. Auto indexes a node in the auto index, for fast lookup in the future


Then, when it needs to find it again, it just seeks the node index with that ActiveRecord id for its neo node id. Then, when it needs to find it again, it just seeks the auto index with that ActiveRecord id.


### For Relationships: ### For Relationships:


Like Nodes, it uses an index (relationship index) to look up a relationship by ActiveRecord id Like Nodes, it uses an auto index, to look up a relationship by ActiveRecord id


1. With the options passed in the `neoidable`, it fetches the `start_node` and `end_node` 1. With the options passed in the `neoidable`, it fetches the `start_node` and `end_node`
2. Then, it calls `neo_node` on both, in order to create the Neo4j nodes if they're not created yet, and creates the relationship with the type from the options. 2. Then, it calls `neo_node` on both, in order to create the Neo4j nodes if they're not created yet, and creates the relationship with the type from the options.
3. Add the relationship to the relationship index. 3. Adds the relationship to the relationship index.


## Testing ## Testing


In order to test your app or this gem, you need a running Neo4j database, dedicated to tests. In order to test your app or this gem, you need a running Neo4j database, dedicated to tests.


I use port 7574 for this. To run another database locally: I use port 7574 for testing.

To run another database locally (read [here](http://docs.neo4j.org/chunked/1.9.M03/server-installation.html#_multiple_server_instances_on_one_machine) too):


Copy the entire Neo4j database folder to a different location, Copy the entire Neo4j database folder to a different location,


Expand Down Expand Up @@ -344,7 +445,7 @@ end


## Testing This Gem ## Testing This Gem


Just run `rake` from the gem folder. Run the Neo4j DB on port 7574, and run `rake` from the gem folder.


## Contributing ## Contributing


Expand All @@ -356,9 +457,9 @@ Please create a [new issue](https://github.com/elado/neoid/issues) if you run in
Unfortunately, as for now, Neo4j add-on on Heroku doesn't support Gremlin. Therefore, this gem won't work on Heroku's add on. You should self-host a Neo4j instance on an EC2 or any other server. Unfortunately, as for now, Neo4j add-on on Heroku doesn't support Gremlin. Therefore, this gem won't work on Heroku's add on. You should self-host a Neo4j instance on an EC2 or any other server.




## To Do ## TO DO


[To Do](https://github.com/elado/neoid/blob/master/TODO.md) [TO DO](HTTPS://GITHUB.COM/ELADO/NEOID/BLOB/MASTER/TODO.MD)




--- ---
Expand Down
2 changes: 0 additions & 2 deletions TODO.md
@@ -1,6 +1,4 @@
# Neoid - To Do # Neoid - To Do


* Allow to disable sub reference nodes through options
* Execute queries/scripts from model and not Neography (e.g. `Movie.neo_gremlin(gremlin_query)` with query that outputs IDs, returns a list of `Movie`s) * Execute queries/scripts from model and not Neography (e.g. `Movie.neo_gremlin(gremlin_query)` with query that outputs IDs, returns a list of `Movie`s)
* Rake task to index all nodes and relatiohsips in Neo4j * Rake task to index all nodes and relatiohsips in Neo4j
* Test update node

0 comments on commit ead47a6

Please sign in to comment.