Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement a elasticsearch river plugin for rethinkdb #1009

Closed
eshao opened this issue Jun 14, 2013 · 15 comments
Closed

Implement a elasticsearch river plugin for rethinkdb #1009

eshao opened this issue Jun 14, 2013 · 15 comments
Assignees
Milestone

Comments

@eshao
Copy link

eshao commented Jun 14, 2013

A feature request.

Similar to https://github.com/richardwilly98/elasticsearch-river-mongodb/ or http://www.elasticsearch.org/guide/reference/river/couchdb/.

@coffeemug
Copy link
Contributor

Hi @eshao, thanks for the suggestion -- I think it's a great idea. I'll do a little bit of research and see how soon we can get this done. This is a fairly low priority at the moment, but if you really need the plugin please comment here (or shoot me an e-mail to slava@rethinkdb.com) about the use case and we'll see if we can get it in sooner.

@eshao
Copy link
Author

eshao commented Jun 17, 2013

Ah, basically we're using mongo + elasticsearch right now, which I think is a pretty common configuration for many. (Basically majority of reads, i.e. searches goes through elasticsearch. Mongo is great as the persistent layer.) We can't move off of it or even consider trying rethinkdb until it supports elasticsearch also.

@coffeemug
Copy link
Contributor

Thanks -- I'll try to bump this up. I have to finish 1.7 release planning, and then I'll do some research on this and see if we can get it in sooner.

@fuwaneko
Copy link

We use ElasticSearch in our project. I'll see what I can do.

@fuwaneko
Copy link

All right, I did a bit of research, and here's what I found out.

  1. To be able to implement ”pull” approach for indexing content with ElasticSearch source must provide some sort of monitoring. Like CouchDB _changes or operation log in MongoDB.
  2. RethinkDB does not have monitoring yet. I think there are some big plans about embedding Node.js and adding triggers (Proposal: triggers #997), but it'll happen who-knows-when.

What that means is for now it's better to stick with “push” approach where you directly tell ElasticSearch what to index. I know it requires your software to do that instead of having river plugin that will transparently pull and index things. But that's better than having river plugin that will fetch data periodically which removes all purpose of having river plugin at all.

@coffeemug
Copy link
Contributor

I'll see what we can do to speed up #997. Hopefully it will become available faster than you think :)

@fuwaneko
Copy link

@coffeemug haha, I hope that it will not make RDB bloatware with all that node.js stuff :)

@coffeemug
Copy link
Contributor

@fuwaneko -- have no fear, I dare say we tend to have good taste when it comes to things like these :)

@coffeemug
Copy link
Contributor

Related to #997.

@deontologician deontologician self-assigned this Aug 26, 2014
@deontologician
Copy link
Contributor

Working on this now

@marshall007
Copy link
Contributor

@deontologician once this is fully implemented, will it only support one-to-one mapping with the tables or will it be possible to do joins/arbitrary transformations?

@deontologician
Copy link
Contributor

It just does one to one mapping, though you can specify that multiple rethinkdb tables go to the same index if you want. What are you looking for in terms of transformations?

@marshall007
Copy link
Contributor

Aggregations on child documents is really what I'm looking for, but even just support a basic map would go a long way.

// categories
{
  "id": 1,
  "name": "Category 1"
}

// products
[{
  "category_id": 1,
  "list_price": 5.00
},
{
  "category_id": 1,
  "list_price": 15.00
}]

// indexed category
{
  "id": 1,
  "name": "Category 1",
  "list_price": { "min": 5.00, "max": 15.00 }
}

So basically we're watching for changes on products and reindexing their parent category.

@deontologician
Copy link
Contributor

This may be able to be done with scripts in ES. Some other rivers enable script support, so I'll see how hard that would be to do.

@deontologician
Copy link
Contributor

OK, I looked into scripts and it seems a bit involved for the first release. I've created rethinkdb/elasticsearch-river-rethinkdb#2 so we don't forget about this feature.

Closing out this issue as the river is implemented and has documentation (https://github.com/rethinkdb/elasticsearch-river-rethinkdb). There's an upcoming article that will discuss it a bit more in depth.

@deontologician deontologician modified the milestones: ecosystem, backlog Oct 6, 2014
@AtnNn AtnNn modified the milestones: ecosystem, 2.0 Mar 17, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants