Skip to content
This repository has been archived by the owner on Oct 17, 2023. It is now read-only.

Joins, or a way to pulling extra data from other namespaces #39

Closed
nstott opened this issue Jan 1, 2015 · 8 comments
Closed

Joins, or a way to pulling extra data from other namespaces #39

nstott opened this issue Jan 1, 2015 · 8 comments

Comments

@nstott
Copy link
Contributor

nstott commented Jan 1, 2015

After fetching a document from a source, we need a way to resolve pieces of the document when data might exist in other namespaces
eg, if we have a document from a namespace of 'posts' that looks like this

{
    title: "this is a title",
    author: ObjectId("54179ce06570544fb3892b69"),
    content: "post content"
}

then we need to be able to query another namespace on the source to turn ObjectId("54179ce06570544fb3892b69") into an appropriate object.

One way to solve this would be to add a javascript vm to the source, and let the user run a js function with a javascript builtin or other mechanism that would perform a lookup against the source

@nstott
Copy link
Contributor Author

nstott commented Jan 1, 2015

If we agree that merging the joined data into the existing document should be handled in a javascript vm in the source, then we're presented with a bit of a dilemma about how to request lookups
i.e. consider the case of redis, and mongo,
with redis we might want to query with
GET <key> or HGET <key> <value>
whereas with mongo, we'd want to do a db.findOne({_id: <id>}), or possibly a findOne(<bson query>)

We are forced to either provide a generic function that can take a variety of ways to query,
ie
Mongo

module.exports = function(doc, source) {
    doc["author"] = source.lookup({namespace: "boom.authors", query: {_id: doc.author_id}});
}

Redis

module.exports = function(doc, source) {
    doc["author"] = source.lookup({method: "HGET", key: "authors", value: doc.author_id});
}

or we provide specialized functions for each source type.
Redis

module.exports = function(doc, source) {
    doc["author"] = source.HGET("authors", doc.author_id);
}

Each of these options has drawbacks.

opinions?

@mrkurt
Copy link
Contributor

mrkurt commented Jan 2, 2015

I think you can probably get a long way with a stupid simple lookup interface right now, even just k/v lookups (find by _id in Mongo, get in Redis. Advanced queries (anything special on Redis probably counts) can come later.

And, the less actual work that happens in Javascript the more chance there is to optimize / scale this stuff later. Letting people do arbitrary queries and then run logic against them in JS seems like it's going to create a really hard-to-optimize performance bottleneck.

@andrewreedy
Copy link

👍

@tiengtinh
Copy link

👍 This is also one feature that I'm looking forward to

@shividhar
Copy link

+1 Definitely a sought after feature.

@allanlundhansen
Copy link

+1 denormalization of data for elasticsearch should be possible

@vinodtolexo
Copy link

+1 from me as well I can also contribute tothe code if required .

@jipperinbham
Copy link
Contributor

dumping what the plan is here so I don't forget when I get to this soon...

t.Source(mongodb({uri: "connection string"}).
  Join(postgres({uri: "connection string"}), {
    id_map: {"account_id": "id"}, 
    field_map: {
      "name": "flegergle", 
      "slug": "account_slug"
   },
    query_ref: "accounts"}).
  Save(elasticsearch({uri: "connection string"...})
)

NOTE this is contingent on changes to the javascript DSL which is currently in progress.

the general idea here is to have a new method Join(...) that takes two parameters, an adaptor and a configuration for performing the query.

in the above pipeline, the following scenario would take place:

original doc

{
  "_id": "somespecial_ID",
  "name": "fancypants",
  "type": "foo",
  "account_id": 1567
}

and when sent to the Join the following query would be executed:

SELECT name AS flegergle, slug AS account_slug FROM accounts WHERE id = 1567

which would then send the following document down the pipeline:

{
  "_id": "somespecial_ID",
  "name": "fancypants",
  "type": "foo",
  "account_id": 1567,
  "flegergle": "Super Duper",
  "account_slug": "super-duper"
}

The initial implementation of this will likely only support joins to a single table/collection.

@jipperinbham jipperinbham added ready and removed next labels Mar 16, 2017
@jipperinbham jipperinbham removed this from the v0.3.0 milestone Mar 16, 2017
@nstott nstott closed this as completed Jun 24, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

8 participants