Sync multiple namespaces at the same time #23

nstott · 2014-12-22T16:21:17Z

It should be possible to sync more then one namespace with the pipeline.
I can think of a few ways this can work, but in general, I favour the idea of allowing regex / wildcard matches on a namespace. i.e. something like Source({name: "mongo", namespace: "database.*"})
This will cause problems on the sink, as it is expecting a constant namespace. as well, transformers will need to be aware of the messages namespace.

The text was updated successfully, but these errors were encountered:

codepope · 2014-12-22T16:25:05Z

Could we make it Source({name:"mongo", namespace:["database.cheese","database.bacon",...]}) ? They can then be passed with a constant namespace into the pipeline.

nstott · 2014-12-22T16:26:14Z

for transformer being aware of the namespace, it might make sense to have the nodes pass message.Msg, and store the namespace on the message, rather then the nodes passing straight documents.

codepope · 2014-12-22T16:31:06Z

What happens to dropped messages?
Do we want people playing with the command type?
Feels like lots of power but could turn into a mis-held chainsaw unless we surround with sanity checks.

Maybe pass a metadata object of reference information?

mrkurt · 2014-12-22T17:59:53Z

This seems like it's not really possible until there's a higher level msg to operate on, and that concept of mapping data in the save method.

shividhar · 2015-07-02T04:48:43Z

This would be an amazing feature to have.

jipperinbham · 2015-07-13T15:29:58Z

With this change we can now start to address this issue and I think a regex is the best option here.

The initial change will be to add a Namespace field to message.Msg and update func NewMsg(op OpType, data interface{}) *Msg to also accept namespace string and this will be sent into transformer functions as well with the field name ns.

My initial thought is to add a string namespace parameter to the Listen function in Pipe such that it would look like so:

func (m *Pipe) Listen(string namespace, fn func(*message.Msg) (*message.Msg, error)) error

Then we can apply a matching pattern inside the Listen function before calling the fn passed in.

Thoughts?

nstott · 2015-07-13T15:37:09Z

I like it all, apart from changing the signature of the Listen func, the adaptor should already have the namespace, see the mongo adaptor https://github.com/compose/transporter/blob/master/pkg/adaptor/mongodb.go#L453

the way we're doing this should work across all adaptors, but using mongo as a specific example
the tail func will need to understand the wild card, we query on it here
https://github.com/compose/transporter/blob/master/pkg/adaptor/mongodb.go#L341

and the cat func will need to iterate over the effected collections.

codepope · 2015-07-13T15:45:17Z

If the multiple collections are defined only as a regexp then it could lead to gnarly namespace sources... eg I want all foo's collections and from bar/stow... [foo/.*|bar.stow] which looks less configgy and more codey. Would it be simpler to expose as a comma-sep list? (with the option to add say a /regexp/ format later)

jipperinbham · 2015-07-13T15:53:42Z

A comma separated list seems very inefficient and inflexible because if the purpose is to be able to sync multiple namespaces easily, any additions of a collection would not get picked up until you change the transporter config which IMO is not the desired result.

codepope · 2015-07-13T16:03:59Z

How do you propose to change the regexp without changing the config or the pipeline defining js? As I read this, you specify the multiple collections on the Source node either in line or in config, incoming messages which match the specification get tagged with the canonical namespace they came from, this then can be altered by transformers where required, and passed to a destination adaptor for writing using the namespace to control where. Am I missing anything?

jipperinbham · 2015-07-14T01:32:02Z

@nstott I agree any adaptor acting as a source will need to know and act on the namespace it's configured for but it doesn't change the need for changing the Listen func. Here's a scenario, Source({name: "mongo", namespace: "database.*"}) so it will first cat every collection and then process all tail ops for any collection in database. If you have 2 save calls,

pipeline.save({name:"localmongo", namespace: "database.bas"})
pipeline.save({name:"localmongo", namespace: "database.baz"})

then during a message is received on the In channel in Pipe we could perform a check before calling the fn provided to Listen, effectively adding the logic between https://github.com/compose/transporter/blob/master/pkg/pipe/pipe.go#L86 and https://github.com/compose/transporter/blob/master/pkg/pipe/pipe.go#L88. Doing so would allow "Sink" adaptors to not contain any logic pertaining to whether it should process the message or now.

jipperinbham · 2015-07-14T01:36:26Z

@codepope I'll do my best to describe the specific use case I believe we should support.

A Source is setup as follows, Source({name: "mongo", namespace: "alphadbet.*", tail: true}) and a save as .save({name:"localmongo", namespace: "alphadbet.*"}). When transporter starts up, the source has the following collections:

collA
collB
The user adds another collection, collC to alphadbet. This new collection should be automatically picked up by the Source while performing the tail operations and the collC will be synced without any changes or stopping of transporter.

codepope · 2015-07-14T06:29:22Z

You've given a great example there... Regexp's are always surprising when not explicit and there you've got a regexp which would also match alphadbeta.collA and alphadbetonetwothree.collC. The regexp should of course be "alphadbet..*".

My suggestion was that the namespace is a comma list with regexps denoted by /..../ so your example would simply be namespace:"/alphadbet..*/" while "alphadbet.collA,alphadbet.collB" is also valid as would "alphadbet.collA,/alphadbet.coll[B-Z]/,alphadbet.otherColl".

That way, the principle of least surprise works (plain text matching by default) while the powerful option (regexp matching) is explicitly available. It also dodges the potentially breaking change* where every currently defined namespace becomes an ambiguously interpretable.

dependent on adaptor implementation

As i wrote this I was also reminded that backslashes would need backslashing too...

jipperinbham · 2015-07-14T14:46:38Z

For now, we're going to implement a single string with some restrictions in that the regex portion only applies to the 2nd half of the namespace. This limits adaptor to only having to work with a single "database".

It's very likely we will need to expand on this in the future but to limit the scope of the initial implementation I'd like to just go with the single string.

jipperinbham · 2015-07-27T01:45:34Z

close via #101

nstott added the enhancement label Dec 22, 2014

jipperinbham added this to the v0.1.0 milestone Jul 14, 2015

jipperinbham self-assigned this Jul 14, 2015

jipperinbham added the status/in progress label Jul 14, 2015

jipperinbham mentioned this issue Jul 22, 2015

(Phase 1) Multi namespace support #101

Merged

jipperinbham removed the status/in progress label Jul 27, 2015

jipperinbham closed this as completed Jul 27, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Sync multiple namespaces at the same time #23

Sync multiple namespaces at the same time #23

nstott commented Dec 22, 2014

codepope commented Dec 22, 2014

nstott commented Dec 22, 2014

codepope commented Dec 22, 2014

mrkurt commented Dec 22, 2014

shividhar commented Jul 2, 2015

jipperinbham commented Jul 13, 2015

nstott commented Jul 13, 2015

codepope commented Jul 13, 2015

jipperinbham commented Jul 13, 2015

codepope commented Jul 13, 2015

jipperinbham commented Jul 14, 2015

jipperinbham commented Jul 14, 2015

codepope commented Jul 14, 2015

jipperinbham commented Jul 14, 2015

jipperinbham commented Jul 27, 2015

Sync multiple namespaces at the same time #23

Sync multiple namespaces at the same time #23

Comments

nstott commented Dec 22, 2014

codepope commented Dec 22, 2014

nstott commented Dec 22, 2014

codepope commented Dec 22, 2014

mrkurt commented Dec 22, 2014

shividhar commented Jul 2, 2015

jipperinbham commented Jul 13, 2015

nstott commented Jul 13, 2015

codepope commented Jul 13, 2015

jipperinbham commented Jul 13, 2015

codepope commented Jul 13, 2015

jipperinbham commented Jul 14, 2015

jipperinbham commented Jul 14, 2015

codepope commented Jul 14, 2015

jipperinbham commented Jul 14, 2015

jipperinbham commented Jul 27, 2015