Skip to content
This repository has been archived by the owner on Oct 17, 2023. It is now read-only.

Namespaces #78

Closed
Garito opened this issue Jun 18, 2015 · 17 comments
Closed

Namespaces #78

Garito opened this issue Jun 18, 2015 · 17 comments
Milestone

Comments

@Garito
Copy link

Garito commented Jun 18, 2015

Hi!
I'm trying to use this library to sync mongodb with elasticsearch but I have a malformed mongo namespace
This is my config.yaml:

nodes:
  mongodb:
    type: mongo
    uri: http://mongodb:27017
  es:
    type: elasticsearch
    uri: http://es:9200

and this is my application.js

pipeline = Source({name:"mongodb", namespace:"tf"}).save({name:"es", namespace:"tf"}); 

Obviously I don't understand namespaces (perhaps the lack of documentation has its part on this)
What are namespaces?

How can I configure transporter to replicate a whole database?

Thanks!

@codepope
Copy link
Contributor

@Garito
Copy link
Author

Garito commented Jun 18, 2015

What about replication all collections from a database?
Can I use tf.*?

@Garito
Copy link
Author

Garito commented Jun 18, 2015

With namespace as tf.* it does exit 0 without replicating anything
What I'm missing? It is supposed that transporter will run forever watching mongo's changes?

@codepope
Copy link
Contributor

The Transporter has been designed to synchronise data collections rather than entire databases. The Transporter currently uses singular collections as a source from MongoDB. You cannot give a wildcard namespace as the Transporter currently has no way of mapping the source collection to a destination namespace in another type of database.

There are Transporter based commands like "Seed" (see https://github.com/compose/transporter-examples/tree/master/go/cmd ) which query the available collections and copy them across.

If you are copying a single collection to an ES index then setting tail:true in the mongodb part of the config.yaml will activate the oplog tailing, providing you have configured your Mongodb locally to act as a single noded replica set.

@Garito
Copy link
Author

Garito commented Jun 18, 2015

So I need to create a pipline for every collection I'm going to have even if I don't know that?

@codepope
Copy link
Contributor

For something like copying data to Elasticsearch, most users tend to only want to copy particular collections into Elasticsearch after transforming the content to something more searchable. We have more extensive mapping on the roadmap for Transporter.

@Garito
Copy link
Author

Garito commented Jun 18, 2015

So if I need to replicate all the collections to an index, is it possible to put them all in the same index but with different _type?

@Garito
Copy link
Author

Garito commented Jun 18, 2015

Hi!
This is how I change application.js

var pipeline = Source({name:"mongodb", namespace:"tf.user"}).save({name:"es", namespace:"tf.user"});
var pipeline = Source({name:"mongodb", namespace:"tf.section"}).save({name:"es", namespace:"tf.section"});
var pipeline = Source({name:"mongodb", namespace:"tf.investment"}).save({name:"es", namespace:"tf.investment"});

Now the index is created but there is no documents replicated
Could you point me my error here?

@Garito
Copy link
Author

Garito commented Jun 18, 2015

I try to put only the 1st line to check if I can't use it like above but it doesn't create any doc there

@Garito
Copy link
Author

Garito commented Jun 18, 2015

This line has showed up in the logs:
transporter: CRITICAL: elasticsearch error (Bulk Insertion Error. Failed item count [1])

@Garito
Copy link
Author

Garito commented Jun 18, 2015

I noticed I can wget the elasticsearch server from the transporter container so no connection issues here

@nstott
Copy link
Contributor

nstott commented Jun 18, 2015

@Garito, I would really like to get some sort of wild card matching happening on namespaces, so a glob like "tf.*" would work, or perhaps a regex. That's not in place though yet.

The bulk insertion error means that elasticsearch rejected one of the documents that it was trying to insert. the elasticsearch logs might have more information?

@Garito
Copy link
Author

Garito commented Jun 18, 2015

Same issue than here: #73

@lvidarte
Copy link

Same here

org.elasticsearch.index.mapper.MapperParsingException: failed to parse [_id]

Caused by: org.elasticsearch.index.mapper.MapperParsingException: Provided id [AU4MdBUTcmqtbhPtTfE1] does not match the content one [5581d96504e7bd28308b46c2]

I'm not using a custom _id

My config

nodes:
  mongo:
    type: mongo
    uri: mongodb://localhost:27017/
    namespace: ecommerce.Items
  es:
    type: elasticsearch
    uri: http://localhost:9200/
    namespace: ecommerce.items

My app

Source({name:"mongo"}).save({name:"es"});

My env

ubuntu 14.04
elasticsearch 1.5.2
mongodb 2.4.9
transporter 0.0.3

@nstott
Copy link
Contributor

nstott commented Jul 3, 2015

@lvidarte are you using a transformer?
Elasticsearch is rejecting that document because of a mapping issue
if you're using a transformer, then see: https://www.compose.io/articles/transporter-and-elasticsearch-mapping/
otherwise it might be a problem with the current id methodology

@lvidarte
Copy link

@nstott I added the following transform.js file and now it works, thanks.

module.exports = function(doc) {
  doc._id = doc._id['$oid'];
  return doc;
}

Ref #81

@jipperinbham
Copy link
Contributor

closed via #101

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants