Skip to content

Conversation

dadoonet
Copy link
Contributor

In CouchDB, you can retrieve docs by GET, _changes API and views.
CouchDB river uses _changes API to get documents.

I would like to be able to get documents that changed (getting ID with the _changes API) using a view with parameter key="DOCID".

As views return a collection of results (aka rows), we will index in ES each row with an id like DOCID_seq where seq is the sequence number of each row.
If you get back 3 rows for one single change for document with ID=1234, the river will index 3 documents :

  • 1234_1
  • 1234_2
  • 1234_3

To use it, you have to define a view in couchDB. For instance, _design/vues/_view/test_dpi with

function(doc) {
  listArt=doc.document.articles;
  var numeroArticle;
  var codeMarchandise;
  var designationCommerciale;
  var mb;
  var nombreColis; 

  for(var i=0; i<listArt.length;i++) {  
   var artJson = {};
   numeroArticle=listArt[i].numeroArticle; 
   codeMarchandise=listArt[i].codeMarchandise;
   designationCommerciale=listArt[i].designationCommerciale;
   mb=listArt[i].masseBrute;   
   nombreColis=listArt[i].nombreColis;   
   artJson = { 'numeroArticle' : numeroArticle , 'codeMarchandise' : codeMarchandise , 'designationCommerciale' : designationCommerciale ,'masseBrute' : mb , 'nombreColis' : nombreColis };
   artJson =  JSON.stringify( artJson );

   emit(doc._id, eval('('+artJson+')') );
  };
}

You can use it in your couchDb river as follow :

{
  "type":"couchdb",
  "couchdb": {
    "host":"localhost",
    "port":"5984",
    "db":"dau_test",
    "view":"vues/_view/test_dpi",
    "viewIgnoreRemove":false
  }
}

New options :

  • view : if not null, couchDB river will not fetch content from _changes API but only IDs and then will use the view to retrieve rows using the ID as a key. By default : null
  • viewIgnoreRemove : ask the river to ignore removal of rows if there is less rows after a document update. By default : false so non existing rows will be removed from elastic search.

For example, with the 3 rows described earlier, if you push a new version of the document 1234 in couchDB with only 2 docs,

If viewIgnoreRemove is false (default), then

  • 1234_1 will be updated
  • 1234_2 will be updated
  • 1234_3 will be removed

If viewIgnoreRemove is true, then

  • 1234_1 will be updated
  • 1234_2 will be updated
  • 1234_3 will not be updated

I hope I wrote it in the right way. Any comments are welcome...

BTW, I will push an update when the ids_prefix filter will be available to make code more efficient. (See issue #1259)

Thanks

@dadoonet
Copy link
Contributor Author

Ooouch. I'm not a Git expert so I added the commit 270d7e0 to this pull request instead of opening a new pull request for the attachement bypass option...

So, what can I do now ? Is there anyway to remove the last commit to this pull request ?
Or do I complete my pull request by giving some details about the new option ?

Thanks (and sorry ;-) )

@kimchy
Copy link
Member

kimchy commented Aug 24, 2011

I think we should have two different pull requests for those. Lemme first also work on the ids prefix filter so we can have the better solution baked right in. Not sure I fully followed what it does though :)

@dadoonet
Copy link
Contributor Author

Ok. Pull request #1283 created for the ignoreAttachements new option. I will try to update this one (or will close it and open another one if I don't suceed).

kimchy and others added 27 commits September 9, 2011 13:09
… structure (resulting in wrong search responses), closes elastic#1323.
kimchy and others added 19 commits September 27, 2011 00:41
…e the list fo nodes to ping as well as the provided nodes, closes elastic#1217.
…e clean things up before we delete content if needed
@kimchy kimchy merged commit d69baa3 into elastic:master Oct 5, 2011
@jdzurik
Copy link

jdzurik commented Oct 6, 2011

Do you think the functionality of pulling from a view will be added to the couchdb river?

@dadoonet
Copy link
Contributor Author

dadoonet commented Oct 6, 2011

Hi there,

Not sure of what happened with my pull request : fc0e03c

I think that I did a stupid thing yesterday with git and my elasticsearch fork...
I need more training about git ! Shame on me !

Do I have to create a new pull request for this issue ?

@jdzurik
Copy link

jdzurik commented Nov 28, 2011

I was looking through the release notes for 18 and I don't the ability to create a couchdb river for a view is this something that's not getting implemented or is it just being worked on from other angles?

@dadoonet
Copy link
Contributor Author

No. It's not here.
I make some mess with Github (and raise an issue at GitHub support) so, the pull request seems to be closed but in fact, it's not there.
I'm not sure that I can find my code again :-( I will try and make a new pull request for it, although I'm waiting for issue #1259 to be solved.
Cheers
David.

@dadoonet
Copy link
Contributor Author

@jdzurik : I worked on it again. You can give it a try and let me know.
Source code is here : https://github.com/dadoonet/elasticsearch/tree/couchdbriver_views
Documentation is updated here : https://github.com/dadoonet/elasticsearch.github.com/tree/couchdbriver_views
See dadoonet/elasticsearch.github.com@a7fac29

Please let me know if it answers to your needs. If so, I will send a pull request for it.

David.

@benjelloun23
Copy link

Hello David,

I installed ElasticSearch, its work good i can index and search xml and json content using Dev HTTP Client.
I need your help to index binary files in elasticsearch then search for them by content.
I added mapper-attachements to elastic search but what i dont know is how to specify the folder of pdf or docx files to index it. something like base64 or i dont know.
Thanks for helping me.

sincerely,

@dadoonet
Copy link
Contributor Author

I think you misunderstood what I answered to your private email.

If you need to ask public question, please use the mailing list. You can have more details on how to do it here: http://www.elasticsearch.org/help/

But please, don't hijack issues or pull requests.
Thanks

@benjelloun23
Copy link

ok i'm sorry for misunderstanding you and thanks for help you are a good man(Professional)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.