-
Notifications
You must be signed in to change notification settings - Fork 58
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
doc_ids don't line up using the docs_bulk function #123
Comments
Thanks for the report. I remember there being a good reason to force zero based document IDs, so I do that https://github.com/ropensci/elastic/blob/master/R/docs_bulk.r#L226 when the ids are numeric. However, for user supplied I guess we should not do that |
I wish I could remember what that reason was |
In every use case I currently have, the document id has significance as a relational key field that ties more than a single incoming dataset together. In the use case above, the productid is created from external systems that transactionally manage a product master. I have to provide search capacity to the underlying RDBMS as the search load on the system is overcoming the capacity to conduct transactions. I'm sure there are use cases where simple sequential ids are a valid scheme, but one could use the default _id. |
at 1 or at whatever the user supplies updated docs for this fxn with more details about document ids, and noted possible change to doc_ids in the future to default to UUIDs bumped dev version
@jhendric98 okay reinstall like You should get the same doc ids you pass in to the function see also #125 |
great. I had just forked the project myself. I'll test it out in just a few hours when I get to my computer and let you know. |
Worked fine thanks. |
great! |
closing, pushing #125 soon |
Loading an elasticsearch index from a data.frame of about 125,000 observations and using the first variable as the argument for the doc_ids doesn't load the doc_ids correctly. It seems to load one row off.
docs_bulk(product,index = "model",type = "hd", doc_ids = product$productid)
product has 2 variables => productid and description
_id : 503
productid: 504
descscription: "4 drawer dresser, black"
My first productid is 101 and the first _id gets 100 assigned.
The text was updated successfully, but these errors were encountered: