Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question regarding update vs insert #289

Open
mkhorton opened this issue Sep 22, 2020 · 2 comments
Open

Question regarding update vs insert #289

mkhorton opened this issue Sep 22, 2020 · 2 comments

Comments

@mkhorton
Copy link
Member

The MongoStore does not provide an insert option, only an update option. This does a bulk_write but as a list of ReplaceOne due to how maggma deals with key handling, which presumably(?) is a lot slower than just a list of insert ops.

I'm building a large number of documents but I've dropped down to using store.collection.insert_many() since the update seems prohibitive.

Is the lack of insert intentional? Has this been problematic for others?

@shyamd
Copy link
Contributor

shyamd commented Sep 23, 2020

I've found bulk unordered ReplaceOne with upsert=True is almost as performant as insert_many. The only time this is an issue is when there isn't an index on the key field which can happen, and then its much slower. I've been hesitant to enforce indexes in the store themselves but considering making it a default to call ensure_index on the key and last_updated fields in the connect method.

@mkhorton
Copy link
Member Author

I've been hesitant to enforce indexes in the store themselves but considering making it a default to call ensure_index on the key and last_updated fields in the connect method.

Ah, I thought this was the case -- in my case, this was building to a new collection (i.e. without indices) and thought that indices on the key would be generated.

Given the performance of maggma determines critically on indices being available for both the key field and last_updated field, I would definitely be on board with ensuring indexes by default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants