Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added option for writable JSONStores (for single JSON files only). #507

Merged
merged 3 commits into from Nov 25, 2021

Conversation

davidwaroquiers
Copy link
Contributor

@davidwaroquiers davidwaroquiers commented Nov 18, 2021

Added option to allow for a writable JSONStore. The default stays the same, i.e. the JSONStore is just a MemoryStore initialized from a json or a list of json files.

When a write-like operation is performed on the JSONStore (i.e. "update" or "remove_docs"), the json file is rewritten. This is of course not optimal if you have a large json file/database. Nonetheless, this is not meant to be used for large data but for small use cases.

Note that this option is not available when more than one json file is provided.

A unit test has been added.

@codecov
Copy link

codecov bot commented Nov 23, 2021

Codecov Report

Merging #507 (38eae63) into main (630c4fb) will increase coverage by 0.03%.
The diff coverage is 94.44%.

Impacted file tree graph

@@            Coverage Diff             @@
##             main     #507      +/-   ##
==========================================
+ Coverage   88.73%   88.76%   +0.03%     
==========================================
  Files          40       40              
  Lines        2672     2689      +17     
==========================================
+ Hits         2371     2387      +16     
- Misses        301      302       +1     
Impacted Files Coverage Δ
src/maggma/stores/mongolike.py 84.77% <94.44%> (+0.58%) ⬆️
src/maggma/api/query_operator/sorting.py 100.00% <0.00%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 242cdda...38eae63. Read the comment docs.

@munrojm
Copy link
Member

munrojm commented Nov 23, 2021

Thanks for this @davidwaroquiers! I think your solution is fine given the intended use cases for JSONStore as you have mentioned.

One thing I would vote to have is a tiny bit more text in the doc string description for writable. It would be nice to make it clear that if you call update() with writeable = False that it will still be "writable" to some degree as it behaves like a standard MemoryStore.

@davidwaroquiers
Copy link
Contributor Author

Hello @munrojm ,

Thanks for your message. Indeed I can describe this in more details. I could also change the parameter to e.g. "write_to_file" or "file_update" or "file_writable" (or something else) that would make it more clear as well. What do you think ? Or just describing the behavior in the docstring is sufficient ?

Best,

David

@munrojm
Copy link
Member

munrojm commented Nov 24, 2021

I think both of those things would be good! Once you update, I will merge and release.

Updated docstring to make it more clear it only applies to the writing
of the file (JSONStore behaves like a MemoryStore when file_writable is
False).
Updated tests accordingly.
Added test to make sure the file is NOT updated when file_writable is
False.
@davidwaroquiers
Copy link
Contributor Author

I think both of those things would be good! Once you update, I will merge and release.

Done. I've also added one more test to make sure the file is not written when file_writable is False.

@munrojm
Copy link
Member

munrojm commented Nov 24, 2021

Great, thanks for that. Once the new test passes I will merge.

@davidwaroquiers
Copy link
Contributor Author

Great, thanks for that. Once the new test passes I will merge.

Ah... Sorry, this was working on my laptop ... so I tried again and indeed the test is sometimes passing sometimes failing... I should have known something like that could happen. Probably a thread/system issue. Not exactly sure how to deal with this. Also makes me wonder if my implementation of "update_json_file" is indeed thread-safe and if I should not lock the file while writing or something like this ...

I will investigate this in more details. If you have any ideas/comments on how to deal with that issue, do not hesitate to share it with me :)

@davidwaroquiers
Copy link
Contributor Author

Hi @munrojm
I've updated the test it now always passes on my laptop. I've taken a slightly different approach, mocking the JSONStore.update_json_file method and checking that it is never called when the JSONStore is not file-writable.
Concerning the thread issue, I am not sure it would be a problem in the intended use cases for this file-writable JSONStore. Still, if you think a locking system would be needed, do you have any idea how this could work out ?
Best,
David

@munrojm
Copy link
Member

munrojm commented Nov 25, 2021

Hi @davidwaroquiers, the test change looks good to me. Thanks for your thoughts on file locking. Given the intended use of the store, I agree with your sentiment and think the current solution is appropriate. Happy to merge now and deal with future thread safety concerns if they come up.

Thanks again for this contribution!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants