Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Table.write_back(), replacing documents by ids #184

Merged
merged 12 commits into from Feb 18, 2018
Merged

Add Table.write_back(), replacing documents by ids #184

merged 12 commits into from Feb 18, 2018

Conversation

davidlatwe
Copy link
Contributor

@davidlatwe davidlatwe commented Jan 20, 2018

Compare to update(), replace() write_back() provide a more customizable workflow
to modify documents.
One can fistly search the documents and modify them by thier conditions,
then input new douments with thier doc_ids to replace() write_back().

Example Usage

from tinydb import TinyDB, Query
from tinydb.storages import MemoryStorage

db = TinyDB(storage=MemoryStorage)

db.purge()

dataset = [
    {"stock": ["cookies", "apple"]},
    {"stock": ["brownies"]},
    {"stock": ["cake"]},
    {"stock": ["cookies", "milk"]},
    {"stock": ["cake"]},
    {"stock": ["pie", "cake"]},
    {"stock": ["apple"]},
]

table = db.table('yumyum')
table.insert_multiple(dataset)
# search
yum = Query()
docs = table.search(yum.stock.any(["cake", "apple"]))
# modify docs
for doc in docs:
    for i, stock in enumerate(doc["stock"]):
        if stock == "cake":
            doc["stock"][i] = "water"
        if stock == "apple":
            doc["stock"][i] = "honey"
# write back
table.write_back(docs)

Or replace write_back with new documents

db.purge()

table = db.table('yumyum')
table.insert_multiple(dataset)

docs = table.all()
# save doc_ids
doc_ids = [doc.doc_id for doc in docs]
# modify docs
for i, doc in enumerate(docs):
    if len(doc["stock"]) == 1:
        docs[i] = {"best": doc["stock"][0]}
    elif len(doc["stock"]) > 1:
        docs[i] = {"good": doc["stock"]}

table.write_back(docs, doc_ids)

Could be a convenient feature :)

Compare to `update()`, `replace()` provide a more customizable workflow
to modify documents.
One can fistly search the documents and modify them by thier conditions,
then input new douments with thier `doc_ids` to `replace()`.
@davidlatwe
Copy link
Contributor Author

davidlatwe commented Jan 20, 2018

Seems like pytest-runner become an issue that can't be solved, it starts failing, even current master branch.
Due to pytest-runner not able to find the package setuptools_scm in Python 2.6.

reference 1
reference 2

In the last commit a2992e6 is the best workaround I can get for now.

Back to the topic, please let me know what you think about this replace feature !
Thank you.

@hugovk
Copy link
Contributor

hugovk commented Jan 21, 2018

Here's the pip installs for tinydb from PyPI for last month:

python_version percent download_count
2.7 44.5% 4,663
3.6 17.4% 1,826
3.5 15.2% 1,587
3.4 12.4% 1,299
3.2 3.2% 339
2.6 2.6% 269
3.3 2.4% 255
3.7 2.2% 235

Source: pypinfo --start-date -51 --end-date -21 --percent --pip --markdown tinydb pyversion

2.6 has been EOL for over 4 years. Can it be dropped?

@davidlatwe
Copy link
Contributor Author

Just moved this to issue #185 :)

@msiemens
Copy link
Owner

Looks good to me 🙂 If @eugene-eeo doesn't have any objections, I'll be glad to merge this!

@eugene-eeo
Copy link
Contributor

eugene-eeo commented Jan 27, 2018

Yeah, looks great! My only nitpick would be the name of the method... replace is nice but writeback would be more precise on what is being done.

@davidlatwe
Copy link
Contributor Author

replace is nice but writeback would be more precise on what is being done.

sure ! @eugene-eeo, I changed to write_back now, with an underscore.


Just found out that may cause doc_id duplicated error after this write_back (replace) action if one feeds with doc_id that is greater then _last_id, because this function write doc just like insert :

data[doc_id] = documents.pop()

So added a check in commit d0d0c8f

@davidlatwe davidlatwe changed the title Add Table.replace(), replacing doucments by ids Add Table.write_back(), replacing documents by ids Jan 28, 2018
@davidlatwe
Copy link
Contributor Author

OK, I also modified the example usage in top comment with the function name write_back, I am good to go :)

@davidlatwe
Copy link
Contributor Author

:shipit: Merged with latest master branch, setup.py cleaned.

@davidlatwe
Copy link
Contributor Author

Hi @msiemens and @eugene-eeo , can we merge this ?
Or Docs should be update as well ?
Thanks :D

@eugene-eeo
Copy link
Contributor

LGTM but it's up to @msiemens to merge. Thanks for contributing!

@msiemens
Copy link
Owner

Sorry for the silence, everyone. I'm having two exams this week and am a little short on time. All in all, this looks good to me too. But I'd bring up the question of naming again. From an API usability perspective, I'd really prefer replace() instead of write_back(). Most people immediately can recognize what replace() does but will struggle with write_back().

@eugene-eeo What do you think?

@eugene-eeo
Copy link
Contributor

eugene-eeo commented Jan 31, 2018

@msiemens yeah I see your point which is something I considered at first, but the way I saw this was that it is a "lower level" version of db.update because you're in charge of maintaining the IDs. Also the write_back name was inspired by the writeback option in shelve.open. However I don't mind if we use replace instead as it seems to be more intuitive.

@msiemens
Copy link
Owner

the way I saw this was that it is a "lower level" version of db.update because you're in charge of maintaining the IDs

That's a good point however. The question to me is whether this operation is something we would encourage everyday users or not. If we do encourage everyday use, replace is the better name. If this is meant to be a rather low level operation, we should rather go with write_back. I'm not sure what I prefer as one could argue for both cases (and we cannot change the behaviour significantly later on)…

@eugene-eeo
Copy link
Contributor

I think it could be regarded as a lower level operation now, we can always change the method name with a deprecation warning in the future.

@msiemens
Copy link
Owner

msiemens commented Feb 1, 2018

That's a good idea! The last remaining question is where to document this. In Advanced Usage? Only in the API docs?

@eugene-eeo
Copy link
Contributor

Both in the API docs and advanced usage, IMO, and in the API docs we'll put a warning saying that it can be dangerous to use the method if you don't know what you're doing.

@davidlatwe
Copy link
Contributor Author

davidlatwe commented Feb 6, 2018

If in Advanced Usage, maybe could place it right after the operations example in Updating data section, since this is like a "lower level" version of update that giving user the ability to perform custom operation wildly.

P.S. @msiemens sorry, no mean to rush 😄 , hope for the best !

@msiemens
Copy link
Owner

msiemens commented Feb 8, 2018

Just a quick note: I haven't forgotten about this PR and 'll probably revisit it at the end of next week 🙂

@msiemens msiemens merged commit 6b33fee into msiemens:master Feb 18, 2018
@msiemens
Copy link
Owner

I've finally merged this and added some documentation. I'll release a new version of TinyDB in the next week or two 🙂

@davidlatwe davidlatwe deleted the feature-Table.replace branch February 18, 2018 15:33
@davidlatwe
Copy link
Contributor Author

Thank you @msiemens !

@msiemens
Copy link
Owner

msiemens commented Mar 1, 2018

This is now released in v3.8.0 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants