New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Table.write_back(), replacing documents by ids #184

Merged
merged 12 commits into from Feb 18, 2018

Conversation

4 participants
@davidlatwe
Contributor

davidlatwe commented Jan 20, 2018

Compare to update(), replace() write_back() provide a more customizable workflow
to modify documents.
One can fistly search the documents and modify them by thier conditions,
then input new douments with thier doc_ids to replace() write_back().

Example Usage

from tinydb import TinyDB, Query
from tinydb.storages import MemoryStorage

db = TinyDB(storage=MemoryStorage)

db.purge()

dataset = [
    {"stock": ["cookies", "apple"]},
    {"stock": ["brownies"]},
    {"stock": ["cake"]},
    {"stock": ["cookies", "milk"]},
    {"stock": ["cake"]},
    {"stock": ["pie", "cake"]},
    {"stock": ["apple"]},
]

table = db.table('yumyum')
table.insert_multiple(dataset)
# search
yum = Query()
docs = table.search(yum.stock.any(["cake", "apple"]))
# modify docs
for doc in docs:
    for i, stock in enumerate(doc["stock"]):
        if stock == "cake":
            doc["stock"][i] = "water"
        if stock == "apple":
            doc["stock"][i] = "honey"
# write back
table.write_back(docs)

Or replace write_back with new documents

db.purge()

table = db.table('yumyum')
table.insert_multiple(dataset)

docs = table.all()
# save doc_ids
doc_ids = [doc.doc_id for doc in docs]
# modify docs
for i, doc in enumerate(docs):
    if len(doc["stock"]) == 1:
        docs[i] = {"best": doc["stock"][0]}
    elif len(doc["stock"]) > 1:
        docs[i] = {"good": doc["stock"]}

table.write_back(docs, doc_ids)

Could be a convenient feature :)

davidlatwe added some commits Jan 20, 2018

Add Table.replace(), replacing doucments by ids
Compare to `update()`, `replace()` provide a more customizable workflow
to modify documents.
One can fistly search the documents and modify them by thier conditions,
then input new douments with thier `doc_ids` to `replace()`.
@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Jan 20, 2018

Seems like pytest-runner become an issue that can't be solved, it starts failing, even current master branch.
Due to pytest-runner not able to find the package setuptools_scm in Python 2.6.

reference 1
reference 2

In the last commit a2992e6 is the best workaround I can get for now.

Back to the topic, please let me know what you think about this replace feature !
Thank you.

@hugovk

This comment has been minimized.

Contributor

hugovk commented Jan 21, 2018

Here's the pip installs for tinydb from PyPI for last month:

python_version percent download_count
2.7 44.5% 4,663
3.6 17.4% 1,826
3.5 15.2% 1,587
3.4 12.4% 1,299
3.2 3.2% 339
2.6 2.6% 269
3.3 2.4% 255
3.7 2.2% 235

Source: pypinfo --start-date -51 --end-date -21 --percent --pip --markdown tinydb pyversion

2.6 has been EOL for over 4 years. Can it be dropped?

@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Jan 22, 2018

Just moved this to issue #185 :)

@msiemens

This comment has been minimized.

Owner

msiemens commented Jan 23, 2018

Looks good to me 🙂 If @eugene-eeo doesn't have any objections, I'll be glad to merge this!

@eugene-eeo

This comment has been minimized.

Contributor

eugene-eeo commented Jan 27, 2018

Yeah, looks great! My only nitpick would be the name of the method... replace is nice but writeback would be more precise on what is being done.

@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Jan 27, 2018

replace is nice but writeback would be more precise on what is being done.

sure ! @eugene-eeo, I changed to write_back now, with an underscore.


Just found out that may cause doc_id duplicated error after this write_back (replace) action if one feeds with doc_id that is greater then _last_id, because this function write doc just like insert :

data[doc_id] = documents.pop()

So added a check in commit d0d0c8f

@davidlatwe davidlatwe changed the title from Add Table.replace(), replacing doucments by ids to Add Table.write_back(), replacing documents by ids Jan 28, 2018

@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Jan 28, 2018

OK, I also modified the example usage in top comment with the function name write_back, I am good to go :)

davidlatwe added some commits Jan 29, 2018

Merge branch 'master' into feature-Table.replace
Merging master branch with Python 2.6 support dropped.
@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Jan 29, 2018

:shipit: Merged with latest master branch, setup.py cleaned.

@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Jan 30, 2018

Hi @msiemens and @eugene-eeo , can we merge this ?
Or Docs should be update as well ?
Thanks :D

@eugene-eeo

This comment has been minimized.

Contributor

eugene-eeo commented Jan 30, 2018

LGTM but it's up to @msiemens to merge. Thanks for contributing!

@msiemens

This comment has been minimized.

Owner

msiemens commented Jan 30, 2018

Sorry for the silence, everyone. I'm having two exams this week and am a little short on time. All in all, this looks good to me too. But I'd bring up the question of naming again. From an API usability perspective, I'd really prefer replace() instead of write_back(). Most people immediately can recognize what replace() does but will struggle with write_back().

@eugene-eeo What do you think?

@eugene-eeo

This comment has been minimized.

Contributor

eugene-eeo commented Jan 31, 2018

@msiemens yeah I see your point which is something I considered at first, but the way I saw this was that it is a "lower level" version of db.update because you're in charge of maintaining the IDs. Also the write_back name was inspired by the writeback option in shelve.open. However I don't mind if we use replace instead as it seems to be more intuitive.

@msiemens

This comment has been minimized.

Owner

msiemens commented Jan 31, 2018

the way I saw this was that it is a "lower level" version of db.update because you're in charge of maintaining the IDs

That's a good point however. The question to me is whether this operation is something we would encourage everyday users or not. If we do encourage everyday use, replace is the better name. If this is meant to be a rather low level operation, we should rather go with write_back. I'm not sure what I prefer as one could argue for both cases (and we cannot change the behaviour significantly later on)…

@eugene-eeo

This comment has been minimized.

Contributor

eugene-eeo commented Feb 1, 2018

I think it could be regarded as a lower level operation now, we can always change the method name with a deprecation warning in the future.

@msiemens

This comment has been minimized.

Owner

msiemens commented Feb 1, 2018

That's a good idea! The last remaining question is where to document this. In Advanced Usage? Only in the API docs?

@eugene-eeo

This comment has been minimized.

Contributor

eugene-eeo commented Feb 2, 2018

Both in the API docs and advanced usage, IMO, and in the API docs we'll put a warning saying that it can be dangerous to use the method if you don't know what you're doing.

@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Feb 6, 2018

If in Advanced Usage, maybe could place it right after the operations example in Updating data section, since this is like a "lower level" version of update that giving user the ability to perform custom operation wildly.

P.S. @msiemens sorry, no mean to rush 😄 , hope for the best !

@msiemens

This comment has been minimized.

Owner

msiemens commented Feb 8, 2018

Just a quick note: I haven't forgotten about this PR and 'll probably revisit it at the end of next week 🙂

@msiemens msiemens merged commit 6b33fee into msiemens:master Feb 18, 2018

2 checks passed

continuous-integration/travis-ci/pr The Travis CI build passed
Details
coverage/coveralls Coverage remained the same at 100.0%
Details
@msiemens

This comment has been minimized.

Owner

msiemens commented Feb 18, 2018

I've finally merged this and added some documentation. I'll release a new version of TinyDB in the next week or two 🙂

@davidlatwe davidlatwe deleted the davidlatwe:feature-Table.replace branch Feb 18, 2018

@davidlatwe

This comment has been minimized.

Contributor

davidlatwe commented Feb 18, 2018

Thank you @msiemens !

@msiemens

This comment has been minimized.

Owner

msiemens commented Mar 1, 2018

This is now released in v3.8.0 🙂

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment