Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

optionally make shelve less surprising #36568

Closed
aleaxit opened this issue May 7, 2002 · 14 comments
Closed

optionally make shelve less surprising #36568

aleaxit opened this issue May 7, 2002 · 14 comments
Assignees
Labels
stdlib Python modules in the Lib dir

Comments

@aleaxit
Copy link
Contributor

aleaxit commented May 7, 2002

BPO 553171
Nosy @gvanrossum, @loewis, @aleaxit, @rhettinger
Files
  • shelve.patch: patch to Lib/shelve.py
  • shelve.patch: patch to shelve.py, test_shelve, AND docs
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/loewis'
    closed_at = <Date 2003-04-19.21:02:59.000>
    created_at = <Date 2002-05-07.08:13:04.000>
    labels = ['library']
    title = 'optionally make shelve less surprising'
    updated_at = <Date 2003-04-19.21:02:59.000>
    user = 'https://github.com/aleaxit'

    bugs.python.org fields:

    activity = <Date 2003-04-19.21:02:59.000>
    actor = 'loewis'
    assignee = 'loewis'
    closed = True
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2002-05-07.08:13:04.000>
    creator = 'aleax'
    dependencies = []
    files = ['4246', '4247']
    hgrepos = []
    issue_num = 553171
    keywords = ['patch']
    message_count = 14.0
    messages = ['39921', '39922', '39923', '39924', '39925', '39926', '39927', '39928', '39929', '39930', '39931', '39932', '39933', '39934']
    nosy_count = 5.0
    nosy_names = ['gvanrossum', 'loewis', 'aleax', 'rhettinger', 'dannu']
    pr_nums = []
    priority = 'high'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue553171'
    versions = ['Python 2.3']

    @aleaxit
    Copy link
    Contributor Author

    aleaxit commented May 7, 2002

    shelve has highly surprising behavior wrt modifiable
    values:
    s = shelve.open('she.dat','c')
    s['ciao'] = range(3)
    s['ciao'].append(4) # doesn't "TAKE"!

    Explaining to beginners that s['ciao'] is returning a
    temporary object and the modification is done on the
    temporary thus "silently ignored" is hard indeed. It
    also makes shelve far less convenient than it could
    be (whenever modifiable values must be shelved).

    Having s keep track of all values it has returned may
    perhaps break some existing program (due to extra
    memory consumption and/or to lack of "implicit
    copy"/"snapshot" behavior) so I've made the 'caching'
    change optional and by default off. However it's now
    at least possible to obtain nonsurprising behavior:
    s = shelve.open('she.dat','c',smart=1)
    s['ciao'] = range(3)
    s['ciao'].append(4) # no surprises any more

    I suspect the 'smart=1' should be made the default,
    but, if we at least put it in now, then perhaps we
    can migrate to having it as the default very slowly
    and gradually.

    Alex

    @aleaxit aleaxit closed this as completed May 7, 2002
    @aleaxit aleaxit added the stdlib Python modules in the Lib dir label May 7, 2002
    @aleaxit aleaxit closed this as completed May 7, 2002
    @aleaxit aleaxit added the stdlib Python modules in the Lib dir label May 7, 2002
    @loewis
    Copy link
    Mannequin

    loewis mannequin commented May 7, 2002

    Logged In: YES
    user_id=21627

    Even more important than the backwards compatibility might
    be the issue that it writes back all accessed objects on
    close, which might be expensive if there have been many
    read-only accesses.

    So I think the option name could be also 'slow'; although
    'writeback' might be more technical.

    Also, I wonder whether write-back should be attempted if the
    shelve was opened read-only.

    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    Nicely done! The code is clean and runs in the smart mode
    without problems on my existing programs. I agree that the
    patch solves a real world problem. The solution is clean,
    but a little expensive.

    If there were a way to be able to tell if an entry had been
    altered, it would save the 100% writeback. Unfortunately,
    I can't think of a way.

    The docstring could read more smoothly and plainly. Also,
    it should be clear that the cost of setting smart=1 is that
    100% of the entries get rewritten on close.

    Two microscopically minor thoughts on the coding (feel free
    to disregard). Can some of the try/except blocks be
    replaced by something akin to 'if self.smart:'? For the
    writeback loop, consider 'for k,v in cache.iteritems()' as
    it takes less memory and saves a lookup.

    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    A few more thoughts:

    Please change the "except:" lines to specify the exception
    being caught.

    Also, if GvR shows interest in the patch, we should update
    the library reference and add unittests.

    The docstring should also mention that the cache is kept in
    memory -- besides persistence, one of the forces for
    shelving is memory conservation.

    @dannu
    Copy link
    Mannequin

    dannu mannequin commented May 10, 2002

    Logged In: YES
    user_id=83092

    I'd suggest not changing shelve at all but providing
    a "cache-commit" dictionary (ccdict) which can wrap a
    shelf-instance (or any other simple dictish instance)
    and provides the 'non-surprising' behaviour.

    Some proof of concept code for the following
    properties is provided here

    http://home.trillke.net/~hpk/ccdict.py

    Current properties are:

    • ccdict wraps a dictionary-like object which
      in turn only needs to provide
      __getitem__, __setitem__, __delitem__,has_key

    • on first access of an element
      ccdict makes a lookup on the underlying
      dict and caches the item.

    • the next accesses work with the cached thing.
      Unsurprising dict-semantics are provided.

    • deleting an item is deferred and actually happens
      on commit() time. deleting an item and later on
      assigning to it works as expected (i.e. the assignment
      takes preference).

    • commit() transfers the items in the
      cache to the underlying dict and clears
      the cache.Prior to issuing commit
      no writeback to the underlying dict happens.

    • deleting an ccdict-instance does *not* commit any
      changes. You have to explicitely call commit().
      If you want to work readonly, don't call commit.

    • clear() only cleares the cache and not the underlying
      dict

    • you can explicitely prune the cache (via cache.keys()
      etc.) before calling commit(). This lets you
      avoid writing back unmodified objects if this
      is an issue.

    It seems quite impossible to figure out automagically
    which objects have been modified
    and so the solution is to do it explicitely
    (or don't commit for readonly).

    holger

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 30, 2003

    Logged In: YES
    user_id=21627

    Alex, do you still think this should be implemented, in some
    form or other?

    @aleaxit
    Copy link
    Contributor Author

    aleaxit commented Mar 30, 2003

    Logged In: YES
    user_id=60314

    Yes, Martin, I'm still quite convinced shelve's behavior is
    generally surprising and often problematic, and even though
    the fixed suggested by both me and dannu are each imperfect
    (given the impossibility to find out, in general, whether an object
    has been modified), I think one or the other would still be better
    than the current situation.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 30, 2003

    Logged In: YES
    user_id=21627

    Would you then be willing to provide a complete patch
    (documentation, NEWS entry, test case)?

    @rhettinger
    Copy link
    Contributor

    Logged In: YES
    user_id=80475

    The issue has arisen a couple of times
    of comp.lang.python.
    I think this patch would be helpful.

    @aleaxit
    Copy link
    Contributor Author

    aleaxit commented Mar 30, 2003

    Logged In: YES
    user_id=60314

    sure, but along what lines -- my previous patch's, or dannu's? let
    me know, and I'll get to work on it as soon as I'm back from
    Python-UK & short following trip (i..e around Apr 12)

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Mar 31, 2003

    Logged In: YES
    user_id=21627

    dannu's code is currently unavailable... I see no reason to
    add yet another layer of indirection, and no other
    application of such a wrapper within Python.

    The trickiest aspect of this educational: If the default
    behaviour does not change (as it shouldn't), how can
    unsuspecting users avoid running into the trap? So this is
    much more a documentation problem than a code problem.

    @aleaxit
    Copy link
    Contributor Author

    aleaxit commented Apr 19, 2003

    Logged In: YES
    user_id=60314

    done -- uploading the patch to code, test and docs, with the
    optional parameter renamed to writeback; docs specifying the
    issue, possible workaround w/o writeback, and writeback's
    performance-hit risks; and following raymond's suggestions.
    Not sure what "a NEWS entry" as martin requires should be a
    patch TO, though...? So I didn't do that yet -- lemme know!!!

    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    Looks good to me. Martin or Raymond, can you check it in?
    (Or should we just give Alex commit perms?)

    One unrelated note: the 'binary' parameter should really be
    upgraded to 'protocol' to match changes to the pickle module.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Apr 19, 2003

    Logged In: YES
    user_id=21627

    I have committed the patch as

    libshelve.tex 1.20
    shelve.py 1.21
    test_shelve.py 1.4
    NEWS 1.737

    In these changes, the binary parameter got upgraded to
    protocol; binary was added after writeback for compatibility
    and pendingly deprecated. The test case usage of binary was
    mainly preserved, and protocol 2 test cases added.

    Alex: NEWS is Misc/NEWS; any non-bug fix change is listed there.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    3 participants