Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document Safe / Unsafe Usage Patterns of temp storage #12

Open
dwt opened this issue Jul 2, 2019 · 12 comments

Comments

@dwt
Copy link
Contributor

commented Jul 2, 2019

According to zopefoundation/Zope#637 (comment) there are some systematic shortcomings of tempstorage, that I haven't seen documented in this project.

Is this the case? Is the given advice solid? If yes, there should definitely be at least a short warning section in the readme, possibly some longer form documentation that guides users about safe and unsafe ways to use this package.

There are quite some bug-reports that point to problems with temp storage (and we are seeing problems here too) that seem to be much easier to trigger on python3, but are probably present since quite a long time. At the same time @d-maurer seems to indicate that this is common knowledge.

Quite possibly #8 is also an instance of this problem - we are seeing similar things in our internal testing, #5, zopefoundation/Zope#637, zopefoundation/Products.Sessions#4 all seem to paint that there is a general problem going on.

@mgedmin

This comment has been minimized.

Copy link
Member

commented Jul 2, 2019

Anecdote: once I imported a folder full of page templates and scripts (a new version of our website) into /temp_folder, tested it out, then cut and pasted it into root.

About 24 hours later the new website started failing and throwing POSKeyErrors. Turns out ZMI's Cut & Paste doesn't move the objects from one storage into another, so the Folder object pasted into the root folder was a cross-database reference into the temp storage. Which got erased for some reason.

I wouldn't call this a bug, exactly, merely an unfortunate clash between user expectations and software working as designed. I've learned to avoid /temp_folder.

@dataflake

This comment has been minimized.

Copy link
Member

commented Jul 2, 2019

@mgedmin Wow, I never ever even got the idea of storing anything at all in there. Not because I have some higher level of knowledge about tempstorage or Products.TemporaryFolder, but I never made that mental connection to think "this behaves just like a temporary folder on disk".

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

@mgedmin I would have to say, that calling a function cut/copy/paste and then implementing it to merely reference some other storage without actually doing the copy is about as much a bug as the mysql utf8 encoding not actually encoding utf8 but instead only at max three bytes utf8, therefore being unable to encode smileys and also silently dropping those from the input.

That is to say: This is utterly surprising behaviour that goes completely counter to what the user expects. It is definitely a bug - albeit one that maybe nobody wants to fix because it is complicated.

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

Just a thought: If tempstorage is actually so broken - wouldn't it be sensible at some point to just drop it, move session into the normal ZODB and just be done with it?

@d-maurer

This comment has been minimized.

Copy link

commented Jul 3, 2019

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

@d-maurer: I know this is only tangential to the bug we're discussing here, but I would really not have a problem if cut'n'paste only 'moved' objects. The problem is more that it does not. I think that is exactly the problem: It only creates a reference to the old storage location and does not move the object. That is the exact reason why it the name 'cut' does not fit the mental model of users.

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 3, 2019

I have tried experimenting a bit with the temp storage to understand it's failure states, so far I wasn't able to reproduce any of the failures that other bug reports have mentioned or that we saw.

Here's my try - can you maybe give some guidance what might be required to trigger temp storage barfing up?

#!/usr/bin/env python

from contextlib import contextmanager
import time
from random import random

import ZODB, ZODB.FileStorage
from tempstorage.TemporaryStorage import TemporaryStorage
from persistent.mapping import PersistentMapping
from BTrees.OOBTree import OOBTree
import transaction

from Testing.makerequest import makerequest


storage = TemporaryStorage()
db = ZODB.DB(storage)
original_connection = connection = db.open()
root = connection.root

from OFS.Application import Application
root.Application = Application()
transaction.savepoint(optimistic=True)


def _populate(app):
    from OFS.DTMLMethod import DTMLMethod
    from Products.Sessions.BrowserIdManager import BrowserIdManager
    from Products.Sessions.SessionDataManager import SessionDataManager
    from Products.TemporaryFolder.TemporaryFolder import MountedTemporaryFolder
    from Products.Transience.Transience import TransientObjectContainer
    import transaction
    bidmgr = BrowserIdManager('browser_id_manager')
    tf = MountedTemporaryFolder('temp_folder', title="Temporary Folder")
    toc = TransientObjectContainer(
        'temp_transient_container',
        title='Temporary '
        'Transient Object Container',
        timeout_mins=20
    )
    session_data_manager = SessionDataManager(
        id='session_data_manager',
        path='/' + 'temp_folder' + '/' + 'temp_transient_container',
        title='Session Data Manager',
        requestName='SESSION'
    )
    
    app._setObject('browser_id_manager', bidmgr)
    app._setObject('session_data_manager', session_data_manager)
    app._setObject('temp_folder', tf)
    
    transaction.commit()
    
    app.temp_folder._setObject('temp_transient_container', toc)
    transaction.commit()

    # index_html necessary for publishing emulation for testAutoReqPopulate
    app._setObject('index_html', DTMLMethod('', __name__='foo'))
    transaction.commit()

_populate(root.Application)

print(connection.root)

# is a dict here good enough to check this out, or do I need to tell zodb that stuff changed#
# May need to change this to a persistent.Mapping to get the effect I want
# connection.root.session = PersistentMapping()
# connection.root.session = OOBTree()
# connection.root.session['a'] = 'b'
# print(connection.root.session['a'])
#
# import transaction
# transaction.commit()
#
# print(root.session['a'])
container = makerequest(root.Application)
session = container.session_data_manager.getSessionData()
cookies = container.REQUEST.response.cookies
cookies = {'_ZopeId': cookies['_ZopeId']['value']}
transaction.savepoint(optimistic=True)

# import sys; sys.stdout = sys.__stdout__; from pdb import set_trace; set_trace()

@contextmanager
def level(level_number):
    container = makerequest(root.Application)
    container.REQUEST.cookies = cookies
    with db.transaction() as connection:
        print(level_number, '. level')
        # import sys; sys.stdout = sys.__stdout__; from pdb import set_trace; set_trace()
        session = container.session_data_manager.getSessionData(create=False)
        print(session._p_oid)
        print(session)
        session['foo'] = level_number
        session['foo%s' % level_number] = level_number
        print(session)
        
        # print(dict(connection.root.session))
        yield session
        session['bar'] = level_number
        connection.transaction_manager.get().commit()
    print(container.session_data_manager.getSessionData()._p_changed)
    print(level_number, '. level ended')
    transaction.commit()


# with level(1):
#     with level(2):
#         with level(3):
#             with level(4):
#                 with level(5):
#                     with level(6):
#                         with level(7):
#                             with level(8):
#                                 with level(9):
#                                     pass
#
from threading import Thread

class RequestThread(Thread):
    
    def run(self):
        time.sleep(random())
        with level(1):
            with level(2):
                time.sleep(random())

t = [RequestThread() for _ in range(10000)]
list(map(lambda t: t.start(), t))
list(map(lambda t: t.join(), t))

print(session)
print(dict(session))
# print(dict(original_connection.root.session))
@d-maurer

This comment has been minimized.

Copy link

commented Jul 3, 2019

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 4, 2019

Out of interest: Why does this make you think that this fact makes cut'n'paste is a good name for this feature that will give users the right mental model when to use it?

But now I'd really like to get back to the overall stability and fitness of temp storage.

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 9, 2019

I have not been able to replicate this issue yet, but I did find out, that this patch

diff --git a/src/tempstorage/TemporaryStorage.py b/src/tempstorage/TemporaryStorage.py
index 284c55d..d61fce3 100644
--- a/src/tempstorage/TemporaryStorage.py
+++ b/src/tempstorage/TemporaryStorage.py
@@ -29,10 +29,10 @@ from ZODB.serialize import referencesf
 from ZODB.utils import z64
 
 # keep old object revisions for CONFLICT_CACHE_MAXAGE seconds
-CONFLICT_CACHE_MAXAGE = 60
+CONFLICT_CACHE_MAXAGE = 5
 
 # garbage collect conflict cache every CONFLICT_CACHE_GCEVERY seconds
-CONFLICT_CACHE_GCEVERY = 60
+CONFLICT_CACHE_GCEVERY = 5
 
 # keep history of recently gc'ed oids of length RECENTLY_GC_OIDS_LEN
 RECENTLY_GC_OIDS_LEN = 200

seems to make it much easier to replicate this issue in our real app. Still, I'd like to understand better what the actual issue is to ensure we have actually solved it.

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 10, 2019

I'll be continuing the discussion about this reproducer at #8 as that seems a better bug for the kind of systematic error I'm seeing. I'd love for this bug to return to a more principled discussion of how to use tempstorage the right / wrong way.

@dwt

This comment has been minimized.

Copy link
Contributor Author

commented Jul 22, 2019

To try and move this discussion forward: As there seems to be no developer investment in fixing / working on #8, maybe we should start thinking about deprecating this package? One first step could be to change the generator, so it doesn't always generate a tempstorage mount point for every new Zope deployment that is started. (At least until the problems outlined in #8 are fixed / under control).

What do you guys think? Is changing the generator to not create the tempstorage mountpoint a discussion that should happen in https://github.com/zopefoundation/Zope?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants
You can’t perform that action at this time.