Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

uuid.uuid1() should use uuid_generate_time_safe() if available #66996

Closed
warsaw opened this issue Nov 6, 2014 · 12 comments
Closed

uuid.uuid1() should use uuid_generate_time_safe() if available #66996

warsaw opened this issue Nov 6, 2014 · 12 comments
Assignees
Labels
3.7 stdlib

Comments

@warsaw
Copy link
Member

@warsaw warsaw commented Nov 6, 2014

BPO 22807
Nosy @warsaw, @vstinner, @taleinat, @alex, @serhiy-storchaka
PRs
  • #138
  • #703
  • #552
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/warsaw'
    closed_at = <Date 2017-02-20.15:15:25.790>
    created_at = <Date 2014-11-06.20:07:26.300>
    labels = ['3.7', 'library']
    title = 'uuid.uuid1() should use uuid_generate_time_safe() if available'
    updated_at = <Date 2018-09-10.12:17:40.604>
    user = 'https://github.com/warsaw'

    bugs.python.org fields:

    activity = <Date 2018-09-10.12:17:40.604>
    actor = 'taleinat'
    assignee = 'barry'
    closed = True
    closed_date = <Date 2017-02-20.15:15:25.790>
    closer = 'barry'
    components = ['Library (Lib)']
    creation = <Date 2014-11-06.20:07:26.300>
    creator = 'barry'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 22807
    keywords = []
    message_count = 12.0
    messages = ['230759', '230760', '230762', '287950', '288089', '288209', '288213', '288214', '288215', '288216', '324829', '324920']
    nosy_count = 6.0
    nosy_names = ['barry', 'vstinner', 'taleinat', 'vila', 'alex', 'serhiy.storchaka']
    pr_nums = ['138', '703', '552']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue22807'
    versions = ['Python 3.7']

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Nov 6, 2014

    I'm classifying this as a security issue, since using uuid_generate_time() -- i.e. the not _safe() variety -- does return collisions in real world cases that we've seen, and those could have security implications. However, I don't know that this can be exploited in any real world cases, so I'm not making it private or sending to security@.

    The basic problem is that uuid.uuid1() uses uuid_generate_time(3), but if the synchronization methods used in that C function's manpage are not used, then two concurrent processes can -- and do in our cases -- return the same UUID.

    I would propose that if uuid_generate_time_safe() is available, this should be used instead, and the return value should be checked to see if a safe method was used. If not, then uuid1() should fall back to the pure-Python approach.

    @warsaw warsaw added the stdlib label Nov 6, 2014
    @alex
    Copy link
    Member

    @alex alex commented Nov 6, 2014

    FWIW, I'm not convinced the pure python fallback code is sufficient either; time.time() doesn't have the necessary resolution AFAIK? Also clock_seq is generated using the random module's messerne twister, not SystemRandom().

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Nov 6, 2014

    On Nov 06, 2014, at 08:10 PM, Alex Gaynor wrote:

    FWIW, I'm not convinced the pure python fallback code is sufficient either;
    time.time() doesn't have the necessary resolution AFAIK? Also clock_seq is
    generated using the random module's messerne twister, not SystemRandom().

    Perhaps, but that's a different bug. ;)

    -----snip snip-----

    from uuid import UUID
    import ctypes
    import ctypes.util
    
    lib = ctypes.CDLL(ctypes.util.find_library('uuid'))
    _ugts = lib.uuid_generate_time_safe
    
    _buffer = ctypes.create_string_buffer(16)
    retval = _ugts(_buffer)
    
    # Remember, this is C!
    is_safe = (retval == 0)

    print('{} is safe? {}'.format(UUID(bytes=_buffer.raw), is_safe))
    -----snip snip-----

    On Ubuntu 14.10, gives me:

    xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx is safe? True

    @warsaw warsaw added the 3.7 label Feb 11, 2017
    @warsaw warsaw self-assigned this Feb 16, 2017
    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Feb 16, 2017

    I changed my mind on whether this should affect older versions of Python. I have a branch which adds an UUID.is_safe attribute that relays the platform information about whether the UUID was generated safely or not, if available. It's an enum named SafeUUID with values .safe, .unsafe, .unknown (the default).

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Feb 18, 2017

    New changeset 8c130d7 by GitHub in branch 'master':
    bpo-22807: Expose platform UUID generation safety information. (#138)
    8c130d7

    @warsaw warsaw closed this as completed Feb 19, 2017
    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Feb 20, 2017

    I don't understand well this change.

    What am I supposed to do with an UUID with safe=False? Should I loop on the function until I get safe==True?

    "safe for multiprocessing applications"

    Does it mean unique on the whole system?

    I looked at uuid_generate_time_safe(3) manual page which mention "synchronization mechanisms (see above)" but they are not documented.
    http://manpages.ubuntu.com/manpages/zesty/en/man3/uuid_generate.3.html

    I'm classifying this as a security issue, (...)

    This issue was only fixed in Python 3.7. Does it mean that it's no more considered as as security vulnerability?

    @vstinner vstinner reopened this Feb 20, 2017
    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Feb 20, 2017

    On Feb 20, 2017, at 02:21 PM, STINNER Victor wrote:

    What am I supposed to do with an UUID with safe=False? Should I loop on the
    function until I get safe==True?

    It would be an application dependent response. It might be that you would
    check some other attributes of your platform (e.g. are the OS packages that
    should be installed to give you safe UUIDs?). Or your application may not
    care that much, or your application may refuse to continue to run on platforms
    without safe UUIDs, or you might use some application-level synchronization
    methods to guarantee safe UUIDs (e.g. store the unsafe or unknown ones in a
    database and check that new ones are not already used).

    The point of this change is that it provides information to the application
    creating UUIDs that wasn't previously available.

    "safe for multiprocessing applications"

    Does it mean unique on the whole system?

    I looked at uuid_generate_time_safe(3) manual page which mention
    "synchronization mechanisms (see above)" but they are not documented.
    http://manpages.ubuntu.com/manpages/zesty/en/man3/uuid_generate.3.html

    I believe some systems at least use interprocess communication with a daemon
    to provide the synchronization. Yes, it would be system-wide.

    > I'm classifying this as a security issue, (...)

    This issue was only fixed in Python 3.7. Does it mean that it's no more
    considered as as security vulnerability?

    I should remove that tag. While this could have an impact on application
    security, it's not a security issue *in Python* itself.

    @warsaw warsaw closed this as completed Feb 20, 2017
    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Feb 20, 2017

    Oh, and because the fix is an API change, I don't believe it should be applied to earlier versions. So I think adding the API in 3.7 is all the fix needed here.

    @vstinner
    Copy link
    Member

    @vstinner vstinner commented Feb 20, 2017

    >>> import uuid
    >>> u=uuid.uuid4()
    >>> u.is_safe
    <SafeUUID.unknown: None>

    Can't we consider that UUID4 is always safe?

    @warsaw
    Copy link
    Member Author

    @warsaw warsaw commented Feb 20, 2017

    On Feb 20, 2017, at 03:45 PM, STINNER Victor wrote:

    Can't we consider that UUID4 is always safe?

    It's not a guarantee made by the underlying platform, so I chose to use the
    default SafeUUID.unknown value there.

    @serhiy-storchaka
    Copy link
    Member

    @serhiy-storchaka serhiy-storchaka commented Sep 8, 2018

    This breaks pickle compatibility. UUIDs pickled in 3.7 can't be unpickled in older Python versions because they do not have the SafeUUID class. See bpo-30977 for possible solution.

    @taleinat
    Copy link
    Contributor

    @taleinat taleinat commented Sep 10, 2018

    The fix for bpo-30977 did fix the unpickling in older versions. It was only applied to the master (i.e. 3.8) branch, though.

    I've created bpo-34621 to deal with this separately.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 stdlib
    Projects
    None yet
    Development

    No branches or pull requests

    5 participants