Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Item Count Error in Shelf #81422

Closed
jessembacon mannequin opened this issue Jun 12, 2019 · 7 comments
Closed

Item Count Error in Shelf #81422

jessembacon mannequin opened this issue Jun 12, 2019 · 7 comments
Labels
3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@jessembacon
Copy link
Mannequin

jessembacon mannequin commented Jun 12, 2019

BPO 37241
Nosy @ericvsmith
Files
  • KeyCount.png: Screen shot of exercise
  • ShelfKeys.png: Data Missing from Shelf
  • Python Proof.ipynb: Jupyter Notebook with comments
  • pbr37241_Jesse_Bacon.py.txt: Test_Script
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2019-06-13.16:29:52.711>
    created_at = <Date 2019-06-12.00:13:05.766>
    labels = ['3.7', 'type-bug', 'library']
    title = 'Item Count Error in Shelf'
    updated_at = <Date 2019-06-13.16:29:52.710>
    user = 'https://bugs.python.org/jessembacon'

    bugs.python.org fields:

    activity = <Date 2019-06-13.16:29:52.710>
    actor = 'SilentGhost'
    assignee = 'none'
    closed = True
    closed_date = <Date 2019-06-13.16:29:52.711>
    closer = 'SilentGhost'
    components = ['Library (Lib)']
    creation = <Date 2019-06-12.00:13:05.766>
    creator = 'jessembacon'
    dependencies = []
    files = ['48411', '48413', '48414', '48415']
    hgrepos = []
    issue_num = 37241
    keywords = []
    message_count = 7.0
    messages = ['345290', '345291', '345369', '345381', '345412', '345419', '345525']
    nosy_count = 2.0
    nosy_names = ['eric.smith', 'jessembacon']
    pr_nums = []
    priority = 'normal'
    resolution = 'third party'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue37241'
    versions = ['Python 3.6', 'Python 3.7']

    @jessembacon
    Copy link
    Mannequin Author

    jessembacon mannequin commented Jun 12, 2019

    I have loaded the National Vulnerability Database from NIST for 2019 and it includes 3989 JSON Documents. This data I have placed in a shelf. when I run len(db.keys()) I get 3658. len(set(cves)) == 3989 : True

    When I extract the data from the shelf I have the right amount of records, 3989. I tested on python 3.7.3 and Python 3.6.5. I am concerned this is going to ruin a metric in a security report. For example, A risk exposure report may use the number of keys in a yearly vulnerability db as the baseline for a risk calculation which contrasts the number of patched CVE's.

    nvdcve-1.0-2019.json

    @jessembacon jessembacon mannequin added 3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Jun 12, 2019
    @ericvsmith
    Copy link
    Member

    Please do not post images: we can't copy and paste from them, and they're unfriendly to visually impaired users.

    Can you create code that reproduces this? A small example, with no external dependencies would be best. Please attach the reproducer as a text file.

    @jessembacon
    Copy link
    Mannequin Author

    jessembacon mannequin commented Jun 12, 2019

    I am missing keys, when extracting the data back out with todays NVD pull.
    ---------------------------------------------------------------------------

    KeyError                                  Traceback (most recent call last)
    ~/anaconda3/lib/python3.6/shelve.py in __getitem__(self, key)
        110         try:
    --> 111             value = self.cache[key]
        112         except KeyError:

    KeyError: 'CVE-2019-1842'

    During handling of the above exception, another exception occurred:
    
    KeyError                                  Traceback (most recent call last)
    <ipython-input-62-aeb8a14b4774> in <module>
          1 results = []
          2 for x in raw_cves:
    ----> 3     results.append(db[x])

    ~/anaconda3/lib/python3.6/shelve.py in __getitem__(self, key)
    111 value = self.cache[key]
    112 except KeyError:
    --> 113 f = BytesIO(self.dict[key.encode(self.keyencoding)])
    114 value = Unpickler(f).load()
    115 if self.writeback:

    KeyError: b'CVE-2019-1842'

    @ericvsmith
    Copy link
    Member

    This still isn't an example we can copy and paste to reproduce, so I'm going to be unable to help you. Sorry.

    Again: please don't post images, for the reasons I previously stated.

    @jessembacon
    Copy link
    Mannequin Author

    jessembacon mannequin commented Jun 12, 2019

    Eric,

    The interpreter said something about passing a negative value when I converted the db.keys to a list. I have attached a script in txt format and a Jupyter notebook for further analysis. I apologize for posting images, I just saw your note. I'll go ahead and look at the shelve source while you determine if this information is sufficient. Thank you for your time.

    @ericvsmith
    Copy link
    Member

    After fixing a missing import (import urllib.request), this is what I get:

    $ /usr/local/bin/python3.6 pbr37241_Jesse_Bacon.py 
    Fetching nvdcve-1.0-2019.json.gz
    Storing Gzipped File
    Loading JSON Content
    4275 records
    4275 unique records
    Creating Shelve: cve_2019.shelf
    Assembling Big Dictionary of 2019 Data in shelve
    shelve reports 4275 unique records
    Extracting data by keys from shelve
    4275 extracted records
    Number of missing records 0
    data match

    Are you seeing failures?

    This is on a python3.6 that I compiled from source on an old Fedora box.

    What OS are you using?

    @jessembacon
    Copy link
    Mannequin Author

    jessembacon mannequin commented Jun 13, 2019

    I was using anaconda distribution on OSX. It failed for 3.6 and 3.7. I pulled off anaconda and compiled from source and the script executed correctly regardless of whether or not "--enable-optimizations" was set. Anaconda claims to be geared towards scientists so this is alarming. Thank you for your time.

    @SilentGhost SilentGhost mannequin closed this as completed Jun 13, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant