Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor check_user to optionally hash key and user inputs #51

Merged
merged 31 commits into from
Jan 13, 2022

Conversation

machikoyasuda
Copy link
Member

@machikoyasuda machikoyasuda commented Dec 15, 2021

#34

What this PR does

  • Refactor Hash, Database and Verify classes
  • Add .env, .env.sample, .env.tests file with new env variable, HASH_TYPE and default sample value, sha256.
  • Add a check to the Database#check_user that checks to see if the Database has a Hash or not. If the Database was initialized with a Hash object, the check_user function will hash the inputs first. If not, the function will not hash the inputs.
  • Add hashed data (sha256 and sha512) to server.json as test data
  • Add test cases
  • Add docstring documentation and typing for methods

To Do

  • Refactor hash variables into its own config
  • Hash class should read hash vars from config
  • Refactor hash vars out of Database class method

How to run tests

coverage run -m pytest -v

@machikoyasuda
Copy link
Member Author

@thekaveman
I've tried several ways to mock the environment variables (HASH_INPUTS (Bool) and HASH_TYPE (String)), but they haven't led to what I expect would happen.

I'm running into several issues:

  1. Using monkeypatch to directly mock HASH_INPUTS and HASH_TYPE like this, works -- but the last 2 lines will fail.
    monkeypatch.setenv("HASH_INPUTS", "False")
    assert os.environ.get("HASH_INPUTS") == "False"
    assert settings.HASH_INPUT == False ## this fails
    assert hash._hash_inputs == False   ## this fails

It seems like monkeypatch really only makes os.environ.get() return the mocked value (False). This, however, isn't forcing the settings class to be re-run and the hash instantiation to then get this updated value of False. It appears that both the settings.py file and hash.py class are still using the actual .env var value, of True. The way I have the hash#__init__ method structured, the init method is reading from settings.

    def __init__(self):
        """
        Initialize variables (hash_inputs, hash_type) from settings.py variables
        """

        self._hash_inputs = settings.HASH_INPUTS or False
        self._hash_type = settings.HASH_TYPE or "sha256"

So, I tried a different way:

@patch("eligibility_server.settings.HASH_INPUTS", False)
@patch("eligibility_server.settings.HASH_TYPE", "sha512")

But I wasn't able to get this to pass:

assert settings.HASH_TYPE == "sha512" # This fails

I wasn't able to use @patch to monkeypatch the hash (self) directly either.

  1. Then I tried a new way all together, I made a new test directory, with a new conftest.py file and created a new fixture that monkeypatches:

image

The good news is that, with this way, I was able to get this test, which is testing the case in which a user doesn't want to hash anything, to pass:

image

However, when I tried to write another spec that tests this same condition (HASH_INPUTS is False) from the check_user Database class method, this test failed. When I print outputted self._hash_inputs, it logged as True.... which clued me into think that somehow, still, this variable is getting overridden, or not being mocked correctly:

image

Any suggestions???

I need to be able to:

  • Write tests to confirm that changing .env variables will instantiate a Hash class correctly. If the .env var file says that HASH_INPUTS is False, self._hash_inputs should return False.
  • Write tests to confirm that the hash_input() method correctly will not hash inputs if self._hash_inputs is False.
  • Write tests to where hash_type is set to sha512 or something else.

I'm not sure if monkeypatching the environment variables are better, or if using the fixture method is better.. (though I'd like to make it easier to turn on/off fixtures or use only certain fixtures with certain tests, so I can more easily change "hash_type" for just a few specs).

@machikoyasuda
Copy link
Member Author

@thekaveman Can you review the way I've added the .env vars, config default values and initialized the Hash class?

@machikoyasuda
Copy link
Member Author

@machikoyasuda
Copy link
Member Author

  • Settings: Test settings with monkey patching (test to see if defaults are working)
  • Hash: Test should instantiate a specific Hash
  • Database: Test should use a fixture Hash

@thekaveman
Copy link
Member

thekaveman commented Jan 10, 2022

I think you captured most of what we talked about in your notes. A few additional things:

  • Hash: __init__() should accept the hash function (SHA256 etc.) as a parameter, rather than reading from settings
  • Database: __init__() should accept a Hash object as a parameter so it is not creating its own object
  • Verify: create the Hash and Database objects using settings.py

And one additional idea as I was typing this up: I think the Hash object can be simplified to not care about HASH_INPUTS, instead assuming that if it is being used, you always want to hash the inputs. The settings check can move to Verify, which will or won't init a Database with a Hash object based on the setting. Here is pseudocode:

class Database:
    def __init__(self, hash=False):
        self._hash = hash

   def lookup(self, input):
        if self._hash:
             hashed_input = self._hash.hash_input(input)
             # lookup using hashed_input
        else:
             # look up using input directly


class Verify:
    def __init__(self):
        if settings.HASH_INPUTS:
            hash = Hash(settings.HASH_TYPE)
            self._db = Database(hash=hash)
        else:
            self._db = Database()

@machikoyasuda machikoyasuda marked this pull request as ready for review January 11, 2022 21:59
@machikoyasuda machikoyasuda added this to the Courtesy Cards milestone Jan 11, 2022
@machikoyasuda machikoyasuda self-assigned this Jan 11, 2022
@machikoyasuda machikoyasuda linked an issue Jan 11, 2022 that may be closed by this pull request
@machikoyasuda
Copy link
Member Author

@thekaveman @angela-tran Finally ready for review. Thank you both for all the help already!

Copy link
Member

@thekaveman thekaveman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is looking great! A few testing comments, a few comments on the comments 😉, a few more general/python comments.

eligibility_server/database.py Outdated Show resolved Hide resolved
eligibility_server/database.py Outdated Show resolved Hide resolved
eligibility_server/database.py Outdated Show resolved Hide resolved
eligibility_server/database.py Show resolved Hide resolved
eligibility_server/hash.py Outdated Show resolved Hide resolved
tests/test_database.py Outdated Show resolved Hide resolved
tests/test_database.py Show resolved Hide resolved
tests/test_database.py Show resolved Hide resolved
tests/test_database.py Outdated Show resolved Hide resolved
tests/test_hash.py Outdated Show resolved Hide resolved
@machikoyasuda
Copy link
Member Author

@thekaveman Ready for a re-review:

  • renamed hash_type to input_hash_algo; HASH_TYPE to INPUT_HASH_ALGO
  • added test case, when the wrong hash algorithm is specified, the database should return []
  • simplified check_user method
  • reworded some comments

Copy link
Member

@thekaveman thekaveman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a blocker in the least, but perhaps a quick git rebase -i main could allow for clean up of some of these "trying this" type commits.

@machikoyasuda machikoyasuda merged commit 65afd83 into main Jan 13, 2022
@machikoyasuda machikoyasuda deleted the feat/34/hashed-lookup branch January 13, 2022 23:18
@angela-tran angela-tran mentioned this pull request Nov 14, 2022
10 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

Implement hashed lookup
2 participants