New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Convert profile attributes to JSON #599
Conversation
@@ -0,0 +1,4 @@ | |||
module Pii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this class will be exercised once #498 is complete
@@ -0,0 +1,21 @@ | |||
module Pii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is not encryption per se. Just Base64 encoding. The class name is a placeholder for the concept of encrypting. Actual encryption will happen in #498
expect(FeatureManagement.use_kms?).to eq(true) | ||
end | ||
|
||
xit 'throws exception when attempting to fetch PII signing key' do |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
KeyMaker class is not present in this PR but will be in #498
598 has been merged. Wanna rebase? |
851f92a
to
795e9d9
Compare
thanks @monfresh -- rebased |
795e9d9
to
7bd86ac
Compare
7bd86ac
to
b1d8ea1
Compare
self.user = user | ||
self.service_provider = service_provider | ||
self.authn_request = authn_request | ||
@decrypted_pii = decrypted_pii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed in many other places we prefix instance variables with _
, so this would be @_decrypted_pii
-- should we be doing this here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have used the leading underscore for memoized private instance variables that should not be accessed directly anywhere else in the code except for via the memoized method. That doesn't apply here, since we do want to use the instance var elsewhere. I can make it a private accessor if that would be clearer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Private accessor sounds good to me, I didn't see the difference between memoized instance variables vs otherwise
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in 51dd5df
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we convert the other initialized variables into instance variables for consistency?
@@ -68,5 +76,13 @@ def submit_applicant | |||
idv_attempter.increment | |||
resolution | |||
end | |||
|
|||
def password | |||
params.require(:user)[:password] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be either .require(:password)[:password]
or .require(user: :password)[:user][:password]
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I feel like I tried a half dozen different combinations to get what I wanted, which is:
- do not raise an exception if the param is missing
- do return nil if the param is missing
We want the error to come from the valid_password?
method, not the controller param fetcher.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think I understand.
I think we want:
params.fetch(:user, {})[:password].presence
The methods from strong parameters .require
and .permit
are only useful if we're passing entire hashes around (ex User.create(params)
), but in this case, we just want a single key, so I think we can avoid that approach and be explicit.
attr_name_sym = method_sym.to_s.gsub(/=\z/, '').to_sym | ||
if plain_pii.members.include?(attr_name_sym) | ||
return plain_pii[attr_name_sym] if arguments.empty? | ||
plain_pii[attr_name_sym] = arguments.first |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this is imitating writing an attribute, should we do anything to mark the record as dirty?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
good question.
PII must always be encrypted before saving, and because encryption requires the user passphrase, we can't make it a before_save
hook or equivalent. So the only attribute that really should be marked as dirty is encrypted_pii
and marking it dirty would be technically incorrect until after encrypt_pii
is explicitly called.
I'm starting to think that the "convenience magic" of the method_missing stuff maybe isn't worth the complexity it creates. These two do the same thing:
profile.plain_pii[:ssn] = '1234'
profile.ssn = '1234'
and while I personally love the terse-ness and error checking of the 2nd form, the non-magical-ness of the 1st form might be friendlier to the code maintainer. Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am very wary of method_missing
and believe we should minimize Ruby dark magic, big fan of profile.plain_pii[:ssn] = '1234'
!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How would you feel about this instead of method_missing
etc:
Pii::Attributes.new.members.each do |attr|
delegate attr, to: :plain_pii
setter = "#{attr}=".to_sym
delegate setter, to: :plain_pii
end
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, I'd be really happy with just writing profile.plain_pii.foobar
everywhere since it makes for a really clear relation, but it may be easier to refactor the code with this delegation layer? Would love to get other folk's opinions.
(Also teeny nit, structs have a class-level .members
so you could do Pii::Attributes.members
without the .new
)
@zachmargolis thanks for the review. updates in 51dd5df |
per the codeclimate warning, I tried changing this: if evaluator.pii
evaluator.pii.each { |key, val| profile.plain_pii[key] = val }
end to this: evaluator.pii&.each { |key, val| profile.plain_pii[key] = val } and got an error about "can't call I'm in favor of disabling that rubocop rule. I've never seen that safe navigation syntax before. UPDATE: I worked around the rule. |
3cb411a
to
cbb4edb
Compare
def verified? | ||
verified_at.present? | ||
def plain_pii | ||
@_plain_pii ||= Pii::Attributes.new |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's address this later, but wanted to comment before I forget:
I'm not sure we should memoize this method in this way -- I think that this would allow pretending to read PII attributes without actually decrypting them. IMO if a profile has PII but it has not been decrypted, this should be nil or something so we get errors when try to access fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point. I am not happy with how the PII encryption is neither automatic in a hook, nor clearly a manual process.
We could get rid of plain_pii
altogether to remove confusion. E.g.
profile = Profile.new( user: user )
pii_attrs = Pii::Attributes.new_from_hash( ssn: '1234' )
profile.encrypt_pii(passphrase, pii_attrs)
profile.save!
pii_decrypted = profile.decrypt_pii(passphrase)
puts pii_decrypted == pii_attrs # -> true
You would just never handle un-encrypted PII aside from an explicit Pii::Attributes
instance. The Profile
only ever holds it encrypted.
That would make the tests a little more verbose but at least there would not be unexpected behaviour.
Thoughts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@zachmargolis I have removed plain_pii
method in d17b0a7
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yup, makes sense!
I would even be in favor of something like this (very verbose and separate but very clear)
profile = Profile.new( user: user )
pii_attrs = Pii::Attributes.new_from_hash( ssn: '1234' )
profile.encrypted_pii = pii_attrs.encrypt(passphrase, pii_attrs)
profile.save!
pii_decrypted = Pii::Attributes.decrypt(passphrase, profile.encrypted_pii)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Profile.encrypt_pii
also fingerprints the ssn, which must be done in sync with whatever PII is being encrypted. So having a method to handle all the encryption-related stuff seems ok to me.
@@ -3,5 +3,7 @@ class ProfileController < ApplicationController | |||
layout 'card_wide' | |||
|
|||
def index | |||
cacher = Pii::Cacher.new(current_user, user_session) | |||
@active_profile = cacher.fetch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Above, in saml_idp_auth_concern
, you are doing the same thing, but the method is called decrypted_pii
, whereas here you are saying the result represents an active profile. IIUC cacher.fetch
returns a hash of attributes, so it seems like decrypted_pii
is the correct term, and perhaps the instance variable here should be called that as well?
def build_profile(applicant, user, password) | ||
Profile.create_with_encrypted_pii(user, pii_from_applicant(applicant), password) | ||
end | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reverse the order of these 2 methods so that they read from top-down?
|
||
def build_profile(applicant, user, password) | ||
Profile.create_with_encrypted_pii(user, pii_from_applicant(applicant), password) | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
build_profile
implies calling new
and not saving, but then create_with_encrypted_pii
is called, which does save to the database. Perhaps this class should be renamed to ProfileFromApplicant
and the build_profile
method should be renamed to create
, with the contents of the create_with_encrypted_pii
from the Profile class? This will keep both the Profile class clean and keep the profile creation in one place.
d4a9f00
to
27fc889
Compare
pii_json = decrypted_pii.to_json | ||
encrypted_pii = encryptor.encrypt(pii_json) | ||
user_session[:encrypted_pii] = encrypted_pii | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't fully understand the purpose of this Cacher class. It seems like in order for the encrypted PII to be cached, the user must have an active profile. But if they already have an active profile, that means they already have encrypted PII in the DB, right? If so, why can't the encrypted PII be fetched directly from the DB without the session as a middleman?
From what I can tell, here are the sequence of events:
- User with an active profile signs in
- Their encrypted PII is fetched from the DB and stored in the session (after_action in users/sessions_controller)
- Since their profile is already active, meaning they've already proofed, their PII is not going to be modified in this session
- The encrypted PII is then fetched from the session and decrypted in 2 places: 1) the ProfileController in order to display to the user their verified attributes, 2) in SamlIdpController to assert attributes back to the SP
If this is correct, it looks like the caching is not necessary since the data is already available in the DB.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The PII may only be decrypted from the db with the user's passphrase. Since we only have the passphrase available to us at log in time, we must decrypt it then, and then store it re-encrypted w/o the passphrase for later use.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's hard to see that in this PR because there is no actual encryption happening. All the crypto was yanked out to simplify the review process.
In a subsequent PR, the passphrase will actually be used for encryption, and there will be different encryption methods used for storage in the db vs storage in the session.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. I keep forgetting about that. I think some comments at the top of this class explaining why it's needed would be helpful.
@@ -86,6 +92,9 @@ production: | |||
participate_in_dap: 'false' | |||
password_pepper: 'rake secret' | |||
password_strength_enabled: 'true' | |||
pii_passphrase: 'trust-but-verify' | |||
pii_server_cek: 'rake secret 256 bits' | |||
pii_signing_passphrase: 'trust-but-verify' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These 3 keys don't seem to be used in this PR. Move to another PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup, that will go into a later PR. good catch. yanked in 61f6f79
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh shoot, reviewed this earlier but forgot to submit
ssn: applicant.ssn, | ||
phone: applicant.phone | ||
) | ||
def self.create_with_encrypted_pii(user, plain_pii, password) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
thoughts on using named args here? I find them very helpful when dealing with more than 1 arg
@@ -15,10 +15,11 @@ class AttributeAsserter | |||
:phone | |||
].freeze | |||
|
|||
def initialize(user, service_provider, authn_request) | |||
def initialize(user, service_provider, authn_request, decrypted_pii) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
named args?
@@ -52,6 +55,8 @@ def complete_profile | |||
profile.verified_at = Time.zone.now | |||
profile.vendor = vendor | |||
profile.activate | |||
# move from temp slot to 'permanent' slot | |||
user_session[:encrypted_pii] = session.delete(:encrypted_pii) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
put this in a method named move_from_temp_to_permanent
instead of having comment?
@@ -0,0 +1,32 @@ | |||
module Pii | |||
Attributes = Struct.new( | |||
:first_name, :middle_name, :last_name, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rubocop might freak out if we do this, but I find lists like this easier to read when each arg is on its own line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
agreed on both counts. I dislike fighting with rubocop more than I dislike grouped args.
decrypted_pii = profile.decrypt_pii(password) | ||
pii_json = decrypted_pii.to_json | ||
encrypted_pii = encryptor.encrypt(pii_json) | ||
user_session[:encrypted_pii] = encrypted_pii |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it seems like this method is doing more than save
-- would it make sense to put parts of this in other (mayve private) methods so it's clearer what is happening?
@@ -28,7 +28,7 @@ h2.pb0.heading = t('idv.titles.session.basic') | |||
.mt0.mb3.pb-tiny.border-bottom.border-teal = t('profile.index.dob') | |||
= f.input :dob, label: t('idv.form.dob'), required: true, | |||
pattern: '(0[1-9]|1[012])/(0[1-9]|1[0-9]|2[0-9]|3[01])/[0-9]{4}', | |||
input_html: { class: 'dob', value: idv_profile_form.dob.try(:strftime, '%m/%d/%Y'), | |||
input_html: { class: 'dob', value: idv_profile_form.dob, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why no more strftime
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because dob
is no longer a DateTime object. It is a String
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some smaller comments but nothing that blocks PR approval. I will hold off on merging to give @monfresh a chance to re-look. 🌵
@@ -68,5 +76,13 @@ def submit_applicant | |||
idv_attempter.increment | |||
resolution | |||
end | |||
|
|||
def password |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
super minor detail, but I usually put methods in order of dependencies (so password
below where it is used, which here is valid_password?
). I find this makes diffs more readable because you read from top to bottom and see why each thing is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in f6fffdb
|
||
def initialize(params, user) | ||
@user = user | ||
profile.attributes = params.select { |key, _val| respond_to?(key.to_sym) } | ||
params.each do |key, value| |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might consider putting this activity into a method rather than initialize
-- seems like a lot of data parsing for calling .new
All I usually expect to see in initialize
is ivar assignment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in f6fffdb
self.authn_request = authn_request | ||
URI_PATTERN = Saml::Idp::Constants::REQUESTED_ATTRIBUTES_CLASSREF | ||
|
||
def initialize(options) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ah so you'd rather use hash than ruby named args? I kind of prefer named args but not a big deal: https://robots.thoughtbot.com/ruby-2-keyword-arguments#keyword-arguments-vs-options-hash
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I agree, am a big fan of named arguments, then we get bonuses like errors when there are extra/typoed arguments
Since we know we're on a recent version of Ruby and AFAIK are not planning on switching platforms (ex JRuby) I think there's a huge upside to named arguments
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in f6fffdb
allow(subject).to receive(:confirm_idv_steps_complete).and_return(true) | ||
allow(subject).to receive(:confirm_idv_attempts_allowed).and_return(true) | ||
idv_session.params = user_attrs.merge(phone_confirmed_at: Time.zone.now) | ||
allow(subject).to receive(:idv_session).and_return(idv_session) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this is a good example of why before
and let
statements are tricky. When reading this diff, I have no idea what idv_session
is, am just trusting it's some thing (factory? mock? stub? string?) above in the spec file ☘
If the test setup were all in each test, there'd be more duplication but I could see what was going on much more easily
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
happy to see a refactor of this, maybe in a separate PR?
profile = Idv::ProfileFromApplicant.create(applicant, current_user, password) | ||
self.profile_id = profile.id | ||
cache_encrypted_pii(password) | ||
profile | ||
end |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I can tell, the return value (the profile) is not used when this method is called. Perhaps the profile_id assignment should be in its own method: set_applicant_profile_id
, then instead of calling idv_session.profile_from_applicant(idv_session.applicant, password)
in init_questions_and_profile
, you would call both of these methods:
idv_session.set_applicant_profile_id(idv_session.applicant, password)
idv_session.cache_encrypted_pii(password)
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done in f6fffdb
thanks for all the comments. I think I've hit everything. @monfresh comments added to Cacher in 652f01386bffb28bda061f560f7cc9dbd4bfd406 |
0anypRH3z3wSpO1R/5qG1YUZOQlr46CSCUxwOv0RuPMmyAxAvvEzlsVb2CwikCGD | ||
O2L4KQ+tOCBfjFf+OYcxV3Z3WrasR77xfgCxA3x2ePlpQSzeHKt3pXy229URMGl3 | ||
GCJlM8uk66/rwARsfIDVrDYy5QzYBBWYrBIqCKzyGQn023+sW8kejRSTqF525CBt | ||
-----END RSA PRIVATE KEY----- |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this key used in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no but it will be in the next PR
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I will remove it here
@@ -3,5 +3,7 @@ class ProfileController < ApplicationController | |||
layout 'card_wide' | |||
|
|||
def index | |||
cacher = Pii::Cacher.new(current_user, user_session) | |||
@decrypted_pii = cacher.fetch |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Every time the profile is accessed, and every time a SAML request is made, this will decrypt the PII and create a hash from them. Instead of repeating this action on every request, would it make sense to store the decrypted PII in the user_session so that it can be reused?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it's never ok to store PII decrypted.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
subsequent PRs will encrypt the entire session as well.
LGTM. Wanna squash? |
**Why**: All PII must be encrypted at rest. Storing attributes in a single JSON blob makes this much easier. **How**: This change does not implement any actual encryption. That will follow in a separate change. This change includes: * schema modifications * user flow for collecting passphrase at IdV review * internal flow for decrypt PII at login for use in session
e907a03
to
0c0e30b
Compare
@monfresh all squashed. |
This is part 2 of #498
Please review and merge #598 first to reduce the changes here.
Why: All PII must be encrypted at rest.
Storing attributes in a single JSON blob makes this much easier.
How:
This change does not implement any actual encryption. That will follow
in a separate change.
This change includes: