Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Sensitive Data: Complex Passwords #3150

Closed
pdurbin opened this issue Jun 1, 2016 · 45 comments
Closed

Sensitive Data: Complex Passwords #3150

pdurbin opened this issue Jun 1, 2016 · 45 comments

Comments

@pdurbin
Copy link
Member

pdurbin commented Jun 1, 2016

Enforcing complex passwords is a requirement for storing level 3 data as defined by http://policy.security.harvard.edu/level-3 and the following text comes from http://policy.security.harvard.edu/sa2-complex-passwords as of 2016-06-01:

SA2: Servers and applications that manage passwords must force the setting of a complex password. This must meet the following requirements:

No common names or dictionary words

Include at least one character from at least 3 of these categories:

  • Uppercase letter
  • Lowercase letter
  • Number
  • Special character

Any one of these three length and additional requirements:

10 characters minimum
8 characters minimum and annual password reset/expiration
8 characters minimum and a second authentication factor

On 2016-05-25, we were advised that "The upcoming changes to the password policy will only include a relaxation of complexity rules for passwords of 20 characters or more. Everything else will remain the same." We expect these changes to be published at the URL above by 2016-07-01. The spirit of this change is in line with XKCD cartoon from https://xkcd.com/936/ in which the password is "correct horse battery staple":

password_strength

Again, for passwords of 19 characters or fewer, the requirements above must be applied.

All of these requirements are optional if a particular Dataverse installation has not been put into some sort of "secure" mode such that we plan to put https://dataverse.harvard.edu in before we permit level 3 data to be uploaded. That is to say, the existing password complexity rules should remain (minus the bug reported in pull request #3095) out of the box. Ideally there would be a switch called "level 3" or something to enforce all of the rules above but it would be nice if each Dataverse installation had control over each of the rules separately.

We have control over password complexity for local accounts within Dataverse (stored using bcrypt per #1034 in the builtinuser table) but for Shibboleth accounts we must trust that each Identity Provider enforces decent passwords. Membership with a federation such as InCommon should help in this regard and in Harvard's case http://iam.harvard.edu/resources/incommon-bronze details how passwords are handed in Harvard Identity Provider (IdP):

Harvard’s common identity management system enforces the use of complex passwords of at least 10 characters in length. The password character set includes uppercase and lowercase alphabetic, numeric, and the suite of special characters. The complexity requirement is for at least one uppercase, one lowercase, and one non-alphabetic character in each password. Candidate passwords are also checked against a large dictionary and against directory information about the individual. Both CAS and Active Directory lock a user out for at least 30 minutes after 10 bad password guesses. The University does not require regular password changes, but the University IdP will ensure that passwords used to authenticate individuals for InCommon Bronze SPs have been changed within the last five years, in essence enforcing a five-year password reset requirement that impacts only users of InCommon Bronze SPs. With these conditions, Harvard meets Bronze requirements for basic resistance to guessing. See the appendix for calculations showing the level of guessing resistance.

The page for Harvard's Identity Provider (IdP) above links to https://www.incommon.org/assurance/ and from the Dataverse perspective it's our understanding that all InCommon federation members have sufficient password complexity rules to meet level 3 requirements. In short, there's nothing to do code-wise within Dataverse with regard to passwords that are not stored in the database used by a Dataverse installation. These passwords are stored in each institution's Identity Provider (IdP).

@pdurbin
Copy link
Member Author

pdurbin commented Jun 6, 2016

@lwo was recently looking at this part of the code in his pull request #3095 and because no good deed should go unpunished (kidding!) I pinged him to see if he'd like to take a swing at this issue. He agreed so I assigned it to him but I'll be keeping an eye on this issue as well. @lwo if you'd like me to start a requirements doc for this issue, please just say the word.

To be clear, implementing two factor authentication should be considered out of scope for this issue. To avoid that requirement, all we have to do is make sure passwords are 10 characters in length or greater, when the installation is flipped to "secure mode" or "sensitive data" mode. (Let's not call it "level 3 mode", which is way too Harvard-specific.)

@pdurbin
Copy link
Member Author

pdurbin commented Jul 21, 2016

Today @lwo met over Google Hangouts to discuss this issue. We took a few notes at https://docs.google.com/document/d/1Cq6ud-yydSPxRkTAlp_IS_5iCbBDNNk4goe7Yxp3T5A/edit?usp=sharing

I hadn't really considered that "No common names or dictionary words" necessitates having access to a dictionary! The plan is to ship one with Dataverse, to bundle it with the app, rather than relying on some operating system-specific file such as /usr/share/dict/words on Linux. @lwo has ideas on where to find one.

It would be nice to have some more clarity at http://policy.security.harvard.edu/sa2-complex-passwords or elsewhere about the upcoming changes having to do with 20 characters. We're still using this for guidance: "The upcoming changes to the password policy will only include a relaxation of complexity rules for passwords of 20 characters or more. Everything else will remain the same." Anyway, the plan is to have a flag/setting (one of many stored in the setting table) to turn this on or off.

@lwo will let me know if he has any more questions. Thanks for working on this @lwo!

@lwo
Copy link

lwo commented Jul 24, 2016

Hello @djbrooke,
I better ask you, as @pdurbin should be on his break. For this I like to use a well documented third party library
http://www.passay.org/
which is dual licensed under both the LGPL and Apache 2. Can you confirm I can use this maven dependency to tackle this issue? Without response, I will work under the assumption that this is ok.

@pdurbin
Copy link
Member Author

pdurbin commented Jul 25, 2016

Heh. @djbrooke and I are both back now. Passay looks fine to me, from a quick glance. The website says it's built on vt-password and you may have noticed that in code (PasswordValidator.java) I had already suggested switching to this. Yes, the license looks fine and you are welcome to allow Maven to resolve dependencies. Thanks @lwo!

lwo pushed a commit to IISH/dataverse.old that referenced this issue Aug 1, 2016
lwo pushed a commit to IISH/dataverse.old that referenced this issue Aug 1, 2016
@lwo
Copy link

lwo commented Aug 3, 2016

Hmm... those password lists... I am still unsure about the technical translation from the "No common names or dictionary words" requirement. For example: "Password" and "password123" are commonly used passwords, and we could take a dictionary list from: http://weakpass.com/
to block them. But then, those examples any the large majority of those dictionary words would not pass the validator anyway ( no caps, digits, etc ).

Now "P@$$w0rd" is not on that list and considered valid... but it is a weak password because the character substitution is an open door cipher. It makes more sense to me if we block that. If this is considered the problem, then I suppose we could add to #3227 a stripped down version of the rule based algorithm described here:
https://labs.mwrinfosecurity.com/blog/a-practical-guide-to-cracking-password-hashes
and use it to invalidate common names or dictionary words plus those variations of them that are the function of a simple character substitution cipher. But we should investigate how expensive that is.

@pdurbin
Copy link
Member Author

pdurbin commented Sep 2, 2016

Per @djbrooke the Sensitive Data project is being deferred until Dataverse 5.0 so I'm unassigning myself since I like having https://github.com/IQSS/dataverse/issues/assigned/pdurbin reflect what I'm actually working on. I'll also add the 5.0 milestone.

@lwo as we've discussed, next week we'll at least touch base about the excellent work you've done to address this issue in pull request #3227 but you don't seem to be in any hurry to see this issue addressed for own installation of Dataverse. I appreciate your flexibility on the timing!

@oscardssmith
Copy link
Contributor

Looking at the code, are we making users change passwords regularly? If so, we shouldn't be. Also with regards to dictionary words, The easy answer is to take the lowercase version of the password, turn symbols to their likely letter versions (@->a, 3->e), and check substrings against a dictionary with 1 and 2 letter words removed.

@pdurbin
Copy link
Member Author

pdurbin commented Jun 28, 2017

Sensitive data work isn't happening until Dataverse 5.0. We'll revisit this then. Closing. Thanks for your input, @oscardssmith . 😄

@pdurbin pdurbin closed this as completed Jun 28, 2017
@pdurbin
Copy link
Member Author

pdurbin commented Sep 12, 2017

Todo list as of 34d3beb

  • remove or reduce logging level for "invalid value" messages is server.log. Out of the box, sane defaults are set and it's ok for these settings to not be populated in the database.
  Invalid value for PwMaxLength: null
  Invalid value for PwMinLength: null
  Invalid value for PVNumberOfCharacteristics: null
  Invalid value for :PVNumberOfConsecutiveDigitsAllowed: null
  PwDictionaries not set and no default password file found: weak_passwords.txt
  Dictionary was set, but none was read in.
  Invalid value for PVNumberOfCharacteristics: null
  Invalid value for PVGoodStrength: null
  Invalid value for PwMinLength: null

@dlmurphy
Copy link
Contributor

I just poked around at a working demo of this, and from a design standpoint I think it's good to go. It's a massive improvement over what we started with.

While talking with @matthew-a-dunlap about this issue, we thought of something that would improve the design even further, but we're opting not to implement it right now in order to avoid scope creep on an issue that has already been more complex than expected. But I want to note it here for posterity, as it might be worth pursuing in the future.

When one of the password requirements is "No dictionary words" and a user's password fails on this rule, it would be useful for us to tell the user what word it failed on.

For example:

  • User tries the password"Baseball12A!" and it fails the dictionary rule
  • User alters it to "Boseboll12A!" and it STILL fails because "Bose" is in the dictionary as a proper noun
  • User is now unsure of why that password fails - "those aren't dictionary words!"

If we told the user that the password failed because it included the word "Bose" then they'd more easily be able to adjust their password to make it work.

HarvardKey does this; here's a screenshot Matthew snagged:
screen shot 2017-09-12 at 5 34 36 pm

Again, this is a "nice to have" and not essential, so let's move on for now and consider coming back to this some day in the future.

@pdurbin
Copy link
Member Author

pdurbin commented Sep 13, 2017

I'm not sure I love the idea of part of my password being shown to me in the the clear. There's a reason password input fields show asterisks when you type.

dlmurphy added a commit that referenced this issue Sep 13, 2017
Changed order of password-related sections of config doc to make more
narrative sense.
dlmurphy added a commit that referenced this issue Sep 13, 2017
Took out the references to JVM options from the docs, since we decided
previously that those options weren't needed.
@matthew-a-dunlap
Copy link
Contributor

Some testing notes:
Parts to test:

  • All the rules
  • 3 different pages (sign up, forced reset, manual reset)

Other notes:

  • There are a lot of different configuration options, its probably easiest to just turn them all on and see them all working. These should be well-captured in the documentation.
  • There are added config options to the Harvard config script which should bring the installation in line with the Harvard security needs.

@kcondon kcondon self-assigned this Sep 15, 2017
@kcondon
Copy link
Contributor

kcondon commented Sep 18, 2017

Issues found:

  • Was mentioned by Phil in a previous to do but lots of logging, when loading the page, on error, on success.

@kcondon
Copy link
Contributor

kcondon commented Sep 18, 2017

Tested all rules, 3 pages, config settings, script.

@pdurbin
Copy link
Member Author

pdurbin commented Sep 19, 2017

I reduced logging in e9fb520. I also merged the latest from develop (just the addition of ISSUE_TEMPLATE.md) and re-enabled a related API test having to do with password complexity.

@pdurbin pdurbin removed their assignment Sep 19, 2017
@pdurbin pdurbin modified the milestones: 5.0 - DataTags and Support for Sensitive Data, 4.8 - AWS S3 Support Sep 19, 2017
@kcondon kcondon closed this as completed Sep 19, 2017
@kcondon
Copy link
Contributor

kcondon commented Sep 26, 2017

Tania asked for screenshots of the final implementation. There are 3 pages: 1. sign up, 2. forced password change, 3. account info change password, and 1 view of sign up page where some fields were entered correctly and some incorrectly.
screen shot 2017-09-26 at 5 45 52 pm
screen shot 2017-09-26 at 5 44 19 pm
screen shot 2017-09-26 at 5 44 58 pm
screen shot 2017-09-26 at 5 45 36 pm

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests