Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unify password generation across FreeIPA #317

Closed
wants to merge 2 commits into from

Conversation

stlaz
Copy link
Contributor

@stlaz stlaz commented Dec 7, 2016

When installing FreeIPA in FIPS mode I noticed that there were often different ways of generating passwords in different spots raising the same issue with password requirements. Handling password generation at one centralized spot should allow us handle any password requirements issues at this very spot.

@MartinBasti MartinBasti self-assigned this Dec 7, 2016
@MartinBasti
Copy link
Contributor

NACK

You replaced os.random() by ipa_generate_password, but ipa_generate password does not generate random bytes but random printable characters (entropy--) so you have to recalculate a new password length accordingly or edit ipa_generate_password function to generate random bytes.

Also I noticed you removed base64encoding, are you sure that places where it was used can handle all bytes characters (nonprintable, etc)? I would stay with base64 there.

@MartinBasti
Copy link
Contributor

Please replace this by something sane,

return sha1(ipautil.ipa_generate_password()).hexdigest()

security by obscurity worked well in Roman empire, but now please generate directly password with entropy 128bits without using sha1

@stlaz
Copy link
Contributor Author

stlaz commented Dec 8, 2016

The passwords should have around the same entropy now. SHA-1 actually produces 160bit outputs (hence 40-characters long hexadecimal digests), so I recounted it for 20-bytes entropy.

As ipa_generate_password creates passwords of only printable characters (and a space) by default, base64 should not be a requirement here. However, a space could be a problem somewhere I guess, should it be removed as well from the defaul behavior of the password generator?

@MartinBasti
Copy link
Contributor

The passwords should have around the same entropy now. SHA-1 actually produces 160bit outputs (hence 40-characters long hexadecimal digests), so I recounted it for 20-bytes entropy.

Sure, my bad

As we discussed offline, due NSS FIPS requirements, we should get rid off base64 encoding. I wouldn't remove space from there, can you check if NSS supports it?

@stlaz
Copy link
Contributor Author

stlaz commented Dec 8, 2016

NSS does support spaces in its passwords it seems. My hopes are that HTTP will be able to understand spaces in its password.conf file.

@stlaz
Copy link
Contributor Author

stlaz commented Dec 8, 2016

Apparently, spaces are ok even in HTTP password.conf so I guess we can leave it there.

@simo5
Copy link
Contributor

simo5 commented Dec 8, 2016

@stlaz SHA-1 DOES NOT add entropy at all, you need the right number of bits in INPUT for whatever trasformation you use.
@mbasti-rh in what way FIPS is incompatible with base64 encoding ?
@stlaz spaces may cause issues in some places where passwords are stored in files or passed (annoyingly) as shell arguments, soit is safer to avoid them in the final output, and given the way the code deal with space that would also simplify the random generator and avoid the bias on 1st and last charcter of the password.

@tiran
Copy link
Member

tiran commented Dec 8, 2016

Please don't use a hack like sha1() to turn a random byte sequence into a hex value. At best sha1 keeps the entropy of the input. I also don't like the fact that the function only cares about the length of the output. The actual length is irrelevant. We care about the entropy of the output.

Let's drop pwd_len and apply proper math instead:

import math
import random
import string

alnum = string.ascii_letters + string.digits
sysrandom = random.SystemRandom()  # uses os.urandom() as RNG

def mkpasswd(entropy_bits=128, symbols=alnum):
    length = int(math.ceil(entropy_bits / math.log(len(symbols), 2)))
    return ''.join(sysrandom.choice(symbols) for _ in range(length))

@stlaz
Copy link
Contributor Author

stlaz commented Dec 8, 2016

@simo5 I was actually trying to get rid of SHA-1 and I am aware that entropy will not be raised, that part of the code draw a smile on some of our faces here, really :)
As for the spaces, I did not encounter issues with them in password.conf files which is awesome but I agree they're potentially dangerous. However, removing them from default set of password chars would not make our life easier as the check would have to stay there in case someone passes them as a possible character as an argument to ipa_generate_password (although they should probably know what they're doing, right?).
We may be able to get rid off the characters argument should the cases where it's used are found invalid though (currently in host, user passwords and in dnskeysync).
@tiran Regarding sha1 - did you see the patch? ;) However I agree that the length is not a good argument for password-generating function, I will have a look at transforming it to entropy.

f = os.open(pwd_file, os.O_CREAT | os.O_RDWR)
os.write(f, hex_str)
os.write(f, password)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's with this ugly code? Why does it use a file descriptor instead of a proper file and context manager?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not really sure.

@simo5
Copy link
Contributor

simo5 commented Dec 8, 2016

We may need a max length argument if we are dealing with some stuff that has issues with more then max length caracters ... In that case we can warn (or raise, we'll have to decide) not enough entropy will be available is max length is not sufficient to hold the desired entropy.

@tiran
Copy link
Member

tiran commented Dec 8, 2016

@stlaz Your patch looks good. My comment regarding SHA1 was aimed at comment #317 (comment) . The suggestion of SHA1 is a Verschlimmbesserung (improvement for the worse) of the current code.

I studied the implementation ipa_generate_password. The special cases for white space makes it more complicated. If you combine @simo5 's suggestion and my function, you can write the function in like 6 to 7 lines of simple code. It might be good idea to use only alpha numeric chars, too. #!='"%${}.?* have special meaning in bash, C, ini files etc.

@MartinBasti
Copy link
Contributor

@tiran @simo5 If you read my comments properly I was happy with removing sha1() and I pointed out that ipa_generate_password() must generate entropy 160bits as was probably originally aimed by using sha1()

@simo5 I'm fine with removing space then

@simo5 Standa found out that when FIPS is enabled NSS is not willing to accept some password, it requires some special chars AFAIK @stlaz knows details

@tiran I'm afraid we need to keep special chracters there as I mentioned above ^

@tiran thank you for nice code snippet

@tiran AFAIK you misunderstood my comment, I wanted to "replace sha1 with something sane" or I don't understand what is wrong with my comment.

@tiran
Copy link
Member

tiran commented Dec 8, 2016

@mbasti-rh I probably misunderstood your intention. I read your comment as "Replace it with something sane, the sane thing is sha1".

By the way I'm currently tangled up in a twitter discussion about Python's new secrets module and entropy. The module doc has a nice recipe to generate passwords with special properties https://docs.python.org/3.6/library/secrets.html#recipes-and-best-practices . I asked a friend of mine and real (tm) cryptographer about entropy for black box tokens. He told me

128 if you don't care about quantum computing; 256 if you do

@tiran
Copy link
Member

tiran commented Dec 8, 2016

#!/usr/bin/python3
import math
import random
import string


class TokenGenerator(object):
    """Simple, tunable token generator

    TokenGenerator(uppercase=3, lowercase=3, digits=0, special=None)

    At least 3 upper and 3 lower case ASCII chars, may contain digits, no
    special chars.

    128 bits entropy: secure
    256 bits of entropy: secure enough if you care about quantum computers
    """
    uppercase = frozenset(string.ascii_uppercase)
    lowercase = frozenset(string.ascii_lowercase)
    digits = frozenset(string.digits)
    # without: = # ' " \ `
    special = frozenset('!$%&()*+,-./:;<>?@[]^_{|}~')

    def __init__(self, uppercase=1, lowercase=1, digits=1, special=1):
        self.rng = random.SystemRandom()
        self.requirements = dict(
            uppercase=uppercase,
            lowercase=lowercase,
            digits=digits,
            special=special
        )
        chars = set()
        for symclass, req in self.requirements.items():
            if req is not None:
                chars.update(getattr(self, symclass))
        self.chars = tuple(chars)

    def __call__(self, entropy_bits=128):
        length = int(math.ceil(entropy_bits / math.log(len(self.chars), 2)))
        while True:
            token = ''.join(self.rng.choice(self.chars) for _ in range(length))
            tokenset = set(token)
            token_ok = True
            for symclass, req in self.requirements.items():
                if req is None or req <= 0:
                    continue
                reqchars = getattr(self, symclass)
                if len(tokenset.intersection(reqchars)) < req:
                    token_ok = False
                    break
            if token_ok:
                return token


if __name__ == '__main__':
    pwgen = TokenGenerator(special=None)
    for i in range(100):
        print(pwgen())

@MartinBasti
Copy link
Contributor

@tiran IMO you need check length > uppercase + lowercase + num + special, otherwise infinity loop
but generally LGTM

@tiran
Copy link
Member

tiran commented Dec 9, 2016

@mbasti-rh uppercase + lowercase + num + special should be limited to a sensible value. A large value invalidates the formula that calculates the length of the token. As far as I remember my math, formula assumes a uniform distribution of distinct values. Additional restrictions reduce the sample space of the result. I'm not clever enough to come up with an algorithm to calculate the length with additional restrictions. My gut feeling tells me that less than 15% per character class (3 for upper/lower case and symbols, 1 for digit) should be ok.

@MartinBasti
Copy link
Contributor

MartinBasti commented Dec 9, 2016

@tiran True, I would like to have there at least assert witch prevents devs to use more than let say 15% per class and more than 50% together, to prevent silly mistakes. It will also reduces possibility of starvation when too many passwords don't match requirements

Do you want to send this as PR or are you fine by adding it to this PR by Standa?

@pspacek
Copy link
Contributor

pspacek commented Dec 9, 2016

Guys, I'm confused. What exactly is the purpose of TokenGenerator? The docstring does not explain to me what is relation between arguments in __init__ and __call__. Of course I can guess but this should be clearly defined first.

If we are clever enough we can get away with the while loop and make it deterministic.

@MartinBasti
Copy link
Contributor

It generates random tokens than can be used as:

  • passwords
  • anything else that should be random

It is written in class docstring

Yes we can randomly generate strings for each class with specified length, then generate randomly the rest, and finally make order of characters random. However I'm not sure if this is from crypto point of view equal to generate random string at once and then check validity, there can be something we don't see, but probably not.

@pspacek
Copy link
Contributor

pspacek commented Dec 9, 2016

@mbasti-rh You are missing the point and thus do not answer my question: The docstring does not tell anything about relation of 'entropy' and the output. What is the relation?

Does it assume that attacker knows init parameters of TokenGenerator? Or not? How can we do analysis without knowing threat model first? Does entropy mean that the output string simply codes xxx bits of entropy, or does it mean that attacker has to guess xxx bits of entropy? That should be spelled out.

I would argue that for any IPA-internal passwords we must assume that attacker knows the input parameters because he can easily read the source code.

@stlaz
Copy link
Contributor Author

stlaz commented Dec 12, 2016

Correct me if I'm wrong here but I believe we're going for the scenario where the attacker has to guess the xxx bits of entropy and they know that they have to do it. We're not actually coding xxx bits of entropy as we need more entropy bits to get a sufficient result (hence length = int(math.ceil(entropy_bits / math.log(len(self.chars), 2))).
However! To the very first question of yours - unfortunately, there is a very small relation between the arguments in __init__ and __call__ as @tiran says:

I'm not clever enough to come up with an algorithm to calculate the length with additional restrictions. My gut feeling tells me that less than 15% per character class (3 for upper/lower case and symbols, 1 for digit) should be ok.

From the code you can see that if a certain class of characters should not be used, it's not accounted for in the calculation of the final length of the password but that's about it - if a further restriction is made (>1 character of the give character class), this restriction is also not accounted for. But since we're the ones who'll be using this token generator, I think we could live with this. There should be a warning in a docstring somewhere, though.

edit: Just realized - the code is wrong, the restriction to a certain class == None should just mean that the characters from the given class could but don't have to appear in the password (thus still need to be accounted for), the restriction of a certain class == 0 should mean the character should not appear in the password and should not be accounted for in the length calculation.

@pspacek
Copy link
Contributor

pspacek commented Dec 12, 2016

The main problem here is that we are mixing two approaches together, i.e. entropy specification using bits + specification using character classes etc. which used to be means of expressing entropy requirements in a way understandable by ordinary users.

If I understand it correctly, the encoding here is just to please password-quality checkers because the real password strength should be provided by the entropy parameter.

So I propose to use character classes only for encoding but not during generation. That should simplify the code and make it easier to understand.

@pspacek
Copy link
Contributor

pspacek commented Dec 12, 2016

Talk is cheap so here is the code!

import math
import string
import random


class TokenGenerator(object):
    """Tunable token generator."""
    # without: = # ' " \ `
    _special = '!$%&()*+,-./:;<>?@[]^_{|}~'
    def_charsets = {
                     'uppercase':
                     {'chars': string.ascii_uppercase,
                      'entropy': math.log(len(string.ascii_uppercase), 2)},
                     'lowercase':
                     {'chars': string.ascii_lowercase,
                      'entropy': math.log(len(string.ascii_lowercase), 2)},
                     'digits':
                     {'chars': string.digits,
                      'entropy': math.log(len(string.digits), 2)},
                     'special':
                     {'chars': _special,
                      'entropy': math.log(len(_special), 2)},
            }

    def __init__(self, uppercase=0, lowercase=0, digits=0, special=0,
                 min_len=0):
        """Specify character contraints on generated tokens.

        Integer values specify minimal number of characters from given
        character class and length.
        Value False prevents given character from appearing in the token.

        Example:
        TokenGenerator(uppercase=3, lowercase=3, digits=0, special=False)

        At least 3 upper and 3 lower case ASCII chars, may contain digits,
        no special chars.
        """
        self.rng = random.SystemRandom()
        self.min_len = min_len
        self.req_classes = dict(
            uppercase=uppercase,
            lowercase=lowercase,
            digits=digits,
            special=special
        )

        self.todo_charsets = self.def_charsets.copy()
        # 'all' class is used when adding entropy to too-short tokens
        # it contains characters from all allowed classes
        self.todo_charsets['all'] = {'chars': ''.join(
                [charclass['chars']
                 for charclass_name, charclass
                 in self.todo_charsets.items()
                 if self.req_classes[charclass_name] is not False]
            )}
        self.todo_charsets['all']['entropy'] = math.log(
                len(self.todo_charsets['all']['chars']), 2)

    def __call__(self, req_entropy=128):
        """Generate token containing at least req_entropy bits.

        req_entropy is minimal number of entropy bits attacker has to guess:
        128 bits entropy: secure
        256 bits of entropy: secure enough if you care about quantum computers

        The generated token will fulfill containts specified in init.
        """
        todo_entropy = req_entropy
        password = ''
        # Generate required character classes:
        # The order of generated characters is fixed to comply with check in
        # NSS function sftk_newPinCheck() in nss/lib/softoken/fipstokn.c.
        for charclass_name in ['digits', 'uppercase', 'lowercase', 'special']:
            charclass = self.todo_charsets[charclass_name]
            todo_characters = self.req_classes[charclass_name]
            while todo_characters > 0:
                password += random.choice(charclass['chars'])
                todo_entropy -= charclass['entropy']
                todo_characters -= 1

        # required character classes do not provide sufficient entropy
        # or does not fulfill minimal length constraint
        allchars = self.todo_charsets['all']
        while todo_entropy > 0 or len(password) < self.min_len:
                password += random.choice(allchars['chars'])
                todo_entropy -= allchars['entropy']

        return password


if __name__ == '__main__':
    pwgen = TokenGenerator()
    for i in range(100):
        print(pwgen(256))

This code deterministically generates passwords. If character constraints are specified, the code might generate slightly longer passwords than the brute-force method. For example, 256 bit password with FIPS-compliant constrains (3 character classes) the difference is 41 vs. 40 characters. Given this different, I think that determinism trumphs shorter passwords.

Also, I think that it does not make sense to have req_entropy parameter in __call__. IMHO it makes sense to specify it along with other constrains in __init__.

@stlaz stlaz force-pushed the password_unify branch 2 times, most recently from 428a158 to 17b08d5 Compare December 21, 2016 14:39
@pspacek pspacek added the ack Pull Request approved, can be merged label Dec 21, 2016
@pspacek
Copy link
Contributor

pspacek commented Dec 21, 2016

Works for me, server installation including DNSSEC worked fine.

@MartinBasti MartinBasti removed the ack Pull Request approved, can be merged label Dec 21, 2016
@MartinBasti
Copy link
Contributor

I would like to review this as well, so removing ACK to prevent pushing this to master

@MartinBasti MartinBasti added the ack Pull Request approved, can be merged label Jan 5, 2017
@MartinBasti
Copy link
Contributor

PR needs rebase

stlaz and others added 2 commits January 6, 2017 09:12
Also had to recalculate entropy of the passwords as originally,
probability of generating each character was 1/256, however the
default probability of each character in the ipa_generate_password
is 1/95 (1/94 for first and last character).

https://fedorahosted.org/freeipa/ticket/5695
A change to the algorithm that generates random passwords
for multiple purposes throught IPA. This spells out the need
to assess password strength by the entropy it contains rather
than its length.

This new password generation should also be compatible with the
NSS implementation of password requirements in FIPS environment
so that newly created databases won't fail with wrong authentication.

https://fedorahosted.org/freeipa/ticket/5695
@stlaz
Copy link
Contributor Author

stlaz commented Jan 6, 2017

I don't see any merge conflicts and the rebase was automatic so I don't see why, but ok. Just note that ipatool may be confused with me commiting @pspacek's commit as he was the author of the main code and I put it to work with the rest of IPA.

@MartinBasti MartinBasti added the pushed Pull Request has already been pushed label Jan 6, 2017
@MartinBasti MartinBasti closed this Jan 6, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ack Pull Request approved, can be merged pushed Pull Request has already been pushed
Projects
None yet
5 participants