Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pbkdf2 #225

Merged
merged 12 commits into from
Mar 28, 2019
Merged

Pbkdf2 #225

merged 12 commits into from
Mar 28, 2019

Conversation

0golovatyi
Copy link
Contributor

  • Password hashing is done with PBKDF2
  • Added utility to add/update accounts
  • Updated documentation

@pep8speaks
Copy link

pep8speaks commented Mar 27, 2019

Hello @0golovatyi! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found:

There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻

Comment last updated at 2019-03-28 17:13:39 UTC

docs/server-config.md Outdated Show resolved Hide resolved
docs/server-config.md Outdated Show resolved Hide resolved
tabpy-server/tabpy_server/app/app.py Outdated Show resolved Hide resolved
tabpy-server/tabpy_server/app/app.py Outdated Show resolved Hide resolved
tabpy-server/tabpy_server/app/util.py Outdated Show resolved Hide resolved
tabpy-server/tabpy_server/app/util.py Show resolved Hide resolved
utils/user_management.py Outdated Show resolved Hide resolved
utils/user_management.py Outdated Show resolved Hide resolved
utils/user_management.py Outdated Show resolved Hide resolved
utils/user_management.py Show resolved Hide resolved
Copy link
Contributor

@WillAyd WillAyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Couple minor things on code.

Without knowing what discussions have been had internally are you sure you want to handle security in this way? By no means a security expert myself but seems kind of risky to have multiple users' credentials stored in a file - someone with relative ease could evaluate code in TabPy that gets access to all of those passwords and could I think brute force cracking them, especially since the salt is readily available

return True


def generate_password(pwd_len=16):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since your min Python version is 3.6 you probably want to use the secrets module instead of rolling this on your own

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I checked https://docs.python.org/3/library/secrets.html and it shows how to generate a password in the same way we do in this function

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I was just thinking you could use token_safe unless you really cared to control the contents here. Simplifies code

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I read the documentation correctly token_safe returns base64 encoded token - there won't be punctuation characters in it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is URL-safe punctuation. You can see the PEP for the module here:

https://www.python.org/dev/peps/pep-0506/#id42

Basically the whole point of the module is to prevent people from using random to generate passwords, which is what we are doing here. Random was just kept for backwards compatibility

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it (I think), pushing updated code.

# add tabpy-tools and tabpy-server folders to
# PYTHONPATH so code from there can be found when
# modules are imported
for dir_ in {'tabpy-tools', 'tabpy-server'}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there a reason for this? Generally modifying the path like this from within code makes it generally more difficult to debug.

At the very least I don't think you need to add the tools package here as you aren't importing from it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main reason we want for users to be able to run the tool from repo root folder and for that we need to have at least tabpy-server in PYTHONPATH as we reuse some code from it. As for tabpy-tools it is not needed now but may be needed in future - server code still uses some types from tools package :(

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Not super important for now since this is just a util file but hoping we can effectively decouple the tools and server packages. Otherwise we should just bundle them together in code base

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, believe me, we are constantly discussing how we can setup process more robust and platform independent.
There is a list of options but neither of them is going to work smoothly for all of the scenarios people use TabPy with.

elif args.command == 'update':
return update_user(args, credentials)
else:
logger.error('Uknown command "%s"' % args.command)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably best to just use string formatting at this point instead of Py2 syntax

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. Somehow I keep forgetting about the recommended way for string formatting. Thanks for catching these!

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yep. By the way since you are on 3.6 you could simplify even further with formatted string literals, so just:

f'Unknown command {args.command}'

Don't need to do it here but there's an open issue about it

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great, will try to remember that.

credentials : dict
Credentials from the file. Empty if succeeded is False.
'''
logger.info('Parsing passwords file %s...' % pwd_file_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same thing on str formatting

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

password = 'password'
cls.pwd_file.write('{} {}\n'.format(
username,
hash_password(username, 'password')))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Guess it doesn't matter a lot here but you probably mean to reference the local variable rather than literal string in this expression

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

TY! fixed.

@0golovatyi
Copy link
Contributor Author

@WillAyd Answering your question about how secure is the password file. It is not, and we are aware of it. We are discussing some future improvements where we engage real authentication service instead of implementing our own. What we have here is more for auditing (so TabPy can log what account requested what operation).

However we considered security questions you raised and for now this is what our answer is:

  • Using PBKDF2 for hashing makes it significantly harder to recover a password comparing with SHA (times if not powers).
  • For the password file, we believe hosting OS mechanisms should be used to restrict access to files with sensitive information. I'll update the documentation with the recommendation.

# add tabpy-tools and tabpy-server folders to
# PYTHONPATH so code from there can be found when
# modules are imported
for dir_ in {'tabpy-tools', 'tabpy-server'}:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Gotcha. Not super important for now since this is just a util file but hoping we can effectively decouple the tools and server packages. Otherwise we should just bundle them together in code base

return True


def generate_password(pwd_len=16):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea I was just thinking you could use token_safe unless you really cared to control the contents here. Simplifies code

credentials[login] = row[1]
logger.debug('Found username {}'.format(login))

return True, credentials
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So back to previous conversation around security, I think the README updates you made are great! But the problem is that regardless of OS permissions a user could technically still deploy a function to import and execute this function to return back the credentials of all users so long as they can make a reasonable guess at the file name (which probably wouldn't be that hard).

I think it would really help security a lot if credentials were never returned here nor exposable through anything that exists in the global namespace of any module

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, it's an architectural problem caused by TabPy executing user scripts in the same context where the Web server runs.

It is not that bad as we are dealing with hashes of passwords and TabPy never knows actual passwords. Another limiting factor for an attacker is he/she needs to know at least one pair of login/password to be able to evaluate a script. And since all the users are the same to TabPy as it is implemented now (we don't have any authorization or resources restrictions) knowing additional login/password won't give an attacker any benefit.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea for sure. So my point is that it's just pretty easy for someone to get a hold of these hashed passwords, and that makes things difficult but obviously not impossible to target a particular user (say CEO of company) and try to match the hash with their own computing resources.

It would be a much better design if there was a database locked with a password available only to the deployed server which stored passwords and would be checked against per user on demand rather than having them persist in the memory space of Python

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, but it also could be configuration nightmare if we require people to configure a DB and access to it considering TabPy setup itself can be challenging for some platforms.

Again, even with using basic access authentication here we see it more like auditing mechanism at the moment.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yea makes sense. SQLite would be pretty easy to leverage here for future reference

Agreed it's an improvement over what's in place regardless

Copy link
Contributor

@johng42 johng42 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM modulo Will's comments

@@ -2,6 +2,9 @@

The following security issues should be kept in mind as you use TabPy with Tableau:

- REST server and Python execution context are the same meaning they share
Python session, e.g. HTTP requests are served in the same space where
user scripts are evaluated.
- TabPy currently does not use authentication.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to remove this line

-------
succeeded : bool
True if specified file was parsed successfully.
False if there were ane issues with parsing specified file.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this typo still looks broken - can you verify the change you made is being picked up?

@johng42 johng42 merged commit 3131d32 into dev Mar 28, 2019
@johng42 johng42 deleted the pbkdf2 branch March 28, 2019 18:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants