Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Login, take two #115

Merged
merged 17 commits into from
Mar 18, 2021
Merged

Login, take two #115

merged 17 commits into from
Mar 18, 2021

Conversation

tsibley
Copy link
Member

@tsibley tsibley commented Mar 8, 2021

Login and logout commands for authenticating with nextstrain.org. This PR lays the foundation for other commands to make authenticated requests to nextstrain.org's APIs, however, currently no commands make such requests.

All the details are in the commit messages. The best overview is 296a82a.

Supersedes #92, where a lot of design discussion took place.

Todo (but not blocking initial review):

  • Support initial login credentials from the environment (NEXTSTRAIN_USERNAME, NEXTSTRAIN_PASSWORD) instead of prompting, to support automated environments like CI builds.

@tsibley tsibley mentioned this pull request Mar 8, 2021
2 tasks
@tsibley tsibley requested a review from a team March 8, 2021 22:18
@trvrb
Copy link
Member

trvrb commented Mar 8, 2021

Thanks Tom! This is amazing. I like this from a user perspective and confirmed it was as intended with my account.

The general strategy seems smart, but I don't have have any expert insight into how to securely manage credentials.

Copy link
Contributor

@huddlej huddlej left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This "worked" for me (as someone who doesn't have a Nextstrain account) in that all the new commands behaved as expected. The code all generally looks good, too, although like Trevor I can't comment on the credential management aspects.

tsibley and others added 17 commits March 11, 2021 14:33
Even a single-level traversal like .parent can cause errors with
relative paths, so best practice is to resolve first.  Resolution isn't
strict since we a) are willing to create the special ".nextstrain"
parent and b) will catch other non-existent paths later on write.
Instead of defaulting to the config file at PATH, refactor the `get()`
and `set()` functions to accept a path argument that defaults to PATH.

See #98 and #99 for some ideas of larger refactorings that would improve
things here.
In future commits, we will be using config to refer to a new file in
~/.nextstrain/. Rename all instances of PATH to CONFIG to remove
ambiguity.
This prevents corruption if one process is reading from the config while
another is writing, if two processes try to write at the same time, etc.
While unlikely to be an issue given the minimal nature of the current
read/write access patterns, this makes the system more robust in
anticipation of more frequent reads and writes due to user
authentication.

The "fasteners" library seemed like the most competent cross-platform
locking library I could find, and it sure beats writing on our on top of
the messy world of platform-specific IPC.

These locks *do not* prevent conflicting access within process, e.g.
between threads, but our only code outside of the main thread doesn't
touch config.  If necessary in the future, we can use the locking
primitives in the core "threading" module to also protect inter-thread
accesses.
Useful for removing things like the credentials we're about to start
storing in a secrets config file.
…in advance of adding additional modules to it.
From commit 909dd491b8690df58e02d949238d6702e0be3f15 (tagged 2021.03.1).

I'll be using this along with the boto3 Cognito client to implement
login to nextstrain.org, instead of using pyCognito or Warrant, which do
a bit more than we need and feel not well-maintained.

Local modifications are forthcoming to integrate this into our project,
but I'm saving those for a separate commit to better track provenance.
We did not copy the pycognito.exceptions module.  Use a new name that
matches our minimal exception-naming conventions so far.
We will always pass this class an existing Cognito client instance.
Unnecessary since we only support Python 3.
…cise

Other AWS services may use other SRP variations.
…words

There are several of them, and it's good for security code to be
explicit.
Username and password credentials are prompted for and exchanged for
authentication tokens which are then stored locally.  These tokens can
be periodically refreshed for ~30 days without requiring another
password prompt, similar to our nextstrain.org web session durations.
A new config file, ~/.nextstrain/secrets, is used to store the tokens so
that it may be protected by restricted file permissions, excluded from
backups/version control, etc. as suitable for each user.

There are three main parts to this functionality:

 1. The login and logout commands, which provide the user interface.
    These are responsible for prompting for credentials and outputting
    nice messaging to the user.

 2. An internal authentication library (nextstrain.cli.authn) used by
    the login/logout commands to make authentication attempts with
    nextstrain.org and transparently manage locally-saved tokens.  This
    library's API is designed to be used by other commands in the near
    future (e.g. "remote upload") which need to assert a user is logged
    in and get tokens for them.

 3. An internal AWS Cognito library that lightly wraps the boto3
    cognito-idp client and integrates an open-source implementation of
    the Secure Remote Password protocol used by Cognito.  It is intended
    only for use by our internal authn library.

First versions of this work used pyCognito in place of our own (3).
Upon closer inspection and consideration, it became clear to me that
pyCognito's approach is a bit sloppy and the API it provides a bit
awkward for our use case and also too easy to use incorrectly when
saving tokens locally.  I started to address some of this by submitting
patches upstream but, after more thinking about how pyCognito did much
more than we really needed, decided it'd be a better fit to implement
what became (2) and (3).  The APIs they expose and their conceptual
split support our use cases better and allow us more control.  If we
later find a better library for interacting with Cognito, we'll be able
to replace (3) with it pretty cleanly.  I did compare Warrant
(pyCognito's progenitor) to see if it was substantially different; it is
not.¹

The initial draft of this work in PR #92 was done by Kairsten Fay with
guidance from myself, but this final version ended up sharing little
with those initial drafts.  Some of the commits from her PR still remain
in this PR as early commits.

¹ https://github.com/tsibley/warrant-vs-pycognito
To support automated environments like CI builds.
@tsibley
Copy link
Member Author

tsibley commented Mar 11, 2021

@trvrb @huddlej Thanks for taking a look! I'm mostly looking for feedback on the UI/UX and overall organization/approach of the code, so glad that seems good. The credentials management itself is comparable to other tools, and I think fine.

I did just repush to remove the group read permissions from the secrets file (leaving it u=rw,go=), as upon further consideration other tools (like ssh) do this as well. The rebase also resolves some conflicts subsequently introduced on master.

My thought is to merge this now (once CI passes again), even though nothing is making use of it yet. Some folks may notice it and wonder about it, but I think that's fine.

@jameshadfield
Copy link
Member

My thought is to merge this now (once CI passes again), even though nothing is making use of it yet. Some folks may notice it and wonder about it, but I think that's fine.

Seems good to me. I'll try out the commands (UI/UX) now if you can hold off for a bit longer...

@tsibley
Copy link
Member Author

tsibley commented Mar 11, 2021

@jameshadfield Yes! There's no rush (other than the nice feeling of merging a bunch of work :-).

@jameshadfield
Copy link
Member

jameshadfield commented Mar 12, 2021

The login/logout/whoami commands worked as expected, and the secrets file was cleared upon logout 💯

UI / UX comments

  • There's probably going to be some confusion from people who have been using nextstrain remote upload... for groups, which use IAM credentials from the environment not cognito usernames, who try to use these credentials as logins. There are few enough groups that this could be managed on a case-by-case basis, as I imagine checking if a username is actually an IAM access key is a bit too much.
  • I've gotten in the habit of running --help-all instead of --help (for the CLI, i'm not aware of other software which uses this split), so it'd be nice if that would work for commands like nextstrain login etc, even if the output was identical to --help.

Install troubles

Perhaps unrelated to this PR as I was upgrading from an older version of the CLI on this machine, but when re-installing the cli (to ensure deps up to date) I got the following warning:

$ pip install -e .[dev]
...
botocore 1.19.52 has requirement urllib3<1.27,>=1.25.4; python_version != "3.4", but you'll have urllib3 1.24 which is incompatible.

Which then resulted in a broken CLI:

$ nextstrain --help
Traceback (most recent call last):
  File "/Users/naboo/miniconda3/envs/nextstrain-cli/lib/python3.7/site-packages/pkg_resources/__init__.py", line 578, in _build_master
    ws.require(__requires__)
  File "/Users/naboo/miniconda3/envs/nextstrain-cli/lib/python3.7/site-packages/pkg_resources/__init__.py", line 895, in require
    needed = self.resolve(parse_requirements(requirements))
  File "/Users/naboo/miniconda3/envs/nextstrain-cli/lib/python3.7/site-packages/pkg_resources/__init__.py", line 786, in resolve
    raise VersionConflict(dist, req).with_context(dependent_req)
pkg_resources.ContextualVersionConflict: (urllib3 1.24 (/Users/naboo/miniconda3/envs/nextstrain-cli/lib/python3.7/site-packages), Requirement.parse('urllib3<1.27,>=1.25.4; python_version != "3.4"'), {'botocore'})

The piplock pins this to urllib3==1.26.3 but since I don't use pipenv I ended up added the following to setup.py to get things working

        "requests == 2.25.1", # previously unpinned which gave me v2.20 which demands urllib3 < 1.24
        "urllib3 == 1.26.3", # same as in pip.lock

@tsibley
Copy link
Member Author

tsibley commented Mar 13, 2021

There's probably going to be some confusion from people who have been using nextstrain remote upload... for groups, which use IAM credentials from the environment not cognito usernames, who try to use these credentials as logins. There are few enough groups that this could be managed on a case-by-case basis, as I imagine checking if a username is actually an IAM access key is a bit too much.

Agreed there may be confusion here for the current "early adopters" of Groups.

Agreed it's likely too much to call the AWS STS GetAccessKeyInfo method for each username given (and doing so would presume available AWS credentials), but it'd be plausible to add a heuristic based on the given username matching something like /^A[KS]IA[A-Za-z0-9]{12,}$/.

That said, I'm not sure it's worth the extra effort here, for what would probably be a one-time confusion for at max a dozen or two people. It'd be fairly easy to add, I think, so maybe even a little confusion-avoided is worth it. Thoughts?

I've gotten in the habit of running --help-all instead of --help (for the CLI, i'm not aware of other software which uses this split), so it'd be nice if that would work for commands like nextstrain login etc, even if the output was identical to --help.

Mmm, that's a reasonable request. Good idea and doable. I've opened #117.

(The --help / --help-all split is one I've encountered a few times over the years (sometimes with a naming variation), though I don't have examples on-hand to point to. I would like to keep it, as it simplifies the interface for most users.)

Perhaps unrelated to this PR as I was upgrading from an older version of the CLI on this machine, but when re-installing the cli (to ensure deps up to date) I got the following warning:

Can you provide:

  1. The older version of the CLI you were upgrading from?
  2. The version of pip in your Conda env?

The transitive dep resolution around botocore is more than a bit fraught because of some of the choices of our direct dependencies, but I haven't seen this particular issue before and can't reproduce locally with some basic attempts.

@tsibley
Copy link
Member Author

tsibley commented Mar 13, 2021

It seems as if pip didn't want to upgrade an existing package despite a req for a newer version, which is confusing. I wonder if you needed to pass the --upgrade flag to pip install to allow it to upgrade transitive deps? (While I know how --upgrade affects direct packages, I'm not sure of the exact impact on transitive deps.)

@jameshadfield
Copy link
Member

Re: confusion of early adopters. I think this can be overcome with clear documentation, rather than code changes. When we publicise this, presumably when the first functions start to take advantage (nextstrain remote?), we can make a special note of this. If some people make the mistake, then we can deal with it on a case-by-case basis.

Re: install issues. This was in a rather out-of-date conda environment, and I don't mean for it to block merging of this PR at all. However if we run into these things then I imagine others will too! I was using a conda environment with python 3.7, pip 18.1, nextstrain-cli v1.x (don't remember which exactly, perhaps 1.16).

@tsibley tsibley merged commit 1f7c233 into master Mar 18, 2021
@tsibley tsibley deleted the login-take-two branch March 18, 2021 20:45
@tsibley
Copy link
Member Author

tsibley commented Mar 18, 2021

I think this can be overcome with clear documentation, rather than code changes.

👍

However if we run into these things then I imagine others will too! I was using a conda environment with python 3.7, pip 18.1, nextstrain-cli v1.x (don't remember which exactly, perhaps 1.16).

Yeah, quite possibly! Thanks for the details; I think they may be helpful in troubleshooting the issue if we see it again. (I'm not going to try to reproduce right now, though.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants