-
Notifications
You must be signed in to change notification settings - Fork 28
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add pycloudlib configuration file (SC-519) #164
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not tested, but I think this makes sense.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a fan of this idea and I'm a fan of the cli too!
Haven't actually tested yet, but looks good overall - just a couple questions/comments
@@ -0,0 +1,59 @@ | |||
############### pycloudlib.toml.template ##################################### |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
toml +1
pycloudlib.toml.template
Outdated
# Copy this file to /etc/pycloudlib.toml or ~/.config/pycloudlib.toml and | ||
# fill in the values appropriately. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should it also check in the current directory (wherever it was run) for a pycloudlib.toml?
That could make per-project configuration easier, but also brings accidental git commiting risk, as described below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like to be able to specify the config file path via a PYCLOUDLIB_CONFIG
environment variable. This plays well in Jenkins, where a credentials file can be added to the Jenkins creds storage, and then Jenkins can create it in a random location when a job runs, exporting the path via a custom env var.
# If a key is uncommented, it is required to launch an instance on that cloud. | ||
# Commented keys aren't required, but allow further customization for | ||
# settings in which the defaults don't work for you. If a key has a value, | ||
# that represents the default for that cloud. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Love it.
pycloudlib/cloud.py
Outdated
@@ -25,10 +28,17 @@ def __init__(self, tag, timestamp_suffix=True): | |||
timestamp_suffic: Append a timestamped suffix to the tag string. | |||
""" | |||
self._log = logging.getLogger(__name__) | |||
self.config = parse_config()[self._type] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there also be a way to pass in a config as a parameter to this constructor?
elif 'GOOGLE_APPLICATION_CREDENTIALS' in os.environ: | ||
self.credentials_path = os.environ[ | ||
'GOOGLE_APPLICATION_CREDENTIALS'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My first thought is that I'd expect the envvar to have higher precedence than the config file, but not a particularly strong opinion.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Normally I'd also expect the env vars to have precedence over the config file.
I think this is great! It will really make it easier to run tests in the clouds, both manually and in Jenkins. I left a couple of inline comments, but overall this LGTM. The branch needs fixing to make CI pass, but I didn't spot anything fundamental there. |
1bb8de4
to
3909b6a
Compare
45 files changed!? Over half of those are docs changes and removal of a directory we don't need ( There were a lot of small required test changes, so a lot of files got touched, but the non-test code changes are almost identical to the first round of reviews. I tested using the config file on all of the pycloudlib supported clouds. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks really good, and I think a strict approach to defining a single conf entrypoint makes the most sense and avoids some of pitfalls we've seen with trying to be "too flexible" in supporting each cloud's means of obtaining credentials. One conf to rule them all, and proper pycloudlib docs to allow pycloudlib users to query the cloud to get those details where possible.
pycloudlib.toml.template
Outdated
clientId = "" | ||
clientSecret = "" | ||
subscriptionId = "" | ||
tenantId = "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get that Azure naming is camelCase, but it feels a bit strange that we have our generlized snake_case params mixed with camelCase. Is this PR an opportunity to standardize on snake_case and just do the proper camelCase mapping internal to pycloudlib when invoking the SDK?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Now that we have a potential of people providing unaltered or incorrect toml config with empty string as values. I think we probably want to handle "empty" values and ignore them properly on each cloud.
Ec2 currently fails in an unfriendly manner because pycloudlib.ec2.util _get_session passes the empty strings for secet_access_key, access_key_id and region into boto3.Session, which attempts to use the empty string instead of sourcing your local .aws/config files.
self.client = session.client('ec2')
File "/home/csmith/src/pycloudlib/.tox/pylint/lib/python3.8/site-packages/boto3/session.py", line 258, in client
return self._session.create_client(
File "/home/csmith/src/pycloudlib/.tox/pylint/lib/python3.8/site-packages/botocore/session.py", line 847, in create_client
client = client_creator.create_client(
File "/home/csmith/src/pycloudlib/.tox/pylint/lib/python3.8/site-packages/botocore/client.py", line 86, in create_client
client_args = self._get_client_args(
File "/home/csmith/src/pycloudlib/.tox/pylint/lib/python3.8/site-packages/botocore/client.py", line 355, in _get_client_args
return args_creator.get_client_args(
File "/home/csmith/src/pycloudlib/.tox/pylint/lib/python3.8/site-packages/botocore/args.py", line 99, in get_client_args
endpoint = endpoint_creator.create_endpoint(
File "/home/csmith/src/pycloudlib/.tox/pylint/lib/python3.8/site-packages/botocore/endpoint.py", line 287, in create_endpoint
raise ValueError("Invalid endpoint: %s" % endpoint_url)
ValueError: Invalid endpoint: https://ec2..amazonaws.com
csmith@uptown:~/src/pycloudlib$
Should we treat toml empty string values as None in pycloudlib before invoking cloud-specific libraries? Or should we just ensure ec2, gce, openstack oci adopt the right fallback when provided with an empty string value.
For ec2 specifically, the following retains fallback behavior to config in .aws/config:
diff --git a/pycloudlib/ec2/util.py b/pycloudlib/ec2/util.py
index 4c1a9ca..81d3a4f 100644
--- a/pycloudlib/ec2/util.py
+++ b/pycloudlib/ec2/util.py
@@ -66,7 +66,7 @@ def _get_session(access_key_id, secret_access_key, region):
_decode_console_output_as_bytes)
return boto3.Session(
botocore_session=mysess,
- aws_access_key_id=access_key_id,
- aws_secret_access_key=secret_access_key,
- region_name=region
+ aws_access_key_id=access_key_id or None,
+ aws_secret_access_key=secret_access_key or None,
+ region_name=region or None
)
pycloudlib/config.py
Outdated
possible_configs.extend(CONFIG_PATHS) | ||
for path in possible_configs: | ||
try: | ||
return toml.load(path, _dict=Config) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we print or log the config file we successfully sourced?
Re: empty or bad value checks, I think I'd like to wait and iterate on that. Especially if we decide that configuration file will be the only valid way to pass credentials to pycloudlib, it will be a lot easier to do validation of required values. |
c70c957
to
3553e7c
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I finally managed to try test more things and what I tried worked smoothly. There is probably room for improvement in how some errors/failures are presented, but that only happens if the conf file is broken. We can have more iterations on this if needed, no need to block landing. Overall this is a very big improvement on the usability of the library!
credentials rather than embedding credential details in the API. This is currently backwards compatible, so credentials can still be passed via API, and those passed via the API will take precedence over credentials specified in the configuration file. This allows us to pull individual cloud configuration out of the API. There's a few reasons to do this: * There have often been "works on my machine" problems when it comes to credentials. One person might have a newer version of one of the cli tools that expects things in different locations. Moving credentials into a separate configuration file (ideally) means there's no more wondering what needs to be specified to launch an instance on particular cloud. * Pycloudlib currently needs to be configured per project. E.g., UA tests already works on certain jenkins machines, and re-setting things up so they work with cloud-init is a waste of time. * It's easier to follow the liskov substitution principle. E.g., openstack currently requires a 'network' parameter that other clouds don't require. This means we can't make a generic get_cloud().launch() kind of API, because different clouds require different arguments based on their auth. * If we eventually want a cli on top of pycloudlib, this makes it a whole lot easier.
It gets regenerated every doc build
3553e7c
to
49dbce2
Compare
This allows us to pull individual cloud configuration out of the API. There's a few reasons I want to do this:
get_cloud().launch()
kind of API, because different clouds require different arguments based on their auth.I haven't updated any tests or documentation as I would like to get buy-in on what's here first. I also realize that this config references other configs. Most of them can be pulled out into a single config, but we can iterate on that later, and I'd rather not make this PR even bigger.