-
Notifications
You must be signed in to change notification settings - Fork 484
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
4990 ec2 ansible scripting #5063
Conversation
- Does not take in a branch name or propigate that to the instance - Needs more thinking on pem creation - Needs another script or other ideas around tearing down instances
Script now fufills the need of creating an ec2 instance and scping in a file that has the branch name to be spun up later by ansible. This could probably be used as is, but I want to clean it up and create a tear-down script or think of some other ways we will ensure that a billion instances haven't been created.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great initial work. I put some details in my review for changes we should make. We need to decide what's in scope for phase 1 vs phase 2 or 3 or whatever.
#Initially Referred to this doc: https://docs.aws.amazon.com/cli/latest/userguide/tutorial-ec2-ubuntu.html | ||
|
||
#TODO: allow arbitrary repo, not just IQSS. Will require changing it on the ansible side as well | ||
DEPLOY_FILE=dataverse_deploy_info.txt |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where is DEPLOY_FILE used? Is it cruft?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
cruft!
echo "No branch name provided" | ||
exit 1 | ||
else | ||
BRANCH_NAME=$1 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we make the indentation consistent? https://google.github.io/styleguide/shell.xml#Indentation says, "Indent 2 spaces. No tabs."
@@ -0,0 +1,76 @@ | |||
#!/bin/bash -x |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High level comments as of 19e2915:
Great work. We need some documentation, probably in the dev guide, maybe in a new section called "deployment" or "validation" or somewhere under the existing "testing" page. The audience is someone that we've instructed to spin up an arbitrary branch. This person may not be a Python user and may not have the aws
binary installed already. They will need help with all the config under ~/.aws
that's required. We need to explain that only region = us-east-1
is supported, etc.
One comment on the shebang line is that we could consider removing "-x" but it's probably useful for now, even if it's a little verbose.
#Using this image ID a 1-time requires subscription per root account, which was done through the UI | ||
#Also, change the instance size as your own peril. Previous attempts of setting it smaller than medium have caused solr and maven to crash weirdly during install | ||
echo "*Creating ec2 instance" | ||
INSTACE_ID=$(aws ec2 run-instances --image-id ami-9887c6e7 --security-groups devenv-sg --count 1 --instance-type t2.medium --key-name devenv-key --query 'Instances[0].InstanceId' --block-device-mappings '[ { "DeviceName": "/dev/sda1", "Ebs": { "DeleteOnTermination": true } } ]' | tr -d \") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: "INSTACE_ID"
We might want to make "t2.medium" a variable so we can more easily change it.
We might want some more error checking here.
echo "*Checking for existing key pair" | ||
if ! [ -f devenv-key.pem ]; then | ||
echo "*Creating key pair" | ||
PRIVATE_KEY=$(aws ec2 create-key-pair --key-name devenv-key --query 'KeyMaterial' --output text) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need a way to avoid this error: An error occurred (InvalidKeyPair.Duplicate) when calling the CreateKeyPair operation: The keypair 'devenv-key' already exists.
PUBLIC_DNS=$(aws ec2 describe-instances --instance-ids $INSTACE_ID --query "Reservations[*].Instances[*].[PublicDnsName]" --output text) | ||
PUBLIC_IP=$(aws ec2 describe-instances --instance-ids $INSTACE_ID --query "Reservations[*].Instances[*].[PublicIpAddress]" --output text) | ||
|
||
echo "Connecting to the instance. This may take a minute as it is being spun up" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "may take a minute" message should probably be the last thing you see before the long wait and may need to say "may take 10 minutes" or so. Also, not only is the instance being spun up. Dataverse is being installed.
|
||
echo "Connecting to the instance. This may take a minute as it is being spun up" | ||
|
||
echo "New EC2 instance created at $PUBLIC_DNS" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps this note about the IP can be rolled into the main message above.
sudo yum -y install git nano ansible | ||
git clone https://github.com/IQSS/dataverse-ansible.git dataverse | ||
export ANSIBLE_ROLES_PATH=. | ||
sed -i "s/branch:/branch: $BRANCH_NAME/" dataverse/defaults/main.yml |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Over at http://irclog.iq.harvard.edu/dataverse/2018-09-18 @donsizemore suggests trying --extra-vars "dataverse.git.branch=9999-get-my-margarita"
He linked to https://docs.ansible.com/ansible/2.6/user_guide/playbooks_variables.html#variable-precedence-where-should-i-put-a-variable which shows "extra vars (always win precedence)".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried this and could not get the variable preference to work. It seemed like the blank variable in the config was taking preference. But I may just be a noob about this. command line is definitely the lowest preference.
ansible-playbook -i dataverse/inventory dataverse/dataverse.pb --connection=local | ||
EOF | ||
|
||
echo "New EC2 instance created at $PUBLIC_DNS (Public IP $PUBLIC_IP ). When you are done, please terminate your instance with: aws ec2 terminate-instances --instance-ids $INSTACE_ID" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Instead of just the IP we should give the user a link to click. And does it have to be an IP? Does it get a DNS entry automatically?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DNS would take extra code and would have extra cost
@@ -0,0 +1,11 @@ | |||
#!/bin/bash | |||
|
|||
#This script gets all the instances from ec2 and sends terminate to them |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This "destroy" script is useful but a "list" or "read" script (I'm thinking CRUD) would be nice. "Give me a list of all the running instances and a command for each instance to destroy some or all of them"
Tabs vs. Spaces | ||
^^^^^^^^^^^^^^^ | ||
|
||
Don't use tabs. Use 2 spaces. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is not very consistent throughout the codebase.
Are you guys aware of Checkstyle Maven plugin for checking and maybe enforcing this kind of stuff throughout the dev team (and helping QA)?
Example config I used for Peerpub
(If you want, I can create a separate issue for talking about things like CheckStyle, PMD + CPD and FindBugs SpotBugs)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@poikilotherm I know! Help! Actually, can you please create an issue just about CheckStyle? To help us with consistent formatting like tabs vs. spaces? We try to work in small chunks and in the future you'd be welcome to create issues for FindBugs and other tools too. If you want, you can write the issue title in the form of a user story with something like "As a developer, I'd like some tooling to report on if the code I'm writing meets the coding style of the project such as tabs vs. spaces." Whatever wording makes sense to you. Then in the issue we can discuss CheckStyle vs. other tools. Again, the point is to have the smallest chunk possible. Maybe just the tabs vs. spaces thing. I hope this makes sense!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@poikilotherm thanks for opening #5075 about code style.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cross-referencing: have a look at #5094 😄
The changes necessary at https://github.com/IQSS/dataverse-ansible - cfb8fdb de-nest git variables for extra-vars compatibility - 45f714c bump group_vars for vagrant testing - 7a5ef0f flatten branch and repo variables for extra-vars compatibility
fi | ||
|
||
RANDOM_STRING="$(uuidgen | cut -c-8)" | ||
KEY_NAME="key-$USER-$RANDOM_STRING" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if adding the branch name to this would provide value as well. Not a blocker for this version though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, I guess. I was trying to keep the keyname somewhat short. My thought was that we could always check the running instance to see what branch is, if necessary.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some very minor changes. I may just do them now.
echo "New instance created with ID \"$INSTANCE_ID\". To ssh into it:" | ||
echo "ssh -i $PEM_FILE $USER_AT_HOST" | ||
|
||
echo "Please wait at least 15 minutes while the branch \"$BRANCH_NAME\" from $REPO_URL is being deployed." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We say 5-10 in the documentation (EDIT: UPDATED)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for updating this. More than 15 minutes is more realistic.
|
||
# TODO: Add some error checking for this ssh command. | ||
ssh -T -i $PEM_FILE -o 'StrictHostKeyChecking no' -o 'UserKnownHostsFile=/dev/null' -o 'ConnectTimeout=300' $USER_AT_HOST <<EOF | ||
sudo yum -y install epel-release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I left epel-release in here from when I was trying to install older versions of ansible via pip. Pretty sure it can be removed.
I do wonder if maybe the default action should be for the script to deploy the latest release branch instead of develop. That'll be much quicker (as it won't have to build) and guaranteed to work. But I only feel strongly about it if folks outside of the Harvard team start using this. |
|
||
# TODO: Add some error checking for this ssh command. | ||
ssh -T -i $PEM_FILE -o 'StrictHostKeyChecking no' -o 'UserKnownHostsFile=/dev/null' -o 'ConnectTimeout=300' $USER_AT_HOST <<EOF | ||
sudo yum -y install git nano ansible epel-release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added back in epel-release though I'm not quite sure why its in here. Maybe its like my addition of nano and its just helpful when you're in the box?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should put it back how it was and leave a comment saying that we install EPEL first so that when we install Ansible we get 2.6 instead of 2.4. For details, please see http://irclog.iq.harvard.edu/dataverse/2018-09-21#i_73096
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah ok! Thanks for the info :)
@matthew-a-dunlap unless passed a branch the dataverse-ansible role deploys the latest release warfile (from dvinstall.zip) so unless ec2-create specifies a branch you’ll get say 4.9.2 if this is desired you could simply remove the branch check from the script header? |
|
||
First, you need to have AWS Command Line Interface (AWS CLI) installed, which is called ``aws`` in your terminal. Launching your terminal and running the following command to print out the version of AWS CLI will tell you if it is installed or not: | ||
|
||
``aws --version`` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It might be helpful to mention a known-working version of the AWS CLI (which may make things easier if a future update changes argument formatting used by the EC2 scripts).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mentioning a known good version isn't a bad idea but I believe the idea is that we'll be using this script fairly regularly so we should notice if it stops working.
@@ -0,0 +1,115 @@ | |||
#!/bin/bash |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It would be slightly more portable to use #!/usr/bin/env bash
instead of #!/bin/bash
, although for how the documentation says to use these scripts that won't matter (and in practice, for CentOS and OS X it is in /bin/
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not sure where I picked up /bin/bash as the path but it always seems to work for me. I'm not aware of when I'd need to use env instead. Maybe if someone installed a newer version of bash? But I'm targeting the system bash.
|
||
# epel-release is installed first to ensure the latest ansible is installed after | ||
# TODO: Add some error checking for this ssh command. | ||
ssh -T -i $PEM_FILE -o 'StrictHostKeyChecking no' -o 'UserKnownHostsFile=/dev/null' -o 'ConnectTimeout=300' $USER_AT_HOST <<EOF |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This feels like it might be a bad habit. Probably not worth investigating alternative ways to avoid the interactive prompts in this iteration, but might be something to keep in mind for the future.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I didn't question too much what's going on with this EOF HEREDOC. It works but I did leave a TODO saying that we should add some error checking at some point. I'm not sure of the best way to improve this and I'm definitely open to ideas.
Also put Ansible before Puppet since dataverse-puppet hasn't been updated since March 2016.
Related Issues
Pull Request Checklist