Skip to content

Setting up EC2 instance for EmrEtlRunner and StorageLoader

leonmaas edited this page Apr 11, 2018 · 4 revisions

HOME > SNOWPLOW SETUP GUIDE > Step 1: setup a Collector > Clojure collector setup > Setting up EC2 instance for EmrEtlRunner and StorageLoader

This tutorial assumes it's your first installation and you probably just want to checkout the platform. Thus many steps describe low-performance and unsecured installation. In real-world scenario you may want to fix that.

Prepare your system

Before getting started you need to have:

  • Account on Amazon Web Services.
  • Installed AWS CLI.
  • IAM user, first one need to be created in AWS Console.
  • IAM user need to have attached AdministratorAccess.
  • Configured credentials on your local machine. (You can use aws configure for it).
  • For some steps you may want to install jq. It's optional, but handy.

Everything else can be done from CLI.

Setting up EC2 instance for EmrEtlRunner/StorageLoader

In the end of this step, you'll have an AWS EC2 instance, SSH access to it and key stored on local machine.

1. Find your Default VPC ID

We will refer to it as {{ VPC_ID }}.

$ aws ec2 describe-vpcs | jq -r ".Vpcs[0].VpcId"

2. Create Security Group for SSH access

On output you'll get GroupId. We will refer to it as {{ SSH_SG }}.

$ aws ec2 create-security-group \
    --group-name "EC2 SSH full access" \
    --description "Unsafe. Use for demonstration only" \
    --vpc-id {{ VPC_ID }} \
    | jq -r '.GroupId'

3. Add rule allowing SSH access from anywhere

$ aws ec2 authorize-security-group-ingress \
    --group-id {{ SSH_SG }} \
    --protocol tcp \
    --port 22 \

4. Create SSH key-pair named on the local machine

We named it "snowplow-ec2" here.

$ aws ec2 create-key-pair --key-name snowplow-ec2 \
    | jq -r ".KeyMaterial" > ~/.ssh/snowplow-ec2.pem
$ chmod go-rwx ~/.ssh/snowplow-ec2.pem

5. Run t2.small instance with Amazon Linux AMI with previously created SSH-key

On output you will get your instance id. We will refer to it as {{ INSTANCE_ID }}.

$ aws ec2 run-instances \
    --image-id ami-60b6c60a \
    --count 1 \
    --instance-type t2.small \
    --key-name snowplow-ec2 \
    | jq -r '.Instances[0].InstanceId'

6. Attach security group to Instance

$ aws ec2 modify-instance-attribute \
    --instance-id {{ INSTANCE_ID }} \
    --groups {{ SSH_SG }}

7. Check public IP-address of newly created Instance

Further we will refer to it as {{ PUBLIC_IP }}.

$ aws ec2 describe-instances \
    --instance-ids {{ INSTANCE_ID }} \
    | jq '.Reservations[0].Instances[0].PublicDnsName'

8. Log-in

Fill-in {{ PUBLIC_IP }} from previous step.

$ ssh -i ~/.ssh/snowplow-ec2.pem ec2-user@{{ PUBLIC_IP }}
Clone this wiki locally
You can’t perform that action at this time.