Skip to content
hemangandhi edited this page Aug 9, 2018 · 10 revisions

How, Why, and What, HTTP is all about.

First, we answer the questions:

We then have more explicit instructions with best practices, timings, and details. We have three ways:

This will be migrated to the frontend wiki as soon as it is convenient to do so.

What

HTTPS is security. The internet, basically, is a telephone line with millions of wiretaps every foot. This means, every foot, there's a different person who could listen to everything you say on the faulty phone of HTTP on TCP, or your conventional internet connection. This is OK if you're not private, or downloading some pictures or browsing Wikipedia. However, this breaks down the minute you need privacy. You then need two assurances that HTTP doesn't have:

  • That the site you're at is the site you think it is.
  • That nobody can listen in on your communications with the trusty site.

To see this written by experts, google it, because there are loads of people dying to tell you why you should use their handy service to secure your site.

Why

As suggested above: security. Most browsers will yell a frontend that has LCS login without https. With good reason. We should not allow hackers to be hacked, and exposing plaintext passwords is a great way to do that.

So, HTTPS is a nice way to be secure.

How

Here's Shrek with some insight:

Shrek: For your information, there's a lot more to ogres than people think.

Donkey: Example?

Shrek: Example... uh... ogres are like onions!

[holds up an onion, which Donkey sniffs]

Donkey: They stink?

Shrek: Yes... No!

Donkey: Oh, they make you cry?

Shrek: No!

Donkey: Oh, you leave 'em out in the sun, they get all brown, start sproutin' little white hairs...

Shrek: [peels an onion] NO! Layers. Onions have layers. Ogres have layers... You get it? We both have layers.

[walks off]

Donkey: Oh, you both have LAYERS. Oh. You know, not everybody like onions. CAKE! Everybody loves cake! Cakes have layers!

Shrek: I don't care what everyone likes! Ogres are not like cakes.

Donkey: You know what ELSE everybody likes? Parfaits! Have you ever met a person, you say, "Let's get some parfait," they say, "Hell no, I don't like no parfait."? Parfaits are delicious!

Shrek: NO! You dense, irritating, miniature beast of burden! Ogres are like onions! End of story! Bye-bye! See ya later.

Donkey: Parfait's may be the most delicious thing on the whole damn planet!

:%s/orges/web deployments/gi, basically, and here are the layers:

layer non-AWS example AWS thing used
The web server A static file system, node, flask S3 static sites
SSL certifier OpenSSL, let's encrypt (if you googled https, you probably stumbled into a few) ACM
Reverse-proxy Nginx, apache server Cloudfront
DNS Google domains, domains.com Route 53

There are many paths through these layers since there are many services to do this. Generally, switching services is harder, so we've just stuck to AWS. Our way, we think, is the cheapest. The site should be virtually free, but we'll update you on that. Now for each layer of our Ogre:

The web server

The site is static. This means that you don't need a fancy server for the site. Just anything that can send out files. Because that's all you do: send the file that was asked for through http. (Well https, but that doesn't change what you send, just how.)

This means, in the build folder, you can use any of a multitude of one-liners.

Technically, if you don't want to roll your own server, github pages would suffice.

This, however, won't let you control your domain and URLs. Hence, we use S3, a glorified online hard disk - like dropbox but less intuitive. We recommend the AWS CLI and a quick script to do the uploading.

npm run-script build
AWS_ACCESS_KEY_ID="$KEY_ID" AWS_SECRET_ACCESS_KEY="$SECRET" aws s3 cp --recursive build s3://hackru-frontend --grants read=uri=http://acs.amazonaws.com/groups/global/AllUsers

Note that the keys and secrets are things that you need from AWS.

SSL Certificates

This is a key that people can test to be yours, but one they can't replicate. There's some sick one-way function math, which is better explained elsewhere.

There are many free ways to get this, so if you're paying, know why.

The way we chose is AWS ACM. This requires DNS set up so you can prove domain ownership - so you're securing your stuff and not stealing identity.

It's not difficult to follow the site and has only a few steps:

  1. Go to the place for the thing.
  2. Enter your domain. The simplest entries are *.your-domain.thing and your-domain.thing. This lets you secure any subdomain without certifying each one and is 100% legit.
  3. Enter DNS validation if you're ready for it. (The email validator sends it to a pre-determined list of emails that you'd have to set up.)
  4. Wait. The validation can take up to an hour, depending on the DNS set-up.

With let's encrypt and a decent Linux box, you can probably get this down to a shell script, but that's too cool for me.

Reverse-proxy

This is just to clean-up the way your system handles its https server. It adds buffering, caching, and a lot of goodies, to ensure that your feeble static server isn't overrun. A good enough reverse-proxy can also be the static server, like nginx. We defer to experts on what this is.

The AWS way to do this is to use cloudfront. You go to the place and create a distribution. Then, you add a few details:

  • The SSL certificate (if you have it)
  • Redirect http to https (makes it way easier to use your site)
  • Add a CNAME that is your domain.
  • Make sure you use TLSv1 or higher.
  • Make the alias the S3 bucket web URL (the one you tested above)

You can create the distribution then. It's not hard to tweak, but changes take up to half a day to propagate.

Any time you change the site, you might have to invalidate the cache to see the changes quickly. This is if you change the S3 bucket. Note that 1000+ file invalidations a month will charge you $0.015 per file (or something).

An nginx example is provided later, for completeness.

DNS

DNS is the translation from names (your-hackathon.thing) to IP addresses (3.14.15.92). Every web service has this in some form. This is the only thing you can't just use your server for.

We use AWS's route 53. This is because, with all the AWS stuff above, we can configure this quickly. Some important records to add to your domain (called a hosted zone in AWS):

  • A - this is the map from your domain to the cloudfront URL.
  • AAAA - same as A but for IP v6. This is all that's needed. If you bought the domain on another service, you just need to tell it that you're using your own name servers. Then copy-paste the entire NS record.

Notes on GSuite

If you have email services through Google and had google domains, you need the google MX records. For Google, the following suffice as a single MX record, newline and all:

1 aspmx.l.google.com.
5 alt1.aspmx.l.google.com.
5 alt2.aspmx.l.google.com.
10 alt3.aspmx.l.google.com.
10 alt4.aspmx.l.google.com.

This would allow email@your-domain.org to work as expected through gmail. Note that drive, and probably some other GSuite goodies do not abruptly die when changing name servers.

AWS step by step

Here are all the instructions. This supposes that you have to launch at 10am on day 2.

  1. (Day 1, 10 am) Get your domain. Use whatever you want to buy it, like google domains.
  2. (Day 1, 10:15 ish) Make an S3 bucket and upload a version of your site. We recommend this to be a splash page, in case attendees know your domain and may visit.
    • Make sure the bucket is configured to be a website. This means public readability for all files and the static-site option on S3. This option can be found in the Properties tab after clicking on your bucket on the S3 site.
    • Supposing that you called the bucket my-hack-deploy, http://my-hack-deploy-s3-website.amazonaws.com should be a link to your splash site.
  3. Point your DNS at the bucket. The this record:
    • A with your domain name as the name and the s3 URL (that you just tested).
  4. (Day 1, 10:45 ish) Get an SSL certificate.
    1. Go to the place for the thing.
    2. Enter your domain. The simplest entries are *.your-domain.thing and your-domain.thing. This lets you secure any subdomain without certifying each one and is 100% legit.
    3. Enter DNS validation if you're ready for it. (The email validator sends it to a pre-determined list of emails that you'd have to set up.) This means you'll have to set up a CNAME record in your DNS. You can set the time to live (TTL) to something low to reduce your wait time.
    4. Wait. The validation can take up to an hour, depending on the DNS set-up and the TTL you set above.
  5. (Day 1, 12pm latest) Set up cloudfront. Go to the cloudfront place and create a distribution. Set the following:
    • The SSL certificate (validated as above)
    • Redirect http to https (makes it way easier to use your site)
    • Add a CNAME that is your domain.
    • Make sure you use TLSv1 or higher.
    • Make the alias the S3 bucket web URL (the one you tested above)
  6. (Day 1, 9 pm or so, yeah, it takes about that long) Make sure that the distribution state on cloudfront is "Enabled" and the status "deployed".
  7. (Day 1, 9 pm) Create a hosted zone on route 53. Add the following records (you may need one for www.your-domain.thing and your-domain.thing:
    • A - this is the map from your domain to the cloudfront URL.
    • AAAA - same as A but for IP v6.
    • CNAME - the thing you had for DNS validation when getting the certificate before.
    • If you use your old DNS for emailing, consider MX records, details about GSuite are above.
  8. (Day 1, 9:15 pm) Point the names server to the NS record in route 53. DNS propagation can take a while. You should keep going to your domain and ensuring that http://your-domain.thing redirects to https://your-domain.thing, the certificate is validated by your browser and you get the site you expect. This means a green padlock in your browser's address bar (most likely).
  9. (Day 2, 9 am) Double-check that the DNS has propagated. Be sure to check on whatsmydns.com for your domain. Basically, expect the green check marks next to every location.
  10. (Day 2, 9:30 am) Upload the site you want live and invalidate the cloudfront cache.
  11. (Day 2, 10 am) LAUNCH!!! 🎉 now kickback and let the registrations roll in.

Roll Your Own

OK, AWS is... eh. Here's how to roll your own. This is a bit more flexible, but more expensive. I assume you have a server that's online with port 443 and 80 exposed to the whole world. These instructions are general Linux things. But you can find distro specific instructions without too much pain. Ain't nobody got time for figuring it out on windows or similar trash. There are more details elsewhere and this is just to give you an idea of the process.

I assume you can get your server on some user port like 8080, 5000, or 3000. Or you have static files. Whichever, get those up on your server. Also, expose 8080 for the time being so that you can access your site. Then, you're ready for the following:

Step 1: The domain, part 1.

Just set up a domain to point at 8080 on your server. If you're just serving static files, do the nginx steps and then don't add SSL right now. Once you get your URL to point to your server at the temporary port, this is done.

Step 2: SSL cert

As script like this one should suffice. It may ask you to upload a file to your server (in a specific location) or add to your DNS configuration. Just follow those instructions.

Step 3: Nginx

If you're serving static files, this nginx config will do the trick as /etc/nginx.conf

# this specifies that there is a WSGI server running on port 8000
user ec2-user;

events{
    worker_connections 4096;
}

# Nginx is set up to run on the standard HTTP port and listen for requests
http{

    server {
            listen 80;
            listen [::]:80;
            server_name hackru.org www.hackru.org; 
            rewrite ^/(.*)/$ /$1 permanent;
            return 301 https://hackru.org/$request_uri;
    }

    server {
            listen 443;
            server_name www.hackru.org; 
            ssl_certificate /etc/ssl/www.hackru.org.crt;
            ssl_certificate_key /etc/ssl/www.hackru.org.key;
            rewrite ^/(.*)/$ /$1 permanent;
            return 301 https://hackru.org$request_uri;
    }

    server {
            autoindex on;
            sendfile on;

            listen 443 ssl;
            server_name hackru.org;
            ssl_certificate /etc/ssl/hackru.org.crt;
            ssl_certificate_key /etc/ssl/hackru.org.key;
            ssl_protocols TLSv1.2 TLSv1.1 TLSv1;
            ssl_ciphers         HIGH:!aNULL:!MD5;

            rewrite ^/(.*)/$ /$1 permanent;


        location / {
                include /etc/nginx/mime.types;
                autoindex on;
                alias /home/ec2-user/frontend/build/;
        }

    }
}

So this redireccts all the traffic to https://hackru.org. But you need the SSL set up to provide the files as shown in the config.

If you have a server, change the alias line in the location block in the config to the following:

                proxy_pass http://localhost:8080;

After installing nginx, you have to start it as a service, something like sudo service nginx start. You can use sudo service nginx restart every time you edit the config.

Step 4: Update your DNS

Now you can change the alias and the CNAME on your DNS to point at your server's URL without the port, and cease to expose the insecure, user port.

The Github.io Alternative

Yeah, if you don't care about the domain name, you can just set up GitHub pages. It's secure, proxied and works.

The Web: A Rant

So this process is complex and sucks. So let's rant about it. (This is catharsis for the author and really not needed unless, after this whole ordeal, you too need some catharsis.)

The internet started with trust. It was trivial. You could hack telephone lines to make one essentially useless room called a computer chat useless stuff to another. That was the 70s and this ARPANET was great. Professors could reliably not scam each other and stuff was safe and people were happy.

Fast-forward the entire life of a gen-x person and you get to today. And a lot has been learned about connecting a bunch of computers on hacked phone lines.

Firstly, much like with phones, routing sucks. They used to have humans do it! Thank God I don't have to reveal the kind of... reddit I browse as they awkwardly connect my useless chips to some other operator and so on until I get to r/Aww or whatever. But this means computers do it. And computers are great when stuff works, but stuff doesn't as told to us by some Murphy dude and his law. So now, it's not a human trying to connect you and who can communicate their condolences that a storm took out the one wire to the one computer that has my dreamy r/Aww; it's a computer that quietly shrugs its shoulders and tells mine "sucks." Then all mine does is try again. May be it'll get a different route. May be it'll work. May be the old message was just a bit slow, and actually made it and is on the way back. But instead of an operator, you get computers trying their best and they can't tell you all their problems. So you really don't know what's up with the internet. Serves us right for hacking our phone lines so that computers could chat.

And even the polite human voices that were meant to chat on phones suck sometimes. So they suck more on the internet. Just as somebody can call you and claim to be a loan firm in desperate need of your SSN, websites can pretend to be your bank. People can tap into phone lines too. So too can the web routers. So you're trying to shout your bank information in a secret code over a room of convicts every time you bank online. And it works because math. So we too suck, much like our patchwork we pride ourselves in called the internet.

This is why HTTPS is needed and hard to get (or takes 9 hours, which to a 20-year old is forever). Because in those 9 hours, every robot operator has to get the memo that "Mr. Your Hackathon at domain" has moved house and only receives their calls through the secure "TLS 1" system, with such and such as proof of identification. Any part of that message can vanish. Any operator can get that message any number of times. If there's a new operator, somebody must tell them, and hopefully, that somebody will have gotten the right version.

To top it, people know that this process is dumb. So everybody does their own thing to try to make it make sense, because to understand the whole mess would probably entail a PhD or two. But each person must not only be a trusted broker of trust, but be able to tell everybody who they are. And they have their own jargon, preferences and policies.

So the web sucks and deploying onto it securely is still a bit of an ogre.