Skip to content
This repository has been archived by the owner on Oct 12, 2023. It is now read-only.

MDN Cutover plan #43

Closed
3 tasks done
limed opened this issue Aug 22, 2018 · 7 comments
Closed
3 tasks done

MDN Cutover plan #43

limed opened this issue Aug 22, 2018 · 7 comments
Assignees

Comments

@limed
Copy link
Contributor

limed commented Aug 22, 2018

Define a detailed, step-by-step plan for switching from the MozMEAO-based to the MozIT-based infrastructure.

Acceptance Criteria

  • Create a detailed, ordered list of steps required to perform the actual cut-over from the MozMEAO to the MozIT infrastructure
  • Create a detailed, ordered list of steps required after cut-over has been completed
  • Get approval from @jwhitlock, @limed, @metadave, and @jgmize
@limed
Copy link
Contributor Author

limed commented Aug 22, 2018

After a conversation with our TAM (technical account manager) this is solution that we have

15:14:04 ChrisAWS | Okay, I’ve got a possible solution. 15:14:45 ChrisAWS | The TXT record mentioned in that article is just to prove that you control the domain. Once AWS can validate that you control it, they will point traffic at the new distribution.
15:15:09 ChrisAWS | That article is more for people who have had their CNAME associated with a distro by someone not in their org.
15:15:20 ChrisAWS | In your case, here’s what you can do:
15:16:27 ChrisAWS | Let’s call the old distro 1 and the new distro 2
15:16:45 ChrisAWS | You set up distro 2 in the new account, then point the CNAME at distro 2.
15:17:06 ChrisAWS | Cloudfront will continue to route traffic to distro 1, because it’s the distro associated with the CNAME in our system.
15:17:46 ChrisAWS | You then open a support case and we validate that you control the CNAME by noticing that it points to distro 2.
15:17:58 ChrisAWS | And we swap the CNAME association to distro 2.
15:18:35 ChrisAWS | It takes about 15-20 minutes for that change to propagate across our edge locations, during which time both distro 1 and distro 2 receive traffic. After that, all traffic will route to distro 2
15:18:40 ChrisAWS | Would that work?

Read this as well: https://aws.amazon.com/premiumsupport/knowledge-center/resolve-cnamealreadyexists-error/

@escattone
Copy link
Contributor

Original comment from @limed:

Taken from mozilla-itcloud/mdn-migration-project#20

comment 1:

There are some unknowns about moving the CloudFront distro to our account.

developer.mozilla.org is on infoblox, is a CNAME to mdn-prod.moz.works, which is a CNAME to a MozMEAO cloudfront distro.

Cloudfront needs to know that it fronts developer.mozilla.org. This name can only exist on one cloudfront distro. Our best understanding is that it can take up to a day for cloudfront to recognize that it no longer handles a specific fronting domain, which means we must detach the CDN at least a day before cutover. (this data could be outdated)

25% of requests are handled by the CDN, 75% make it to the backend.

We can go live without a CDN, for improved debugging capability, and then add our own cloudfront distro a day or so later.

comment 2:

I thought about this some more, and I'm not as confident in my 25%/75% estimate.

We moved developer.mozilla.org to a CDN in April. Before this, static assets were served from a CDN, but the HTML requests went to Django. After the change, CloudFront had a chance to serve the initial page from cache (with a low 5-minute timeout) and the assets were served from developer.mozilla.org as well.

It is possible that the assets are a significant portion of the CDN traffic, and we'd need more than a 25% boost in nodes to handle the additional traffic.

If others think this traffic will be significant, we could try splitting the traffic again, sending assets traffic to a different domain name, handled by CloudFront with an origin in the new cluster. We'll then be back to the 25% increase in traffic when dropping the CDN. We can re-combine when the developer.mozilla.org CDN points to the new datacenter

Another option is Fastly, which wants to give MDN free CDN service, and might be a better fit than CloudFront. This may help reduce the time where the host is not covered by a CDN.

@escattone escattone added this to the Sprint 1 Q4 2018 milestone Sep 28, 2018
@escattone escattone self-assigned this Sep 28, 2018
@escattone escattone mentioned this issue Sep 28, 2018
5 tasks
@escattone
Copy link
Contributor

@limed
Copy link
Contributor Author

limed commented Oct 4, 2018

We have opened a support request to clarify more on the CDN migration and this is what they responded with

Hello, Geoff here with AWS Support.
My understanding is in general you want to migrate traffic from one CloudFront distribution to nother with no downtime. The key part to look at here is CloudFront serves traffic based on which distribution the CNAME is attached. Since CloudFront shares IP addresses, the CNAME is what dictates where the traffic is served. So the migration would simply be a CNAME swap which support can do based on the document below [1].
Basically you create a TXT record to prove domain ownership and then myself or another engineer can swap this per you conveniences. Assuming your destination distribution is setup correctly, this will mean no downtime as the CNAME will be migrated between distributions on our side and will take roughly 10-20 minutes to complete.

So this means our migration will involve AWS staff in some way or another

@escattone
Copy link
Contributor

escattone commented Oct 4, 2018

FYI. I posted this question to AWS today:

Hi Geoff,

Thanks for your responses. So this is my understanding of the process:

  1. We create a DNS TXT record for the CNAME (we'd like removed from the current
    CDN and added to the destination CDN). The value of the TXT record should be the
    destination CDN domain name.

  2. You (AWS) remove the CNAME from the list of CNAME's of the current CDN
    and add it to the list of CNAME's for the destination CDN.

  3. We update our DNS for the CNAME to point to the destination CDN.

It seem to me, however, that as soon as step #2 is completed, the traffic we're
currently sending to the current CDN will start failing (since the CNAME has
been removed) and keep failing until we complete step #3. So I don't see how
this process avoids downtime. Am I misunderstanding something?

Thanks,
Ryan

and their response was really encouraging:

Hi Ryan,

CloudFront shares IP addresses and uses the CNAME to determine which
distribution to route traffic towards. So let's say we have the following:


DNS:
d111.cloudfront.net CNAME example.com

CloudFront:
d111.cloudfront.net CNAME example.com

When I migrate the domain, this turns into:
DNS:
d111.cloudfront.net CNAME example.com

CloudFront:
d222.cloudfront.net CNAME example.com

At this point, as long as d111.cloudfront.net is still resolvable (DNS),
traffic will route to d222.cloudfront.net once it hits CloudFront. So
your DNS change is more for overall architecture rather than making
this work. I hope this clears things up a little more. The key thing to
keep in mind is DNS is completely separate from CloudFront CNAMEs.
Please let me know if you have any other questions or concerns.

Best regards,

Geoff G.
Amazon Web Services

@limed
Copy link
Contributor Author

limed commented Oct 4, 2018

We will want to do a dry run migration as well on the stage instance where we roll forward to the new cluster and test out what AWS is saying here, and then roll it back to the old cluster.

@jwhitlock
Copy link
Contributor

Follow-on work is tracked in #90

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants