New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cljdoc ops 2.0 #248
cljdoc ops 2.0 #248
Conversation
207f72b
to
f6a0c79
Compare
["traefik.tags=cljdoc" | ||
"traefik.frontends.blue.rule=Host:test.cljdoc.org" | ||
"traefik.frontends.blue.rule=Host:test.cljdoc.xyz" | ||
"traefik.frontends.blue.rule=PathPrefix:/" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adjustments needed here for production.
The ADR explains the motivations well. I do worry a little bit about how much extra complexity this will add. The nginx + bash scripts route has the advantage of being easier up front as it is made up of things people understand well, though you make a good point about bit-rot. Most of cljdoc is read-only, if there was a caching CDN (or even just nginx) in front that could serve URLS out of the cache, that would seem to take away most of the downsides of more frequent deployments, though it wouldn't be as complete of a solution as what you're describing here. |
workflows: | ||
version: 2 | ||
build-and-deploy: | ||
jobs: | ||
- build | ||
- prettier | ||
- docker-deploy: | ||
requires: | ||
- build |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This needs a branch filter like deploy I think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The used Docker tag contains the branch name if it isn't master
but yeah, we probably just shouldn't push to Docker unless we're on master
.
ops/infrastructure/cljdoc.tf
Outdated
@@ -65,6 +78,15 @@ resource "aws_route53_record" "dokku" { | |||
records = ["167.99.133.5"] | |||
} | |||
|
|||
resource "aws_route53_record" "test_xyz" { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We probably don't need to add a record for test.cljdoc.xyz, if it's never been used before?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's right, I used this for testing mostly, will removed again eventually.
I mostly just clicked the request changes button to see if they'd changed the icon, as the UI looked a little different :). This seems pretty reasonable after looking over the code and seeing the explanation. It's a little bit terrifying seeing all of the extra things coming along for the ride (Nomad, Consul, Traefik), but I think the end goal of automatically deploying to production is a worthy one. One other thought I had is about the database. It has been a bit awkward to do this kind of work because we have a stateful sqlite db on the application server. If we had Postgres on a separate server then we could easily recycle app servers without losing any state. Though that adds it's own set of complexity... This seems pretty good to me. |
I was equally terrified when I started going down this road but especially Nomad and Consul have proven very predictable and are exceptionally well documented. I'm still a little bit worried about Traefik as it seems to have various minor bugs but it's working for now. Various people mentioned fabio but it doesn't handle SSL certs automatically and so we'd loose some of the niceties of Traefik in that regard.
This came into my mind a few times while working on this too. I still very much like the single instance setup that we have now (i.e. avoiding distributed systems problems) but if we later decide we want a separate instance with a database the Nomad/Consul setup will only make this easier.
Besides bit-rot I also found this setup much easier to test. Basically you just add another module "test_server" {
source = "./server_instance"
# ...
} And then run PS. I realize that a lot of what I'm writing is fairly argumentative despite not having to convince anybody. As you might be guess I'm fairly invested in this particular approach just by the sheer amount of time I spent on it. If you (not just Daniel) think this is going in the wrong direction, make yourself heard please. I'm also available on Slack as @martinklepsch. |
1e122f3
to
1048ff3
Compare
Workaround for traefik/traefik#4247 Usually Traefik and other reverse proxies handle this.
This reverts commit 7dc3f08.
1048ff3
to
d60a8a5
Compare
1b74a68
to
93bc066
Compare
Just saw this now, I missed your earlier comment. I didn't take your comments to be argumentative, just explaining clearly why you went this route. I think it's a good choice, I'm excited to see it get into prod! |
Initial groundwork for #184 and general revisit of cljdoc's devops tooling.
In short this PR:
Take a look at the new readme
ops/README.adoc
, the code responsible for deploying to Nomad and the decision record for this change.