Skip to content
This repository has been archived by the owner on Dec 26, 2022. It is now read-only.

Enable dnsmasq as a system job that runs everywhere #1976

Merged
2 commits merged into from
Sep 15, 2022

Conversation

ghost
Copy link

@ghost ghost commented Sep 13, 2022

Which issue does this PR correspond to?

https://github.com/grapl-security/issue-tracker/issues/1011

What changes does this PR make to Grapl? Why?

For AWS we need a way to change Nomad jobs so that they get dns from localhost:8600 (the Consul agent). Ideally we'd be able to do dns_server = localhost:8600, but that isn't supported. Looking at the docker docs, they kinda support it using published ports (ie mapping host ports to different ports in containers ie the reverse of normal port mapping where we publish container ports to a different host port). Nomad doesn't support published ports that I can find. As such, the easiest thing to do is to use dnsmasq as a dns forwarder.

Note: We're using containerized dnsmasq instead of host-based dnsmasq that way we only have one place we're setting up dns

How were these changes tested?

@ghost ghost requested a review from christophermaier as a code owner September 13, 2022 13:17
@ghost ghost self-assigned this Sep 13, 2022
For AWS we need a way to change Nomad jobs so that they get dns from
localhost:8600 (the Consul agent). Ideally we'd be able to do
dns_server = localhost:8600, but that isn't supported. Looking at the
docker docs, they kinda support it using published ports (ie mapping
host ports to different ports in containers ie the reverse of normal
port mapping where we publish container ports to a different host port).
Nomad doesn't support published ports that I can find. As such, the
easiest thing to do is to use dnsmasq as a dns forwarder.

Note: We're using containerized dnsmasq instead of host-based dnsmasq
that way we only have one place we're setting up dns
@ghost ghost force-pushed the twunderlich/dnsmasq-in-aws branch from 62d0b45 to a4893bc Compare September 13, 2022 13:19

config {
#This is an alpine-based dnsmasq container
image = "4km3/dnsmasq:2.85-r2"
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Long-term we'll want to switch this to either an official dnsmasq container or one that we build

ports = ["dns"]
args = [
# Send all queries for .consul to the NOMAD_IP
"--server", "/consul/${NOMAD_IP_dns}#8600",
Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note to self: doublecheck why we're doing it this way. It probably makes sense to switch to attr.unique.network.ip-address

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(how likely are we to actually follow up on this lol)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part seems like it will work both ways, so I'm going to leave it

@ghost
Copy link
Author

ghost commented Sep 13, 2022

I still need to do some testing with this, but this is generally the approach I'd like to take to forward dns to the local Consul agent.

As an overview the dns call chain will look like
container -> dnsmasq on port 53 -> consul agent -> AWS dns.
Why?
This way we get access to all *.consul domains (used to forward observability requests to the otel collector) and then we get normal dns. Not either or.

Some risks that we need to take a look at:
What happens if dnsmasq fails to come up? Does it need to come up first?

@codecov
Copy link

codecov bot commented Sep 13, 2022

Codecov Report

Base: 44.17% // Head: 44.17% // Increases project coverage by +0.00% 🎉

Coverage data is based on head (b1a5b11) compared to base (5847298).
Patch coverage: 0.00% of modified lines in pull request are covered.

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1976   +/-   ##
=======================================
  Coverage   44.17%   44.17%           
=======================================
  Files         473      473           
  Lines       13381    13382    +1     
  Branches       23       23           
=======================================
+ Hits         5911     5912    +1     
  Misses       7455     7455           
  Partials       15       15           
Impacted Files Coverage Δ
pulumi/grapl/__main__.py 0.00% <0.00%> (ø)
src/rust/sysmon-parser/src/util.rs 59.72% <0.00%> (+0.46%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

ports = ["dns"]
args = [
# Send all queries for .consul to the NOMAD_IP
"--server", "/consul/${NOMAD_IP_dns}#8600",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(how likely are we to actually follow up on this lol)

Comment on lines 323 to 325
custom_timeouts=CustomTimeouts(
create=nomad_grapl_core_timeout, update=nomad_grapl_core_timeout
),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

honestly we prob don't need this for a single image

@ghost
Copy link
Author

ghost commented Sep 14, 2022

I've investigated alternative approaches:

  1. For production, it may make sense to set up the consul terraform sync code. This is the model that k8s typically uses.
    2 I've also looked into getting rid of all consul dns. Right now the only place this is being used is by Consul to figure out where to forward traces to. We could do this, but it would require a fair amount of hackery, especially since go-sockaddr isn't supported

@ghost ghost merged commit 5e4c88a into main Sep 15, 2022
@ghost ghost deleted the twunderlich/dnsmasq-in-aws branch September 15, 2022 17:56
This pull request was closed.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants