Pronsy is a DNS proxy written in Go that listen UDP and TCP requests from the client and resolves the petitions against a DNS/TLS server, like CloudFlare or Google.
## build and run it using docker
make docker-build && make docker-run
## run it in your pc
make run
## test it solving a domain
dig blog.charlei.xyz @127.0.0.1 -p 5353 +tcp
dig blog.charlei.xyz @127.0.0.1 -p 5353
All the configurations so far are made via environment variables. It is possible to set your own configurations from the following files depending on how the application is launched.
- In the
docker-compose.yaml
in case you are running it withmake docker-run
- In the
env.env
if running withmake run
The variables in the files are almost self-explanatory, but they are also mentioned along this document.
A little speak about my code rather than the project itself. I wrote my code using a blend of Domain Driven Design and Clean Architecture.
From my perspective this is very useful for two main reasons:
-
Way easier to test compared to other patterns since it allows to create your own mocks and stubs with the use of interfaces and dependency injection (besides this project has no tests because of time reasons).
-
The order. The domain is the most important and you shouldn't contamine it with implementation details. That's the job of the gateways and the controllers.
Pronsy handles UDP and TCP DNS petitions. For the TCP implementation it uses an
atomic counter to limit the number of active connections and make use of
goroutines to handle concurrent requests. This can be configured with the
environment variable PRONSY_TCPMAXCONNPOOL
The UDP implementation has a more elaborated approach, it features a custom queue created on top of a channel and the limit of the 'handled messages' is set by the channel buffer size. The message to be solved by the 'Proxy Service' goes through a goroutine that reads the message from the socket and send it to the queue. On the other side, there is a 'Dequeue' function that also works concurrently getting the messages from the queue and sending them to the handler function where the petition is finally solved calling the Proxy Service. For the UDP implementation the goroutines are limited by the number of CPUs the host machine have.
It is possible to change the buffer size of the message queue with
PRONSY_UDPMAXQUEUESIZE
.
I ran some tests under different conditions to see how the UDP resolution behaves. All the tests were performed in my local machine, a Laptop with an Intel i7-1165g7 8 cores @ 4.70ghz 16GB of RAM DDR4, using ArchLinux 5.16.1-arch1-1.
The tool used for the tests was DNSBlast. It sends random generated domains to a given DNS Resolver.
$ time ./dnsblast 127.0.0.1 1000 100 5353
- Conditions: Using CloudFlare, buffer size of 1000, sending 1000 requests. 100 request per second.
Queries Sent: 1000
Queries Received: 1000
Elapsed time: 49.986s
Reply Rate: 20 pps
- Conditions: Using CloudFlare, buffer size of 10000, sending 1000 requests. 100 request per second.
Queries Sent: 1000
Queries Received: 1000
Elapsed time: 35.370s
Reply Rate: 28 pps
- Conditions: Using CloudFlare, buffer size of 100000, sending 1000 requests. 100 request per second.
Queries Sent: 1000
Queries Received: 1000
Elapsed time: 19.171s
Reply Rate: 52 pps
- Conditions: Using CloudFlare, buffer size of 1000000, sending 1000 requests. 100 request per second.
Queries Sent: 1000
Queries Received: 1000
Elapsed time: 20.011s
Reply Rate: 49 pps
It seems I found a limit here since 100000 or 1000000 buffer size perform almost the same.
I did the test directly against CloudFlare, no proxy in the middle, and this is what I got:
Queries Sent: 1000
Queries Received: 631
Elapsed time: 12.941s
Reply Rate: 61 pps
As expected, CloudFlare performs better than my local proxy but also seems to limit the requests you send them and I guess that's why I'm not receiving response to all of the queries DNSBlast sent.
The resolver, at a software development level, is the package that knows how to talk with a DNS/TLS provider to solve domains. It hides the implementation details from the domain.
It automatically gets the TLS connection working and retrieves the RootCAs needed, and that enables Pronsy to talk with different providers.
By default it's using CloudFlare as DNS Provider. It can be changed
when the application is started changing the value of the PRONSY_PROVIDERHOST
environment variable.
Pronsy features a really basic 'home-made' in-memory cache that saves the recently solved domains to avoid losing time querying against the DNS Provider.
It's just a map protected with a sync/Mutex that is locked and unlocked by the goroutines accesing it.
This feature can be disabled by setting the PRONSY_CACHEENABLED
environment
variable to false
. The data from the cache is flushed every N seconds. It's
possible to assign a value to that N with the environment variable
PRONSY_CACHETTL
.
This cache implementation is not tied to the application and can be changed
easily if desired. All what is needed is to write a new implementation
compliant with the proxy.Cache
interface.
If I get to deploy this solution in a more 'production ready' environment I would create an implementation to use Redis. That way I could spin multiple replicas of Pronsy while having a centralized cache server shared among the replicas.
This feature was not fully developed because of time reasons.
The domain package can be found with the interfaces needed to start this
service. Also a not fully developed Rest API to manage the denylist with
methods like AddDomain
, GetDomain
, DeleteDomain
or GetDomains
is in
the codebase but it's not being used. The API is initialized but only answer to
GET /ping
.
The use case for this feature was to tell Pronsy to not resolve some blocked domains by an administrator just like PiHole or Blocky do.
Most of the packages of Pronsy can be injected with a Logger. Just like the cache, this can be changed for a different implementation as well as it implements the methods of the Logger interface.
The STD Output implementation shipped with this codebase allows Pronsy to log messages in three different levels:
- Debug
- Info
- Error
A useful implementation could be the integration with a 3rd party log service such as AWS CloudWatch or any other custom service. It is possible (and desirable, from my perspective) to write all the code and inject the dependency to the packages to start logging to CloudWatch without modifying the domain of our application.
If I were to give a solution for a production environment running multiple replicas of Pronsy I would use a std output scrapper that sends the logs to a different system where me or a group of teams can watch them. Solutions like Logstash/Kibana or Loki/Promtail can be really useful to accomplish this.
Imagine this proxy being deployed in an infrastructure. What would be the security concerns you would raise?
If somebody get to sniff the incoming traffic to the proxy they can get to know the domains that are being queried by the clients since the traffic between the client and the proxy travels without encryption.
For a correct use of this solution, it should be deployed in a private network were only the clients you want to use it can reach it. That way the unencrypted traffic only goes through your network and never reaches internet. The other side of the proxy is more secure since the traffic travels encrypted to the DNS/TLS Provider.
How would you integrate that solution in a distributed, microservices-oriented and containerized architecture?
I have, at least, two approaches, with their trade offs and concerns.
One of them is to deploy it as a service with all the replicas needed behind an internal load balancer and make it available for other services within a private network. In AWS it can be configured as a DNS Server for a VPC enforcing all the hosts of that network to use it.
I have some concerns about the resolution of private domains, but I think it's possible to make that a feature in Pronsy.
On the other side, If you are using something like Kubernetes you can just deploy Pronsy as a sidecar for every application in the cluster the way Istio does.
I found two problems to solve with this approach.
- The observability. It's a complex task to get all the logs of all your sidecars in a big infrasctructure. Also, there is a pro about this approach regarding the observability: It could be easier to identify which pod/applications is doing which request. (Also a feature opportunity for Pronsy.)
- The compute resources. 1 pod to 1 sidecar of DNS proxy can be an overkill most of the times and it's possible that having so many replicas of the proxy can require a bigger infrasctructure when it can be avoided.
I would go deeper with the development of the bonus features I added to the project.
Logs feature can go further. It's a good idea to me to create a Logger implementation that push logs away, to an external system.
For a production environment the cache could be a key feature. I would rely on an external cache system to enable scalability
The deny/block list it's a really nice to have here. This capability can be extended to block domains or also block IPs.
Metrics exposure. This project already have logs. But we can add some metrics endpoints, Prometheus style, to feed dashboards to quickly see how many petitions are solved successfully, how many of them failed, why do they failed, the most petitioned domains, cache metrics, blocked domains metrics and so on.
Tha ability to talk with more than one DNS Provider. Maybe add a 'secondary' DNS as a fallback.
Regarding private domains, and in addition to the previous feature, it would be nice to be able to define rules for routing certain domains to certain providers. That way I could set up the DNS Proxy to resolve all the *.mycompany.net internal domains to an internal DNS and avoid sending them to a provider like CloudFlare or Google that won't be able to solve my domain.