Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Critical issue with response poisoning #132

Closed
DejanPelzel opened this issue Aug 10, 2021 · 8 comments · Fixed by #146
Closed

Critical issue with response poisoning #132

DejanPelzel opened this issue Aug 10, 2021 · 8 comments · Fixed by #146
Assignees
Milestone

Comments

@DejanPelzel
Copy link

DejanPelzel commented Aug 10, 2021

It seems there is an issue going on with the responses being poisoned by old packets due to UdpClient reuse. I'm not sure what exactly causes this to happen, but it seems that the DNS server sends multiple packets for whatever reason or if we receive a broken packet of sorts I assume, the next query might receive the response from the query before itself.

  • This scales beyond multiple instances since the s_clients is static.
  • This gets resolved by setting _enableClientQueue to false.
  • Due to the nature of UDP, I think packets on the response should likely need to be validated by the ID, such as: if (response.Header.Id != request.Header.Id). Without this, we're also potentially exposing the client to external attacks.

Additionally, s_random is a singleton used between multiple threads, which can also create cross-thread issues and result in identical numbers returned. See: https://stackoverflow.com/questions/3049467/is-c-sharp-random-number-generator-thread-safe

This is a critical flaw, making the client potentially dangerous given the scale on which it is used.

Digging deeper, by modifying the DnsUdpMessageHandler's QueryAsync, we are able to read multiple queries in a single QueryAsync call.

if (response.Header.Id != request.Header.Id)
{
    Console.WriteLine("Broken response!");

    received = await udpClient.Client.ReceiveAsync(new ArraySegment<byte>(memory.Buffer), SocketFlags.None).ConfigureAwait(false);
    response = GetResponseMessage(new ArraySegment<byte>(memory.Buffer, 0, received));

    Console.WriteLine("Received another response: " + response.Header.Id);
}

The sample code to reproduce is below. Semaphores are used just to simulate tight concurrency, especially with random generation, but I'm not sure that's even necessary here to replicate. Test environment was on Ubuntu 18, not able to reproduce on Windows locally however.

private static LookupClient _dnsClient1 = new LookupClient(new LookupClientOptions(IPAddress.Parse("8.8.8.8"), IPAddress.Parse("1.1.1.1"), IPAddress.Parse("8.8.4.4"))
        {
            CacheFailedResults = true,
            FailedResultsCacheDuration = TimeSpan.FromSeconds(10),
            ThrowDnsErrors = false,
            UseCache = false,
            UseTcpFallback = false,
            ContinueOnDnsError = true,
            MaximumCacheTimeout = TimeSpan.FromSeconds(10000),
            MinimumCacheTimeout = TimeSpan.FromSeconds(10000),
        });
        private static LookupClient _dnsClient2 = new LookupClient(new LookupClientOptions(IPAddress.Parse("8.8.8.8"), IPAddress.Parse("1.1.1.1"), IPAddress.Parse("8.8.4.4"))
        {
            CacheFailedResults = true,
            FailedResultsCacheDuration = TimeSpan.FromSeconds(10),
            ThrowDnsErrors = false,
            UseCache = false,
            UseTcpFallback = false,
            ContinueOnDnsError = true,
            MaximumCacheTimeout = TimeSpan.FromSeconds(10000),
            MinimumCacheTimeout = TimeSpan.FromSeconds(10000),
        });

        static void Main(string[] args)
        {
            var semaphore = new SemaphoreSlim(0, 2);
            int iterations = 0;
            string n1 = "1";
            string n2 = "2";
            Task.Run(async () =>
            {
                while (true)
                {
                    try
                    {
                        semaphore.Wait();
                        var r = (await _dnsClient1.QueryAsync("bing.com", QueryType.A)).Answers.FirstOrDefault() as ARecord;
                        Console.WriteLine(r.Address.ToString());
                        n1 = r.Address.ToString();
                    }
                    catch(Exception e) {
                        Console.WriteLine(e.Message);
                    }
                }
            });
            Task.Run(async () =>
            {
                while (true)
                {
                    try
                    {
                        semaphore.Wait();
                        var r = (await _dnsClient2.QueryAsync("yahoo.com", QueryType.A)).Answers.FirstOrDefault() as ARecord;
                        Console.WriteLine(r.Address.ToString());
                        n2 = r.Address.ToString();
                    }
                    catch (Exception e)
                    {
                        Console.WriteLine(e.Message);
                    }
                }
            });

            while (true)
            {
                try { semaphore.Release(2); } catch { }
                if(n1 == n2)
                {
                    Console.WriteLine("Found match after " + iterations + " iterations");
                    Console.ReadLine();
                }
                iterations++;
                Thread.Sleep(10);
            }
        }
@MichaCo
Copy link
Owner

MichaCo commented Aug 12, 2021

Hi @DejanPelzel
I'll look into it when I have a bit more time. Thanks for all the examples.

I never had any issues running things on Windows, so that at least lines up with your findings it seems.
When you test on Ubuntu 18, is that WSL or an actual Ubuntu server? WSL does some strange things to the network and I was already looking into that.

I might just remove that feature from on Linux based systems and always throw away the UdpClient. Would be pretty unfortsh though because that costs a lot of performance.

@DejanPelzel
Copy link
Author

Hi,

It was Ubuntu 18.04 LTS on a physical server.
I'm not sure if it's Linux-wide, but that's where we were able to replicate it. It could be the NIC/Kernel doing funky things. The issue is, even if it happens on a very small percentage of requests, it's possible to cause very bad things, unfortunately.

@dmitryzarubin
Copy link

dmitryzarubin commented Aug 24, 2021

Hi,

I suppose we've got the same issue in production on Windows Server. We are using DnsClient to get the addresses of all instances of a service, and then we choose a random instance and call it.
So, I found in our logs that one of our services (Main) tries to call service A, but sends requests to the hosts where service B is deployed. And vice versa when the service (Main) calls service B, it sends requests to the hosts where service A is deployed. We are using UDP protocol.

@lduran-smartertools
Copy link

We're seeing a similar issue on a Windows Server. We recently updated our software to .NET 4.8 and upgraded from DnsClient 1.3.2 to 1.5.0. We also use the DnsUdpMessageHandler to perform MXLookup queries and we started seeing results from previous request show up in the responses. After running for a few hours and making hundreds of requests we'll eventually run into the problem.

We pulled 1.5.0 locally and added the fix "if (request.Header.Id != response.Header.Id)" suggested by @DejanPelzel and it seems to have resolved the issue. We've been running this fix for a couple of days now and have not received results from previous DNS queries. It looks like that IF check used to be in 1.3.2 so maybe that's why we've never ran into this before.

@MichaCo
Copy link
Owner

MichaCo commented Sep 30, 2021

The if check does not solve anything. It just compares the headers.

In 1.5 the only change was to not throw a hard error If the Id doesnt match.

Regarding an actual fix for this. I'll probably disable any reuse of sockets until I can reliably fix and reproduce that problem, which I still cannot.

@lduran-smartertools
Copy link

Throwing the Exception causes the updClient to dispose. So I'm guessing it gets recreated on the next query and why it seems to fix for us.

@DejanPelzel
Copy link
Author

DejanPelzel commented Sep 30, 2021

The if check does not solve anything. It just compares the headers.

In 1.5 the only change was to not throw a hard error If the Id doesnt match.

Regarding an actual fix for this. I'll probably disable any reuse of sockets until I can reliably fix and reproduce that problem, which I still cannot.

It does not solve the source of the issue itself, but it solves the issue of getting the wrong response since it makes sure that the ID of the response matches the request. I guess I'd call it a workaround, not a fix, but it should be in regardless in some form. Otherwise someone can send fake packets back and just feed whatever data due to the ID not being checked.

@MichaCo
Copy link
Owner

MichaCo commented Jan 26, 2022

I will remove the pooling mechansim I had for UDP traffic in 1.6
This seems to do no good, and, after some testing, seems to resolve any of those strange behaviors (at least in my test environments)
Pooling for TCP needs to stay because of port usage, but for TCP it seems save.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants