Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

dnsdist: Fix ECS addition when the OPT record is not the last one #8115

Merged
merged 2 commits into from Feb 7, 2020

Conversation

rgacogne
Copy link
Member

@rgacogne rgacogne commented Jul 22, 2019

Short description

When the OPT record in the query is not the last one, for example if it is followed by a TSIG, dnsdist would not detect it correctly and would wrongly add a new OPT record if configured to add ECS information to the incoming query.

This issue went unnoticed for quite some time and the TSIG signature will be broken anyway so I don't think we should rush this fix for 1.4.0, therefore I'm scheduling it for 1.5.0 instead.

Fixes #8098.

Checklist

I have:

  • read the CONTRIBUTING.md document
  • compiled this code
  • tested this code
  • included documentation (including possible behaviour changes)
  • documented the code
  • added or modified regression test(s)
  • added or modified unit test(s)

Copy link
Contributor

@gibson042 gibson042 left a comment

This seems to be a detour from changes that are really needed—it's never necessary to copy the entire packet, only to move the part of it following the position for new content. Adopting inline suggestions would both eliminate slowRewriteQueryWithExistingEDNS and fix bugs like #8321, and also support a more comprehensible flow for handleEDNSClientSubnet:

int res = getEDNSOptionsStart(packet, consumed, *len, &optRDPosition, &remaining);
if (res != 0) {
  /* no preexisting OPT; construct and add one */
  return addEDNSWithECS(packet, packetSize, len, newECSOption, ednsAdded, ecsAdded, preserveTrailingData);
}
res = getEDNSOption(reinterpret_cast<char*>(optRDLen), remaining, EDNSOptionCode::ECS, &ecsOptionStart, &ecsOptionSize);
if (res != 0) {
  /* no preexisting ECS option; add one to the OPT */
  return addECSToExistingOPT(packet, packetSize, len, newECSOption, optRDLen, ecsAdded);
}
if (overrideExisting) {
  /* replace a preexisting ECS option */
  return replaceEDNSClientSubnetOption(packet, packetSize, len, ecsOptionStart, ecsOptionSize, optRDLen, newECSOption);
}
return true;

It would also be possible for a future refactoring to eliminate the various ECS-specific functions altogether by abstracting the general pattern (find prexisting content or the position for new content, generate new content, verify sufficient capacity, move everything following the position as necessary, copy new content into place).

@@ -348,12 +540,12 @@ static bool addECSToExistingOPT(char* const packet, size_t const packetSize, uin

memcpy(packet + *len, newECSOption.c_str(), newECSOptionSize);
Copy link
Contributor

@gibson042 gibson042 Sep 22, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
memcpy(packet + *len, newECSOption.c_str(), newECSOptionSize);
/* make room for the new option by moving all data after the OPT */
unsigned char* oldSuffix = optRDLen + 2 + newRDLen - newECSOptionSize;
assert(oldSuffix < packet);
memmove(oldSuffix + newECSOptionSize, oldSuffix, *len - (oldSuffix - packet));
memcpy(oldSuffix, newECSOption.c_str(), newECSOption.size());
/* TODO: update compression pointers in trailing records that reference offsets >= oldSuffix */
/* it's rare for any record to follow OPT and rarer still to reference a name appearing so deep, */
/* but we shouldn't leave potentially exploitable holes */

And similar changes should be made in replaceEDNSClientSubnetOption:

     if (dataBehindSize > 0) {
+      /* resize the old ECS option by moving all data after it to a position after the new end */
-      memmove(oldEcsOptionStart, oldEcsOptionStart + oldEcsOptionSize, dataBehindSize);
+      memmove(oldEcsOptionStart + newECSOption.size(), oldEcsOptionStart + oldEcsOptionSize, dataBehindSize);
+      /* TODO: update compression pointers in trailing records that reference offsets >= oldEcsOptionStart. */
     }
-    memcpy(oldEcsOptionStart + dataBehindSize, newECSOption.c_str(), newECSOption.size());
+    memcpy(oldEcsOptionStart, newECSOption.c_str(), newECSOption.size());

@rgacogne
Copy link
Member Author

@rgacogne rgacogne commented Sep 22, 2019

You are assuming no compression, which I don't think will hold.

@gibson042
Copy link
Contributor

@gibson042 gibson042 commented Sep 23, 2019

No, I explicitly mention fixing up compression pointers but have just not provided sample code for doing so ("TODO: update compression pointers in trailing records that reference offsets >= oldSuffix"). But even without that, adopting my suggestions would still be an improvement—the currently proposed changes of this PR corrupt the DNS message whenever anything follows the insertion point for new data; my suggestions unnecessarily corrupt the DNS message only when the inserted/altered OPT is not the last record and a compression pointer in a later record references an offset that falls on or after the insertion point. And as I also state in the TODO, "it's rare for any record to follow OPT and rarer still to reference a name appearing so deep, but we shouldn't leave potentially exploitable holes".

@rgacogne
Copy link
Member Author

@rgacogne rgacogne commented Sep 23, 2019

I do not understand what you mean, the goal of this PR is to deal with cases where the OPT record is not the last one. Why are you stating that "the currently proposed changes of this PR corrupt the DNS message whenever anything follows the insertion point for new data"?

Most of the code in this PR is here because we don't want to send garbage when compression pointers are in use after the OPT record, so yes, I'm a bit worried about a proposal to remove that code while stating that it does not handle that case.

@gibson042
Copy link
Contributor

@gibson042 gibson042 commented Sep 24, 2019

You misunderstand... I wasn't saying to skip the pointer rewriting, I was stubbing that work out in my short suggestions because the code is substantial. However, creating a copy of the packet with slowRewriteQueryWithExistingEDNS is still overkill when all that's needed is to shift and preserve the existing content as-is except for pointers that reference offsets at or after the insertion point (which necessarily follow the insertion point)—and such pointers should either get arithmetically adjusted by the amount that the content after the insertion point shifted OR expanded because that shift pushed the compression reference out of bounds. Consider:

size_t nextNameOffset = sizeof(dnsheader) + consumed + 4;
unsigned int nameLen = 0;
uint16_t remaining, rtype, rclass, rdlen;
try {
  /* skip over the question/answer/authority sections, then try to find an OPT RR */
  for(remaining = ntohs(dh->qdcount) - 1; remaining-- > 0; ) {
    DNSName q(packet, packetSize, nextNameOffset, true, &rtype, &rclass, &nameLen);
    nextNameOffset += (size_t)nameLen + 4
  }
  for(remaining = ntohs(dh->ancount) + ntohs(dh->nscount); remaining-- > 0; ) {
    DNSName rr(packet, packetSize, nextNameOffset, true, &rtype, &rclass, &nameLen);
    nextNameOffset += (size_t)nameLen + 8;
    rdlen = (packet[nextNameOffset] * 256) + packet[nextNameOffset + 1];
    nextNameOffset += 2 + (size_t)rdlen;
  }
  for(remaining = ntohs(dh->arcount); remaining-- > 0; ) {
    DNSName rr(packet, packetSize, nextNameOffset, true, &rtype, &rclass, &nameLen);
    nextNameOffset += (size_t)nameLen + 8;
    rdlen = (packet[nextNameOffset] * 256) + packet[nextNameOffset + 1];
    nextNameOffset += 2 + (size_t)rdlen;
    if (rtype == QType::OPT && rclass == QClass::IN) {
      size_t optRdlenOffset = nextNameOffset - (size_t)rdlen - 2;
      /* found an OPT; replace or insert ECS and update disrupted compression pointers */
      return replaceOrInsertEDNSOption(packet, packetSize, len,
          optRdlenOffset, remaining, nextNameOffset,
          EDNSOptionCode::ECS, newECSOption, overrideExisting, ecsAdded);
    }
  }
}
catch(const std::range_error& e) {
  return false;
}

/* no OPT; add one as the last RR and don't worry about compression pointers */
return addEDNSWithOption(packet, packetSize, len, newECSOption, ednsAdded, ecsAdded);
bool replaceOrInsertEDNSOption(char* const packet, const size_t packetSize, uint16_t* const len,
    size_t optRdlenOffset, uint16_t followingRRs, size_t nextRROffset,
    uint16_t optionCode, const string& newOption,
    bool overrideExisting, bool* const optionAdded)
{
  /* look for a preexisting option */
  unsigned char* optRdlenPos = reinterpret_cast<unsigned char*>(packet) + optRdlenOffset;
  uint16_t optRdlen = (packet[optRdlenPos] * 256) + packet[optRdlenPos + 1];
  char * optionStart = nullptr;
  size_t oldOptionSize = 0;
  int res = getEDNSOption(reinterpret_cast<char*>(optRdlenPos), len - optRdlenOffset,
      optionCode, &optionStart, &oldOptionSize);
  if (res != 0) {
    /* no preexisting option; start after other options and keep oldOptionSize 0 */
    optionStart = optRdlenPos + 2 + optRdlen;
  } else if (!overrideExisting) {
    /* not allowed to override */
    return true;
  }

  /* overwrite in place, if possible */
  int shift = newOption.size() - oldOptionSize;
  if (shift == 0) {
    memcpy(optionStart, newOption.c_str(), newOption.size());
    return true;
  }

  /* shift the old suffix and insert the new option */
  int remainingCapacity = packetSize - *len;
  remainingCapacity -= shift;
  if (remainingCapacity < 0) {
    return false;
  }
  unsigned char* oldSuffix = optionStart + oldOptionSize;
  assert(oldSuffix > packet);
  memmove(optionStart + newOption.size(), oldSuffix, *len - (oldSuffix - packet));
  memcpy(optionStart, newOption.c_str(), newOption.size());
  if (oldOptionSize != 0) {
    *ecsAdded = true;
    optRdlen++;
    packet[optRdlenPos] = optRdlen / 256;
    packet[optRdlenPos + 1] = optRdlen % 256;
  }

  /* update compression pointers as necessary */
  if (followingRRs > 0) {
    res = adjustCompressionPointersFrom(packet, packetSize, len,
        oldSuffix - packet, shift, followingRRs, nextRROffset + shift);
    if (res != 0) {
      return false;
    }
  }
}
bool adjustCompressionPointersFrom(char* const packet, const size_t packetSize, uint16_t* const len,
    size_t movedOffset, int adjustment, uint16_t followingRRs, size_t nextRROffset)
{
  try {
    unsigned int nameLen = 0;
    uint16_t rtype, rclass, rdlen;
    const unsigned char* pos;
    unsigned char labelLen;
    while(followingRRs-- > 0) {
      /* read the owner name to verify that it's good, then rescan it for compression pointers */
      DNSName rr(packet, packetSize, nextRROffset, true, &rtype, &rclass, &nameLen);
      pos = packet + nextRROffset;
      nextRROffset += (size_t)nameLen + 8;
      rdlen = (packet[nextRROffset] * 256) + packet[nextRROffset + 1];
      nextRROffset += 2 + (size_t)rdlen;
      for(labelLen = *pos; labelLen != 0; labelLen = *pos) {
        if (labelLen < 0xc0) {
          /* normal label; advance to the next */
          pos += labelLen;
          continue;
        }
        labelLen &= (~0xc0);
        int target = (labelLen << 8) + *(const unsigned char*)(pos + 1);
        if (target < movedOffset) {
          /* unaffected compression pointer; stop scanning labels */
          break;
        }
        target += shift;
        if ((target & 0xc000) != 0) {
          /* newly out-of-bounds compression pointer */
          /* TODO if possible, expand, move following contents, and adjust nextRROffset accordingly */
          return false;
        }
        /* in-place adjustable compression pointer */
        target |= 0xc000;
        *pos = (unsigned char)(target >> 8);
        *(pos + 1) = (unsigned char)(target & 0xff);
        break;
      }
      /* TODO also adjust compression pointers inside well-known RDATA per RFC 3597 */
    }
  }
  catch(const std::range_error& e) {
    return false;
  }
}

P.S. There are problems in both getEDNSOptionsStart (it returns ENOENT unless the only record in the entire message is an additional-section OPT) and DNSPacketWriter::xfrBlob (it ignores compression pointers in RDATA, contrary to https://tools.ietf.org/html/rfc3597#section-4 ). So that would seem to diminish your hard-line stance of fixing compression pointers in post-OPT RRs rather than just shifting them to make room for new contents. Regardless, should I open issues for those bugs?

@rgacogne
Copy link
Member Author

@rgacogne rgacogne commented Sep 24, 2019

It looks like you are misunderstanding our goals in dnsdist. We care about correctness first, always. Moving memory around and dealing with raw pointers is always error-prone, as experience as proven time and time again. In some cases we care enough about the performance of the fast path that we are willing to take some risks if we think that it is worth it.
That's true for the general case where there is no record in any section except the question, or if there is only an OPT RR. But I am not interested in trying to fix compression pointers manually for the few queries that are not covered by that case. XPF is much more suited to that case anyway.

As for the case of preserving trailing data, which I understand is your main concern, I couldn't really care less. We debated adding support for manipulating trailing data in dnsdist for a while, and I decided to merge the proposal against the opinion of others in the community because I preferred to see you use the mainline dnsdist instead of a patched version, but I can't say I have been thrilled with the results so far. I'm actually seriously considering removing it to simplify the code.

There are problems in both getEDNSOptionsStart (it returns ENOENT unless the only record in the entire message is an additional-section OPT) and DNSPacketWriter::xfrBlob (it ignores compression pointers in RDATA, contrary to https://tools.ietf.org/html/rfc3597#section-4 ).

Unless I'm mistaken, getEDNSOptionsStart() returning ENOENT unless the only record in the entire message is an additional-section OPT should be fine in this PR since other cases will be handled by slowRewriteQueryWithExistingEDNS() or do you see another possibility that I missed?

As for DNSPacketWriter::xfrBlob() ignoring compression pointers in RDATA this is a known issue, and there is no way for us to fix it in dnsdist since we don't want to parse records in there. That's another reason why we feel that XPF is much more suited than the (ab)use of EDNS Client Subnet to pass the initial source address to the backend.

So that would seem to diminish your hard-line stance of fixing compression pointers in post-OPT RRs rather than just shifting them to make room for new contents

No.

@gibson042
Copy link
Contributor

@gibson042 gibson042 commented Sep 25, 2019

There are problems in both getEDNSOptionsStart (it returns ENOENT unless the only record in the entire message is an additional-section OPT) and DNSPacketWriter::xfrBlob (it ignores compression pointers in RDATA, contrary to https://tools.ietf.org/html/rfc3597#section-4 ).

Unless I'm mistaken, getEDNSOptionsStart() returning ENOENT unless the only record in the entire message is an additional-section OPT should be fine in this PR since other cases will be handled by slowRewriteQueryWithExistingEDNS() or do you see another possibility that I missed?

A DNS query can have RRs in the answer and/or authority sections (e.g., for DNS UPDATE, or as attempted exploits, or even just from bugs). When processing such a message whose additional section is a single OPT (i.e., ARCOUNT 1), the handleEDNSClientSubnet of this PR will skip the slow path, get a nonzero ENOENT from getEDNSOptionsStart, and invoke addEDNSWithECS, which will add a second OPT. parseEDNSOptions behavior is even worse—it proceeds as if there were no OPT. And addEDNSToQueryTurnedResponse leaves it untouched rather than removing/replacing it as designed.

As for the rest, I understand. We're actually not that far apart, and the intentional DNSPacketWriter::xfrBlob limitation does make sense (and further, you really could ignore compression pointers in post-OPT owner names for the same reason—if they were actually pointing at or after the OPT, then the input was likely already bad anyway). I appreciate how you've worked with me and hope you'll continue to do so.

If you take a fresh look at #8115 (comment) , perhaps ignoring the gratuitous adjustCompressionPointersFrom, I think you'll find that the resulting code covers relevant post-OPT RR cases that are the prime focus of this PR plus trailing data cases like #8321 that it currently misses, and gets rid of some ECS-specific clutter in favor of a much more straightforward and #3688-generalization-ready replaceOrInsertEDNSOption, and does so by shifting the tail end of a packet rather than incurring the performance penalty of copying the whole thing. handleEDNSClientSubnet is currently confusing and reaches out to four distinct ECS-aware functions, but it could be as simple as

int res = scanPacketForOpt(packet, packetSize, len,
    &optRDLenPosition, &unscannedRRCount, &nextRRPosition);
if (res != 0) {
  /* invalid input detected */
  return false;
}
if (optRDLenPosition != 0) {
  /* valid OPT found */
  return replaceOrInsertEDNSOption(packet, packetSize, len,
          optRDLenPosition, unscannedRRCount,
          EDNSOptionCode::ECS, newECSOption, overrideExisting, &ecsAdded);
}
return addEDNSWithOption(packet, packetSize, len,
    nextRRPosition, newECSOption, &ednsAdded, &ecsAdded);

Anyway, I think I've done enough hijacking here. Thanks again, and please do consider the above. I truly do appreciate the software, and come here with a desire to help it succeed.

@rgacogne rgacogne force-pushed the dnsdist-ecs-before-tsig branch from e646df0 to f8a01c4 Compare Oct 7, 2019
@rgacogne rgacogne merged commit ebb46ba into PowerDNS:master Feb 7, 2020
28 checks passed
@rgacogne rgacogne deleted the dnsdist-ecs-before-tsig branch Feb 7, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants