Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

netty-resolver-dns "Got bad packet: bad label type" #8962

Closed
nzhenry opened this issue Mar 20, 2019 · 9 comments · Fixed by #9311
Closed

netty-resolver-dns "Got bad packet: bad label type" #8962

nzhenry opened this issue Mar 20, 2019 · 9 comments · Fixed by #9311
Milestone

Comments

@nzhenry
Copy link

nzhenry commented Mar 20, 2019

I have built a simple DNS proxy server using the Netty DNS resolver library. See the code here.

It works for most queries but not all. For example, when I try to query the A records for "www.apple.com" using nslookup or dig, it fails intermittently with this error:

305 bytes
ca 20 80 00 00 01 00 04 00 00 00 01 03 77 77 77          .............www
05 61 70 70 6c 65 03 63 6f 6d 00 00 01 00 01 03          .apple.com......
77 77 77 05 61 70 70 6c 65 03 63 6f 6d 00 00 05          www.apple.com...
00 01 00 00 00 37 00 1b 03 77 77 77 05 61 70 70          .....7...www.app
6c 65 03 63 6f 6d 07 65 64 67 65 6b 65 79 03 6e          le.com.edgekey.n
65 74 00 03 77 77 77 05 61 70 70 6c 65 03 63 6f          et..www.apple.co
6d 07 65 64 67 65 6b 65 79 03 6e 65 74 00 00 05          m.edgekey.net...
00 01 00 00 30 2b 00 2f 03 77 77 77 05 61 70 70          ....0+./.www.app
6c 65 03 63 6f 6d 07 65 64 67 65 6b 65 79 03 6e          le.com.edgekey.n
65 74 0b 67 6c 6f 62 61 6c 72 65 64 69 72 06 61          et.globalredir.a
6b 61 64 6e 73 c0 41 03 77 77 77 05 61 70 70 6c          kadns.A.www.appl
65 03 63 6f 6d 07 65 64 67 65 6b 65 79 03 6e 65          e.com.edgekey.ne
74 0b 67 6c 6f 62 61 6c 72 65 64 69 72 06 61 6b          t.globalredir.ak
61 64 6e 73 03 6e 65 74 00 00 05 00 01 00 00 0c          adns.net........
b7 00 19 05 65 36 38 35 38 05 64 73 63 65 39 0a          ....e6858.dsce9.
61 6b 61 6d 61 69 65 64 67 65 c0 41 05 65 36 38          akamaiedge.A.e68
35 38 05 64 73 63 65 39 0a 61 6b 61 6d 61 69 65          58.dsce9.akamaie
64 67 65 03 6e 65 74 00 00 01 00 01 00 00 00 11          dge.net.........
00 04 68 5f a0 7e 00 00 29 10 00 00 00 00 00 00          ..h_....).......
00                                                       .

When it fails the debug log looks like this:

DefaultDnsRawRecord(www.apple.com. 885 IN CNAME 27B)
DefaultDnsRawRecord(www.apple.com.edgekey.net. 10239 IN CNAME 47B)
DefaultDnsRawRecord(www.apple.com.edgekey.net.globalredir.akadns.net. 441 IN CNAME 25B)
DefaultDnsRawRecord(e6858.dsce9.akamaiedge.net. 9 IN A 4B)] RECEIVED: [[[id: 0xc9dafb10], 36804, /192.168.1.1:53, DatagramDnsResponse(from: /192.168.1.1:53, to: /0:0:0:0:0:0:0:0:51678, 36804, QUERY(0), NoError(0), RD RA)
DefaultDnsQuestion(www.apple.com. IN A)
DefaultDnsRawRecord(www.apple.com. 885 IN CNAME 27B)
DefaultDnsRawRecord(www.apple.com.edgekey.net. 10239 IN CNAME 47B)
DefaultDnsRawRecord(www.apple.com.edgekey.net.globalredir.akadns.net. 441 IN CNAME 25B)
DefaultDnsRawRecord(e6858.dsce9.akamaiedge.net. 9 IN A 4B)]: {}], {}

When it succeeds the log looks like this:

DefaultDnsRawRecord(www.apple.com. 881 IN CNAME 27B)
DefaultDnsRawRecord(www.apple.com.edgekey.net. 10235 IN CNAME 50B)
DefaultDnsRawRecord(www.apple.com.edgekey.net.globalredir.akadns.net. 437 IN CNAME 28B)
DefaultDnsRawRecord(e6858.dsce9.akamaiedge.net. 5 IN A 4B)
DefaultDnsRawRecord(OPT flags:0 udp:4096 0B)] RECEIVED: [[[id: 0xc9dafb10], 37086, /192.168.1.1:53, DatagramDnsResponse(from: /192.168.1.1:53, to: /0:0:0:0:0:0:0:0:51678, 37086, QUERY(0), NoError(0), RD RA)
DefaultDnsQuestion(www.apple.com. IN A)
DefaultDnsRawRecord(www.apple.com. 881 IN CNAME 27B)
DefaultDnsRawRecord(www.apple.com.edgekey.net. 10235 IN CNAME 50B)
DefaultDnsRawRecord(www.apple.com.edgekey.net.globalredir.akadns.net. 437 IN CNAME 28B)
DefaultDnsRawRecord(e6858.dsce9.akamaiedge.net. 5 IN A 4B)
DefaultDnsRawRecord(OPT flags:0 udp:4096 0B)]: {}], {}

Notice the additional OPT record when it succeeds. This appears to be the key. I just don't know how I can fix the problem when it fails.

I've discovered that if I remove the CNAME records from the response, then the query resolves with no errors. I would like to return a proper copy of the server response including the CNAME records. Any help would be greatly appreciated!

@normanmaurer
Copy link
Member

I suspect it has to do with the maximal datagram packet size. Try to use new FixedRecvByteBufAllocator(4096) when you bootstrap your server.

@nzhenry
Copy link
Author

nzhenry commented Apr 1, 2019

I tried this:

NioDatagramChannel channel = new NioDatagramChannel();
channel.config().setRecvByteBufAllocator(new FixedRecvByteBufAllocator(4096));
Bootstrap bootstrap = new Bootstrap();
bootstrap.group(eventLoopGroup)
    .channelFactory(() -> channel)
    ...

No luck. Tried 8192 as well.

@qeesung
Copy link
Contributor

qeesung commented Apr 12, 2019

I tried to reproduce the problem. This seems to be a bug in DatagramDnsResponseDecoder, if the server receives the compressed DNS response and decodes it, it will be found that if the compression pointer is included in the RADATA of ANSWER SECTION, this problem will occur.

The reason is that DatagramDnsResponseDecoder does not decompress the compression pointer contained in RADATA of ANSWER SECTION.

The ANSWER SECTION has a resource record format:

0  1  2  3  4  5  6  7  8  9  A  B  C  D  E  F 
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                                               |
/                                               /
/                      NAME                     /
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      TYPE                     |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                     CLASS                     |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                      TTL                      |
|                                               |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+
|                   RDLENGTH                    |
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--|
/                     RDATA                     /
/                                               /
+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+--+

For Example:

Send www.apple.com DNS query request.

Expect ANSWER SECTION:

www.apple.com.          1376    IN      CNAME   www.apple.com.edgekey.net.
www.apple.com.edgekey.net. 134  IN      CNAME   www.apple.com.edgekey.net.globalredir.akadns.net.
www.apple.com.edgekey.net.globalredir.akadns.net. 2663 IN CNAME e6858.e19.s.tl88.net.
e6858.e19.s.tl88.net.   5       IN      A       222.163.207.56

Actually decoded by Netty

www.apple.com.          1376    IN      CNAME   .www.apple.com.edgekey.net.
                                    +------------------------+
                                    |
         +--------------------------v----------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+---------------------------------------------------------------------------+
|00000000| 03 77 77 77 05 61 70 70 6c 65 03 63 6f 6d 07 65 |.www.apple.com.e|
|00000010| 64 67 65 6b 65 79 03 6e 65 74 00                |dgekey.net.     |
+---------------------------------------------------------------------------+

www.apple.com.edgekey.net. 134  IN      CNAME   www.apple.com.edgekey.net.globalredir.akadns.A
                                    +------------------------------+
                                    |
         +--------------------------v----------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+---------------------------------------------------------------------------+
|00000000| 03 77 77 77 05 61 70 70 6c 65 03 63 6f 6d 07 65 |.www.apple.com.e|
|00000010| 64 67 65 6b 65 79 03 6e 65 74 0b 67 6c 6f 62 61 |dgekey.net.globa|
|00000020| 6c 72 65 64 69 72 06 61 6b 61 64 6e 73 c0 41    |lredir.akadns.A |
+---------------------------------------------------------------------------+
                                                  +---+
                                                    |
                                                    v
                                       Undecompressed Compression Pointer
www.apple.com.edgekey.net.globalredir.akadns.net. 2663 IN CNAME     e6858.e19.s.tl88.AA
                                    +-----------------------------------------+
                                    |
         +--------------------------v----------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+---------------------------------------------------------------------------+
|00000000| 05 65 36 38 35 38 03 65 31 39 01 73 04 74 6c 38 |.e6858.e19.s.tl8|
|00000010| 38 c0 41                                        |8.A             |
+---------------------------------------------------------------------------+
              +---+
                |
                v
  Undecompressed Compression Pointer
e6858.e19.s.tl88.net.   5       IN      A  222.163.207.56
                                    +-------------+
                                    |
         +--------------------------v----------------------+
         |  0  1  2  3  4  5  6  7  8  9  a  b  c  d  e  f |
+---------------------------------------------------------------------------+
|00000000| da 3a 65 e5                                     |.:e.            |
+---------------------------------------------------------------------------+

When all ANSWER Records are combined in dns-proxy server and sent to the client, the original compression pointer will point to the illegal position, resulting in bad label type error.

I'll do a PR.

@normanmaurer
Copy link
Member

@qeesung can you provide a focused fix for the issue without changing the whole class hierarchy etc and breaking the API or is this not possible at all ?

@qeesung
Copy link
Contributor

qeesung commented Jun 26, 2019

@normanmaurer Okay, I'll give it a try. :-)

@normanmaurer
Copy link
Member

@qeesung thanks a lot

@qeesung
Copy link
Contributor

qeesung commented Jun 30, 2019

@normanmaurer The reason for this problem is that rdata of DNS record cannot be parsed correctly when it contains compressed pointers. Generally, compression pointers are included only when rdata type is text type, such as NS, CNAME type. In other words, rdata can be parsed correctly only if Dns record type is known in advance.

In the original implementation, DnsRecord has only three subclasses, DnsRawRecord, DnsPtrRecord and DnsOptPseudoRecord. The text type of DnsRecord is currently hold in DnsRawRecord. To solve the problem of bad label, there are two solutions:

The original implementation of method decodeRecord of class DefaultDnsRecordDecoder

protected DnsRecord decodeRecord(
String name, DnsRecordType type, int dnsClass, long timeToLive,
ByteBuf in, int offset, int length) throws Exception {
// DNS message compression means that domain names may contain "pointers" to other positions in the packet
// to build a full message. This means the indexes are meaningful and we need the ability to reference the
// indexes un-obstructed, and thus we cannot use a slice here.
// See https://www.ietf.org/rfc/rfc1035 [4.1.4. Message compression]
if (type == DnsRecordType.PTR) {
return new DefaultDnsPtrRecord(
name, dnsClass, timeToLive, decodeName0(in.duplicate().setIndex(offset, offset + length)));
}
return new DefaultDnsRawRecord(
name, type, dnsClass, timeToLive, in.retainedDuplicate().setIndex(offset, offset + length));
}

Solution1: In the decodeRecord method of class DefaultDnsRecordDecoder, a new if branch is added to decode the record of text type. This approach has minimal impact on current implementations, but it should not be a good design. If you need to decode other types of records later, you need to add a new if branch.

protected DnsRecord decodeRecord(
            String name, DnsRecordType type, int dnsClass, long timeToLive,
            ByteBuf in, int offset, int length) throws Exception {
	if (type == DnsRecordType.PTR) {
		return new DefaultDnsPtrRecord(
		                    name, dnsClass, timeToLive, decodeName0(in.duplicate().setIndex(offset, offset + length)));
	}
  	if (type == DnsRecordType.NS || type ==  DnsRecordType.CNAME /** || more types... */) {
        // decode normal text rdata or compressed pointer text rdata
        // return DnsRawRecord
    }
	return new DefaultDnsRawRecord(
	                name, type, dnsClass, timeToLive, in.retainedDuplicate().setIndex(offset, offset + length));
}

Solution2: Add an RData decoder for each type of DnsRecord, This method has been implemented in PR #9084 . If a new type of RData needs to be decoded, then only one decoder needs to be added and registered. But this will break some of the APIs and implementations

protected DnsRecord decodeRecord(
            String name, DnsRecordType type, int dnsClass, long timeToLive,
            ByteBuf in, int offset, int length) throws Exception {
	ByteBuf rData = in.duplicate().setIndex(offset, offset + length);
	// Get the corresponding RData decoder by type
	DnsRDataDecoder<? extends DnsRecord> rDataDecoder = DnsRDataCodecs.rDataDecoder(type);
	if (rDataDecoder != null) {
		return rDataDecoder.decodeRData(name, dnsClass, timeToLive, rData);
	}
	return new DefaultDnsRawRecord(name, type, dnsClass, timeToLive, rData.retain());
}

WDYT?Which way is better?

@normanmaurer
Copy link
Member

@qeesung for 4.1 I think we should do 1) for master we can check 2)

@qeesung
Copy link
Contributor

qeesung commented Jun 30, 2019

@normanmaurer Okay, I'll submit a new PR for 1)

normanmaurer pushed a commit that referenced this issue Jul 2, 2019
…rs (#9311)

Motivation:

When decoding DnsRecord, if the record contains compression pointers, and not all compression pointers are decompressed, but part of the pointers are decompressed. Then when encoding the record, the compressed pointer will point to the wrong location, resulting in bad label problem.

Modification:

Pre-decompressed record RData that may contain compression pointers.

Result:

Fixes #8962
normanmaurer pushed a commit that referenced this issue Jul 2, 2019
…rs (#9311)

Motivation:

When decoding DnsRecord, if the record contains compression pointers, and not all compression pointers are decompressed, but part of the pointers are decompressed. Then when encoding the record, the compressed pointer will point to the wrong location, resulting in bad label problem.

Modification:

Pre-decompressed record RData that may contain compression pointers.

Result:

Fixes #8962
@normanmaurer normanmaurer added this to the 4.1.38.Final milestone Jul 2, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment