Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

pcap-filter man page clarifications and improvements #326

Closed
cjmaynard opened this issue Oct 25, 2013 · 14 comments
Closed

pcap-filter man page clarifications and improvements #326

cjmaynard opened this issue Oct 25, 2013 · 14 comments

Comments

@cjmaynard
Copy link

The pcap-filter man page could benefit from some clarifications and improvements, such as:

  1. proto qualifiers: The man page reads, "proto qualifiers restrict the match to a particular protocol. Possible protos are: ether, fddi, tr, wlan, ip, ip6, arp, rarp, decnet, tcp and udp. " However, as noted in the discussion from http://ask.wireshark.org/questions/26350/how-to-filter-wlan-with-a-capture-filter, the semantics of those proto qualifiers differ, in particular ether, fddi, tr and wlan from the others. To quote Guy Harris, "The problem is that there are both proto qualifiers and protocol abbreviations, and some tokens, such as ip, ip6, tcp, and udp, serve both purposes". Some clarification might therefore be beneficial to users.

  2. Protocol Abbreviations: The man page reads, "tcp, udp, icmp [are] Abbreviations for: proto p where p is one of the above protocols. ". icmp6 isn't listed. Should it be? Are there others that should be listed too?

  3. sctp isn't documented anywhere.

  4. expr relop expr: The man page reads, "Proto is one of ether, fddi, tr, wlan, ppp, slip, link, ip, arp, rarp, tcp, udp, icmp, ip6 or radio, ...", but ppp, slip, link, radio [and icmp6] aren't listed previously when describing the proto qualifiers (see point 1 above), and decnet is missing from this list. It might also be nice if this was a sorted list of protos.

  5. Speaking of a sorted list, it might be nicer if all the primitives were sorted to make it easier to find/reference them.

@infrastation
Copy link
Member

Could you post a diff or open a pull request to show exact changes you suggest to make? Thank you.

@mcr mcr added this to the release-after-next milestone Apr 19, 2019
infrastation added a commit that referenced this issue Jan 4, 2022
Address point 3 of GH bug report #326: list SCTP in contexts where it
belongs, as far as the current implementation in gencode.c seems to go
(although it is not immediately clear if the implicit fragment exclusion
apllies to SCTP too).

Improve wording and formatting in the "proto" qualifier description.
Lose a stray bold decoration so it does not affect the next line.
Refer to pcap_compile(3PCAP) properly.  Update the timestamp.
@infrastation
Copy link
Member

  1. sctp isn't documented anywhere.

Now it is. @cjmaynard or anybody else, would you like to address any other items on the list?

guyharris pushed a commit that referenced this issue Feb 21, 2022
Address point 3 of GH bug report #326: list SCTP in contexts where it
belongs, as far as the current implementation in gencode.c seems to go
(although it is not immediately clear if the implicit fragment exclusion
apllies to SCTP too).

Improve wording and formatting in the "proto" qualifier description.
Lose a stray bold decoration so it does not affect the next line.
Refer to pcap_compile(3PCAP) properly.  Update the timestamp.

(cherry picked from commit 5f634cf)
dmiller-nmap pushed a commit to nmap/libpcap that referenced this issue May 10, 2022
Address point 3 of GH bug report the-tcpdump-group#326: list SCTP in contexts where it
belongs, as far as the current implementation in gencode.c seems to go
(although it is not immediately clear if the implicit fragment exclusion
apllies to SCTP too).

Improve wording and formatting in the "proto" qualifier description.
Lose a stray bold decoration so it does not affect the next line.
Refer to pcap_compile(3PCAP) properly.  Update the timestamp.

(cherry picked from commit 5f634cf)
@infrastation
Copy link
Member

infrastation commented Jun 12, 2022

After studying gen_proto_abbrev_internal() for some time I would like to suggest resolving point 2 of the original problem statement as follows:

  • pim is a shorthand for proto \pim, ah is a shorthand for proto \ah and esp is a shorthand for proto \esp.

    This way, each of these three items expands to "(IPv4 or IPv6) and protocol X" (which is the meaning of proto X) and should be added as a valid abbreviation to the man page.

  • icmp is a shorthand for ip proto 1. icmp6 is a shorthand for ip6 proto 58. Both proto 1 and proto 58 expand (by the definition of proto X) to expressions that get either IPv4 ICMP or IPv6 ICMP6 wrong, so it is quite right that there are no shorthands that expand to these two expressions.

    This way, the man page text should be fixed: it incorrectly says that icmp expands into proto \icmp, this is not (and should not be) the actual behaviour. Likewise, icmp6 is not on the list of abbreviations, so it should remain this way.

  • ip proto protocol lists icmp6 as a protocol name, which makes no sense for IPv4; ip6 proto protocol does not explain that in the list of protocols icmp makes no sense for IPv6. The list of protocols does not explain that all names are subject to a lookup using getprotobyname(), on my Linux host it means that ICMPv6 is known only as ipv6-icmp as that's how it appears in /etc/protocols.

    These two sections should be amended for clarity.

infrastation added a commit that referenced this issue Jun 12, 2022
Address point 2 and progress point 5 of GH bug report #326.

In the "ip proto protocol" section omit the "icmp6" protocol name,
explain the implicit involvement of getprotobyname() and give two
OS-specific names for IGRP.  In the "ip6 proto protocol" section give
"ipv6-icmp" as the proper protocol name (not a keyword).  In the index
operation description add "icmp6" to the list of supported protocol
layers.

Add two entries with the meaning of "icmp" and "icmp6" abbreviations.
In the list of abbreviations for "proto \protocol" remove "icmp" because
it never was and should not be the implemented behaviour; add "pim",
"ah" and "esp" because that has been the implemented behaviour since
commit 7fe3c11 in 1999.

Reformat and sort some lists alphabetically, update the timestamp.
@infrastation
Copy link
Member

Current progress: points 2 and 3 have been fully resolved, there is a bit of improvement towards points 4 and 5, point 1 still remains not reviewed. Would anybody like to progress any of the remaining work items?

@guyharris
Copy link
Member

The list of protocols does not explain that all names are subject to a lookup using getprotobyname(), on my Linux host it means that ICMPv6 is known only as ipv6-icmp as that's how it appears in /etc/protocols.

Not just Linux, probably universal for most if not all UN*Xes:

$ sw_vers
ProductName:	macOS
ProductVersion:	11.6.4
BuildVersion:	20G417
$ egrep icmp /etc/protocols
icmp	1	ICMP		# internet control message protocol
ipv6-icmp	58	IPV6-ICMP	icmp6	# ICMP for IPv6
solaris11$ uname -sr
SunOS 5.11    # That's the core OS component of Solaris 11
solaris11$ egrep icmp /etc/protocols
icmp		1	ICMP		# internet control message protocol
ipv6-icmp	58	IPv6-ICMP	# IPv6 internet control message protocol

As neither AIX nor HP-UX, to name the two surviving non-Linux/non-BSD UN*Xes, run on x86, I don't have a VM for them. I didn't bother with the *BSDs as macOS is a BSD cousin and they're probably all much the same.

And, as for our only non-UN*X platform:

C:\Users\guy>find "icmp" C:\WINDOWS\system32\drivers\etc\protocol

---------- C:\WINDOWS\SYSTEM32\DRIVERS\ETC\PROTOCOL
icmp       1     ICMP         # Internet control message protocol
ipv6-icmp  58    IPv6-ICMP    # ICMP for IPv6

(yeah, that's how you say "grep" - sort of, except that it's a string match, and you have to quote it, to let find know that it's a search string and not a path name - and /etc/protocols on Windows).

So it's going to be "ipv6-icmp" if you want "Internet Control Message Protocol (ICMPv6) for the Internet Protocol Version 6 (IPv6)", as RFC 4443 calls it. But since the RFC calls it "ICMPv6", maybe we should offer "icmpv6" as an alias for "ipv6-icmp".

@infrastation
Copy link
Member

Thank you for the additional information. "icmp6" is probably a legacy name since macOS lists it as an alias (FreeBSD does too, but NetBSD and OpenBSD don't). AIX 7.1 and 7.2 have the following:

icmp            1               ICMP                    #Internet Control Message              [RFC792]
ipv6-icmp       58              IPv6-ICMP               #ICMP for IPv6                   [RFC1883]

Re "icmpv6" or a similar addition, after looking into this part of the man page for a while it is difficult for me to tell right now whether adding it into the existing syntax will make it easier to comprehend.

@cjmaynard
Copy link
Author

Maybe it would be helpful to document "proto protocol" more similarly to how "ip proto protocol" is documented?

And then instead of explicitly listing ah, esp, pim, sctp, tcp udp as abbreviations for "proto \protocol", perhaps just list them as examples of the above syntax, because at least on my Windows system, ggp, egp, pup, hmp, xns-idp, rdp, ipv6, ipv6-route, ipv6-frag, ipv6-icmp, ipv6-nonxt, ipv6-opts and rvd would all work as well, but naturally that list will differ between platforms.
Of course if there are aliases, such as the proposed icmpv6, then those would need to be explicitly documented since they won't appear in the /etc/protocols (or C:\Windows\System32\drivers\etc\protocol) file.

@infrastation
Copy link
Member

This is the current text:

       proto protocol
              True if the packet is an IPv4 or IPv6 packet  of  protocol  type
              protocol.   Note that this primitive does not chase the protocol
              header chain.

Is your suggestion that this passage should say ip proto \protocol or ip6 proto \protocol in addition or instead of the text?

The reason for listing "ah, esp, pim, sctp, tcp, udp" is that these are not examples, but the only particular abbreviations that expand into proto \protocol. That's why the filter proto \egp is a valid syntax (if present in /etc/protocols, which is not the case on my Linux system), but egp is not. Do you observe this behaviour on your system?

@cjmaynard
Copy link
Author

No, I was suggesting that it read more like ip proto protocol does, so something like:

        proto protocol
                True if the packet is an IPv4 or IPv6 packet of protocol type 
                protocol.  Protocol can be a number or one of the names 
                recognized by getprotobyname(3) (as in e.g. `getent(1) protocols'), 
                typically from an entry in /etc/protocols, ...
                Note that this primitive does not chase the protocol header chain.

Here's the contents of the unmodified C:\Windows\System32\drivers\etc\protocol file on my Windows 10 system:

C:\>type C:\Windows\System32\drivers\etc\protocol
# Copyright (c) 1993-2006 Microsoft Corp.
#
# This file contains the Internet protocols as defined by various
# RFCs.  See http://www.iana.org/assignments/protocol-numbers
#
# Format:
#
#     [aliases...]   [#]

ip         0     IP           # Internet protocol
icmp       1     ICMP         # Internet control message protocol
ggp        3     GGP          # Gateway-gateway protocol
tcp        6     TCP          # Transmission control protocol
egp        8     EGP          # Exterior gateway protocol
pup        12    PUP          # PARC universal packet protocol
udp        17    UDP          # User datagram protocol
hmp        20    HMP          # Host monitoring protocol
xns-idp    22    XNS-IDP      # Xerox NS IDP
rdp        27    RDP          # "reliable datagram" protocol
ipv6       41    IPv6         # Internet protocol IPv6
ipv6-route 43    IPv6-Route   # Routing header for IPv6
ipv6-frag  44    IPv6-Frag    # Fragment header for IPv6
esp        50    ESP          # Encapsulating security payload
ah         51    AH           # Authentication header
ipv6-icmp  58    IPv6-ICMP    # ICMP for IPv6
ipv6-nonxt 59    IPv6-NoNxt   # No next header for IPv6
ipv6-opts  60    IPv6-Opts    # Destination options for IPv6
rvd        66    RVD          # MIT remote virtual disk

Any of those names work in place of foo as in proto \foo. (NOTE: I am using dumpcap.exe on Windows instead of tcpdump, but that shouldn't matter as the filtering is the same.)

And on my Linux Ubuntu VM, there are 55 such entries,:

grep -v "^#" /etc/protocols | grep -v "^$" | wc -l
55

... and all of them work, not just ah, esp, pim, sctp, tcp or udp. (Here I am using tcpdump to test.)

@infrastation
Copy link
Member

Thank you for the additional input. The man page includes 6 uses of "protocol" in the sense of (and involving a lookup in) /etc/protocols:

  • ip proto protocol
  • ip6 proto protocol
  • proto protocol
  • ip6 protochain protocol
  • ip protochain protocol
  • protochain protocol

The first one already explains what it means, and subsequent ones could say "protocol (as explained in "ip proto protocol" above)" or some such, would that clarify the meaning?

Considering the other point, please note that tcpdump 'sctp' works because it is a shorthand for tcpdump 'proto \sctp', which also works, but tcpdump 'rvd' or tcpdump 'hmp' should not work because there is no such abbreviation, even if the entries are in the file; however, tcpdump 'proto rvd' and tcpdump 'proto hmp' would work so long as the entries are in the file. Is this consistent with the behaviour you are observing?

@cjmaynard
Copy link
Author

Yes, that's the behaviour I'm seeing. But the documentation for proto protocol seems to imply that only ah, esp, pim, sctp, tcp or udp can be used when specifying the protocol in proto \protocol, does it not? And now that you mention it, I think ip6 proto protocol could also use some clarifications, akin to ip proto protocol.

And another observation: It seems that for ah, esp, pim, sctp, tcp and udp, a filter of "foo" is not the same as a filter of "proto \foo". The former tests for IPv6 first and IPv4 second, while the latter tests in the opposite order. Why is that? Shouldn't they be identical?

@infrastation
Copy link
Member

The text does not mean that, but it is possible to see how it can be understood this way. I guess the reference to "protocol" meaning will help to clarify that a bit more.

Commit c93c8ff very recently made foo and proto \foo identical bytecode-wise (they were not identical, but equivalent, as far as I managed to establish).

infrastation added a commit that referenced this issue Jun 19, 2022
This should progress GH #326 a little bit more.
guyharris pushed a commit that referenced this issue Jul 8, 2022
Address point 2 and progress point 5 of GH bug report #326.

In the "ip proto protocol" section omit the "icmp6" protocol name,
explain the implicit involvement of getprotobyname() and give two
OS-specific names for IGRP.  In the "ip6 proto protocol" section give
"ipv6-icmp" as the proper protocol name (not a keyword).  In the index
operation description add "icmp6" to the list of supported protocol
layers.

Add two entries with the meaning of "icmp" and "icmp6" abbreviations.
In the list of abbreviations for "proto \protocol" remove "icmp" because
it never was and should not be the implemented behaviour; add "pim",
"ah" and "esp" because that has been the implemented behaviour since
commit 7fe3c11 in 1999.

Reformat and sort some lists alphabetically, update the timestamp.

(cherry picked from commit db3b00e)
infrastation added a commit that referenced this issue Jul 8, 2022
Tell relop aliases outside of the relop set for clarity.  Add a few
missing entries to the list of protocols that support the index
operation to document the currently implemented behaviour and to
progress point 4 of GH issue #326.
@infrastation
Copy link
Member

Commit ffa313d progresses point 4. However, I do not currently see a nice way to coordinate it with point 1 (the list of proto qualifiers at the top) at this time. Perhaps saying something like indexproto [ expr : size ] for index operations would be better because not all protocol tokens support the index operation (there is a default case in gen_load_internal()).

guyharris pushed a commit that referenced this issue Jul 14, 2022
This should progress GH #326 a little bit more.

(cherry picked from commit a99eaca)
guyharris pushed a commit that referenced this issue Jul 14, 2022
Tell relop aliases outside of the relop set for clarity.  Add a few
missing entries to the list of protocols that support the index
operation to document the currently implemented behaviour and to
progress point 4 of GH issue #326.

(cherry picked from commit ffa313d)
@infrastation
Copy link
Member

Current status, in terms of the points made in the original problem statement, is as follows.

Point 1 looks addressed as far as is reasonably practicable: the man page has been updated several times to expand protocol abbreviations better and to mention more tokens the are also keywords/abbreviations and must be escaped.

Points 2-5 look fully resolved via the associated commits (often explained in the comments).

Hence closing this request as resolved, thank you everyone for your input and patience. If you spot any other space for improvement in future, please prepare the changes and open a pull request.

@infrastation infrastation self-assigned this Aug 20, 2022
tenarchits pushed a commit to tenarchits/libpcap that referenced this issue Mar 4, 2023
Address point 3 of GH bug report the-tcpdump-group#326: list SCTP in contexts where it
belongs, as far as the current implementation in gencode.c seems to go
(although it is not immediately clear if the implicit fragment exclusion
apllies to SCTP too).

Improve wording and formatting in the "proto" qualifier description.
Lose a stray bold decoration so it does not affect the next line.
Refer to pcap_compile(3PCAP) properly.  Update the timestamp.
tenarchits pushed a commit to tenarchits/libpcap that referenced this issue Mar 4, 2023
Address point 2 and progress point 5 of GH bug report the-tcpdump-group#326.

In the "ip proto protocol" section omit the "icmp6" protocol name,
explain the implicit involvement of getprotobyname() and give two
OS-specific names for IGRP.  In the "ip6 proto protocol" section give
"ipv6-icmp" as the proper protocol name (not a keyword).  In the index
operation description add "icmp6" to the list of supported protocol
layers.

Add two entries with the meaning of "icmp" and "icmp6" abbreviations.
In the list of abbreviations for "proto \protocol" remove "icmp" because
it never was and should not be the implemented behaviour; add "pim",
"ah" and "esp" because that has been the implemented behaviour since
commit 7fe3c11 in 1999.

Reformat and sort some lists alphabetically, update the timestamp.
tenarchits pushed a commit to tenarchits/libpcap that referenced this issue Mar 4, 2023
tenarchits pushed a commit to tenarchits/libpcap that referenced this issue Mar 4, 2023
Tell relop aliases outside of the relop set for clarity.  Add a few
missing entries to the list of protocols that support the index
operation to document the currently implemented behaviour and to
progress point 4 of GH issue the-tcpdump-group#326.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

4 participants