-
Notifications
You must be signed in to change notification settings - Fork 829
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Packet sanitization and IP masking #615
base: master
Are you sure you want to change the base?
Conversation
As a general note, this problem is not new and similar tools already exist:
If you still find it better to suggest a new solution, the proposed changes need to be one clean commit, which explains why this specific solution is better. |
tcpdump.c
Outdated
@@ -87,6 +87,7 @@ The Regents of the University of California. All rights reserved.\n"; | |||
#include <sys/nv.h> | |||
#endif /* HAVE_CASPER */ | |||
#endif /* HAVE_CAPSICUM */ | |||
#include <arpa/inet.h> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this exist on all platforms? No.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gvanem which platforms are you concerned about?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Windows (and possibly DOS).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@guyharris DOS (i.e. Watt-32) do have <arpa/inet.h>
. But I'm more concerned about Windows off-course.
/* List of all the reserved IPv4 address spaces, per RFC5735, | ||
* ...PROVIDED IN NETWORK BYTE ORDER!! */ | ||
struct netblock specialblock[] = { | ||
{ .netip = 0x00000000, .netmask = 0x000000ff }, /* 0.0.0.0/8 */ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This C99 (?) syntax is not universally available.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For Wireshark, we currently require compilers that support these C99 features:
- flexible array members
- compound literals
- designated initializers
- "//" comments
- mixed declarations and code
- new block scopes for selection and iteration statements (that is, declaring the type in a for-loop like:
for (int i = 0; i < n; i++) ;)
- macros with a variable number of arguments (variadic macros)
- trailing comma in enum declarations
- inline functions (guaranteed only by use of
glib.h
- tcpdump handles that itself, where it may defineinline
as__inline
)
The syntax in question is the "designated initializer" syntax.
Those features are supported by all versions of GCC and Clang I've dealt with, and I think it's supported by sufficiently-recent versions of UN*X vendor C compilers. Before expanding the list of C99 features, I dug through Microsoft's documentation to see which compilers support which features; as I remember, Visual Studio 2013 (the compiler currently used on the Wireshark buildbots) support all of those. The Wireshark Developer's Guide also mentions VS 2010; I think it might support them as well.
So, are there any compilers that we still want to support and that don't support those features? If not, perhaps it's time to require compilers that support those features to build tcpdump.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@gvanem, could you clarify if it is not available because you use a compiler that does not support C99 in principle or because tcpdump runs the compiler with C99 disabled?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's designated initializers that Guy wrote, then MSVC-2013 does not have that. I'm not sure how I can disable C99. So no.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it's designated initializers that Guy wrote, then MSVC-2013 does not have that
"Does not have that" in the sense of "this page doesn't mention it" or in the sense of "Microsoft didn't implement it"? The feature appears to exist, with some bugs, in VS 2013.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the recommendation from this discussion?
Hi there! I'm currently working on the As a former network analyst, I'm certainly aware that tools exist that sort of do these anonymization/privatization and sanitization features, but there are (in my experience) some not-insignificant issues with their adoption. Our motivation for ultimately adding this to tcpdump are the following:
Finally, there's the utility of the proposed -00 option which shrinks the size of data collection. This reduction has significant impact for folks doing downstream ingest and analytics of really large network collections. (Sending 1GB to a GPU farm is way better than 10GB!) Having this functionality built into tcpdump opens it up as a really great option not just for traditional networking folks, but for people doing research in machine learning (e.g., our team) in computer networks and security. All that being said, thanks so much for your input and guidance! I'll work on better defining the description and the code issues pointed out. |
lilchurro <notifications@github.com> wrote:
2 Ability to scale. Most of our work entail large packet captures
3 Thorough sanitization. In the discussions referenced, one will see
So, if you are saying that you would typically want to run the sanitization
*as* you capture, then I get why you want it in tcpdump. That way there
is never a file that is non-sanitized.
|
Guy Harris <notifications@github.com> wrote:
which features; as I remember, Visual Studio 2013 (the compiler
currently used on the Wireshark buildbots) support all of those.
...
If this compiler is redily available, then I think that covers windows.
My only concern is whether or not someone is still trying to build for
some old HPUX or something like that. I'm happy to abandon them :-)
…--
] Never tell me the odds! | ipv6 mesh networks [
] Michael Richardson, Sandelman Software Works | network architect [
] mcr@sandelman.ca http://www.sandelman.ca/ | ruby on rails [
|
Hello again... I'm having some issues compiling on Windows. (Using Visual Studio 2017, and attempting to compile with headers and libraries as provided by WpcapSrc_4_1_3.zip.) I'm trying to compile just the master branch of tcpdump (i.e., without my changes), but am stuck with some dependency issues (among other things). It appears to me that windump version 4.10 doesn't compile on Windows as it refers to the file util.c, which doesn't exist. Also, I'm having issues using wpcapsrc 4.1.3, as the pcap.h included therein contains reference to sys/time.h -- so is there a specific version of winpcap that I should be using? The win32 readme seems vague about this. Is there some way I can get assistance in compiling the master branch on Windows? It seems like I should get that working before trying to verify that my changes can compile. |
FYI, I've added new tests to verify the anonymization and packet sanitization for TCP/UDP packet payloads, and have integrated them into TESTrun.sh. I noticed that there wasn't really a way to test for cases where If there's something else I can do to move this PR along, please let me know; otherwise, I will consider this about as done as it could be. 🙂 |
Ping... is there anything else I can do to move this PR along? |
Makefile.in
Outdated
@@ -298,7 +300,7 @@ HDR = \ | |||
tcp.h \ | |||
netdissect-stdinc.h \ | |||
timeval-operations.h \ | |||
udp.h | |||
udp.h \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is a continuation supposed to be on this line?
README.md
Outdated
@@ -1,7 +1,6 @@ | |||
# tcpdump | |||
|
|||
[![Build | |||
Status](https://travis-ci.org/the-tcpdump-group/tcpdump.png)](https://travis-ci.org/the-tcpdump-group/tcpdump) | |||
[![Build Status](https://travis-ci.org/lilchurro/tcpdump.svg?branch=send-upstream)](https://travis-ci.org/lilchurro/tcpdump) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should this reference the tcpdump travis ci instead of @lilchurro travis ci?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure about this one, actually, because it specifically tests the changes to my branch.
clean_cap_dump.c
Outdated
#include "tcp.h" | ||
#include "udp.h" | ||
|
||
#define MAXPACKET 65549 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
please add a comment about what MAXPACKET defines, (what parts of the packet does it cover?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Adding comment about it being largest IP packet size + eth header.
u_char *s; | ||
const struct ip *ip; | ||
const struct ether_header *ep = (const struct ether_header *)bp; | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can h,bp ever be Null?
} | ||
|
||
static uint32_t | ||
nd_ipv4_to_network_uint(nd_ipv4 a) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can a be Null
|
||
return 0; | ||
} | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/* comment of what function does */
|
||
void | ||
pcap_mod_and_dump(u_char *user, const struct pcap_pkthdr *h, const u_char *sp, | ||
int dlt, int no_payload_flag, int mask_ip_flag, const char *maskIP) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
can any of the incoming pointers be Null?
sf_hdr.caplen = h->caplen; | ||
sf_hdr.len = h->len; | ||
|
||
switch(dlt) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
will there be more things processed by this switch? yes -> ok
no -> probably should become an if
case DLT_EN10MB: | ||
if ((ip = get_iph_ptr(h, modp)) == NULL) | ||
break; | ||
p_len -= ETHER_HDRLEN; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
is p_len always >= ETHER_HDRLEN?
} | ||
} | ||
|
||
if (no_payload_flag > 1) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what are valid values for no_payload_flag?
c1f4c81
to
2763483
Compare
😱 Well, that rebase was a harrowing experience. 1 clean commit now tho. |
Added two new flags to tcpdump To incorporate privacy concerns in sharing network traffic among disparate organizations, we've added two (three?) new flags to tcpdump: - `--zero-tcpudp-payload` zeros out packet payload after TCP/UDP headers (in IPv4); - `--no-tcpudp-payload` truncates the packet payload after TCP/UDP headers (in IPv4); - `--mask-external-address mask_ip` masks all IPv4 addresses outside of the blocks noted in RFC5735. These currently assume DLT_EN10MB and will leave unchanged any other link layer traffic. Additionally, these flags only work when writing to a savefile.
4e02901
to
dbf8403
Compare
Is this something @lilchurro should consider spending the time to bring up to date? Would be great to get some guidance here! |
There were two discussions made in this pull request. One about the code itself, which indeed belongs here. I cannot help completing it before I finish other bits of work, other developers may be available. Another discussion, which would belong more to the tcpdump-workers mailing list, was whether it is right to incorporate the masking function into tcpdump as opposed to doing it in a separate binary. I guess the latter one could go into more detail to reach a more pronounced consensus. I, for instance, am not quite convinced yet, but I could be wrong. |
Thanks - really appreciate the reply. Understood on your concerns. I'll share that we looked at other methods (and all the tools you referenced last year) and still came to the conclusion that for anyone who wants this functionality, this is the right path. That said though, I'll start a thread in the tcpdump-workers and we'll see where it takes us. Thanks! |
Created a separate branch so I can keep upstream and workgroup README and version strings separate.