-
Notifications
You must be signed in to change notification settings - Fork 1
/
README
913 lines (668 loc) · 39.1 KB
/
README
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
=============================
p0f v3: passive fingerprinter
=============================
http://lcamtuf.coredump.cx/p0f3.shtml
Copyright (C) 2012 by Michal Zalewski <lcamtuf@coredump.cx>
---------------
1. What's this?
---------------
P0f is a tool that utilizes an array of sophisticated, purely passive traffic
fingerprinting mechanisms to identify the players behind any incidental TCP/IP
communications (often as little as a single normal SYN) without interfering in
any way.
Some of its capabilities include:
- Highly scalable and extremely fast identification of the operating system
and software on both endpoints of a vanilla TCP connection - especially in
settings where NMap probes are blocked, too slow, unreliable, or would
simply set off alarms,
- Measurement of system uptime and network hookup, distance (including
topology behind NAT or packet filters), and so on.
- Automated detection of connection sharing / NAT, load balancing, and
application-level proxying setups.
- Detection of dishonest clients / servers that forge declarative statements
such as X-Mailer or User-Agent.
The tool can be operated in the foreground or as a daemon, and offers a simple
real-time API for third-party components that wish to obtain additional
information about the actors they are talking to.
Common uses for p0f include reconnaissance during penetration tests; routine
network monitoring; detection of unauthorized network interconnects in corporate
environments; providing signals for abuse-prevention tools; and miscellanous
forensics.
A snippet of typical p0f output may look like this:
.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (syn) ]-
|
| client = 1.2.3.4
| os = Windows XP
| dist = 8
| params = none
| raw_sig = 4:120+8:0:1452:65535,0:mss,nop,nop,sok:df,id+:0
|
`----
.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (syn+ack) ]-
|
| server = 4.3.2.1
| os = Linux 3.x
| dist = 0
| params = none
| raw_sig = 4:64+0:0:1460:mss*10,0:mss,nop,nop,sok:df:0
|
`----
.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (mtu) ]-
|
| client = 1.2.3.4
| link = DSL
| raw_mtu = 1492
|
`----
.-[ 1.2.3.4/1524 -> 4.3.2.1/80 (uptime) ]-
|
| client = 1.2.3.4
| uptime = 0 days 11 hrs 16 min (modulo 198 days)
| raw_freq = 250.00 Hz
|
`----
A live demonstration can be seen here:
http://lcamtuf.coredump.cx/p0f3/
--------------------
2. How does it work?
--------------------
A vast majority of metrics used by p0f were invented specifically for this tool,
and include data extracted from IPv4 and IPv6 headers, TCP headers, the dynamics
of the TCP handshake, and the contents of application-level payloads.
For TCP/IP, the tool fingerprints the client-originating SYN packet and the
first SYN+ACK response from the server, paying attention to factors such as the
ordering of TCP options, the relation between maximum segment size and window
size, the progression of TCP timestamps, and the state of about a dozen possible
implementation quirks (e.g. non-zero values in "must be zero" fields).
The metrics used for application-level traffic vary from one module to another;
where possible, the tool relies on signals such as the ordering or syntax of
HTTP headers or SMTP commands, rather than any declarative statements such as
User-Agent. Application-level fingerprinting modules currently support HTTP.
Before the tool leaves "beta", I want to add SMTP and FTP. Other protocols,
such as FTP, POP3, IMAP, SSH, and SSL, may follow.
The list of all the measured parameters is reviewed in section 5 later on.
Some of the analysis also happens on a higher level: inconsistencies in the
data collected from various sources, or in the data from the same source
obtained over time, may be indicative of address translation, proxying, or
just plain trickery. For example, a system where TCP timestamps jump back
and forth, or where TTLs and MTUs change subtly, is probably a NAT device.
-------------------------------
3. How do I compile and use it?
-------------------------------
To compile p0f, try running './build.sh'; if that fails, you will be probably
given some tips about the probable cause. If the tips are useless, send me a
mean-spirited mail.
It is also possible to build a debug binary ('./build.sh debug'), in which case,
verbose packet parsing and signature matching information will be written to
stderr. This is useful when troubleshooting problems, but that's about it.
The tool should compile cleanly under any reasonably new version of Linux,
FreeBSD, OpenBSD, MacOS X, and so forth. You can also builtdit on Windows using
cygwin and winpcap. I have not tested it on all possible varieties of un*x, but
if there are issues, they should be fairly superficial.
Once you have the binary compiled, you should be aware of the following
command-line options:
-f fname - reads fingerprint database (p0f.fp) from the specified location.
See section 5 for more information about the contents of this
file.
The default location is ./p0f.fp. If you want to install p0f, you
may want to change FP_FILE in config.h to /etc/p0f.fp.
-i iface - asks p0f to listen on a specific network interface. On un*x, you
should reference the interface by name (e.g., eth0). On Windows,
you can use adapter index instead (0, 1, 2...).
Multiple -i parameters are not supported; you need to run
separate instances of p0f for that. On Linux, you can specify
'any' to access a pseudo-device that combines the traffic on
all other interfaces; the only limitation is that libpcap will
not recognize VLAN-tagged frames in this mode, which may be
an issue in some of the more exotic setups.
If you do not specify an interface, libpcap will probably pick
the first working interface in your system.
-L - lists all available network interfaces, then quits. Particularly
useful on Windows, where the system-generated interface names
are impossible to memorize.
-r fname - instead of listening for live traffic, reads pcap captures from
the specified file. The data can be collected with tcpdump or any
other compatible tool. Make sure that snapshot length (-s
option in tcpdump) is large enough not to truncate packets; the
default may be too small.
As with -i, only one -r option can be specified at any given
time.
-o fname - appends grep-friendly log data to the specified file. The log
contains all observations made by p0f about every matching
connection, and may grow large; plan accordingly.
Only one instance of p0f should be writing to a particular file
at any given time; where supported, advisory locking is used to
avoid problems.
-s fname - listens for API queries on the specified filesystem socket. This
allows other programs to ask p0f about its current thoughts about
a particular host. More information about the API protocol can be
found in section 4 below.
Only one instance of p0f can be listening on a particular socket
at any given time. The mode is also incompatible with -r.
-d - runs p0f in daemon mode: the program will fork into background
and continue writing to the specified log file or API socket. It
will continue running until killed, until the listening interface
is shut down, or until some other fatal error is encountered.
This mode requires either -o or -s to be specified.
To continue capturing p0f debug output and error messages (but
not signatures), redirect stderr to another non-TTY destination,
e.g.:
./p0f -o /var/log/p0f.log -d 2>>/var/log/p0f.error
Note that if -d is specified and stderr points to a TTY, error
messages will be lost.
-u user - causes p0f to drop privileges, switching to the specified user
and chroot()ing itself to said user's home directory.
This mode is *highly* advisable (but not required) on un*x
systems, especially in daemon mode. See section 7 for more info.
More arcane settings (you probably don't need to touch these):
-p - puts the interface specified with -i in promiscuous mode. If
supported by the firmware, the card will also process frames not
addressed to it.
-S num - sets the maximum number of simultaneous API connections. The
default is 20; the upper cap is 100.
-m c,h - sets the maximum number of connections (c) and hosts (h) to be
tracked at the same time (default: c = 1,000, h = 10,000). Once
the limit is reached, the oldest 10% entries gets pruned to make
room for new data.
This setting effectively controls the memory footprint of p0f.
The cost of tracking a single host is under 400 bytes; active
connections have a worst-case footprint of about 18 kB. High
limits have some CPU impact, too, by the virtue of complicating
data lookups in the cache.
NOTE: P0f tracks connections only until the handshake is done,
and if protocol-level fingerprinting is possible, until few
initial kilobytes of data have been exchanged. This means that
most connections are dropped from the cache in under 5 seconds;
consequently, the 'c' variable can be much lower than the real
number of parallel connections happening on the wire.
-t c,h - sets the timeout for collecting signatures for any connection
(c); and for purging idle hosts from in-memory cache (h). The
first parameter is given in seconds, and defaults to 30 s; the
second one is in minutes, and defaults to 120 min.
The first value must be just high enough to reliably capture
SYN, SYN+ACK, and the initial few kB of traffic. Low-performance
sites may want to increase it slightly.
The second value governs for how long API queries about a
previously seen host can be made; and what's the maximum interval
between signatures to still trigger NAT detection and so on.
Raising it is usually not advisable; lowering it to 5-10 minutes
may make sense for high-traffic servers, where it is possible to
see several unrelated visitors subsequently obtaining the same
dynamic IP from their ISP.
Well, that's about it. You probably need to run the tool as root. Some of the
most common use cases:
# ./p0f -i eth0
# ./p0f -i eth0 -d -u p0f-user -o /var/log/p0f.log
# ./p0f -r some_capture.cap
The greppable log format (-o) uses pipe ('|') as a delimiter, with name=value
pairs describing the signature in a manner very similar to the pretty-printed
output generated on stdout:
[2012/01/04 10:26:14] mod=mtu|cli=1.2.3.4/1234|srv=4.3.2.1/80|subj=cli|link=DSL|raw_mtu=1492
The 'mod' parameter identifies the subsystem that generated the entry; the
'cli' and 'srv' parameters always describe the direction in which the TCP
session is established; and 'subj' describes which of these two parties is
actually being fingerprinted.
Command-line options may be followed by a single parameter containing a
pcap-style traffic filtering rule. This allows you to reject some of the less
interesting packets for performance or privacy reasons. Simple examples include:
'dst net 10.0.0.0/8 and port 80'
'not src host 10.1.2.3'
'port 22 or port 443'
You can read more about the supported syntax by doing 'man pcap-fiter'; if
that fails, try this URL:
http://www.manpagez.com/man/7/pcap-filter/
Filters work both for online capture (-i) and for previously collected data
produced by any other tool (-r).
-------------
4. API access
-------------
The API allows other applications running on the same system to get p0f's
current opinion about a particular host. This is useful for integrating it with
spam filters, web apps, and so on.
Clients are welcome to connect to the unix socket specified with -s using the
SOCK_STREAM protocol, and may issue any number of fixed-length queries. The
queries will be answered in the order they are received.
Note that there is no response caching, nor any software limits in place on p0f
end, so it is your responsibility to write reasonably well-behaved clients.
Queries have exactly 21 bytes. The format is:
- Magic dword (0x50304601), in native endian of the platform.
- Address type byte: 4 for IPv4, 6 for IPv6.
- 16 bytes of address data, network endian. IPv4 addresses should be
aligned to the left.
To such a query, p0f responds with:
- Another magic dword (0x50304602), native endian.
- Status dword: 0x00 for 'bad query', 0x10 for 'OK', and 0x20 for 'no match'.
- Host information, valid only if status is 'OK' (byte width in square
brackets):
[4] first_seen - unix time (seconds) of first observation of the host.
[4] last_seen - unix time (seconds) of most recent traffic.
[4] total_conn - total number of connections seen.
[4] uptime_min - calculated system uptime, in minutes. Zero if not known.
[4] up_mod_days - uptime wrap-around interval, in days.
[4] last_nat - time of the most recent detection of IP sharing (NAT,
load balancing, proxying). Zero if never detected.
[4] last_chg - time of the most recent individual OS mismatch (e.g.,
due to multiboot or IP reuse).
[2] distance - system distance (derived from TTL; -1 if no data).
[1] bad_sw - p0f thinks the User-Agent or Server strings aren't
accurate. The value of 1 means OS difference (possibly
due to proxying), while 2 means an outright mismatch.
NOTE: If User-Agent is not present at all, this value
stays at 0.
[1] os_match_q - OS match quality: 0 for a normal match; 1 for fuzzy
(e.g., TTL or DF difference); 2 for a generic signature;
and 3 for both.
[32] os_name - NUL-terminated name of the most recent positively matched
OS. If OS not known, os_name[0] is NUL.
NOTE: If the host is first seen using an known system and
then switches to an unknown one, this field is not
reset.
[32] os_flavor - OS version. May be empty if no data.
[32] http_name - most recent positively identified HTTP application
(e.g. 'Firefox').
[32] http_flavor - version of the HTTP application, if any.
[32] link_type - network link type, if recognized.
[32] language - system language, if recognized.
A simple reference implementation of an API client is provided in p0f-client.c.
Implementations in C / C++ may reuse api.h from p0f source code, too.
Developers using the API should be aware of several important constraints:
- The maximum number of simultaneous API connections is capped to 20. The
limit may be adjusted with the -S parameter, but rampant parallelism may
lead to poorly controlled latency; consider a single query pipeline,
possibly with prioritization and caching.
- The maximum number of hosts and connections tracked at any given time is
subject to configurable limits. You should look at your traffic stats and
see if the defaults are suitable.
You should also keep in mind that whenever you are subject to an ongoing
DDoS or SYN spoofing DoS attack, p0f may end up dropping entries faster
than you could query for them. It's that or running out of memory, so
don't fret.
- Cache entries with no activity for more than 120 minutes will be dropped
even if the cache is nearly empty. The timeout is adjustable with -t, but
you should not use the API to obtain ancient data; if you routinely need to
go back hours or days, parse the logs instead of wasting RAM.
-----------------------
5. Fingerprint database
-----------------------
Whenever p0f obtains a fingerprint from the observed traffic, it defers to
the data read from p0f.fp to identify the operating system and obtain some
ancillary data needed for other analysis tasks. The fingerprint database is a
simple text file where lines starting with ; are ignored.
== Module specification ==
The file is split into sections based on the type of traffic the fingerprints
apply to. Section identifiers are enclosed in square brackets, like so:
[module:direction]
module - the name of the fingerprinting module (e.g. 'tcp' or 'http').
direction - the direction of fingerprinted traffic: 'request' (from client to
server) or 'response' (from server to client).
For the TCP module, 'client' matches the initial SYN; and
'server' matches SYN+ACK.
The 'direction' part is omitted for MTU signatures, as they work equally well
both ways.
== Signature groups ==
The actual signatures must be preceeded by an 'label' line, describing the
fingerprinted software:
label = type:class:name:flavor
type - some signatures in p0f.fp offer broad, last-resort matching for
less researched corner cases. The goal there is to give an
answer slightly better than "unknown", but less precise than
what the user may be expecting.
Normal, reasonably specific signatures that can't be radically
improved should have their type specified as 's'; while generic,
last-resort ones should be tagged with 'g'.
Note that generic signatures are considered only if no specific
matches are found in the database.
class - the tool needs to distinguish between OS-identifying signatures
(only one of which should be matched for any given host) and
signatures that just identify user applications (many of which
may be seen concurrently).
To assist with this, OS-specific signatures should specify the
OS architecture family here (e.g., 'win', 'unix', 'cisco'); while
application-related sigs (NMap, MSIE, Apache) should use a
special value of '!'.
Most TCP signatures are OS-specific, and should have OS family
defined. Other signatures, such as HTTP, should use '!' unless
the fingerprinted component is deeply intertwined with the
platform (e.g., Windows Update).
NOTE: To avoid variations (e.g. 'win' and 'windows' or 'unix'
and 'linux'), all classes need to be pre-registered using a
'classes' directive, seen near the beginning of p0f.fp.
name - a human-readable short name for what the fingerprint actually
helps identify - say, 'Linux', 'Sendmail', or 'NMap'. The tool
doesn't care about the exact value, but requires consistency - so
don't switch between 'Internet Explorer' and 'MSIE', or 'MacOS'
and 'Mac OS'.
flavor - anything you want to say to further qualify the observation. Can
be the version of the identified software, or a description of
what the application seems to be doing (e.g. 'SYN scan' for NMap).
NOTE: Don't be too specific: if you have a signature for Apache
2.2.16, but have no reason to suspect that other recent versions
behave in a radically different way, just say '2.x'.
P0f uses labels to group similar signatures that may be plausibly generated by
the same system or application, and should not be considered a strong signal for
NAT detection.
To further assist the tool in deciding which OS and application combinations are
reasonable, and which ones are indicative of foul play, any 'label' line for
applications (class '!') should be followed by a comma-delimited list of OS
names or @-prefixed OS architecture classes on which this software is known to
be used on. For example:
label = s:!:Uncle John's Networked ls Utility:2.3.0.1
sys = Linux,FreeBSD,OpenBSD
...or:
label = s:!:Mom's Homestyle Browser:1.x
sys = @unix,@win
The label can be followed by any number of module-specific signatures; all of
them will be linked to the most recent label, and will be reported the same
way.
All sections except for 'name' are omitted for [mtu] signatures, which do not
convey any OS-specific information, and just describe link types.
== MTU signatures ==
Many operating systems derive the maximum segment size specified in TCP options
from the MTU of their network interface; that value, in turn, normally depends
on the design of the link-layer protocol. A different MTU is associated with
PPPoE, a different one with IPSec, and a different one with Juniper VPN.
The format of the signatures in the [mtu] section is exceedingly simple,
consisting just of a description and a list of values:
label = Ethernet
sig = 1500
These will be matched for any wildcard MSS TCP packets (see below) not generated
by userspace TCP tools.
== TCP signatures ==
For TCP traffic, signature layout is as follows:
sig = ver:ittl:olen:mss:wsize,scale:olayout:quirks:pclass
ver - signature for IPv4 ('4'), IPv6 ('6'), or both ('*').
NEW SIGNATURES: P0f documents the protocol observed on the wire,
but you should replace it with '*' unless you have observed some
actual differences between IPv4 and IPv6 traffic, or unless the
software supports only one of these versions to begin with.
ittl - initial TTL used by the OS. Almost all operating systems use
64, 128, or 255; ancient versions of Windows sometimes used
32, and several obscure systems sometimes resort to odd values
such as 60.
NEW SIGNATURES: P0f will usually suggest something, using the
format of 'observed_ttl+distance' (e.g. 54+10). Consider using
traceroute to check that the distance is accurate, then sum up
the values. If initial TTL can't be guessed, p0f will output
'nnn+?', and you need to use traceroute to estimate the '?'.
A handful of userspace tools will generate random TTLs. In these
cases, determine maximum initial TTL and then add a - suffix to
the value to avoid confusion.
olen - length of IPv4 options or IPv6 extension headers. Usually zero
for normal IPv4 traffic; always zero for IPv6 due to the
limitations of libpcap.
NEW SIGNATURES: Copy p0f output literally.
mss - maximum segment size, if specified in TCP options. Special value
of '*' can be used to denote that MSS varies depending on the
parameters of sender's network link, and should not be a part of
the signature. In this case, MSS will be used to guess the
type of network hookup according to the [mtu] rules.
NEW SIGNATURES: Use '*' for any commodity OSes where MSS is
around 1300 - 1500, unless you know for sure that it's fixed.
If the value is outside that range, you can probably copy it
literally.
wsize - window size. Can be expressed as a fixed value, but many
operating systems set it to a multiple of MSS or MTU, or a
multiple of some random integer. P0f automatically detects these
cases, and allows notation such as 'mss*4', 'mtu*4', or '%8192'
to be used. Wilcard ('*') is possible too.
NEW SIGNATURES: Copy p0f output literally. If frequent variations
are seen, look for obvious patterns. If there are no patterns,
'*' is a possible alternative.
scale - window scaling factor, if specified in TCP options. Fixed value
or '*'.
NEW SIGNATURES: Copy literally, unless the value varies randomly.
Many systems alter between 2 or 3 scaling factors, in which case,
it's better to have several 'sig' lines, rather than a wildcard.
olayout - comma-delimited layout and ordering of TCP options, if any. This
is one of the most valuable TCP fingerprinting signals. Supported
values:
eol+n - explicit end of options, followed by n bytes of padding
nop - no-op option
mss - maximum segment size
ws - window scaling
sok - selective ACK permitted
sack - selective ACK (should not be seen)
ts - timestamp
?n - unknown option ID n
NEW SIGNATURES: Copy this string literally.
quirks - comma-delimited properties and quirks observed in IP or TCP
headers:
df - "don't fragment" set (probably PMTUD); ignored for IPv6
id+ - DF set but IPID non-zero; ignored for IPv6
id- - DF not set but IPID is zero; ignored for IPv6
ecn - explicit congestion notification support
0+ - "must be zero" field not zero; ignored for IPv6
flow - non-zero IPv6 flow ID; ignored for IPv4
seq- - sequence number is zero
ack+ - ACK number is non-zero, but ACK flag not set
ack- - ACK number is zero, but ACK flag set
uptr+ - URG pointer is non-zero, but URG flag not set
urgf+ - URG flag used
pushf+ - PUSH flag used
ts1- - own timestamp specified as zero
ts2+ - non-zero peer timestamp on initial SYN
opt+ - trailing non-zero data in options segment
exws - excessive window scaling factor (> 14)
bad - malformed TCP options
If a signature scoped to both IPv4 and IPv6 contains quirks valid
for just one of these protocols, such quirks will be ignored for
on packets using the other protocol. For example, any combination
of 'df', 'id+', and 'id-' is always matched by any IPv6 packet.
NEW SIGNATURES: Copy literally.
pclass - payload size classification: '0' for zero, '+' for non-zero,
'*' for any. The packets we fingerprint right now normally have
no payloads, but some corner cases exist.
NEW SIGNATURES: Copy literally.
NOTE: The TCP module allows some fuzziness when an exact match can't be found:
'df' and 'id+' quirks are allowed to disappear; 'id-' or 'ecn' may appear; and
TTLs can change.
To gather new SYN ('request') signatures, simply connect to the fingerprinted
system, and p0f will provide you with the necessary data. To gather SYN+ACK
('response') signatures, you should use the bundled p0f-sendsyn utility while p0f
is running in the background; creating them manually is not advisable.
== HTTP signatures ==
A special directive should appear at the beginning of the [http:request]
section, structured the following way:
ua_os = Linux,Windows,iOS=[iPad],iOS=[iPhone],Mac OS X,...
This list should specify OS names that should be looked for within the
User-Agent string if the string is otherwise deemed to be honest. This input
is not used for fingerprinting, but aids NAT detection in some useful ways.
The names have to match the names used in 'sig' specifiers across p0f.fp. If a
particular name used by p0f differs from what typically appears in User-Agent,
the name=[string] syntax may be used to define any number of aliases.
Other than that, HTTP signatures for GET and HEAD requests have the following
layout:
sig = ver:horder:habsent:expsw
ver - 0 for HTTP/1.0, 1 for HTTP/1.1, or '*' for any.
NEW SIGNATURES: Copy the value literally, unless you have a
specific reason to do otherwise.
horder - comma-separated, ordered list of headers that should appear in
matching traffic. Substrings to match within each of these
headers may be specified using a name=[value] notation.
The signature will be matched even if other headers appear in
between, as long as the list itself is matched in the specified
sequence.
Headers that usually do appear in the traffic, but may go away
(e.g. Accept-Language if the user has no languages defined, or
Referer if no referring site exists) should be prefixed with '?',
e.g. "?Referer". P0f will accept their disappearance, but will
not allow them to appear at any other location.
NEW SIGNATURES: Review the list and remove any headers that
appear to be irrelevant to the fingerprinted software, and mark
transient ones with '?'. Remove header values that do not add
anything to the signature, or are request- or user-specific.
In particular, pay attention to Accept, Accept-Language, and
Accept-Charset, as they are highly specific to request type
and user settings.
P0f automatically removes some headers, prefixes others with '?',
and inhibits the value of fields such as 'Referer' or 'Cookie' -
but this is not a substitute for manual review.
NOTE: Server signatures may differ depending on the request
(HTTP/1.1 versus 1.0, keep-alive versus one-shot, etc) and on the
returned resource (e.g., CGI versus static content). Play around,
browse to several URLs, also try curl and wget.
habsent - comma-separated list of headers that must *not* appear in
matching traffic. This is particularly useful for noting the
absence of standard headers (e.g. 'Host'), or for differentiating
between otherwise very similar signatures.
NEW SIGNATURES: P0f will automatically highlight the absence of
any normally present headers; other entries may be added where
necessary.
expsw - expected substring in 'User-Agent' or 'Server'. This is not
used to match traffic, and merely serves to detect dishonest
software. If you want to explicitly match User-Agent, you need
to do this in the 'horder' section, e.g.:
User-Agent=[Firefox]
Any of these sections sections except for 'ver' may be blank.
There are many protocol-level quirks that p0f could be detecting - for example,
the use of non-standard newlines, or missing or extra spacing between header
field names and values. There is also some information to be gathered from
responses to OPTIONS or POST. That said, it does not seem to be worth the
effort: the protocol is so verbose, and implemented so arbitrarily, that we are
getting more than enough information just with a simple GET / HEAD fingerprint.
== SMTP signatures ==
*** NOT IMPLEMENTED YET ***
== FTP signatures ==
*** NOT IMPLEMENTED YET ***
----------------
6. NAT detection
----------------
In addition to fairly straightforward measurements of intrinsic properties of
a single TCP session, p0f also tries to compare signatures across sessions to
detect client-side connection sharing (NAT, HTTP proxies) or server-side load
balancing.
This is done in two steps: the first significant deviation usually prompts a
"host change" entry (which may be also indicative of multi-boot, address reuse,
or other one-off events); and a persistent pattern of changes prompts an
"ip sharing" notification later on.
All of these messages are accompanied by a set of reason codes:
os_sig - the OS detected right now doesn't match the OS detected earlier
on.
sig_diff - no definite OS detection data available, but protocol-level
characteristics have changed drastically (e.g., different
TCP option layout).
app_vs_os - the application detected running on the host is not supposed
to work on the host's operating system.
x_known - the signature progressed from known to unknown, or vice versa.
The following additional codes are specific to TCP:
tstamp - TCP timestamps went back or jumped forward.
ttl - TTL values have changed.
port - source port number has decreased.
mtu - system MTU has changed.
fuzzy - the precision with which a TCP signature is matched has
changed.
The following code is also issued by the HTTP module:
via - data explicitly includes Via / X-Forwarded-For.
us_vs_os - OS fingerprint doesn't match User-Agent data, and the
User-Agent value otherwise looks honest.
app_srv_lb - server application signatures change, suggesting load
balancing.
date - server-advertised date changes inconsistently.
Different reasons have different weights, balanced to keep p0f very sensitive
even to very homogenous environments behind NAT. If you end up seeing false
positives or other detection problems in your environment, please let me know!
-----------
7. Security
-----------
You should treat the output from this tool as advisory; the fingerprinting can
be gambled with some minor effort, and it's also possible to evade it altogether
(e.g. with excessive IP fragmentation or bad TCP checksums). Plan accordingly.
P0f should to be reasonably secure to operate as a daemon. That said, un*x
users should employ the -u option to drop privileges and chroot() when running
the tool continuously. This greatly minimizes the consequences of any mishaps -
and mishaps in C just tend to happen.
To make this step meaningful, the user you are running p0f as should be
completely unprivileged, and should have an empty, read-only home directory. For
example, you can do:
# useradd -d /var/empty/p0f -M -r -s /bin/nologin p0f-user
# mkdir -p -m 755 /var/empty/p0f
Please don't put the p0f binary itself, or any other valuable assets, inside
that user's home directory; and certainly do not use any generic locations such
as / or /bin/ in lieu of a proper home.
P0f running in the background should be fairly difficult to DoS, especially
compared to any real TCP services it will be watching. Nevertheless, there are
so many deployment-specific factors at play that you should always preemptively
stress-test your setup, and see how it behaves.
Other than that, let's talk filesystem security. When using the tool in the
API mode (-s), the listening socket is always re-created created with 666
permissions, so that applications running as other uids can query it at will.
If you want to preserve the privacy of captured traffic in a multi-user system,
please ensure that the socket is created in a directory with finer-grained
permissions; or change API_MODE in config.h.
The default file mode for binary log data (-o) is 600, on the account that
others probably don't need access to historical data; if you need to share logs,
you can pre-create the file or change LOG_MODE in config.h.
Don't build p0f, and do not store its source, binary, configuration files, logs,
or query sockets in world-writable locations such as /tmp (or any
subdirectories created therein).
Last but not least, please do not attempt to make p0f setuid, or otherwise
grant it privileges higher than these of the calling user. Neither the tool
itself, nor the third-party components it depends on, are designed to keep rogue
less-privileged callers at bay. If you use /etc/sudoers to list p0f as the only
program that user X should be able to run as root, that user will probably be
able to compromise your system. The same goes for many other uses of sudo, by
the way.
--------------
8. Limitations
--------------
Here are some of the known issues you may run into:
== General ==
1) RST, ACK, and other experimental fingerprinting modes offered in p0f v2 are
no longer supported in v3. This is because they proved to have very low
specificity. The consequence is that you can no longer fingerprint
"connection refused" responses.
2) API queries or daemon execution are not supported when reading offline pcaps.
While there may be some fringe use cases for that, offline pcaps use a
much simpler event loop, and so supporting these features would require some
extra effort.
3) P0f needs to observe at least about 25 milliseconds worth of qualifying
traffic to estimate system uptime. This means that if you're testing it over
loopback or LAN, you may need to let it see more than one connection.
Systems with extremely slow timestamp clocks may need longer acquisition
periods (up to several seconds); very fast clocks (over 1.5 kHz) are rejected
completely on account of being prohibited by the RFC. Almost all OSes are
between 100 Hz and 1 kHz, which should work fine.
4) Some systems vary SYN+ACK responses based on the contents of the initial SYN,
sometimes removing TCP options not supported by the other endpoint.
Unfortunately, there is no easy way to account for this, so several SYN+ACK
signatures may be required per system. The bundled p0f-sendsyn utility helps
with collecting them.
Another consequence of this is that you will sometimes see server uptime only
if your own system has RFC1323 timestamps enabled. Linux does that since
version 2.2; on Windows, you need version 7 or newer. Client uptimes are not
affected.
== Windows port ==
1) API sockets do not work on Windows. This is due to a limitation of winpcap;
see live_event_loop(...) in p0f.c for more info.
2) The chroot() jail (-u) on Windows doesn't offer any real security. This is
due to the limitations of cygwin.
3) The p0f-sendsyn utility doesn't work because of the limited capabilities of
Windows raw sockets (this should be relatively easy to fix if there are any
users who care).
---------------------------
9. Acknowledgments and more
---------------------------
P0f is made possible thanks to the contributions of several good souls,
including:
Phil Ames
Jannich Brendle
Matthew Dempsky
Jason DePriest
Dalibor Dukic
Mark Martinec
Damien Miller
Josh Newton
Nibbler
Bernhard Rabe
Chris John Riley
Sebastian Roschke
Peter Valchev
Jeff Weisberg
Anthony Howe
Tomoyuki Murakami
Michael Petch
If you wish to help, the most immediate way to do so is to simply gather new
signatures, especially from less popular or older platforms (servers, networking
equipment, portable / embedded / specialty OSes, etc).
Problems? Suggestions? Complaints? Compliments? You can reach the author at
<lcamtuf@coredump.cx>. The author is very lonely and appreciates your mail.