Skip to content

Commit

Permalink
Merge pull request #817 from tlsfuzzer/more-docs
Browse files Browse the repository at this point in the history
More documentation
  • Loading branch information
tomato42 committed May 22, 2023
2 parents d0d14c2 + 8766c9e commit 5f2069b
Show file tree
Hide file tree
Showing 3 changed files with 155 additions and 11 deletions.
114 changes: 114 additions & 0 deletions docs/source/failure-analysis.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,114 @@
================
Failure analysis
================

As the scripts in ``tlsfuzzer`` expect very specific behaviour from the server
not all failures and errors reported by the script necessarily mean that the
server is buggy.

Note that those are just some of the tips, and this article is far from
complete.
To understand the failures and analyse them you will likely need to have
a deep understanding of the TLS protocol.
Remember to first understand what is happening and why it's happening
before dismissing test failure as "irrelevant" or a "false positive".
Also, when investigating, don't be afraid to edit the test scripts:
they're intentionally written with little code reuse to make that process
easy and contained.

Execution summary
=================

All scripts print the following summary after execution:

.. code::
Test end
====================
version: 8
====================
TOTAL: 2
SKIP: 0
PASS: 2
XFAIL: 0
FAIL: 0
XPASS: 0
====================
The ``version`` field is self-explanatory, it's the version number of the
script.
Every time the script is modified in a way that its execution *may*
change, that version number is incremented.

The ``TOTAL`` is the number of conversations executed by the script.
While typically "conversations" is equivalent to "connections", in case the
script is testing session resumption, the actual number of connections
can be larger.

The ``SKIP`` is the number of conversations that were excluded from execution
through the use of the ``-e`` command line option.

The ``PASS`` is the number of conversations in which the server behaved
in expected manner (either the connection was successful when we expected
success or the server correctly detected a malformed message when we
sent one).

The ``XFAIL`` is the number of conversations that *eXpectedly FAILed*.
That is, the number of conversations that were specified with the ``-x`` option
and subsequently failed (without the specific error message, or
with the exact error message specified with the ``-X`` option).

The ``FAIL`` is the number of conversations in which the server behaved
in unexpected manner (refused the connection when we expected a
successful handshake, didn't error out when we sent malformed messaged, etc.).

The ``XPASS`` is the number of conversations that *unXepectedly PASSed*.
That is, the conversations that were specified with the ``-x`` option but
didn't fail.

Those names are taken from the ``pytest`` project and have the same
meaning there.

The overall script will return a non-zero exit code if either the ``FAIL``
or the ``XPASS`` count is more than zero.

Failure in sanity
=================

For the overall test to be valid in the first place, we need to establish
a baseline: verify that we can connect to the server and that we can
perform a basic handshake, exchange some data, and perform an orderly
connection shutdow.

This is done by the ``sanity`` conversations inside the scripts.

For example, if the script is testing that the server is rejecting
malformed ClientHello messages, we first need to verify that it will
accept a non-malformed message first: that happens in the the ``sanity``
case.
If that one doesn't pass, we may not be testing the ClientHello parser at
all, as the server may be rejecting the connection because of the advertised
ciphersuite by the client, not anything else.

Now, since we want to detect wrong behaviour from the server,
the scripts are fairly strict with the expected behaviour in the ``sanity``
case.
In particular, the *number* of ApplicationData packets expected as the
reply to the HTTP GET request is exactly one.
This is to detect situations in which the server sends each line of
reply in single ApplicationData record (very bad),
or a situation in which an HTTP server sends headers and body of the
response in separate records (bad, as that leaks the exact size of both,
which can have privacy consequences).

That of course means that you generally can't execute the scripts against
servers that reply with a large web page that doesn't fit into a single
TLS ApplicationData record (at least, not without modifying the scripts).
It's also like that, as some servers prefer to fragment the response
to smaller records, ones that fit into MTU (to improve latency), while
others will happily send records of maximum size and as few of them as
possible.
What is the expected behaviour for a particular server is not something
we should specify for each and every test script, so they expect
one and only one record as the response (though there are scripts that
test large responses specifically, like the ``test-record-size-limit.py``).
1 change: 1 addition & 0 deletions docs/source/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,7 @@ to see wanted, but not yet implemented features.
quickstart
installation
theory
failure-analysis
writing-tests
advanced-decision-graph
modifying-messages
Expand Down
51 changes: 40 additions & 11 deletions docs/source/quickstart.rst
Original file line number Diff line number Diff line change
Expand Up @@ -168,11 +168,17 @@ This command should provide the following output if everything went fine:
cipher, TLS 1.2 or earlier and RSA key exchange (or (EC)DHE if
-d option is used)
version: 4
Test end
successful: 2
failed: 0
====================
version: 8
====================
TOTAL: 2
SKIP: 0
PASS: 2
XFAIL: 0
FAIL: 0
XPASS: 0
====================
All the test scripts support at least ``--help`` option. For this script it
will provide the following information:
Expand All @@ -187,9 +193,18 @@ will provide the following information:
names and not all of them, e.g "sanity"
-e probe-name exclude the probe from the list of the ones run
may be specified multiple times
-n num only run `num` random tests instead of a full set
-x probe-name expect the probe to fail. When such probe passes despite being marked like this
it will be reported in the test summary and the whole script will fail.
May be specified multiple times.
-X message expect the `message` substring in exception raised during
execution of preceding expected failure probe
usage: [-x probe-name] [-X exception], order is compulsory!
-n num run 'num' or all(if 0) tests instead of default(all)
("sanity" tests are always executed)
-d negotiate (EC)DHE instead of RSA key exchange
-d negotiate (EC)DHE instead of RSA key exchange, send
additional extensions, usually used for (EC)DHE ciphers
-C ciph Use specified ciphersuite. Either numerical value or
IETF name.
--help this message
Almost all scripts support this set of command line options.
Expand All @@ -214,11 +229,17 @@ This produces similar output:
Check if communication with typical group and cipher works with
the TLS 1.3 server.
version: 2
Test end
successful: 2
failed: 0
====================
version: 7
====================
TOTAL: 2
SKIP: 0
PASS: 2
XFAIL: 0
FAIL: 0
XPASS: 0
====================
Similarly to the :term:`TLS` 1.2 script, this one supports a set of options:

Expand All @@ -232,8 +253,16 @@ Similarly to the :term:`TLS` 1.2 script, this one supports a set of options:
names and not all of them, e.g "sanity"
-e probe-name exclude the probe from the list of the ones run
may be specified multiple times
-n num only run `num` random tests instead of a full set
-x probe-name expect the probe to fail. When such probe passes despite being marked like this
it will be reported in the test summary and the whole script will fail.
May be specified multiple times.
-X message expect the `message` substring in exception raised during
execution of preceding expected failure probe
usage: [-x probe-name] [-X exception], order is compulsory!
-n num run 'num' or all(if 0) tests instead of default(all)
("sanity" tests are always executed)
-C ciph Use specified ciphersuite. Either numerical value or
IETF name.
--help this message
As cryptographic parameter negotiation happens differently in :term:`TLS` 1.3
Expand Down

0 comments on commit 5f2069b

Please sign in to comment.