Unique session ids #64

RalfJung · 2017-11-09T10:56:50Z

Quoting the main commit message:

    L2TPv3 session IDs have to be unique on the entire system, not just per tunnel.
    The only reason that tunneldigger got away with using 1 for all sessions is that
    older Linux kernels failed to properly check for duplicate session IDs.  That
    got fixed by kernel commit dbdbc73b44782e22b3b4b6e8b51e7a3d245f3086.
    
    This patch adds unique session IDs to tunneldigger in a backwards-compatible
    way.  If both ends of the tunnel agree to use a unique session ID, they both
    will use the tunnel ID as the session ID.  To manage this mutual agreement, we
    introduce space for up to 32 feature flags.  Three messages are affected:
    
    CONTROL_TYPE_USAGE optionally takes 4 additional bytes after the padding, used
    to send the client's feature flags.  Brokers may skew the usage they report back
    depending on client features.  Old clients will not send these bytes, which is
    interpreted as all flags being 0.
    
    CONTROL_TYPE_PREPARE is extended the same way, also sending the client's feature
    flags, so that the server does not have to remember them.  These are the
    features that the client is offering to be using for this session.  Old clients
    will not send any flags, which tells the server that no features are supported.
    Old servers will ignore flags if they received them.
    
    CONTROL_TYPE_TUNNEL is extended the same way, reporting the features that are
    actually going to be used for this session.  Typically, this will be the
    client's features masked by what the server supports.  If a feature-aware client
    talks to an old server, the flags are going to be missing, telling the client
    that no features are going to be used in this session.
    
    The first feature flag is 'unique session IDs'.  If enabled, the session ID of
    each side of the connection is going to be the same as the tunnel ID.
    Furthermore, the broker gains an option to report full usage for clients not
    supporting unique session IDs, making sure they connect to other servers (if any
    are available).

I think this is ready to be reviewed. However, before merging, I'd like to

roll out an experimental firmware showing that the client stuff really works, and
at least try to figure out why we still have two clients connecting to the Gateway that now should report usage 0xFFFF.

Fixes #57

RalfJung · 2017-11-09T11:01:39Z

Oh, another thing I'd like to do: Have the client fail in context_setup_tunnel if the server returns a flag we do not know. That's clearly a non-conforming server which we do not want to have anything to do with.

mitar · 2017-11-09T11:09:37Z

A side note: were there any changes on port usage? Can now multiple tunnels be on the same port? Maybe now that they fixed IDs, they can also support that? Or maybe we could send the patch upstream. Currently we have to use NAT ugliness to map different internal ports to same outside port, but there is nothing in the specs which would really require this. Kernel should differentiate between packets based on the ID, not based on the port.

If they fixed this, we could remove NAT and make everything much simpler and also support IPv6.

RalfJung · 2017-11-09T11:10:38Z

I did not look into this at all. The old NAT hacks certainly still work.

kostko

Looks pretty good! I've added some minor comments.

kostko · 2017-11-09T14:49:50Z

broker/l2tp_broker.cfg.example

@@ -21,6 +21,9 @@ namespace=default
 connection_rate_limit=10
 ; Set PMTU to a fixed value.  Use 0 for automatic PMTU discovery.
 pmtu=0
+; Whether this server runs a new kernel (4.13 and various stable series; in particular, 4.9.36) and


Could we perform some autodetection to determine this?

I guess we could -- but isn't that too fragile? I don't even have complete data which other kernels the "bad" patch was backported to.

However, one thing we could do is detect the EEXIST on l2tpv3 session creation, and treat that as a hint that we need unique sessions IDs. That would mean one client has a failed connection attempt, and old clients take 30s to re-try, but the next time around they would see usage 0xFFFF.

Would you prefer that? I think I would.

kostko · 2017-11-09T14:51:49Z

broker/src/tunneldigger_broker/protocol.py

            # Verify cookie value.
-            timestamp = msg_data[:2]
+            timestamp = msg_data[offset:offset+2]


Spaces around + operator.

kostko · 2017-11-09T14:55:49Z

broker/src/tunneldigger_broker/protocol.py

            signed_value = '%s%s%s' % (address[0], address[1], timestamp)
            signature = hmac.HMAC(SECRET_KEY, signed_value, hashlib.sha1).digest()[:6]
            timestamp = struct.unpack('!H', timestamp)[0]

-            if signature != msg_data[2:8] or abs(protocol_time() - timestamp) > 2:
+            # Reject message if more than 2 protocol ticks old.  One tick is 1 >> 6 = 64 seconds.
+            if signature != msg_data[offset:offset+6] or abs(protocol_time() - timestamp) > 2:


Spaces around + operator.

kostko · 2017-11-09T14:57:08Z

broker/src/tunneldigger_broker/protocol.py

-            tunnel_manager = self.get_tunnel_manager()
+            client_features = 0
+            try:
+                client_features = struct.unpack('!I', msg_data[8:8+4])[0]


Spaces around + operator.

kostko · 2017-11-09T14:57:33Z

broker/src/tunneldigger_broker/tunnel.py

@@ -39,6 +39,10 @@
 PMTU_PROBE_REPEATS = 4
 PMTU_PROBE_COMBINATIONS = PMTU_PROBE_SIZE_COUNT * PMTU_PROBE_REPEATS

+# Session feature flags
+FEATURE_UNIQUE_SESSION_ID = 1 << 0


I think these should be moved into protocol.py.

RalfJung · 2017-11-10T08:39:37Z

try to figure out why we still have two clients connecting to the Gateway that now should report usage 0xFFFF.

I think I know what is happening. The problem is that these clients do not yet have b367a33. If they fail to get any reply from any broker, they will, by default, connect to the first one, which happens to be the one that already has the newer kernel.

L2TPv3 session IDs have to be unique on the entire system, not just per tunnel. The only reason that tunneldigger got away with using 1 for all sessions is that older Linux kernels failed to properly check for duplicate session IDs. That got fixed by kernel commit dbdbc73b44782e22b3b4b6e8b51e7a3d245f3086. This patch adds unique session IDs to tunneldigger in a backwards-compatible way. If both ends of the tunnel agree to use a unique session ID, they both will use the tunnel ID as the session ID. To manage this mutual agreement, we introduce space for up to 32 feature flags. Three messages are affected: CONTROL_TYPE_USAGE optionally takes 4 additional bytes after the padding, used to send the client's feature flags. Brokers may skew the usage they report back depending on client features. Old clients will not send these bytes, which is interpreted as all flags being 0. CONTROL_TYPE_PREPARE is extended the same way, also sending the client's feature flags, so that the server does not have to remember them. These are the features that the client is offering to be using for this session. Old clients will not send any flags, which tells the server that no features are supported. Old servers will ignore flags if they received them. CONTROL_TYPE_TUNNEL is extended the same way, reporting the features that are actually going to be used for this session. Typically, this will be the client's features masked by what the server supports. If a feature-aware client talks to an old server, the flags are going to be missing, telling the client that no features are going to be used in this session. The first feature flag is 'unique session IDs'. If enabled, the session ID of each side of the connection is going to be the same as the tunnel ID. Furthermore, the broker gains an option to report full usage for clients not supporting unique session IDs, making sure they connect to other servers (if any are available).

…pporting unique session IDs that we are full

RalfJung · 2017-11-12T11:01:34Z

I have an updated version of this that automatically requires unique session IDs upon the first EEXIST on session creation. However, I also rebased on top of #67 (as I want them both on our servers).

I will update this PR once #67 got merged.

RalfJung · 2017-11-15T11:35:06Z

We now have 10 nodes running with the new client, and 3 of our 4 gateways running with the new broker, for 4 days. Everything seems to run smoothly. So as far as I am concerned, this is ready to be merged.

RalfJung · 2017-11-20T12:13:47Z

Updated PR to contain the auto-detection, and rebased on top of master.

kostko · 2017-11-20T13:57:59Z

Thanks!

kostko reviewed Nov 9, 2017

View reviewed changes

RalfJung added 3 commits November 12, 2017 11:13

address review concerns

811dc6b

once we get a duplicate session ID failure, pretend to clients not su…

0a4b330

…pporting unique session IDs that we are full

kostko mentioned this pull request Nov 20, 2017

tunneldigger dies somtimes freifunk-gluon/gluon#1188

Closed

kostko merged commit 0a4b330 into wlanslovenija:master Nov 20, 2017

RalfJung deleted the unique-session-ids branch November 25, 2017 09:45

RalfJung mentioned this pull request Jan 2, 2020

Remove NAT logic #115

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unique session ids #64

Unique session ids #64

RalfJung commented Nov 9, 2017 •

edited

Loading

RalfJung commented Nov 9, 2017

mitar commented Nov 9, 2017

RalfJung commented Nov 9, 2017

kostko left a comment

kostko Nov 9, 2017

RalfJung Nov 9, 2017

RalfJung Nov 12, 2017 •

edited

Loading

kostko Nov 9, 2017

kostko Nov 9, 2017

kostko Nov 9, 2017

kostko Nov 9, 2017

RalfJung commented Nov 10, 2017

RalfJung commented Nov 12, 2017

RalfJung commented Nov 15, 2017

RalfJung commented Nov 20, 2017

kostko commented Nov 20, 2017

Unique session ids #64

Unique session ids #64

Conversation

RalfJung commented Nov 9, 2017 • edited Loading

RalfJung commented Nov 9, 2017

mitar commented Nov 9, 2017

RalfJung commented Nov 9, 2017

kostko left a comment

Choose a reason for hiding this comment

kostko Nov 9, 2017

Choose a reason for hiding this comment

RalfJung Nov 9, 2017

Choose a reason for hiding this comment

RalfJung Nov 12, 2017 • edited Loading

Choose a reason for hiding this comment

kostko Nov 9, 2017

Choose a reason for hiding this comment

kostko Nov 9, 2017

Choose a reason for hiding this comment

kostko Nov 9, 2017

Choose a reason for hiding this comment

kostko Nov 9, 2017

Choose a reason for hiding this comment

RalfJung commented Nov 10, 2017

RalfJung commented Nov 12, 2017

RalfJung commented Nov 15, 2017

RalfJung commented Nov 20, 2017

kostko commented Nov 20, 2017

RalfJung commented Nov 9, 2017 •

edited

Loading

RalfJung Nov 12, 2017 •

edited

Loading