Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[#8912] Safer, better UNIX socket ancillary data decoding #652

Merged
merged 32 commits into from
Feb 16, 2017
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
32 commits
Select commit Hold shift + click to select a range
7d0b3b6
Safer, more complete decoding UNIX socket recvmsg ancillary data.
exvito Dec 30, 2016
321d4f8
Refactored multi fileDescriptorReceived test - now exercising the loop.
exvito Dec 30, 2016
784aa37
Test bad ancillary data from recvmsg.
exvito Dec 30, 2016
f0a1962
Wrap a long line to comply with twistedchecker rules.
exvito Dec 30, 2016
d62614e
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
exvito Jan 5, 2017
b39b048
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
exvito Jan 10, 2017
9c3e589
Rename method to adhere to coding standard.
exvito Jan 14, 2017
f4d60c2
Better method doc string, now with @param entry.
exvito Jan 14, 2017
eede87c
Fix test skip for incoming UNIX socket ancillary data: now outside of…
exvito Jan 14, 2017
37ac2bc
Really fix test skip for incoming UNIX socket ancillary data on Mac.
exvito Jan 14, 2017
fb53001
Removed unused SkipTest import.
exvito Jan 14, 2017
f79830a
Nice test skip message for incoming UNIX socket ancillary data on Mac.
exvito Jan 14, 2017
f43b6ce
Cleaner test skip logic for incoming UNIX socket ancillary data.
exvito Jan 14, 2017
b52a18a
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
exvito Jan 14, 2017
ba30b53
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
exvito Jan 14, 2017
12e0314
Use t.t.u.TestCase.patch instead of monkey patching by hand.
exvito Jan 19, 2017
df71bda
Typo correction in UNIX ancillary data test comment.
exvito Jan 19, 2017
047bd99
Improve bad UNIX ancillary data test - verify message is logged.
exvito Jan 19, 2017
38f50ee
Merge remote-tracking branch 'exvito/8912-exvito-safer-unix-do-read-t…
exvito Jan 19, 2017
fbfb339
Factor out SOL_SOCKET/SCM_RIGHTS UNIX ancillary data handling.
exvito Jan 23, 2017
c4d2774
Rename factored out method to comply with twistedchecker.
exvito Jan 23, 2017
4888d4c
Better unsupported UNIX ancillary data test.
exvito Jan 23, 2017
1824a36
Ensure full coverage of test code.
exvito Jan 23, 2017
718a313
Verify multiple UNIX FD receiving respects order.
exvito Jan 23, 2017
993e344
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
exvito Jan 23, 2017
0d8fc93
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
glyph Jan 26, 2017
0e5b7c5
File instead of socket FDs now used in multi fileDescriptorReceived t…
exvito Jan 30, 2017
1de0b1c
Remove unused imports.
exvito Jan 30, 2017
0873b20
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
glyph Jan 31, 2017
e1578ef
Minor code style adjustments.
exvito Jan 31, 2017
77f7cc7
Reword topfile: better for users while still technically detailed.
exvito Jan 31, 2017
f4dbdbb
Merge branch 'trunk' into 8912-exvito-safer-unix-do-read-tk2
glyph Feb 15, 2017
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
139 changes: 127 additions & 12 deletions src/twisted/internet/test/test_unix.py
Original file line number Diff line number Diff line change
Expand Up @@ -8,8 +8,8 @@
from __future__ import division, absolute_import

from stat import S_IMODE
from os import stat, close, urandom
from tempfile import mktemp
from os import stat, close, urandom, unlink, fstat
from tempfile import mktemp, mkstemp
from socket import AF_INET, SOCK_STREAM, SOL_SOCKET, socket
from pprint import pformat
from hashlib import md5
Expand Down Expand Up @@ -383,33 +383,51 @@ def test_fileDescriptorOverrun(self):
test_fileDescriptorOverrun.skip = sendmsgSkip


def test_multiFileDescriptorReceivedPerRecvmsg(self):
def _sendmsgMixinFileDescriptorReceivedDriver(self, ancillaryPacker):
"""
_SendmsgMixin handles multiple file descriptors per recvmsg, calling
L{IFileDescriptorReceiver.fileDescriptorReceived} once per received
file descriptor.
Drive _SendmsgMixin via sendmsg socket calls to check that
L{IFileDescriptorReceiver.fileDescriptorReceived} is called once
for each file descriptor received in the ancillary messages.

@param ancillaryPacker: A callable that will be given a list of
two file descriptors and should return a two-tuple where:
The first item is an iterable of zero or more (cmsg_level,
cmsg_type, cmsg_data) tuples in the same order as the given
list for actual sending via sendmsg; the second item is an
integer indicating the expected number of FDs to be received.
"""
# Strategy:
# - Create a UNIX socketpair.
# - Associate one end to a FakeReceiver and FakeProtocol.
# - Call sendmsg on the other end with two FDs as ancillary data.
# - Call sendmsg on the other end to send FDs as ancillary data.
# Ancillary data is obtained calling ancillaryPacker with
# the two FDs associated to two temp files (using the socket
# FDs for this fails the device/inode verification tests on
# Mac OS X 10.10, so temp files are used instead).
# - Call doRead in the FakeReceiver.
# - Verify results on FakeProtocol.
# Using known device/inodes to verify correct order.

# TODO: replace FakeReceiver test approach with one based in
# IReactorSocket.adoptStreamConnection once AF_UNIX support is
# implemented; see https://twistedmatrix.com/trac/ticket/5573.

from socket import socketpair
from twisted.internet.unix import _SendmsgMixin
from twisted.python.sendmsg import sendmsg, SCM_RIGHTS
from twisted.python.sendmsg import sendmsg

def deviceInodeTuple(fd):
fs = fstat(fd)
return (fs.st_dev, fs.st_ino)

@implementer(IFileDescriptorReceiver)
class FakeProtocol(ConnectableProtocol):
def __init__(self):
self.fds = []
self.deviceInodesReceived = []
def fileDescriptorReceived(self, fd):
self.fds.append(fd)
self.deviceInodesReceived.append(deviceInodeTuple(fd))
close(fd)

class FakeReceiver(_SendmsgMixin):
Expand All @@ -419,6 +437,12 @@ def __init__(self, skt, proto):
self.protocol = proto
def _dataReceived(self, data):
pass
def getHost(self):
pass
def getPeer(self):
pass
def _getLogPrefix(self, o):
pass

sendSocket, recvSocket = socketpair(AF_UNIX, SOCK_STREAM)
self.addCleanup(sendSocket.close)
Expand All @@ -427,20 +451,111 @@ def _dataReceived(self, data):
proto = FakeProtocol()
receiver = FakeReceiver(recvSocket, proto)

# Temp files give us two FDs to send/receive/verify.
fileOneFD, fileOneName = mkstemp()
fileTwoFD, fileTwoName = mkstemp()
self.addCleanup(unlink, fileOneName)
self.addCleanup(unlink, fileTwoName)

dataToSend = b'some data needs to be sent'
fdsToSend = [sendSocket.fileno(), recvSocket.fileno()]
ancillary = [(SOL_SOCKET, SCM_RIGHTS, pack('ii', *fdsToSend))]
fdsToSend = [fileOneFD, fileTwoFD]
ancillary, expectedCount = ancillaryPacker(fdsToSend)
sendmsg(sendSocket, dataToSend, ancillary)

receiver.doRead()

# Verify that fileDescriptorReceived was called twice.
self.assertEqual(len(proto.fds), 2)
self.assertEqual(len(proto.fds), expectedCount)

# Verify that received FDs are different from the sent ones.
self.assertFalse(set(fdsToSend).intersection(set(proto.fds)))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(required)

Hmm. The existing assertion doesn't verify that the order of file descriptors received is the same as the order of file descriptors sent? So if the implementation mixed them up we wouldn't notice... Is that right?

If so, that seems moderately unfortunate. It's a pre-existing defect so it could be fixed in a follow-up ticket. Please either fix it here or file that follow-up ticket.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are correct. This requires the non-trivial generic challenge of asserting two file descriptors actually "point" to the same underlying "kernel managed object".

The current test implementation only makes this more difficult, I guess, given that it is using the same UNIX socket FDs obtained via socketpair for two purposes:

  • To call sendmsg / doRead() on them.
  • As ancillary data FD payloads (because we need FDs to do that, and those are at hand).

Verifying that two UNIX socket FDs actually "point" to the same underlying socket is something I couldn't figure out how to do. In particular given the fact that these sockets were obtained via socketpair.

Two ideas come to my mind:

  • Trying to bind each of the UNIX sockets to a known address (filename, thus inode) and using os.fstat later to verify which is which. Not sure if socketpair connected sockets can be bound when obtained + how different platforms handle that.
  • Using known temporary files instead of the socket FDs as ancillary data payloads. These can be more simply verified via (filename + inode) and/or file contents.

Maybe there is some better approach at this. Which are your thoughts on this?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just pushed a commit with the first approach I commented above:

  • Bind sockets to known addresses.
  • Have FakeProtocol track not only FDs but also fstat's (st_dev, st_ino) tuples.
  • Do sendmsg / doRead.
  • Verify the sent (st_dev, st_ino) tuples against the ones tracked by FakeProtocol.

Feedback welcome.


# Verify that FDs were received in the same order, if any.
if proto.fds:
deviceInodesSent = [deviceInodeTuple(fd) for fd in fdsToSend]
self.assertEqual(deviceInodesSent, proto.deviceInodesReceived)


def test_multiFileDescriptorReceivedPerRecvmsgOneCMSG(self):
"""
_SendmsgMixin handles multiple file descriptors per recvmsg, calling
L{IFileDescriptorReceiver.fileDescriptorReceived} once per received
file descriptor. Scenario: single CMSG with two FDs.
"""
from twisted.python.sendmsg import SCM_RIGHTS

def ancillaryPacker(fdsToSend):
ancillary = [(SOL_SOCKET, SCM_RIGHTS, pack('ii', *fdsToSend))]
expectedCount = 2
return ancillary, expectedCount

self._sendmsgMixinFileDescriptorReceivedDriver(ancillaryPacker)
if sendmsgSkip is not None:
test_multiFileDescriptorReceivedPerRecvmsgOneCMSG.skip = sendmsgSkip


def test_multiFileDescriptorReceivedPerRecvmsgTwoCMSGs(self):
"""
_SendmsgMixin handles multiple file descriptors per recvmsg, calling
L{IFileDescriptorReceiver.fileDescriptorReceived} once per received
file descriptor. Scenario: two CMSGs with one FD each.
"""
from twisted.python.sendmsg import SCM_RIGHTS

def ancillaryPacker(fdsToSend):
ancillary = [
(SOL_SOCKET, SCM_RIGHTS, pack('i', fd))
for fd in fdsToSend
]
expectedCount = 2
return ancillary, expectedCount

self._sendmsgMixinFileDescriptorReceivedDriver(ancillaryPacker)
if platform.isMacOSX():
test_multiFileDescriptorReceivedPerRecvmsgTwoCMSGs.skip = (
"Multi control message ancillary sendmsg not supported on Mac.")
elif sendmsgSkip is not None:
test_multiFileDescriptorReceivedPerRecvmsgTwoCMSGs.skip = sendmsgSkip


def test_multiFileDescriptorReceivedPerRecvmsgBadCMSG(self):
"""
_SendmsgMixin handles multiple file descriptors per recvmsg, calling
L{IFileDescriptorReceiver.fileDescriptorReceived} once per received
file descriptor. Scenario: unsupported CMSGs.
"""
# Given that we can't just send random/invalid ancillary data via the
# packer for it to be sent via sendmsg -- the kernel would not accept
# it -- we'll temporarily replace recvmsg with a fake one that produces
# a non-supported ancillary message level/type. This being said, from
# the perspective of the ancillaryPacker, all that is required is to
# let the test driver know that 0 file descriptors are expected.
from twisted.python import sendmsg

def ancillaryPacker(fdsToSend):
ancillary = []
expectedCount = 0
return ancillary, expectedCount

def fakeRecvmsgUnsupportedAncillary(skt, *args, **kwargs):
data = b'some data'
ancillary = [(None, None, b'')]
flags = 0
return sendmsg.RecievedMessage(data, ancillary, flags)

events = []
addObserver(events.append)
self.addCleanup(removeObserver, events.append)

self.patch(sendmsg, "recvmsg", fakeRecvmsgUnsupportedAncillary)
self._sendmsgMixinFileDescriptorReceivedDriver(ancillaryPacker)

# Verify the expected message was logged.
expectedMessage = 'received unsupported ancillary data'
found = any(expectedMessage in e['format'] for e in events)
self.assertTrue(found, 'Expected message not found in logged events')
if sendmsgSkip is not None:
test_multiFileDescriptorReceivedPerRecvmsg.skip = sendmsgSkip
test_multiFileDescriptorReceivedPerRecvmsgBadCMSG.skip = sendmsgSkip


def test_avoidLeakingFileDescriptors(self):
Expand Down
58 changes: 43 additions & 15 deletions src/twisted/internet/unix.py
Original file line number Diff line number Diff line change
Expand Up @@ -154,8 +154,9 @@ def writeSomeData(self, data):

def doRead(self):
"""
Calls L{IFileDescriptorReceiver.fileDescriptorReceived} and
L{IProtocol.dataReceived} with all available data.
Calls {IProtocol.dataReceived} with all available data and
L{IFileDescriptorReceiver.fileDescriptorReceived} once for each
received file descriptor in ancillary data.

This reads up to C{self.bufferSize} bytes of data from its socket, then
dispatches the data to protocol callbacks to be handled. If the
Expand All @@ -171,28 +172,55 @@ def doRead(self):
else:
return main.CONNECTION_LOST

if ancillary:
ancillaryData = ancillary[0][2]
fdCount = len(ancillaryData) // 4
fds = struct.unpack('i'*fdCount, ancillaryData)
if interfaces.IFileDescriptorReceiver.providedBy(self.protocol):
for fd in fds:
self.protocol.fileDescriptorReceived(fd)
for cmsgLevel, cmsgType, cmsgData in ancillary:
if (cmsgLevel == socket.SOL_SOCKET and
cmsgType == sendmsg.SCM_RIGHTS):
self._ancillaryLevelSOLSOCKETTypeSCMRIGHTS(cmsgData)
else:
log.msg(
format=(
"%(protocolName)s (on %(hostAddress)r) does not "
"provide IFileDescriptorReceiver; closing file "
"descriptor received (from %(peerAddress)r)."),
"%(protocolName)s (on %(hostAddress)r) "
"received unsupported ancillary data "
"(level=%(cmsgLevel)r, type=%(cmsgType)r) "
"from %(peerAddress)r."),
hostAddress=self.getHost(), peerAddress=self.getPeer(),
protocolName=self._getLogPrefix(self.protocol),
)
for fd in fds:
os.close(fd)
cmsgLevel=cmsgLevel, cmsgType=cmsgType,
)

return self._dataReceived(data)


def _ancillaryLevelSOLSOCKETTypeSCMRIGHTS(self, cmsgData):
"""
Processes ancillary data with level SOL_SOCKET and type SCM_RIGHTS,
indicating that the ancillary data payload holds file descriptors.

Calls L{IFileDescriptorReceiver.fileDescriptorReceived} once for each
received file descriptor or logs a message if the protocol does not
implement L{IFileDescriptorReceiver}.

@param cmsgData: Ancillary data payload.
@type cmsgData: L{bytes}
"""

fdCount = len(cmsgData) // 4
fds = struct.unpack('i'*fdCount, cmsgData)
if interfaces.IFileDescriptorReceiver.providedBy(self.protocol):
for fd in fds:
self.protocol.fileDescriptorReceived(fd)
else:
log.msg(
format=(
"%(protocolName)s (on %(hostAddress)r) does not "
"provide IFileDescriptorReceiver; closing file "
"descriptor received (from %(peerAddress)r)."),
hostAddress=self.getHost(), peerAddress=self.getPeer(),
protocolName=self._getLogPrefix(self.protocol),
)
for fd in fds:
os.close(fd)


class _UnsupportedSendmsgMixin(object):
"""
Expand Down
1 change: 1 addition & 0 deletions src/twisted/topfiles/8912.bugfix
Original file line number Diff line number Diff line change
@@ -0,0 +1 @@
UNIX socket endpoints now process all messages from recvmsg's ancillary data via twisted.internet.unix.Server.doRead/twisted.internet.unix.Client.doRead, while discarding and logging ones that don't contain file descriptors.