New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

__call__ always accessible, even when not exposed by _rpyc_getattr #239

Closed
seveirein opened this Issue Nov 9, 2017 · 12 comments

Comments

Projects
None yet
2 participants
@seveirein
Contributor

seveirein commented Nov 9, 2017

RPYC has no mechanism to preclude access to the python __call__ magic method.

Fetching a __call__ via __getattribute__ on a netref fetches the local netref __call__ function automatically:
https://github.com/tomerfiliba/rpyc/blame/master/rpyc/core/netref.py#L148

Then the __call__ method does a syncreq(_self, consts.HANDLE_CALL, args, kwargs) directly.
https://github.com/tomerfiliba/rpyc/blame/master/rpyc/core/netref.py#L197

This is sent on through the protocol, and there is no checking on either side if the remote __call__ attribute was exposed by _rpyc_getattr.

Simple test case attached:
rpyc__call__issue.zip

Users of restricted() may get a nasty surprise. As for my own purposes, I am using rpyc and I have created a lot of tools and magic python decorators (which I may want to contribute back to rpyc) to describe security restrictions.

I have RPYC configured so it doesn't allow ANYTHING to be accessed except via _rpyc_getattr. I was hoping that would include the __call__'s on class instances, classes themselves, and even function/method references. If there is no way enforce this, I will need to create my own patched fork of rpyc to meet my security model constraints.

Since this 'hole" is longstanding, I think the best way to address it is with a new protocol_config flag that locks down all calling access (unless the __call__ attribute is exposed by _rpyc_getattr).

I can create a patch myself, but want some feedback from maintainers before I go off and do that.

@coldfix

This comment has been minimized.

Collaborator

coldfix commented Nov 9, 2017

Hi,

I see your concern.

Since this 'hole" is longstanding, I think the best way to address it is with a new protocol_config flag that locks down all calling access (unless the call attribute is exposed by _rpyc_getattr).

I can create a patch myself, but want some feedback from maintainers before I go off and do that.

Please go ahead! Thanks.

Simple test case attached:
rpyc__call__issue.zip

For the future: it's better to post in plain text to reduce the trouble for others to go through downloading and extracting before they can see a few lines of code... which will drastically reduce the likelihood of others reading your issue or finding related problems via search engines (and is also harder to backup/migrate to an external system and unsafer in case the files will be deleted at some point).

For reference, the extracted code is:

import unittest
from rpyc.utils.server import ThreadedServer
import rpyc
import threading

class TestService(rpyc.Service):
    def _rpyc_getattr(self, name):
        raise AttributeError()

    def __call__(self):
        print("OH NO")
       
    def hidden(self):
        print("Can't get here")

def test_simple_client_server(): #tests family_exclude and preference order too
        port=18861

        #THIS SHOULD BLOCK EVERYTHING NOT EXPOSED BY _rpyc_getattr
        config={ "allow_safe_attrs":False,
                 "allow_exposed_attrs":False,
                 "allow_getattr":False }

        server=ThreadedServer( TestService, hostname="localhost", port=port, protocol_config=config )
        serverThread = threading.Thread( target=server.start, args=())        
        serverThread.start()

        connection=rpyc.connect("localhost" , port)

        root=connection.root

        root.__call__() #This should not work
        root() #in an ideal world this shouldn't work either.

        connection.close()

        server.close()
        serverThread.join()

if __name__=="__main__":
    test_simple_client_server()
@seveirein

This comment has been minimized.

Contributor

seveirein commented Nov 9, 2017

Fixed call access

I have made a fork with the necessary changes:

https://github.com/seveirein/rpyc

I have written some unit test cases and tested under Python 2.7.13 and 3.5.3.

Problems

This isn't easy to fix. Every callable used over a rpyc connection comes from a netref. Let's say you want to call myNetref.foo() -- this typically is broken down into two protocol steps---getting foo, and calling it.

Even though there is a call attribute protocol message, it is rarely used, because of how the underlying Python works.

That means at the time of a call, the protocol handler doesn't know whether you are holding an item that was legitimately gotten from a netref in a way conducive with the security policy set, or some random callable object that has been gotten another way (say via being returned as an object).

This means that verifying callables were reached appropriately will by necessity be a bit hackish.

Accessing calls

I've implemented an allow_unprotected_calls flag to protocol configuration (True by default to maintain compatability). I only allow calls on a callable if one or more of the following accessibility cases works.

Accessibility Case 1

If __call__ is exposed via the protocol security method directly for a a callable, then you can call that callable. This works if you are calling an instance that implements __call__ and you have used _rpyc_getattr to make that visible.

It will critically not work however for all methods of the class. They are callables, that have their own __call__ methods--and unless you go in and add an _rpyc_getattr in the namespace of the method itself (or wrap it in an object that uses it for __call__), you will not be able to call anything but a normal objects __call__ method this way. All other callable types will break, unless they pass Accessibility Case 2.

Accessibility Case 2

If the method is wrapped/bound, you can backwards verify that its parent self/im_self has a reference to it by callable.name. This won't work on functions and static methods at all, and will not work if you do something like:

def foo(self):
    pass

class a: 
    x=foo

a().x() won't work because the a().x will have the __name__ "foo". If class a has a "foo" method, it won't be called either, because we do validate that the a().x is a.foo before calling.

Other Necessary Changes

Two other changes were necessary of some importance.

_rpyc_class_getattr/_rpyc_class_setattr/_rpyc_class_delattr

If a variable being handled by rpyc is a class definition (not an instance), _rpyc_getattr doesn't properly work. Before this fork, rpyc will call variable.__class__._rypc_getattr(variable, name) but
variable.__class__ will be type for all class definitions. This is wrong.

Moreover it needs to be fixed to check classmethods via Accessibility Case 2.

To that end I've added hooking for _rpyc_class_getattr, _rpyc_class_setattr, and _rpyc_class_delattr. These should be class methods. If actually working with an instance, and the old _rpyc_???attr throws an AttributeError, there will be a second chance to get the object via _rpyc_class_???attr method called on the class.

rpyc.util.helpers.restricted changes

Python will not call an object that is not considered callable. It does not consider something that returns __call__ via __getattr__ or __getattribute__ only callable, unless it also actually has a __call__ method. Therefore the old restricted() call never exposed an objects callability.

This is probably a good thing before these changes.

However, now if we use restricted(callable, ["__call__"]) we want the callable to still be callable.
Furthermore, we can use this to export static methods and functions and essentially make them callable via Accessibility Case 1.

Therefore restricted() has been modified.

@coldfix

This comment has been minimized.

Collaborator

coldfix commented Nov 9, 2017

Can you submit a PR? (doesn't have to be final) I find it easier to reason about code than words.

Best, Thomas

@seveirein

This comment has been minimized.

Contributor

seveirein commented Nov 10, 2017

PR #241 It passes the unit tests and adds its own.

@seveirein

This comment has been minimized.

Contributor

seveirein commented Nov 11, 2017

So, I was thinking about this some more, and it might be possible to do this much more cleanly by exposing the needed binding information when a netref to the function is made. It would require a slight modification to the magic in netref code.

@coldfix

This comment has been minimized.

Collaborator

coldfix commented Nov 11, 2017

Any reason not to simply check for accessibility of __call__ in the _handle_call handler, (even if we don't actually use it)?

@seveirein

This comment has been minimized.

Contributor

seveirein commented Nov 11, 2017

it's not easy to see why it doesn't work till you try it. To reiterate from my previous post:

"Let's say you want to call myNetref.foo() -- this typically is broken down into two protocol steps---getting foo, and calling it."

The bound "foo" has its own oid. So it is impossible to check simply at __call__ time if accessed properly, because all callables are their own object with their own oid, and you can't determine whether you are calling on a naked callable, or a valid dereferenced callable.

If you check at call time, you suddenly find that you can't use any valid dereferenced methods whatsoever, because you'll end up checking every method for rpyc_getattr or exposed__call_ or whatever..and it is just a method.

@seveirein

This comment has been minimized.

Contributor

seveirein commented Nov 12, 2017

So, my feeling on this for the moment is to not integrate my pull request till further notice, but leave the issue open. I'm working on some stuff that may make it a moot point anyways. That said, I may not get back to it for a few weeks.

@seveirein

This comment has been minimized.

Contributor

seveirein commented Dec 6, 2017

So PR #243 is my current solution to this. With the old system there's no good way to protect this. In the old system, once you have an OID you have to be able to call it without checking anything, or otherwise you can't call most things. With the RPyC Exposed objects of #243 if you grab a method, and it is exposed, it will also be a RPyC Exposed object, and it can be checked.

@seveirein

This comment has been minimized.

Contributor

seveirein commented Dec 13, 2017

Well, PR #243 is dead, and I still want to address the bug discussed her in the context of RPyC by itself. Is it okay if I open up discussion on the Google Group to discuss several potential fixes?

In the meantime--here is a recap of the problem:

You can invoke __call__ remotely on any object that you have a netref to currently, and there is no security checking whatsoever. This means you can instantiate any class you have a netref to, and that you can call instances of classes that have a __call__ method defined.

The difficulty in fixing the problem cleanly lies in that methods and functions themselves are objects. When you get a netref to a method or a function, you want to be able to call that netref directly. In order to secure the system you need to make some distinction about what objects should be callable, and which should not. At the time you need to make that distinction, the origin of where that object came from is not available.

I've come up with some potential methods for fixing it.

Method A

The strategy for method A is really simple. If you have a netref to a method or a function, you probably should be able to call it. Things only really get muddy with classes and callable instances. Therefore, we can add new protocol_config flags:

  • allow_constructor_calls -- allow calling of values that inspect.isclass(value) evaluates to True
  • allow_routine_calls -- allow calling of values that inspect.isroutine(value) evaluates to True
  • allow_nonroutine_calls -- allow calling of everything else--which should generally be instances with the __call__ defined in their class.

This mostly works, but there is one really ugly issue::

>>> x=[].__eq__
>>> print(x)
<method-wrapper '__eq__' of list object at 0x7f1190083948>
>>> inspect.isroutine(x)
False
>>>

The cpython type 'method-wrapper' is a largely undocumented type, that does not exist in the types module, and cannot be determined by any routine in the inspect module. It is considered an implementation detail of cpython. It exists for certain builtin methods on objects. However, it needs to be treated as a normal routine. We can detect it, but to do so would be very hacky (as it is considered an implementation detail).

The other major drawback of this method is that it is very fixed--it doesn't allow a per object policy.

Method B

The strategy for method B is to keep things even simpler. Every time there is an attempt to __call__ something, the root level service is checked for a on_call method. If implemented, that method is called every time a call occurs with the available information (object and parameters).
The implementor of the on_call method makes the policy decisions on what can be called based on type.

This sorta punts the problem into the user's problem space, rather than ours. That's good from a standpoint of them having control of the problem. They do however, still have very little information to make their decisions on.

Method C

Require all function/method objects to somehow be marked, modified, or registered explicitly to be callable. How remains to be discussed. Currently the use of determining accessibility by name is not
sufficient. At the time we need to know the original name, all that we have is the "name" attribute, which does not have to match (and frequently doesn't match) the name of the variable storing the object::

>>> def foo(bar):
...     pass
... 
>>> class A:
...     x=foo
... 
>>> A().x()
>>> A().x.__name__
'foo'

Method C is what I used for the monster patch, as everything exposed was essentially a specially wrapped type. The main drawbacks of Method C are that even a much simpler implementation than I did would be complicated, and would require the use of a different way of exposing stuff than currently used by RPyC.

@coldfix

This comment has been minimized.

Collaborator

coldfix commented Dec 13, 2017

Hi,

I will summarize my thoughts on the topic.

My first thoughts were similar to what you described in A: In Connection._handle_call and _handle_callattr do something like return self._access_attr(oid, '__call__', ...)(*args, **kwargs). Add __call__ to safe_attrs for backwards compatibility, and possibly add extra flags to always allow function/method calls (as you said, if you have a netref to such an object, you are probably allowed to call it, even if __call__ is not in safe_attrs).

However, I don't really think this is a clean solution, since it acts as sort of a global behaviour that is not easy to configure on individual methods etc. It is too restricted in what it does allow. After more thinking, I've come to the conclusion that this responsibility should be left to the user entirely if they desire (so your extensive patches could be implemented easily as extension).

I think the main problem is that it is right now hard to hook into the getattr mechanism from the outside.

Theoretically, you can do it by monkey patching like so:

class MyConnection(rpyc.Connection):
  def _handle_call(self, oid, args, kwargs=()):
        # ... can check here if this is allowed however you want..
        return self._local_objects[oid](*args, **dict(kwargs))
  # also need to update _HANDLERS here..

# monkey-patch rpyc.server in order to make ThreadedServer use our subclass:
rpyc.server.Connection = MyConnection

Of course, that's hackish and introduces a global change, so I can't recommend it to anyone.

However, the same can be achieved with just a couple of lines changed in the code by adding a create_connection() method to the Server class that can be overriden by the user. It will by default create an rpyc.protocol.Connection.

Next step: you probably don't want to subclass Connection each time you want to modify how getattr works, so we should move Connection._check_attr to Service as well, so the user can freely override and e.g. check for special attributes that could be set via decorators and whatnot.

This should allow the user to have full control with only a couple of lines changed.

What do you think?

seveirein added a commit to seveirein/rpyc that referenced this issue Dec 14, 2017

Implements changes to RPyC to support 3rd party extension.
There are four basic changes that are needed to support
the enhanced security module I have under development. All
of the changes should be considered improvements to RPyC.

The changes are:
    * Implement a way to intercept object referencing/boxing.
      This is useful for users of RPyC for debugging, logging,
      automatic restriction of objects, and validation that
      objects are exposed erroneously. With this patch you can
      now hook all "box" attempts by adding an ``on_box(self, obj)``
      method to your root service. You can change the obj
      referenced by returning a different obj. Documentation TBD.
      See ``test_on_box.py``
    * Implement a way to intercept object calls to address issue tomerfiliba#239
      You can now hook all ``__call__`` attempts by adding an
      ``on_call(self, obj, args, kwargs)`` method to your root
      service. If you hook it, the method is responsible for
      doing the calls. Documentation TBD. See ``test_on_call.py``
    * You can implement an exception filter function and pass it to
      ``protocol_config`` to have much finer control of what
      exceptions are instantiated or not. It is documented
      in the API documentation. See ``test_exception_filter.py``
    * Lastly, there is very subtle blemish fixed involving
      the use `_rpyc_???attr` routines with metaclasses.
      (Bug report incoming). See ``test_class_metagetattr.py``
      to see how the fix allows the use of `_rpyc_???attr`
      methods on classes with metaclass definitions. The test will
      fail on  the unpatched RPyC.

All 4 of these are necessary for me to make my third party module
interact with RPyC in a seamless fashion.

seveirein added a commit to seveirein/rpyc that referenced this issue Dec 14, 2017

@seveirein

This comment has been minimized.

Contributor

seveirein commented Dec 14, 2017

Integrating PR #245 should close this issue.

@coldfix coldfix closed this Dec 18, 2017

coldfix added a commit that referenced this issue Dec 21, 2017

Remove special handling for _dispatch_call (revert #245)
This reverts PR #245 (on issue #239).

I finally decided it's much cleaner to go for a more generic approach
where the service can easily install itself into

    conn._HANDLERS[HANDLE_CALL]

Getting a special treatment for function calls feels kind of awkward and
is unnecessary…

coldfix added a commit that referenced this issue Dec 21, 2017

Remove special handling for _dispatch_call (revert #245)
This reverts PR #245 (on issue #239).

I finally decided it's much cleaner to go for a more generic approach
where the service can easily install itself into

    conn._HANDLERS[HANDLE_CALL]

Getting a special treatment for function calls feels kind of awkward and
is unnecessary…

coldfix added a commit that referenced this issue Dec 21, 2017

Remove special handling for _dispatch_call (revert #245)
This reverts PR #245 (on issue #239).

I finally decided it's much cleaner to go for a more generic approach
where the service can easily install itself into

    conn._HANDLERS[HANDLE_CALL]

Getting a special treatment for function calls feels kind of awkward and
is unnecessary…

coldfix added a commit that referenced this issue Jun 11, 2018

Release rpyc 4.0.0
This release brings a few minor backward incompatibilities, so be sure to read
on before upgrading. However, fear not: the ones that are most likely relevant
to you have a relatively simple migration path.

Backward Incompatibilities
^^^^^^^^^^^^^^^^^^^^^^^^^^

* ``classic.teleport_function`` now executes the function in the connection's
  namespace by default. To get the old behaviour, use
  ``teleport_function(conn, func, conn.modules[func.__module__].__dict__)``
  instead.

* Changed signature of ``Service.on_connect`` and ``on_disconnect``, adding
  the connection as argument.

* Changed signature of ``Service.__init__``, removing the connection argument

* no longer store connection as ``self._conn``. (allows services that serve
  multiple clients using the same service object, see `#198`_).

* ``SlaveService`` is now split into two asymetric classes: ``SlaveService``
  and ``MasterService``. The slave exposes functionality to the master but can
  not anymore access remote objects on the master (`#232`_, `#248`_).
  If you were previously using ``SlaveService``, you may experience problems
  when feeding the slave with netrefs to objects on the master. In this case, do
  any of the following:

  * use ``ClassicService`` (acts exactly like the old ``SlaveService``)
  * use ``SlaveService`` with a ``config`` that allows attribute access etc
  * use ``rpyc.utils.deliver`` to feed copies rather than netrefs to
    the slave

* ``RegistryServer.on_service_removed`` is once again called whenever a service
  instance is removed, making it symmetric to ``on_service_added`` (`#238`_)
  This reverts PR `#173`_ on issue `#172`_.

* Removed module ``rpyc.experimental.splitbrain``. It's too confusing and
  undocumented for me and I won't be developing it, so better remove it
  altogether. (It's still available in the ``splitbrain`` branch)

* Removed module ``rpyc.experimental.retunnel``. Seemingly unused anywhere, no
  documentation, no clue what this is about.

* ``bin/rpyc_classic.py`` will bind to ``127.0.0.1`` instead of ``0.0.0.0`` by
  default

* ``SlaveService`` no longer serves exposed attributes (i.e., it now uses
  ``allow_exposed_attrs=False``)

* Exposed attributes no longer hide plain attributes if one otherwise has the
  required permissions to access the plain attribute. (`#165`_)

.. _#165: #165
.. _#172: #172
.. _#173: #173
.. _#198: #198
.. _#232: #232
.. _#238: #238
.. _#248: #248

What else is new
^^^^^^^^^^^^^^^^

* teleported functions will now be defined by default in the globals dict

* Can now explicitly specify globals for teleported functions

* Can now use streams as context manager

* keep a hard reference to connection in netrefs, may fix some ``EOFError``
  issues, in particular on Jython related (`#237`_)

* handle synchronous and asynchronous requests uniformly

* fix deadlock with connections talking to each other multithreadedly (`#270`_)

* handle timeouts cumulatively

* fix possible performance bug in ``Win32PipeStream.poll`` (oversleeping)

* use readthedocs theme for documentation (`#269`_)

* actually time out sync requests (`#264`_)

* clarify documentation concerning exceptions in ``Connection.ping`` (`#265`_)

* fix ``__hash__`` for netrefs (`#267`_, `#268`_)

* rename ``async`` module to ``async_`` for py37 compatibility (`#253`_)

* fix ``deliver()`` from IronPython to CPython2 (`#251`_)

* fix brine string handling in py2 IronPython (`#251`_)

* add gevent_ Server. For now, this requires using ``gevent.monkey.patch_all()``
  before importing for rpyc. Client connections can already be made without
  further changes to rpyc, just using gevent's monkey patching. (`#146`_)

* add function ``rpyc.lib.spawn`` to spawn daemon threads

* fix several bugs in ``bin/rpycd.py`` that crashed this script on startup
  (`#231`_)

* fix problem with MongoDB, or more generally any remote objects that have a
  *catch-all* ``__getattr__`` (`#165`_)

* fix bug when copying remote numpy arrays (`#236`_)

* added ``rpyc.utils.helpers.classpartial`` to bind arguments to services (`#244`_)

* can now pass services optionally as instance or class (could only pass as
  class, `#244`_)

* The service is now charged with setting up the connection, doing so in
  ``Service._connect``. This allows using custom protocols by e.g. subclassing
  ``Connection``.  More discussions and related features in `#239`_-`#247`_.

* service can now easily override protocol handlers, by updating
  ``conn._HANDLERS`` in ``_connect`` or ``on_connect``. For example:
  ``conn._HANDLERS[HANDLE_GETATTR] = self._handle_getattr``.

* most protocol handlers (``Connection._handle_XXX``) now directly get the
  object rather than its ID as first argument. This makes overriding
  individual handlers feel much more high-level. And by the way it turns out
  that this fixes two long-standing issues (`#137`_, `#153`_)

* fix bug with proxying context managers (`#228`_)

* expose server classes from ``rpyc`` top level module

* fix logger issue on jython

.. _#137: #137
.. _#146: #146
.. _#153: #153
.. _#165: #165
.. _#228: #228
.. _#231: #231
.. _#236: #236
.. _#237: #237
.. _#239: #239
.. _#244: #244
.. _#247: #247
.. _#251: #251
.. _#253: #253
.. _#264: #264
.. _#265: #265
.. _#267: #267
.. _#268: #268
.. _#269: #269
.. _#270: #270

.. _gevent: http://www.gevent.org/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment