Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Class netrefs return wrong __class__ value #355

Open
jtobiska opened this issue Oct 18, 2019 · 16 comments
Open

Class netrefs return wrong __class__ value #355

jtobiska opened this issue Oct 18, 2019 · 16 comments
Assignees
Labels
Bug Confirmed bug Triage Investigation by a maintainer has started

Comments

@jtobiska
Copy link

jtobiska commented Oct 18, 2019

As of release 4.1, it seems that netrefs that represent classes are returning incorrect values for the __class__ attribute. type() calls are also returning similarly incorrect results.

# rpyc v4.1.2 (surprising/incorrect behavior):
>>> import rpyc
>>> conn = rpyc.classic.connect("localhost")
>>> conn.modules.__builtin__.float.__class__
<type 'float'>
>>> type(conn.modules.__builtin__.float)
<netref class 'rpyc.core.netref.__builtin__.float'>


# rpyc v3.3.0 (expected behavior):
>>> import rpyc
>>> conn = rpyc.classic.connect("localhost")
>>> conn.modules.__builtin__.float.__class__
<type 'type'>
>>> type(conn.modules.__builtin__.float)
<netref class '__builtin__.type'>


# just python, no rpyc (consistent with rpyc v3.3.0 behavior)
>>> float.__class__
<type 'type'>
>>> type(float)
<type 'type'>

We encountered this change in behavior when we attempted to upgrade from 3.3.0 to 4.1.2 and it caused a number of problems in code that was depending on the type handling to work as expected.

To complicate matters, it looks like the fix for #339 included a change to the implementation of BaseNetref.__instancecheck__ that is now relying on this incorrect behavior. The last line of __instancecheck__ is now calling isinstance(other, self.__class__) rather than isinstance(other, self). self represents the correct class to check, but self.__class__ is working because self.__class__ is incorrectly returning self (or the local value of self, depending on whether the class is a builtin).

Environment
  • rpyc version: 4.1.2
  • python version: 2.7.13
  • operating system: linux
@jtobiska
Copy link
Author

jtobiska commented Oct 20, 2019

It looks like the source of the problem is in rpyc.core.protocol.Connection._netref_factory. In that function, id_pack[2] isn't being checked to see if the value is 0 to know if id_pack is referring to a type or an instance. Instead the function seems to be always assuming it's an instance, so when id_pack[0] finds a match in netref.builtin_classes_cache it constructs the netref for an instance of that type rather than the type itself.

I was going to attempt to patch this myself, but the solution isn't immediately clear because when id_pack is representing a class, it is missing the class's type (metaclass) info. Without knowing the metaclass we can't construct a proper netref for the class itself. Back in v3.3.0, the _netref_factory method had direct access to the metaclass's name (clsname) and module name (modname) so building the correct netref was trivial.

@comrumino
Copy link
Collaborator

For brevity, I'm going to focus on ideal behavior compared to current behavior.

{•̃_•̃}

Definitions:

  • Transferred by-value, serialization and transmission of an immutable object
  • Referencing a remote object, using the unboxed reference to a mutable object
  • Object proxying, the process of using a local object to forward behavior over a network connection to an object in another address space
  • Netref, shorthand for network reference and a proxy object
  • Data descriptor, settle for a descriptor example

Of course, these definitions are specific to the context of RPyC

Steps to reproduce with Python3:

pkgver=<desired-version>
myvenv=rpyc${pkgver}
virtualenv /tmp/$myvenv
source /tmp/$myvenv/bin/activate
pip install rpyc==$pkgver
curl https://raw.githubusercontent.com/tomerfiliba/rpyc/$pkgver/bin/rpyc_classic.py -o /tmp/$myvenv/bin/rpyc_classic.py
python /tmp/$myvenv/bin/rpyc_classic.py &> /tmp/${myvenv}-rpyc_classic.log & disown
sleep 1
python -c 'import rpyc; conn = rpyc.classic.connect("127.0.0.1"); print("repr(conn.modules.builtins.float) prints as ", repr(conn.modules.builtins.float), "  #"); print("type(conn.modules.builtins.float) prints as ", type(conn.modules.builtins.float), "  #"); print("repr(conn.modules.builtins.float.__class__) prints as ", repr(conn.modules.builtins.float.__class__), "  #"); print("type(conn.modules.builtins.float.__class__) prints as ", type(conn.modules.builtins.float.__class__), "  #");'
deactivate

Running with pkgver=4.1.2 and filling in the comments:

repr(conn.modules.builtins.float) prints as  <class 'float'>   # looks good
type(conn.modules.builtins.float) prints as  <netref class 'rpyc.core.netref.builtins.float'>   # looks good
repr(conn.modules.builtins.float.__class__) prints as  <class 'float'>   # looks good
type(conn.modules.builtins.float.__class__) prints as  <class 'type'>   # faint smell of a dumpster

The most intuitive behavior for the last line:

type(conn.modules.builtins.float.__class__) prints as  <netref class 'rpyc.core.netref.builtins.type'>

This is where transparency and secure default behavior collide. The special attribute __class__ is not always safe. In our case, creating a netref to builtins.type provides access to a dangerous object (built-ins could create arbitrary code).

So, an ideal solution might be:

  • add exception for UnsafeAttributeUnavailable
  • raise UnsafeAttributeUnavailable when applicable
  • create a data descriptor for unsafe special attributes (i.e. __class__)

tl;dr versions <=4.0.2 should have provided a proxy to the float class. The current behavior is still not ideal, but I would like to drop Python2 support before fixing to reduce complexity.

@comrumino comrumino self-assigned this Nov 7, 2019
@comrumino comrumino added the Triage Investigation by a maintainer has started label Nov 7, 2019
@jtobiska
Copy link
Author

jtobiska commented Nov 8, 2019

Thanks for the detailed response to this issue! However, the results from the examples you listed as "# looks good" are still problematic, regardless of whether it is netrefs or local objects that are returned. The fundamental problem is that in Python type(float) is not float, and float.__class__ is not float. type(float) is type and float.__class__ is type. It is critical that rpyc reflects those same results, regardless of whether it returns those results as netrefs or local objects.

Also, for what it's worth, in our code base we are relying heavily on the builtins being represented as local classes rather than netrefs. For us the safety concerns you listed don't apply, so if that workflow stops being supported I'm afraid we'll be forced to fork and diverge from the officially supported rpyc going forward.

@comrumino
Copy link
Collaborator

comrumino commented Nov 13, 2019

No worries, the next release should support your use (the details that determine backwards compatibility are TBD). If a CVE is okay then upgrading to version 4.0.2 will work meanwhile (the previous maintainer left that CVE as a gift in master for me).

I actually have nothing the behavior of

If conn.modules.builtins.float is a netref to a float, then conn.modules.builtins.float is a proxy object to a float. So, type(conn.modules.builtins.float) prints as <netref class 'rpyc.core.netref.builtins.type'> is correct.

Since the implementation type is in C, the details of a fix for you that doesn't break #339 and #316 make this nontrivial.

@comrumino comrumino added the Bug Confirmed bug label Nov 14, 2019
@comrumino
Copy link
Collaborator

comrumino commented Nov 14, 2019

Note, this is my top priority issue. I'll see if I can get a fix within the week and thanks for reporting it 🥇

I apologize if my writing is ever brusque---I appreciate your feedback. Still working on the finer details for my mental model of RPyC (as a new-ish maintainer). I have a better understanding of your use case, a better understanding of my mistakes, and should have a fix for the bug that does not break other use cases.

I'll keep you updated.

@jtobiska
Copy link
Author

Thanks so much! I appreciate all your help and your detailed responses. Will the fix work for Python 2.7 or will it be Python 3 only?

comrumino added a commit that referenced this issue Nov 19, 2019
@comrumino
Copy link
Collaborator

comrumino commented Nov 19, 2019

I think master should have a fix for this issue for Python 2.7, 3.6, 3.7, and 3.8-dev. I did add a unit test under tests/test_netref_hierachy based on the issue as described. I'll wait to close the issue in hopes that you can confirm it is fixed 💯 (the Travis failure for 3.8 is unrelated to this issue)

If you're curious about the exact changes (since I failed to mention this issue in all of the commits):

diff --git a/rpyc/core/netref.py b/rpyc/core/netref.py
index 39fb905..5fdfae3 100644
--- a/rpyc/core/netref.py
+++ b/rpyc/core/netref.py
@@ -229,9 +229,15 @@ class BaseNetref(with_metaclass(NetrefMetaclass, object)):
                 elif other.____id_pack__[2] != 0:
                     return True
             else:
+                # seems dubious if each netref proxies to a different address spaces
                 return syncreq(self, consts.HANDLE_INSTANCECHECK, other.____id_pack__)
         else:
-            return isinstance(other, self.__class__)
+            if self.____id_pack__[2] == 0:
+                # outside the context of `__instancecheck__`, `__class__` is expected to be type(self)
+                # within the context of `__instancecheck__`, `other` should be compared to the proxied class
+                return isinstance(other, type(self).__dict__['__class__'].instance)
+            else:
+                raise TypeError("isinstance() arg 2 must be a class, type, or tuple of classes and types")
 
 
 def _make_method(name, doc):
@@ -271,6 +277,32 @@ def _make_method(name, doc):
         return method
 
 
+class NetrefClass(object):
+    """a descriptor of the class being proxied
+
+    Future considerations:
+     + there may be a cleaner alternative but lib.compat.with_metaclass prevented using __new__
+     + consider using __slot__ for this class
+     + revisit the design choice to use properties here
+    """
+    def __init__(self, class_obj):
+        self._class_obj = class_obj
+
+    @property
+    def instance(self):
+        """accessor to class object for the instance being proxied"""
+        return self._class_obj
+
+    @property
+    def owner(self):
+        """accessor to the class object for the instance owner being proxied"""
+        return self._class_obj.__class__
+
+    def __get__(self, netref_instance, netref_owner):
+        """the value returned when accessing the netref class is dictated by whether or not an instance is proxied"""
+        return self.owner if netref_instance.____id_pack__[2] == 0 else self.instance
+
+
 def class_factory(id_pack, methods):
     """Creates a netref class proxying the given class
 
@@ -281,30 +313,30 @@ def class_factory(id_pack, methods):
     """
     ns = {"__slots__": (), "__class__": None}
     name_pack = id_pack[0]
-    if name_pack is not None:  # attempt to resolve against builtins and sys.modules
-        ns["__class__"] = _normalized_builtin_types.get(name_pack)
-        if ns["__class__"] is None:
-            _module = None
-            didx = name_pack.rfind('.')
-            if didx != -1:
-                _module = sys.modules.get(name_pack[:didx])
-                if _module is not None:
-                    _module = getattr(_module, name_pack[didx + 1:], None)
-                else:
-                    _module = sys.modules.get(name_pack)
-            else:
-                _module = sys.modules.get(name_pack)
-            if _module:
-                if id_pack[2] == 0:
-                    ns["__class__"] = _module
-                else:
-                    ns["__class__"] = getattr(_module, "__class__", None)
-
+    class_descriptor = None
+    if name_pack is not None:
+        # attempt to resolve __class__ using sys.modules (i.e. builtins and imported modules)
+        _module = None
+        cursor = len(name_pack)
+        while cursor != -1:
+            _module = sys.modules.get(name_pack[:cursor])
+            if _module is None:
+                cursor = name_pack.rfind('.')
+                continue
+            _class_name = name_pack[cursor + 1:]
+            _class = getattr(_module, _class_name, None)
+            if _class is not None and hasattr(_class, '__class__'):
+                class_descriptor = NetrefClass(_class)
+            break
+    ns['__class__'] = class_descriptor
+    netref_name = class_descriptor.owner.__name__ if class_descriptor is not None and id_pack[2] == 0 else name_pack
+    # create methods that must perform a syncreq
     for name, doc in methods:
         name = str(name)  # IronPython issue #10
+        # only create methods that wont shadow BaseNetref during merge for mro
         if name not in LOCAL_ATTRS:  # i.e. `name != __class__`
             ns[name] = _make_method(name, doc)
-    return type(name_pack, (BaseNetref,), ns)
+    return type(netref_name, (BaseNetref,), ns)
 
 
 for _builtin in _builtin_types:
diff --git a/rpyc/core/protocol.py b/rpyc/core/protocol.py
index 3c2a974..e4ed651 100644
--- a/rpyc/core/protocol.py
+++ b/rpyc/core/protocol.py
@@ -305,15 +305,19 @@ class Connection(object):
 
     def _netref_factory(self, id_pack):  # boxing
         """id_pack is for remote, so when class id fails to directly match """
-        if id_pack[0] in netref.builtin_classes_cache:
+        cls = None
+        if id_pack[2] == 0 and id_pack in self._netref_classes_cache:
+            cls = self._netref_classes_cache[id_pack]
+        elif id_pack[0] in netref.builtin_classes_cache:
             cls = netref.builtin_classes_cache[id_pack[0]]
-        elif id_pack[1] in self._netref_classes_cache:
-            cls = self._netref_classes_cache[id_pack[1]]
-        else:
+        if cls is None:
             # in the future, it could see if a sys.module cache/lookup hits first
             cls_methods = self.sync_request(consts.HANDLE_INSPECT, id_pack)
             cls = netref.class_factory(id_pack, cls_methods)
-            self._netref_classes_cache[id_pack[1]] = cls
+            if id_pack[2] == 0:
+                # only use cached netrefs for classes
+                # ... instance caching after gc of a proxy will take some mental gymnastics
+                self._netref_classes_cache[id_pack] = cls
         return cls(self, id_pack)
 
     def _dispatch_request(self, seq, raw_args):  # dispatch
diff --git a/rpyc/lib/__init__.py b/rpyc/lib/__init__.py
index 33513ac..6d7e967 100644
--- a/rpyc/lib/__init__.py
+++ b/rpyc/lib/__init__.py
@@ -151,12 +151,21 @@ def exp_backoff(collision):
 
 
 def get_id_pack(obj):
-    """introspects the given (local) object, returns id_pack as expected by BaseNetref"""
-    if not inspect.isclass(obj):
+    """introspects the given "local" object, returns id_pack as expected by BaseNetref
+
+    The given object is "local" in the sense that it is from the local cache. Any object in the local cache exists
+    in the current address space or is a netref. A netref in the local cache could be from a chained-connection.
+    To handle type related behavior properly, the attribute `__class__` is a descriptor for netrefs.
+
+    So, check thy assumptions regarding the given object when creating `id_pack`.
+    """
+    if hasattr(obj, '____id_pack__'):
+        # netrefs are handled first since __class__ is a descriptor
+        return obj.____id_pack__
+    elif not inspect.isclass(obj):
+        # TODO: this is wrong for modules
         name_pack = '{0}.{1}'.format(obj.__class__.__module__, obj.__class__.__name__)
         return (name_pack, id(type(obj)), id(obj))
-    elif hasattr(obj, '____id_pack__'):
-        return obj.____id_pack__
     else:
         name_pack = '{0}.{1}'.format(obj.__module__, obj.__name__)
         return (name_pack, id(obj), 0)

@jtobiska
Copy link
Author

Thanks for making the changes so quickly! I'm hoping to be able to spend some time tomorrow to confirm that it all works for our use cases.

@jtobiska
Copy link
Author

I've done the first bit of testing, and it looks like it's mostly working. The example that I provided in my original post is working as expected, which is fantastic! Many thanks!!!

However I'm now seeing a couple cases that seem to be behaving strangely and I don't know if they're related:

Case 1:

>>> isinstance(conn.modules.__builtin__.StandardError(), conn.modules.__builtin__.StandardError)
True  # looks good
>>> isinstance(conn.modules.__builtin__.StandardError(), conn.modules.__builtin__.Exception)
False  # should be True since StandardError is a subclass of Exception
>>> isinstance(conn.modules.__builtin__.StandardError(), Exception)
True  # looks good
>>> isinstance(conn.modules.__builtin__.StandardError(), StandardError)
True  # looks good

Case 2:

>>> type(sys)
<type 'module'>  # base case
>>> type(conn.modules.sys)
<netref class 'rpyc.core.netref.__builtin__.module'>  # matches base case
>>> sys.__class__
<type 'module'>  # base case
>>> conn.modules.sys.__class__
<type 'module'>  # matches base case
>>> type(sys.__class__)
<type 'type'>  # base case
>>> type(conn.modules.sys.__class__)
<netref class 'rpyc.core.netref.__builtin__.module'>  # doesn't match.  Should be a netref class of "type" (or maybe just <type 'type'> itself?)

@jtobiska
Copy link
Author

Just checking in on this since it's been a couple months since the last update. Will it be possible to address the two cases I mentioned in my last comment?

@comrumino
Copy link
Collaborator

comrumino commented Jan 26, 2020

So, 4.1.3 was released just before writing this. Case 1 is fixed. Case 2, from test_netref_hierachy.py (just noticed the typo...) the lines can be un-commented to show that the base cases switched in terms of passing.

(venv2) [archie@swanstation tests]$ python -m unittest test_netref_hierachy
....F
======================================================================
FAIL: test_modules (test_netref_hierachy.Test_Netref_Hierarchy)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "test_netref_hierachy.py", line 121, in test_modules
    self.assertEqual(repr(type(conn.modules.sys)), "<netref class 'rpyc.core.netref.{}.module'>".format(_builtin))
AssertionError: "<netref class 'rpyc.core.netref.sys'>" != "<netref class 'rpyc.core.netref.__builtin__.module'>"

----------------------------------------------------------------------
Ran 5 tests in 0.795s

FAILED (failures=1)

Iirc, the test failures would keep switching. Dropping Python 2 would make it a bit easier to suss out the root cause of the issue and hopefully fix case 2 completely. The plan is to take another crack at it eventually.

@jtobiska
Copy link
Author

jtobiska commented Jul 11, 2020

Sorry I've gone so long without giving a response since your last update! Unfortunately I've been stuck on other priorities and haven't been able to devote any time until now.

I just tested out Case 1 above with v4.1.5 and I'm still seeing the same results as before:

>>> isinstance(conn.modules.__builtin__.StandardError(), conn.modules.__builtin__.Exception)
False  # should be True since StandardError is a subclass of Exception

It's really strange though that this case from the test suite works correctly:

>>> isinstance(conn.modules.__builtin__.Exception(), conn.modules.__builtin__.BaseException)
True

@Jongy
Copy link
Contributor

Jongy commented Jan 17, 2021

Found this issue while investigating a (possibly related) problem. All tests are performed on RPyC 5.0.1.

First of all, about the issue @jtobiska described in the last comment - I believe that's observed because the Exception & BaseException case goes through the if other_id_pack[0] in netref.builtin_classes_cache: check in _handle_instancecheck, while the StandardError case does not. I'm running Py3 (so not StandardError) but I can reproduce the same weird behavior with e.g RuntimeError - it incorrectly yields False.

After I add RuntimeError to netref.py:_builtin_types, this finally yields:

>>> isinstance(conn.modules.builtins.RuntimeError(), conn.modules.builtins.Exception)
True

My second issue is with this discrepancy:

>>> conn = rpyc.classic.connect("127.0.0.1")
>>> c = conn.modules.collections.Counter()
>>> c.__class__
<class 'collections.Counter'>
>>> type(c)
<netref class 'rpyc.core.netref.type'>
>>> 

I'd like type(c) to return rpyc.core.netref.Counter.

I hope I'll have time further on this week, I'll try to solve these 2 issues (re @jtobiska 's issue, I'm not sure what's the correct approach, I don't think that adding all missing types to _builtin_types is the right approach...)

@comrumino
Copy link
Collaborator

isinstance(conn.modules.builtin.Exception(), conn.modules.builtin.BaseException)

I'll have to take a look into this. I think sometime ago I was fixing issues related to isinstance.

Some more examples #390

@comrumino
Copy link
Collaborator

I hope I'll have time further on this week, I'll try to solve these 2 issues (re @jtobiska 's issue, I'm not sure what's the correct approach, I don't think that adding all missing types to _builtin_types is the right approach...)

This seems like it could be an issue in how RPyC handles caching and iirc it does not cache the entire inheritance hierarchy. More over, if it fails to find it already cache it should try to locate it by other means which probably fails as well.

Last time I worked on this it would fix some test and break others. There is probably logic errors in more than one place. I'll try to take another crack at this in the near future.

comrumino added a commit that referenced this issue Jan 10, 2024
… any a fix for #355 and #547 will be a break-change/major-release.
comrumino added a commit that referenced this issue Jan 27, 2024
…ce logic for Foo/Bar. Still a lot of cleanup and thought required for changes being made to NetrefMetaclass and BaseNetref that impact type checking. Even so, the direction looks promising
comrumino added a commit that referenced this issue Jan 27, 2024
…ce logic for Foo/Bar. Still a lot of cleanup and thought required for changes being made to NetrefMetaclass and BaseNetref that impact type checking. Even so, the direction looks promising

    - Run actions for develop branch too.
@comrumino
Copy link
Collaborator

@Jongy, any chance you want to take a look at my develop branch. Despite

----------------------------------------------------------------------
Ran 95 tests in 124.747s

FAILED (failures=2, errors=2, skipped=12)

The tests don't fail in the same way locally. Or anyone else, a second set of eyes would be make the fix move faster.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Bug Confirmed bug Triage Investigation by a maintainer has started
Projects
None yet
Development

No branches or pull requests

3 participants