**Python Productivity: Inspecting and Understanding New Objects**

As a machine learning engineer … Working with a new Python object can be challenging because it's hard to find the attributes and methods you need, and the documentation can be unavailable, incorrect, daunting, and time-consuming to look up. This post discusses some quick ways to better understand new objects with Python's built-in functions, PyCharm's tools, and peep dis, a CLI tool for inspecting objects.
We'll start with a simple toy Rectangle class, then move on to more complex data science related objects like numpy arrays, pandas dataframes, sklearn models, and keras models. If you'd prefer to jump straight to this portion, skip to the "CLI Tool: peep dis" section.

In [65]:
class Rectangle:
    def __init__(self, a: float, b: float):
        self.a = a
        self.b = b

    def area(self) -> float:
        return self.a * self.b

    def scale(self, factor: float, ratio=1.0):
        """ scale the side lengths by `factor` """
        self.a = factor * self.a
        self.b = factor * self.b * ratio

    def take_half(self):
        """ cut in half and return the "other half" """
        self.a /= 2
        return Rectangle(self.a, self.b)

    def __str__(self):
        return self.__class__.__name__ + str({'a': self.a, 'b': self.b})

**Built-in Functions**

The dir function is a simple built-in that lists all attributes and methods of an object unless __dir__ has been overloaded. This is what text editors and IDEs use for autocomplete.

In [66]:
>>> rect = Rectangle(3., 4.)
>>> dir(rect)
['__class__', '__delattr__', '__dict__', '__dir__', '__doc__', '__eq__', '__format__', '__ge__', '__getattribute__', '__gt__', '__hash__', '__init__', '__init_subclass__', '__le__', '__lt__', '__module__', '__ne__', '__new__', '__reduce__', '__reduce_ex__', '__repr__', '__setattr__', '__sizeof__', '__str__', '__subclasshook__', '__weakref__', 'a', 'area', 'b', 'scale', 'take_half']

['__class__',
 '__delattr__',
 '__dict__',
 '__dir__',
 '__doc__',
 '__eq__',
 '__format__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__init_subclass__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__new__',
 '__reduce__',
 '__reduce_ex__',
 '__repr__',
 '__setattr__',
 '__sizeof__',
 '__str__',
 '__subclasshook__',
 '__weakref__',
 'a',
 'area',
 'b',
 'scale',
 'take_half']

The output is a list of strings representing the attributes and methods of the object, mostly consisting of builtins. For most users, these aren't particularly useful and just add clutter.

**Filtering Out Builtins**

Depending on our definition of builtins, we can use either string filtering or type filtering to remove these.

String Filtering:

In [67]:
def magic_filter(obj):
    is_magic = lambda x: (x.startswith('__') and x.endswith('__'))
    return [x for x in dir(obj) if not is_magic(x)]

In [68]:
>>> magic_filter(rect)

['a', 'area', 'b', 'scale', 'take_half']

Type Filtering:

In [69]:
from types import BuiltinMethodType

def builtin_type_filter(obj):
    is_builtin = lambda x: isinstance(getattr(obj, x), BuiltinMethodType)
    return [x for x in dir(obj) if not is_builtin(x)]

In [70]:
>>> builtin_type_filter(rect)

['__class__',
 '__delattr__',
 '__dict__',
 '__doc__',
 '__eq__',
 '__ge__',
 '__getattribute__',
 '__gt__',
 '__hash__',
 '__init__',
 '__le__',
 '__lt__',
 '__module__',
 '__ne__',
 '__repr__',
 '__setattr__',
 '__str__',
 '__weakref__',
 'a',
 'area',
 'b',
 'scale',
 'take_half']

In [71]:
>>> dir_filtered = magic_filter(rect)

**Separating Methods from Attributes**

Of the items returned after filtering, we still don't know which are attributes and which are methods. We can use the built-in callable function to filter them.

Attributes:

In [72]:
def filter_attrs(obj, name_list):
    return [x for x in name_list if not callable(getattr(obj, x))]

In [73]:
>>> attrs = filter_attrs(rect, dir_filtered)
>>> attrs

['a', 'b']

Methods:

In [74]:
def filter_methods(obj, name_list):
    return [x for x in name_list if callable(getattr(obj, x))]

In [75]:
>>> methods = filter_methods(rect, dir_filtered)
>>> methods

['area', 'scale', 'take_half']

To see the values of the attributes:

In [76]:
>>> attr_outputs = {x: getattr(rect, x) for x in attrs}
>>> attr_outputs

{'a': 3.0, 'b': 4.0}

**Calling Methods**

For the methods, it's a bit more complicated. One risk with indiscriminately calling methods is that they could modify the state of the object, like Rectangle.take_half. This can be avoided in most cases by making a copy.deepcopy before each method call, although this can be computationally intensive depending on the object. Note that methods which modify class variables, global variables, or interact with their external environment may still have some effect.

In [77]:
from copy import deepcopy

def get_callable(obj, name: str):
    return getattr(deepcopy(obj), name)

Methods that require positional requirements provide an additional challenge, like Rectangle.scale. We can get the outputs of the methods that don't require positionals by using the "leap before you look policy", or by using gestfullargspec from the insepct built-in module to determine which objects don't require positional arguments and evaluate only those.

**Calling Methods Technique 1: Leap Before You Look**

In [78]:
def attempt_call(func):
    try:
        return str(func())
    except:
        return '(failed to evaluate method)'

In [79]:
>>> outputs = {x: attempt_method_call(get_callable(rect, x)) for x in methods}
>>> outputs

{'area': '12.0',
 'scale': '(failed to evaluate method)',
 'take_half': "Rectangle{'a': 1.5, 'b': 4.0}"}

As expected, area and take_half, which don't require positionals returned values, whereas scale, which requires positional arguments did not.

**Calling Methods Technique 2: Check for Positionals**

First, let's introduce getfullargspec:

In [80]:
from inspect import getfullargspec

In [81]:
>>> getfullargspec(rect.scale)

FullArgSpec(args=['self', 'factor', 'ratio'], varargs=None, varkw=None, defaults=(1.0,), kwonlyargs=[], kwonlydefaults=None, annotations={'factor': <class 'float'>})

It returns a FullArgSpec object. args contains the argument names; vargs and varkw contain the names of variable length arguments and keyword arguments, as specified by the * and ** operators, respectively; defaults contains the default values for keyword arguments; kwonlyargs lists names of keyword-only args; kwonlydefaults is a dictionary with keyword-only arg default values; and annotations is a dictionary specifying any type annotations.

We can use this information to check if a method has positional arguments and evaluate it only if it doesn't. To start, we will attempt to get the fullargspec of the method, although not all callables are supported. Then we'll extract the args and use a utility function _remove_self to remove theself argument which is implicit to standard methods. Although it's not done here, we could also avoid calling class methods by checking for the cls argument. Finally, if all args have defaults, then there are no positionals and the method can be called.

In [82]:
def call_if_no_positionals(func):
    try:
        spec = getfullargspec(func)
    except TypeError:
        return '(unsupported callable)'
    args = spec.args
    if 'self' in args:
        args.remove('self')
    n_defaults = len(spec.defaults) if spec.defaults else 0
    if len(args) == n_defaults:
        return str(func())
    else:
        return '(requires positional args)'

def _remove_self(arg_list):
    """ remove implicit `self` argument from list of arg names """
    if 'self' in arg_list:
        arg_list.remove('self')

In [83]:
>>> outputs = {x: call_if_no_positionals(get_callable(rect, x)) for x in methods}
>>> outputs

{'area': '12.0',
 'scale': '(requires positional args)',
 'take_half': "Rectangle{'a': 1.5, 'b': 4.0}"}

**Inferring Argument Types**

We can use the information we get from  can do this by defining infer_arg_types, which starts out similarly to the last function, then populates an OrderedDict with types inferred from any type annotations and default arguments.

In [84]:
from inspect import getfullargspec
from collections import OrderedDict

def infer_arg_types(func):
    try:
        spec = getfullargspec(func)
    except TypeError:
        return '(unsupported callable)'
    arg_types = OrderedDict()
    args = spec.args
    _remove_self(args)
    # infer types from type hints
    for arg in args:
        type_ = spec.annotations.get(arg, None)
        arg_types[arg] = type_.__name__ if type_ is not None else None
    # infer types from default args
    if spec.defaults:
        for i, v in enumerate(spec.defaults):
            arg_i = - len(spec.defaults) + i
            arg = args[arg_i]
            arg_types[arg] = type(v).__name__
    if not arg_types:
        return None
    return arg_types

In [85]:
>>> method_arg_types = {x: infer_arg_types(getattr(rect, x)) for x in methods}
>>> method_arg_types

{'area': None,
 'scale': OrderedDict([('factor', 'float'), ('ratio', 'float')]),
 'take_half': None}

**Forging Arguments**

If we want to see example outputs for methods that require positional arguments, we can attempt to use these inferred argument types to forge them by looking up sample values for each type. We can even attempt to forge collections if the content type is in the annotation (e.g. List[int]).

In [86]:
_sample_args = {
    'float': 1.5,
    'int': 2,
    'str': 'abc',
    'typing.List[int]': [1, 2, 3],
}

We will define a ForgeError so that any errors caused by attempting to forge arguments can be handled specifically. This will allow us to attempt to forge arguments for a collection of methods even if some don't work.

In [87]:
class ForgeError(ValueError):
    pass

The forging function will take a method and look up sample arguments from _sample_args by type from the infer_arg_types output, raising errors if any arguments lacked defaults and types couldn't be inferred, or if any types are presented that aren't in _sample_args.

In [88]:
def forge_args(func, sample_dict=_sample_args):
    arg_types = infer_arg_types(func)
    # If no positional arguments
    if not arg_types:
        return {}
    # If not all types could be inferred
    if not all(arg_types.values()):
        raise ForgeError(f'Some arguments have unknown types')

    arg_dict = OrderedDict()
    for i, (arg, type_) in enumerate(arg_types.items()):
        # check for default values if keyword arg
        defaults = getfullargspec(func).defaults
        n_args_remaining = len(arg_types) - i
        if len(defaults) >= n_args_remaining:
            arg_dict[arg] = defaults[- n_args_remaining]
        # if no defaults, attempt to forge from _sample_dict
        elif type_ in _sample_args:
            arg_dict[arg] = sample_dict[type_]
        else:
            raise ForgeError(
                f'Unsupported argument type ({type_}) for argument: {arg}')
    return arg_dict

Since this is a somewhat complex function, we can set up a few test cases to make sure it works properly. (maybe just put a link to the code where it's tested and describe the test cases I'd use.

In [89]:
def test_forge_unkown_types():
    pass
def test_forge_annotated():
    pass
def test_forge_no_args():
    pass
def test_forge_kwargs():
    pass

Next, we can define a function that takes an object and iterates over all of its methods and attempts to forge the arguments for each using the "leap before you look" approach and noting the reason for any failures.

In [90]:
def forge_call_all(obj, sample_dict=_sample_args):
    dir_filtered = magic_filter(obj)
    method_names = filter_methods(obj, dir_filtered)
    output_dict = {}
    for name in method_names:
        method = get_callable(obj, name)
        try:
            arg_dict = forge_args(method, sample_dict)
            output_dict[name] = str(method(**arg_dict))
        except ForgeError:
            output_dict[name] = "(Failed to forge args)"
        except Exception:
            output_dict[name] = "(Failed to run method with forged args)"
    return output_dict

Let's give this a try on our Rectangle instance:

In [91]:
>>> forged_outputs = forge_call_all(rect)
>>> forged_outputs

{'area': '12.0', 'scale': 'None', 'take_half': "Rectangle{'a': 1.5, 'b': 4.0}"}

The difference between this result and the last result is subtle, but notice that scale now outputs 'None' rather than 'requires positional args'. That's because the method was called successfully with the forged arguments, but rather than returning anything, it modifies the state of rect by changing attributes a and b.  It would be nice to track these modifications so that we can understand what methods do even when they don't return anything.

**Tracking State Modification: Comparison Technique**

In this toy example, scale modifies the dimensions, a and b of the Rectangle and conveniently returns the new values. In the real world, state modifying methods do not often do this, so it would be helpful to have another way to check whether calling a method changed the objects state. We can do this by saving a copy of all the objects' attributes before the method call, then comparing them to the attributes after. We can define a StateComparator object to allow us to save the current attributes using the `__dict__` attribute, then check for new additions, deletions, and modifications of attributes after the method call.

In [92]:
class StateComparator:
    def __init__(self, obj):
        self.state = deepcopy(obj.__dict__)

    def compare(self, other):
        state_1 = self.state
        state_2 = deepcopy(other.__dict__)
        new_attrs = {k: v for k, v in state_2.items() if k not in state_1}
        del_attrs = {k: v for k, v in state_1.items() if k not in state_2}
        mod_attrs = {k: (v, state_2[k]) for k, v in state_1.items()
                     if v != state_2[k]}
        change_dict = {}
        if new_attrs:
            change_dict['new'] = new_attrs,
        if del_attrs:
            change_dict['deleted'] = del_attrs
        if mod_attrs:
            change_dict['modified'] = mod_attrs
        return change_dict

Using the forge_call_all method as a template, we can define a new method which includes state tracking and an option to turn argument forging on or off. The important changes are in bold italics.

In [93]:
def call_all_tracked(obj, sample_dict=_sample_args, forge=True):
    dir_filtered = magic_filter(obj)
    method_names = filter_methods(obj, dir_filtered)
    output_dict = {}
    for name in method_names:
        obj_2 = deepcopy(obj)
        # store initial state
        state = StateComparator(obj_2)
        method = getattr(obj_2, name)
        try:
            if forge is True:
                arg_dict = forge_args(method, sample_dict)
            else:
                arg_dict = {}
            output_dict[name] = str(method(**arg_dict))
        except ForgeError:
            output_dict[name] = "(Failed to forge args)"
        except Exception:
            output_dict[name] = "(Failed to run method with forged args)"
        # check for state changes
        change_dict = state.compare(obj_2)
        if change_dict:
            output_dict[name] = {
                'output': output_dict[name],
                'state changes': change_dict,
            }
        # remove 'output' entry in `output_dict` if no output
        if isinstance(output_dict[name], dict):
            if output_dict[name]['output'] == 'None':
                del output_dict[name]['output']
    return output_dict

Testing this on our rect:

In [94]:
>>> call_all_tracked(rect)

{'area': '12.0',
 'scale': {'state changes': {'modified': {'a': (3.0, 4.5), 'b': (4.0, 6.0)}}},
 'take_half': {'output': "Rectangle{'a': 1.5, 'b': 4.0}",
  'state changes': {'modified': {'a': (3.0, 1.5)}}}}

**Tracking State Modification: Metaclass Wrapper Technique**

A more elegant way, however, would be to create a wrapper class for the object which automatically tracks any state changes. (may need to use metaclass to dynamically inherit all methods and attributes of parent, then just modify gettatr and setattr to log the state).

Unfortunately, most python code is not type hinted, and much of it is unsupported by getargspec. In these cases, arguments forgery could also be attempted by brute force or extraction from docstrings, which are planned features for peep dis.

Simply printing out docstrings might be an easier way to understand methods that require arguments in most cases. They can be systematically printed out from the `__doc__` attribute.

In [31]:
>>> for x in dir_filtered:
>>>     attr = getattr(getattr(rect, x), __doc__, "No docstring")
>>>     print(f'{x}: {attr}')

a: No docstring
area: No docstring
b: No docstring
scale: No docstring
take_half: No docstring


The output is too long to include here, and it's difficult to decipher since it isn't color coded. The output can easily be colorized with termcolor, which is what was used for peep dis.

**IDE Tools (PyCharm)**

Most IDEs have tools to inspect objects while editing and debugging. PyCharm has some of the best object inspection and debugging tools, so I'll use it as an example for this tutorial

**CLI Tool: peep dis**