<center><img src=img/MScAI_brand.png width=70%></center>

# Introspection

Introspection means looking inward to the self. In programming, it refers to a few ways in which a program can inspect its own properties **while running** (as distinct from **at compile-time**).

* Finding out the type of an object;
* Accessing docstrings, function names, and other function properties;
* Accessing filenames and line numbers of the source file;
* Accessing the call stack (what functions called what functions to lead us to *here*);
* Accessing the configuration of the interpreter.

A lot of people think that introspection is a key ingredient in consciousness and intelligence! So it's relevant to AI in this sense. But we will think of introspection as a handy tool for programming in general.

In `C` and other older languages, even detecting the length of an array at runtime was not possible. So to a C programmer, Python's `len(a)` would seem like introspection.

Given an arbitrary object, there are several easy ways to find out about it -- some we've seen already -- and these are more like true introspection:

* `type`: what type is it?
* `dir`: what methods does it have?
* `repr`: see a useful string representation, may be valid Python code.
* `help`: access the docstring.

Some new ones, quite self-explanatory:

* `callable(f)`: is `f` callable (i.e. is `f(...)` allowed)?
* `issubclass(D, C)`: is `D` a subclass of `C`? (Works on classes, not objects.)
* `isinstance(c, C)`: check if `c` is an instance of `C` (allowing for subclasses).

In [2]:
def f(): pass
x = 17
class C:
    def __call__(self):
        return "Objects can quack like functions"
c = C()
print(callable(f))
print(callable(x))
print(callable(C))
print(callable(c))
c()

True
False
True
True


'Objects can quack like functions'

In [2]:
class C: pass
class D(C): pass
class E(D): pass
print(issubclass(D, C))
print(issubclass(E, C))
print(issubclass(C, D))

True
True
False


In [68]:
c = C()
d = D()
print(isinstance(c, C))
print(isinstance(d, D))
print(isinstance(c, D))
print(isinstance(d, C))

True
True
False
True


Classes and functions have official names. The official name is the one written in the source following  `class` or `def`. This is distinct from the object's variable name!

* `x.__name__`: get the "official" name of `x` which must be a function or a class

In [61]:
def f(a, b):
    return a + b
g = f # g is f, and has the same official name:
print(f.__name__)
print(g.__name__)

f
f


This could be useful e.g. when testing some Scikit-Learn models:

In [62]:
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier

for clf in [LogisticRegression(solver="lbfgs"), 
            KNeighborsClassifier(n_neighbors=1)]:
    s = clf.fit([[3, 4], [3, 5]], [0, 1]).score([[4, 4]], [0])
    print(clf.__class__.__name__, s)

LogisticRegression 1.0
KNeighborsClassifier 1.0


### Accessing variables and values in a namespace

Each returns a `dict`:

* `globals()`: variables in the module namespace
* `locals()`: variables in the current namespace
* `vars()`: same as `locals()`
* `vars(d)`: variables in namespace `d` where e.g. `d` is some module.


### `hasattr`

"Frameworks use introspection frequently, to discover the capabilities of objects the user has passed; for example, "does this object's class have a `do_something()` method? If so, call the object's `do_something()`; otherwise call the `do_something_similar()` framework function with the object as an argument."" 

-- http://archive.oreilly.com/oreillyschool/courses/Python4/Python4-09.html.

`hasattr(obj, attr)` tells us whether an object  has a particular attribute.

Using introspection and `hasattr`, we check that our code will work before running it. 



In [13]:
def check_initial(x):
    if hasattr(x, "startswith"):
        print(x.startswith("C"))
    else:
        print(x.__class__.__name__.startswith("C"))
check_initial("CDEFG")

True


In the **duck typing** style, we instead assume `x` has `a` (and maybe catch a possible exception). We say "it is easier to ask forgiveness than permission" (EAFP):



In [14]:
def check_initial(x):
    try:
        print(x.startswith("C"))
    except AttributeError:
        print(x.__class__.__name__.startswith("C"))
check_initial(C())

True


### Version numbers for Python and libraries

These are useful:
1. To catch possible incompatibilities between your code and an old version of a library;
2. When reporting bugs on Github and asking questions on Stackoverflow.

In [34]:
import sys
print(sys.version)
import numpy as np
print(np.version)
import sklearn
print(sklearn.__version__)

3.7.3 (default, Mar 27 2019, 22:11:17) 
[GCC 7.3.0]
<module 'numpy.version' from '/home/jmmcd/anaconda3/lib/python3.7/site-packages/numpy/version.py'>
0.21.2


### Some more possibilities

* Find out what source file an object was defined in, and at what line number (`inspect.getmembers()`);
* Find out which function called the function we are currently in (`inspect.getmembers()`);
* Given a function, find out what arguments (and types) it expects (`inspect.Signature`).

### Example: getting at the data in a JSON file

Suppose we have a JSON file and we don't know anything about its structure. We can use the `json` module to read it in to native Python datatypes. We'll use introspection to understand the structure.

In [8]:
import json
d = json.load(open("data/students.json"))

Notice that (just like `pickle.load` and `pickle.dump` which we saw before), `json.load` expects an open file, not a filename. Same for `json.dump`.

In [9]:
d

[{'name': 'Bruce Wayne',
  'age': 34,
  'ID': '1234',
  'modules': {'CT5123': {'grades': [55, 68],
    'attendance': [False, True, True, True, True]},
   'CT5234': {'grades': [45, 90],
    'attendance': [True, False, False, False, True]}}},
 {'name': 'Peter Parker',
  'age': 21,
  'ID': '0126',
  'modules': {'CT5123': {'grades': [90, 90, 90],
    'attendance': [False, True, True, True, True]},
   'CT5234': {'grades': [60, 74],
    'attendance': [False, True, True, True, True]}}}]

In [10]:
type(d)

list

In [12]:
type(d[0])

dict

In [13]:
d[0].keys()

dict_keys(['name', 'age', 'ID', 'modules'])

In [14]:
d[0]["name"]

'Bruce Wayne'

In [15]:
d[0]["age"]

34

In [17]:
d[0]["ID"]

'1234'

In [18]:
d[0]["modules"]

{'CT5123': {'grades': [55, 68], 'attendance': [False, True, True, True, True]},
 'CT5234': {'grades': [45, 90],
  'attendance': [True, False, False, False, True]}}

In [19]:
type(d[0]["modules"])

dict

... and so on.

Downey provides a nice function `structshape` for doing this automatically and summarising the result. See example below and see Think Python, p. 120. The code is in `code/structshape.py`.

In [23]:
from code.structshape import structshape
structshape(students)

'list of (dict of 4 str->(int, dict of 2 str->dict of 2 str->(list of 2 int, list of 5 bool), str), dict of 4 str->(int, dict of 2 str->(dict of 2 str->(list of 3 int, list of 5 bool), dict of 2 str->(list of 2 int, list of 5 bool)), str))'