# Inspect live objects

This notebook illustrates the powerful package [`inspect`][1] from the PSL which allows user to inspect live objects. This package can help developpers to:

 - better understand internal Python mechanics;
 - explore modules developped by others whether the source is available or not;
 - be very creative and hack a lot.

[1]: https://docs.python.org/3/library/inspect.html#module-inspect

In [1]:
import inspect

## Trial Package

We will explore the Pandas package in order to illustrate the power of `inspect` package. First we import it:

In [2]:
import pandas as pd
pd.__version__

'0.23.0'

Now the variable `pd` points towards a module object:

In [3]:
pd

<module 'pandas' from 'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\__init__.py'>

And this is confirmed by inpsection:

In [4]:
inspect.ismodule(pd)

True

Even more useful, we can find out in which file the module resides, with no suprises, it belongs to `__init__.py`:

In [5]:
inspect.getfile(pd)

'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\__init__.py'

This method is quite handy when you need to find out in which file a source code resides.

## List objects

Function [`getmembers`][1] returns a list of 2-tuples (respectively object name and the object itself, so it can be directly converted into dict using the constructor) that are held by the inspected object. Performing this method on a module returns a lot of objects, this list can be refined using a predicate function to filter out specififc objects.

We select available classes and functions within the pandas root module:

[1]: https://docs.python.org/3/library/inspect.html#inspect.getmembers

In [6]:
objects = dict(inspect.getmembers(pd, inspect.isclass))
functions = dict(inspect.getmembers(pd, inspect.isfunction))

We can see that Pandas package initialization maps a lot of objects spread among many submodules:

In [7]:
objects

{'Categorical': pandas.core.arrays.categorical.Categorical,
 'CategoricalIndex': pandas.core.indexes.category.CategoricalIndex,
 'DataFrame': pandas.core.frame.DataFrame,
 'DateOffset': pandas.tseries.offsets.DateOffset,
 'DatetimeIndex': pandas.core.indexes.datetimes.DatetimeIndex,
 'ExcelFile': pandas.io.excel.ExcelFile,
 'ExcelWriter': pandas.io.excel.ExcelWriter,
 'Float64Index': pandas.core.indexes.numeric.Float64Index,
 'Grouper': pandas.core.groupby.groupby.Grouper,
 'HDFStore': pandas.io.pytables.HDFStore,
 'Index': pandas.core.indexes.base.Index,
 'Int64Index': pandas.core.indexes.numeric.Int64Index,
 'Interval': pandas._libs.interval.Interval,
 'IntervalIndex': pandas.core.indexes.interval.IntervalIndex,
 'MultiIndex': pandas.core.indexes.multi.MultiIndex,
 'Panel': pandas.core.panel.Panel,
 'Period': pandas._libs.tslibs.period.Period,
 'PeriodIndex': pandas.core.indexes.period.PeriodIndex,
 'RangeIndex': pandas.core.indexes.range.RangeIndex,
 'Series': pandas.core.series.Ser

There are a lot of useful functions directly available:

In [8]:
functions.keys()

dict_keys(['Expr', 'Term', 'bdate_range', 'concat', 'crosstab', 'cut', 'date_range', 'eval', 'factorize', 'get_dummies', 'get_store', 'groupby', 'infer_freq', 'interval_range', 'isna', 'isnull', 'lreshape', 'match', 'melt', 'merge', 'merge_asof', 'merge_ordered', 'notna', 'notnull', 'period_range', 'pivot', 'pivot_table', 'pnow', 'qcut', 'read_clipboard', 'read_csv', 'read_excel', 'read_feather', 'read_fwf', 'read_gbq', 'read_hdf', 'read_html', 'read_json', 'read_msgpack', 'read_parquet', 'read_pickle', 'read_sas', 'read_sql', 'read_sql_query', 'read_sql_table', 'read_stata', 'read_table', 'scatter_matrix', 'set_eng_float_format', 'show_versions', 'test', 'timedelta_range', 'to_datetime', 'to_msgpack', 'to_numeric', 'to_pickle', 'to_timedelta', 'unique', 'value_counts', 'wide_to_long'])

## Explore object

Now we will explore a specific object, we pick up the famous `DataFrame`:

In [9]:
inspect.isclass(pd.DataFrame)

True

### Find the source code

In [10]:
inspect.getfile(pd.DataFrame)

'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py'

In [11]:
inspect.getsourcefile(pd.DataFrame)

'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py'

In [12]:
inspect.getmodule(pd.DataFrame)

<module 'pandas.core.frame' from 'C:\\ProgramData\\Anaconda3\\lib\\site-packages\\pandas\\core\\frame.py'>

In [13]:
inspect.getmodulename(inspect.getsourcefile(pd.DataFrame))

'frame'

In [14]:
inspect.getsourcelines(pd.DataFrame)[-1]

248

Those information are really helpful because it help us to find out where the actual code resides. For instance, the definition of `DataFrame` class in version `0.23.0` is located in `pandas/core/frame.py` at the line `248`.

So it can be retrived easily from GitHub and spare us the time of browsing hundred source code files: 

 - https://github.com/pandas-dev/pandas/blob/0.23.x/pandas/core/frame.py#L248

In [15]:
print(inspect.getsource(pd.DataFrame)[:248])

class DataFrame(NDFrame):
    """ Two-dimensional size-mutable, potentially heterogeneous tabular data
    structure with labeled axes (rows and columns). Arithmetic operations
    align on both row and column labels. Can be thought of as a dict-li


In [16]:
print(inspect.getdoc(pd.DataFrame)[:248])

Two-dimensional size-mutable, potentially heterogeneous tabular data
structure with labeled axes (rows and columns). Arithmetic operations
align on both row and column labels. Can be thought of as a dict-like
container for Series objects. The prima


In [17]:
print(inspect.getcomments(pd.DataFrame))

None
