# Chapter 3: IPython
# An interactive Computing and Development Environment
---------


## IPython basic

In [36]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

In [37]:
data = {i : np.random.randn() for i in range(7)}
data

{0: 1.1223853116662184,
 1: -0.3692035913922635,
 2: -0.5813181853474677,
 3: -0.4871844182477777,
 4: 0.9541142634919332,
 5: 0.39061161711148945,
 6: -1.7483533484319194}

## Tab completion

While entering expressions in the shell, pressing **Tab** will search the namespace for any variables (objects, functions, etc.) matching the characters you have typed so far:

In [38]:
an_apple = 15
an_apple = 30

Using **tab** to show the method of an object.

In [39]:
b = [1, 2, 3]
b.append(3)
b.count(3)

2

**“private” methods** : using underscore after dot and press **Tab**

In [40]:
b.__doc__

"list() -> new empty list\nlist(iterable) -> new list initialized from iterable's items"

using **Tab** to show the path of directory

In [41]:
path = '/home/khanhlq/PycharmProjects/python_data_build/chapter3/'

## Introspection

In [42]:
b = pd.DataFrame({
        "name": ['john', 'mary', 'anna'],
        "age": [25, 23, 19]
    })
b?

```python
Type:        DataFrame
String form:
   age  name
0   25  john
1   23  mary
2   19  anna
Length:      3
File:        ~/anaconda2/lib/python2.7/site-packages/pandas/core/frame.py
Docstring:  
Two-dimensional size-mutable, potentially heterogeneous tabular data
structure with labeled axes (rows and columns). Arithmetic operations
align on both row and column labels. Can be thought of as a dict-like
container for Series objects. The primary pandas data structure

Parameters
----------
data : numpy ndarray (structured or homogeneous), dict, or DataFrame
    Dict can contain Series, arrays, constants, or list-like objects
index : Index or array-like
    Index to use for resulting frame. Will default to np.arange(n) if
    no indexing information part of input data and no index provided
columns : Index or array-like
    Column labels to use for resulting frame. Will default to
    np.arange(n) if no column labels are provided
dtype : dtype, default None
    Data type to force, otherwise infer
copy : boolean, default False
    Copy data from inputs. Only affects DataFrame / 2d ndarray input

Examples
--------
>>> d = {'col1': ts1, 'col2': ts2}
>>> df = DataFrame(data=d, index=index)
>>> df2 = DataFrame(np.random.randn(10, 5))
>>> df3 = DataFrame(np.random.randn(10, 5),
...                 columns=['a', 'b', 'c', 'd', 'e'])

See also
--------
DataFrame.from_records : constructor from tuples, also record arrays
DataFrame.from_dict : from dicts of Series, arrays, or dicts
DataFrame.from_items : from sequence of (key, value) pairs
pandas.read_csv, pandas.read_table, pandas.read_clipboard
```

In [43]:
a = 1
a?

```python
Type:        int
String form: 1
Docstring:  
int(x=0) -> int or long
int(x, base=10) -> int or long

Convert a number or string to an integer, or return 0 if no arguments
are given.  If x is floating point, the conversion truncates towards zero.
If x is outside the integer range, the function returns a long instead.

If x is not a number or if base is given, then x must be a string or
Unicode object representing an integer literal in the given base.  The
literal can be preceded by '+' or '-' and be surrounded by whitespace.
The base defaults to 10.  Valid bases are 0 and 2-36.  Base 0 means to
interpret the base from the string as an integer literal.
>>> int('0b100', base=0)
4
```

In [44]:
def add_numbers(a, b):
    """
    Add two numbers
    
    Returns: the sum of two numbers
    """
    return a + b
add_numbers?

```python
Signature: add_numbers(a, b)
Docstring:
Add two numbers

Returns: the sum of two numbers
File:      ~/PycharmProjects/python_data_build/chapter3/<ipython-input-21-932640ba9f34>
Type:      function
```

To watch source code, we use double question mark ??

In [45]:
pd.pivot_table??

```python
Signature: pd.pivot_table(data, values=None, index=None, columns=None, aggfunc='mean', fill_value=None, margins=False, dropna=True, margins_name='All')
Source:   
def pivot_table(data, values=None, index=None, columns=None, aggfunc='mean',
                fill_value=None, margins=False, dropna=True,
                margins_name='All'):
    """
    Create a spreadsheet-style pivot table as a DataFrame. The levels in the
    pivot table will be stored in MultiIndex objects (hierarchical indexes) on
    the index and columns of the result DataFrame

    Parameters
    ----------
    data : DataFrame
    values : column to aggregate, optional
    index : column, Grouper, array, or list of the previous
        If an array is passed, it must be the same length as the data. The list
        can contain any of the other types (except list).
        Keys to group by on the pivot table index.  If an array is passed, it
        is being used as the same manner as column values.
    columns : column, Grouper, array, or list of the previous
        If an array is passed, it must be the same length as the data. The list
        can contain any of the other types (except list).
        Keys to group by on the pivot table column.  If an array is passed, it
        is being used as the same manner as column values.
    aggfunc : function or list of functions, default numpy.mean
        If list of functions passed, the resulting pivot table will have
        hierarchical columns whose top level are the function names (inferred
        from the function objects themselves)
    fill_value : scalar, default None
        Value to replace missing values with
    margins : boolean, default False
        Add all row / columns (e.g. for subtotal / grand totals)
    dropna : boolean, default True
        Do not include columns whose entries are all NaN
    margins_name : string, default 'All'
        Name of the row / column that will contain the totals
        when margins is True.

    Examples
    --------
    >>> df
       A   B   C      D
    0  foo one small  1
    1  foo one large  2
    2  foo one large  2
    3  foo two small  3
    4  foo two small  3
    5  bar one large  4
    6  bar one small  5
    7  bar two small  6
    8  bar two large  7

    >>> table = pivot_table(df, values='D', index=['A', 'B'],
    ...                     columns=['C'], aggfunc=np.sum)
    >>> table
              small  large
    foo  one  1      4
         two  6      NaN
    bar  one  5      4
         two  6      7

    Returns
    -------
    table : DataFrame
    """
    index = _convert_by(index)
    columns = _convert_by(columns)

    if isinstance(aggfunc, list):
        pieces = []
        keys = []
        for func in aggfunc:
            table = pivot_table(data, values=values, index=index,
                                columns=columns,
                                fill_value=fill_value, aggfunc=func,
                                margins=margins)
            pieces.append(table)
            keys.append(func.__name__)
        return concat(pieces, keys=keys, axis=1)

    keys = index + columns

    values_passed = values is not None
    if values_passed:
        if com.is_list_like(values):
            values_multi = True
            values = list(values)
        else:
            values_multi = False
            values = [values]
    else:
        values = list(data.columns.drop(keys))

    if values_passed:
        to_filter = []
        for x in keys + values:
            if isinstance(x, Grouper):
                x = x.key
            try:
                if x in data:
                    to_filter.append(x)
            except TypeError:
                pass
        if len(to_filter) < len(data.columns):
            data = data[to_filter]

    grouped = data.groupby(keys)
    agged = grouped.agg(aggfunc)

    table = agged
    if table.index.nlevels > 1:
        to_unstack = [agged.index.names[i] or i
                      for i in range(len(index), len(keys))]
        table = agged.unstack(to_unstack)

    if not dropna:
        try:
            m = MultiIndex.from_arrays(cartesian_product(table.index.levels),
                                       names=table.index.names)
            table = table.reindex_axis(m, axis=0)
        except AttributeError:
            pass  # it's a single level

        try:
            m = MultiIndex.from_arrays(cartesian_product(table.columns.levels),
                                       names=table.columns.names)
            table = table.reindex_axis(m, axis=1)
        except AttributeError:
            pass  # it's a single level or a series

    if isinstance(table, DataFrame):
        if isinstance(table.columns, MultiIndex):
            table = table.sortlevel(axis=1)
        else:
            table = table.sort_index(axis=1)

    if fill_value is not None:
        table = table.fillna(value=fill_value, downcast='infer')

    if margins:
        if dropna:
            data = data[data.notnull().all(axis=1)]
        table = _add_margins(table, data, values, rows=index,
                             cols=columns, aggfunc=aggfunc,
                             margins_name=margins_name)

    # discard the top level
    if values_passed and not values_multi and not table.empty:
        table = table[values[0]]

    if len(index) == 0 and len(columns) > 0:
        table = table.T

    return table
File:      ~/anaconda2/lib/python2.7/site-packages/pandas/tools/pivot.py
Type:      function
```

## The %run Command
------

In [46]:
%run script.py

The script is run in an empty namespace (with no imports or other variables defined)so that the behavior should be identical to running the program on the command lineusing python script.py. All of the variables (imports, functions, and globals) definedin the file (up until an exception, if any, is raised) will then be accessible in the IPythonshell:

In [47]:
result

77

### Python command line arguments
We have a python file (sys_lab.py) using **sys library**
```python
import sys
print sys.argv
print len(sys.argv)
```

In [48]:
%run sys_lab.py 12 89 'abc'

['sys_lab.py', '12', '89', "'abc'"]
4


## Keyboard shortcut in IPthon (shell)

| Command | Description |
|---------|-------------|
|Ctrl-P or up-arrow| Search backward in command history for commands starting with currently-entered text |
|Ctrl-N or down-arrow | Search forward in command history for commands starting with currently-entered text |
|Ctrl-R | Readline-style reverse history search (partial matching)|
|Ctrl-Shift-V | Paste text from clipboard|
|Ctrl-C | Interrupt currently-executing code|
|Ctrl-A | Move cursor to beginning of line|
|Ctrl-E | Move cursor to end of line|
|Ctrl-K | Delete text from cursor until end of line|
|Ctrl-U | Discard all text on current line|
|Ctrl-F | Move cursor forward one character|
|Ctrl-B | Move cursor back one character|
|Ctrl-L |Clear screen|

## Exceptions and Tracebacks
If an exception is raised while %run-ing a script or executing any statement, IPython will by default print a full call stack trace (traceback) with a few lines of context around the position at each point in the stack.

In [49]:
pront

NameError: name 'pront' is not defined

## Magic Commands
----
IPython has many special commands, known as “magic” commands, which are de-signed to faciliate common tasks and enable you to easily control the behavior of the IPython system. A magic command is any command prefixed by the the percent symbol%.

For example:
* **%timeit**: Run a statement multiple times to compute an emsemble average execution time. Useful fortiming code with very short execution time

In [None]:
a = np.random.randn(100, 100)
a

In [None]:
%timeit np.dot(a, a)

* **%quickref**: Display the IPython Quick Reference Card

In [None]:
%quickref

* **%alias**: Make shortcut for a command

In [None]:
%alias d dir

In [None]:
!dir

* **%hist**: Show the history

In [None]:
%hist

* **%who**: Display all the variables defined in IPython

In [None]:
%who

* **%reset**: Delete all the variable / names defined in the interactive namespace 

When we works with the large data sets, the the Input and Output history can make your computer become freeze, so we need to care about free the space. Using **%reset** or **%xdel** can solve that problems

In [None]:
%reset

## Qt-based Rich GUI Console

### Runing:
```python 
ipython qtconsole --pylab=inline 
```

### Describe:
The Qt console can launch multiple IPython processes in tabs, enabling you to switch between tasks. It can also share a process with the IPython HTML Notebook applica-tion

## Examples

* show image
```python
import matplotlib.pyplot as plt
img = plt.imread('/home/khanhlq/Pictures/hello_world.png')
plt.imshow(img)
plt.show()
```

* show plot
```python
import numpy as np
plt.plot(np.random.randn(1000).cumsum())
plt.show()
```

## Matplotlib Integration and Pylab Mode
IPthon can run samelessly with the GUI. But, regular Python shell cannot do that 

## Using the Command History
IPython maintains a small on-disk database containing the text of each command that you execute. This serves various purposes:
* Searching, completing, and executing previously-executed commands with mini-mal typing

    Use Ctr-R to search the old command.
    

* Persisting the command history between sessions.
    
    * Use **_iX** to show Input variable in line X
    * Use **_X** to show Output in the line X




In [51]:
_i3

NameError: name '_i3' is not defined

In [None]:
_4

* Logging the input/output history to a file

In [52]:
%logstart

Activating auto-logging. Current session state plus future input saved.
Filename       : ipython_log.py
Mode           : rotate
Output logging : False
Raw input log  : False
Timestamping   : False
State          : active


## Interacting with the Operating System

In [None]:
ls

In [None]:
pwd

In [53]:
ip_info = !ifconfig | grep "inet "
ip_info

["'ifconfig' is not recognized as an internal or external command,",
 'operable program or batch file.']

Using "!" sign to interactive with the shell

In [None]:
!touch a.txt

In [None]:
!rm -rf a.txt