## Introduction
#### Here, we will look into the deeper functionality of the IPython system that can be used from comsole or within Jupyter.
#### NOTE - Most of the commands here are related to IPython console and not Jupyter notebook. Wherever applicable, we will provide code examples. But otherwise, most of the knowledge would be theoretical.

## Using the Command History
#### IPython maintains a small, on-disk database containing the text of each command that you executed.
#### This serves various purposes:
####     1. Searching, completing and executing previously executed commands with minimal typing
####     2. Persisting the command history between sessions
####     3. Logging input/output history to file
#### These features are more useful in the shell than in the notebook, as notebook by design keeps log of input and output in each code cell..

### Searching and Reusing the Command History
#### IPython shell lets you search and execute previous code or command. This is useful for using repeating commands (eg - %run).
#### Lets assume that you ran a code as shown below and found an error. After modifying the script, you can start typing first few letters of the %run command and press either 'Ctrl-P' or 'up' arrow key.
#### This will search command history for the first prior command matching the typed letters.
#### By repeating the same key combinations repeatedly, you can continue to search through history.
#### You can move 'ahead' in the history by pressing 'Ctrl-N' or 'down' arraow key.
#### By using 'Ctrl-R', we get partial incremental searching similar to 'readline' in Unix-style shells.
#### In Windows, you can do the same by pressing 'Ctrl-R' then type few characters contained in input line you want to search for.
#### Pressing 'Ctrl-R' will cycle through history for each line matching the characters typed.

In [4]:
# # Example of %run command that may need to be repeated
# %run first/second/third/data_script.py

### Input and Output Variables
#### Sometimes we forget to assign the result of a function to a variable. IPython stores references to both input commands and output objects in special variables.
#### The previous 2 outputs are stored in _ (1 underscore) and __ (2 underscores) respectively.

In [6]:
2 ** 27

134217728

In [7]:
_

134217728

#### Input variables are stored in variables named like '_iX' where 'X' is the input line number.
#### For each input variable, we have a corresponding output variable '_X'. eg - For line 27, we have _27 (output) and i_27 (input).
#### Since the input variables are Strings, they can be executed again with 'exec' keyword.

In [9]:
foo = 'bar'

In [10]:
foo

'bar'

In [11]:
_i10

'foo'

In [12]:
_10

'bar'

#### There are several 'magic functions' which allow to work with input and output history. '%hist' is capable for printing all or part of input history, with or without line numbers.
#### '%reset' clears interactive namespace and optionally input and output caches.
#### '%xdel' removes all references to a particular object form IPython machinery.
#### WARNING - When working with large datasets, IPython's input and output history causes any object referenced not to be 'garbage-collected' (i.e freeing up memory) even if you delete the variables from namespace using 'del'. Only '%xdel' and '%reset' can help you with memory problems.

## Interacting with Operating System
#### IPython also allows seamless access to filesystem and operating system shell. You can perform standard command line actions like Windows or Linux shell without exiting IPython.
#### This includes shell commands, changing directions or storing the results of a command. This also includes simple command aliasing and directory bookmarking.

### Shell Commands and Aliases
#### Starting a line with ! (exclamation or bang) tells IPython that what follows is a shell command.
#### You can store console output in a variable by assigning the expression escaped with ! to a variable.
#### The returned Python object is a custom list type having various versions of console output.

In [18]:
ip_info = !ipconfig

ip_info[23].strip()

'IPv6 Address. . . . . . . . . . . : 2601:c0:8180:378a:e4b6:e239:2a6:2930'

#### IPython can also substitute values in Python in current environment using '!'. To do this preface variable name with dollar sign.
#### The '%alias' magic function can define custom shortcuts for shell commands.
#### You can execute mutiple commands like on the command line by seperating with semicolons.
#### IPython 'forgets' aliases you define interactively as soon as the session closes. To create permanent aliases, use the configuration system.

In [21]:
foo = 'Appendix*'

!dir $foo

 Volume in drive C has no label.
 Volume Serial Number is 2605-B644

 Directory of C:\Users\adity\PythonForDataAnalysis

09/07/2018  03:57 PM           574,196 Appendix A - Advanced NumPy.ipynb
09/08/2018  12:05 PM             8,213 Appendix B - More on IPython System.ipynb
               2 File(s)        582,409 bytes
               0 Dir(s)  435,794,505,728 bytes free


In [26]:
# alias does not work the same for Windows
# %alias 'test_alias' (cd examples; dir)
# test_alias

### Directory Bookmark System
#### IPython has simple directory bookmarking system that enables you to save aliases for common directories so that you can jump around very easily.
#### After creating the bookmark using '%bookmark', you can use the '%cd' magic to use the bookmark defined.
#### If a bookmark conflict with directory name in current working directory, you can use '-b' flag to override and use bookmark location.
#### Using the '-l' option with '%bookmark' lists all your bookmarks.
#### Bookmarks, unlike aliases are automatically persisted between IPython sessions.

In [28]:
%bookmark datasets C:\Users\adity\PythonForDataAnalysis\datasets

In [29]:
cd datasets

C:\Users\adity\PythonForDataAnalysis\datasets


In [30]:
!dir

 Volume in drive C has no label.
 Volume Serial Number is 2605-B644

 Directory of C:\Users\adity\PythonForDataAnalysis\datasets

07/10/2018  02:24 PM    <DIR>          .
07/10/2018  02:24 PM    <DIR>          ..
07/10/2018  02:24 PM    <DIR>          babynames
07/10/2018  02:24 PM    <DIR>          bitly_usagov
07/10/2018  02:24 PM    <DIR>          fec
07/10/2018  02:24 PM    <DIR>          haiti
07/10/2018  02:24 PM    <DIR>          movielens
07/10/2018  02:24 PM    <DIR>          mta_perf
07/10/2018  02:24 PM    <DIR>          titanic
07/10/2018  02:24 PM    <DIR>          usda_food
               0 File(s)              0 bytes
              10 Dir(s)  435,658,518,528 bytes free


In [33]:
cd ..

C:\Users\adity\PythonForDataAnalysis


In [31]:
%bookmark -l

Current bookmarks:
datasets -> C:\Users\adity\PythonForDataAnalysis\datasets


## Software Development Tools
#### In addition to being a comfortable environment for data exploration, IPython can also be useful for general software development.
#### In data analysis, its important to first have the correct code. IPython has closely integrated and enhanced the Python 'pdb debugger'.
#### To make your software fast, IPython has easy-to-use timing and profiling tools.

### Interactive Debugger
#### IPython's debugger enhances 'pdb' with tab completion, syntax highlighting adn context for each line in exception tracebacks.
#### The best time to debug is right after the error has occured. The '%debug' command, when entered immediately after exception, invokes 'post-mortem' debugger and drops you into stack frame where exception was raised.
#### Once inside debugger, you can execute arbitrary Python code and explore all objects and data inside stack frame.
#### By default, you start at lowest level, where error occured. By pressing u{up} or d(down) you can switch between levels fo stack trace.
#### To exit debug mode, enter 'exit()' or press 'Ctrl-d'.

In [35]:
run examples/ipython_bug.py

AssertionError: 

In [36]:
%debug

> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_bug.py[0m(9)[0;36mthrows_an_exception[1;34m()[0m
[1;32m      7 [1;33m    [0ma[0m [1;33m=[0m [1;36m5[0m[1;33m[0m[0m
[0m[1;32m      8 [1;33m    [0mb[0m [1;33m=[0m [1;36m6[0m[1;33m[0m[0m
[0m[1;32m----> 9 [1;33m    [1;32massert[0m[1;33m([0m[0ma[0m [1;33m+[0m [0mb[0m [1;33m==[0m [1;36m10[0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m     10 [1;33m[1;33m[0m[0m
[0m[1;32m     11 [1;33m[1;32mdef[0m [0mcalling_things[0m[1;33m([0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m
ipdb> u
> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_bug.py[0m(13)[0;36mcalling_things[1;34m()[0m
[1;32m     11 [1;33m[1;32mdef[0m [0mcalling_things[0m[1;33m([0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m[1;32m     12 [1;33m    [0mworks_fine[0m[1;33m([0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m---> 13 [1;33m    [0mthrows_an_exception[0m[1;33m([0m[1;33m)[0m[1;33m[0m[0m


#### Executing '%pdb' makes it so that IPython automatically invokes the debugger after any exception, which many users may find useful.
#### You can use the debugger to develop code, especially when you step through execution of function or script to examine each stage. There are several ways to accomplish this.
#### First is to use '%run' with '-d' flag, which keeps invoking debugger before executing code in passed script. You must press 's' {step} to enter script.
#### After this, you have to decide how to work through the file. eg - We can set 'breakpoint' right before 'works_fine' method and run the script until reaching the breakpoint by pressing 'c' (continue).
#### Then we can step into 'works_fine()' or execute it by pressing 'n' (next) to execute next line.
#### Then we could step into 'throws_an_exception()' and advance to the line where the error occurs and locate variables in the scope. Here, debugger command takes precedence over variable names, you need to preced variable names with ! to examine contents.
#### Developing proficiency with interactive debugger requires practice and experience. It will take some time to master.

In [40]:
run -d examples/ipython_bug.py

Breakpoint 1 at c:\users\adity\pythonfordataanalysis\examples\ipython_bug.py:1
NOTE: Enter 'c' at the ipdb>  prompt to continue execution.
> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_bug.py[0m(1)[0;36m<module>[1;34m()[0m
[1;31m1[1;32m---> 1 [1;33m[1;32mdef[0m [0mworks_fine[0m[1;33m([0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m[1;32m      2 [1;33m    [0ma[0m [1;33m=[0m [1;36m5[0m[1;33m[0m[0m
[0m[1;32m      3 [1;33m    [0mb[0m [1;33m=[0m [1;36m6[0m[1;33m[0m[0m
[0m[1;32m      4 [1;33m    [1;32massert[0m[1;33m([0m[0ma[0m [1;33m+[0m [0mb[0m [1;33m==[0m [1;36m11[0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m      5 [1;33m[1;33m[0m[0m
[0m
ipdb> s
> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_bug.py[0m(6)[0;36m<module>[1;34m()[0m
[1;32m      4 [1;33m    [1;32massert[0m[1;33m([0m[0ma[0m [1;33m+[0m [0mb[0m [1;33m==[0m [1;36m11[0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m      5 [1;33m[1;

### Other Ways to make use of Debugger
#### There are a couple of ways to invoke 'debugger'.
#### First is by using special 'set_trace' function (named after pdb.set_trace, which is a "poor man's breakpoint".
#### The following 2 recipes are almost always good for general use.

In [42]:
from IPython.core.debugger import Pdb

def set_trace():
    Pdb(color_scheme='Linux').set_trace(sys._getframe().f_back)
    
def debug(f, *args, **kwargs):
    pdb = Pdb(color_scheme='Linux')
    return pdb.runcall(f, *args, **kwargs)

#### The 'set_trace' can be used in any part of the code you want to temporarily stop to closely examine (before exception occurs).
#### Pressing 'c' (continue) will cause the code to resume normally.
#### The 'dubug' function enables you to invoke interactive debugger easily on arbitrary function call. To debug a function f , paas it as first argument to debug, followed by positional and keyword arguments required by f.
#### The 'debugger' can be used with '%run'. By running a script with '%run -d', you will be dropped directly intp debugger, ready to set breakpoints and start the script.
#### Adding '-b' with line number start the debugger with a breakpoint set already.

In [47]:
# ipython_set_trace.py already has set_trace() function with dependent modules
run examples/ipython_set_trace.py

  


> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_set_trace.py[0m(19)[0;36mcalling_things[1;34m()[0m
[1;32m     17 [1;33m[1;32mdef[0m [0mcalling_things[0m[1;33m([0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m[1;32m     18 [1;33m    [0mset_trace[0m[1;33m([0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m---> 19 [1;33m    [0mthrows_an_exception[0m[1;33m([0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m     20 [1;33m[1;33m[0m[0m
[0m[1;32m     21 [1;33m[0mcalling_things[0m[1;33m([0m[1;33m)[0m[1;33m[0m[0m
[0m
ipdb> c


AssertionError: 

In [48]:
def f(x, y, z=1):
    tmp = x + y
    return tmp / z

In [49]:
# Running function f() with 'debug'
debug(f, 1, 2, z=3)

  import sys


> [1;32m<ipython-input-48-359ec13d6433>[0m(2)[0;36mf[1;34m()[0m
[1;32m      1 [1;33m[1;32mdef[0m [0mf[0m[1;33m([0m[0mx[0m[1;33m,[0m [0my[0m[1;33m,[0m [0mz[0m[1;33m=[0m[1;36m1[0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m[1;32m----> 2 [1;33m    [0mtmp[0m [1;33m=[0m [0mx[0m [1;33m+[0m [0my[0m[1;33m[0m[0m
[0m[1;32m      3 [1;33m    [1;32mreturn[0m [0mtmp[0m [1;33m/[0m [0mz[0m[1;33m[0m[0m
[0m
ipdb> c


1.0

In [50]:
# Running code in debugger mode
%run -d examples/ipython_bug.py

Breakpoint 1 at c:\users\adity\pythonfordataanalysis\examples\ipython_bug.py:1
NOTE: Enter 'c' at the ipdb>  prompt to continue execution.
> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_bug.py[0m(1)[0;36m<module>[1;34m()[0m
[1;31m1[1;32m---> 1 [1;33m[1;32mdef[0m [0mworks_fine[0m[1;33m([0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m[1;32m      2 [1;33m    [0ma[0m [1;33m=[0m [1;36m5[0m[1;33m[0m[0m
[0m[1;32m      3 [1;33m    [0mb[0m [1;33m=[0m [1;36m6[0m[1;33m[0m[0m
[0m[1;32m      4 [1;33m    [1;32massert[0m[1;33m([0m[0ma[0m [1;33m+[0m [0mb[0m [1;33m==[0m [1;36m11[0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m      5 [1;33m[1;33m[0m[0m
[0m
ipdb> c
[1;31m---------------------------------------------------------------------------[0m
[1;31mAssertionError[0m                            Traceback (most recent call last)
[1;32m~\Anaconda3\lib\site-packages\IPython\core\interactiveshell.py[0m in [0;36msafe_execfile[1;34m

In [51]:
# Running debugger with line number sets breakpoint
%run -d -b2 examples/ipython_bug.py

Breakpoint 1 at c:\users\adity\pythonfordataanalysis\examples\ipython_bug.py:2
NOTE: Enter 'c' at the ipdb>  prompt to continue execution.
> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_bug.py[0m(1)[0;36m<module>[1;34m()[0m
[1;32m----> 1 [1;33m[1;32mdef[0m [0mworks_fine[0m[1;33m([0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m[1;31m1[1;32m     2 [1;33m    [0ma[0m [1;33m=[0m [1;36m5[0m[1;33m[0m[0m
[0m[1;32m      3 [1;33m    [0mb[0m [1;33m=[0m [1;36m6[0m[1;33m[0m[0m
[0m[1;32m      4 [1;33m    [1;32massert[0m[1;33m([0m[0ma[0m [1;33m+[0m [0mb[0m [1;33m==[0m [1;36m11[0m[1;33m)[0m[1;33m[0m[0m
[0m[1;32m      5 [1;33m[1;33m[0m[0m
[0m
ipdb> c
> [1;32mc:\users\adity\pythonfordataanalysis\examples\ipython_bug.py[0m(2)[0;36mworks_fine[1;34m()[0m
[1;32m      1 [1;33m[1;32mdef[0m [0mworks_fine[0m[1;33m([0m[1;33m)[0m[1;33m:[0m[1;33m[0m[0m
[0m[1;31m1[1;32m---> 2 [1;33m    [0ma[0m [1;33m=[0m [1;

### Timing Code: %time and %timeit
#### For longer running applications, you may want to know execution time for various components or of individual statements or function cells.
#### You may want report of which functions are taking up most time in a complex process.
#### IPython enables you to get this information easily while developing or testing your code.
#### Timing code manually using built-in 'time' module and its functions 'time.clock' and 'time.time' is tedious and repetetive, as shown by below boilerplate code.

In [54]:
'''
import time

start = time.time()
for i in range(iterations):
    #Some code
elapsed_per = (time.time() - start) / iterations
'''

'\nimport time\n\nstart = time.time()\nfor i in range(iterations):\n    #Some code\nelapsed_per = (time.time() - start) / iterations\n'

#### This is so common that IPython has 2 magic functions '%time' and '%timeit' to automate this process for you.
#### '%time' runs a statement once, reporting total execution time. The 'Wall time' is the main number of interest.
#### But %time does not provide a very precise measurement. As shown by below example. The timing of the 2 functions varies with each execution.

In [56]:
strings = ['foo', 'foobar', 'baz', 'qux',
          'python', 'Guido Van Rossum'] * 1000000

methods1 = [x for x in strings if x.startswith('foo')]
methods2 = [x for x in strings if x[:3] == 'foo']

In [57]:
# First run
%time method1 = [x for x in strings if x.startswith('foo')]

Wall time: 823 ms


In [58]:
%time methods2 = [x for x in strings if x[:3] == 'foo']

Wall time: 562 ms


In [59]:
# 2nd run - Lot of variation from 1st run
%time method1 = [x for x in strings if x.startswith('foo')]

Wall time: 843 ms


In [60]:
%time methods2 = [x for x in strings if x[:3] == 'foo']

Wall time: 464 ms


#### To get more precise measurement, use the '%timeit' magic function.
#### Given an arbitrary statement, it has a heuristic to run the stement multiple times to produce more accurate runtime.

In [62]:
%timeit method1 = [x for x in strings if x.startswith('foo')]

835 ms ± 14.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [63]:
%timeit methods2 = [x for x in strings if x[:3] == 'foo']

467 ms ± 20.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


#### The above example illustrates that it is worth understanding performance characteristics of the Python standard library, NumPy, pandas, etc. In large-scale data analysis applications, each millisecond worth saving. 
#### %timeit is especially useful for analyzing statements and functions with short execution times, to the level of microseconds or nanoseconds.
#### eg - A 20 microsecond function invoked 1 million times takes 15 seconds longer than a 5 microsecond function.

In [65]:
x = 'foobar'
y = 'foo'

%timeit x.startswith(y)

141 ns ± 2.26 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


In [66]:
%timeit x[:3] == y

95.2 ns ± 1.01 ns per loop (mean ± std. dev. of 7 runs, 10000000 loops each)


### Basic Profiling: %prun and %run -p
#### Profiling code is closely related to timing code. Only difference is that the former determines 'where' time is spent.
#### The main Python profiling tool is 'cProfile', which is not specific to IPython. It executes a program or arbitrary code while keeping track of how much time is spent in each function.
#### A common way to use it is on the command line, running the entire program and outputing aggregate time per function.
#### Its easiest to scan down the 'cumtime' column and see how much total time was spent inside each function.
#### Note that if a function calls other function, its clock 'does not stop running'.

In [71]:
import numpy as np
from numpy.linalg import eigvals

def run_experiment(niter=100):
    K = 100
    results = []
    for _ in range(niter):
        mat = np.random.randn(K, K)
        max_eigenvalue = np.abs(eigvals(mat)).max()
        results.append(max_eigenvalue)
    return results

some_results = run_experiment()
print('Largest one we saw: ',np.max(some_results))

Largest one we saw:  11.51654142464131


In [73]:
!python -m cProfile examples/cprof_example.py

Largest one we saw:  11.622255056021956
         57874 function calls (55609 primitive calls) in 0.490 seconds

   Ordered by: standard name

   ncalls  tottime  percall  cumtime  percall filename:lineno(function)
      354    0.001    0.000    0.001    0.000 <frozen importlib._bootstrap>:103(release)
      162    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:143(__init__)
      162    0.000    0.000    0.002    0.000 <frozen importlib._bootstrap>:147(__enter__)
      162    0.000    0.000    0.001    0.000 <frozen importlib._bootstrap>:151(__exit__)
      354    0.001    0.000    0.002    0.000 <frozen importlib._bootstrap>:157(_get_module_lock)
      161    0.000    0.000    0.000    0.000 <frozen importlib._bootstrap>:176(cb)
      192    0.000    0.000    0.001    0.000 <frozen importlib._bootstrap>:194(_lock_unlock_module)
    220/1    0.000    0.000    0.117    0.117 <frozen importlib._bootstrap>:211(_call_with_frames_removed)
     1546    0.001    0.000    0.001

#### Other than command line usage, cProfile can be used programmatically to profile arbitrary blocks of code without running a new process.
#### IPython has convinient interface to use '%prun' command and '-p' option to '%run'.
#### %prun takes "command-line options" as cProfile but will profile arbitrary Python statment instead of whole '.py' file.
#### Calling '%prun -p -s cumulative cprof_example.py' has same effect as command-line approach, except you never have to leave IPython.
#### In Jupyter notebook, you can use '%%prun' to profile entire code block. It pops up a seperate window with profile output. This is useful for getting quick answers on code performance.
#### There are other tools like SnakeViz that make these profiles easier to understand when using IPython or Jupyter. It creates interactive visualization using d3.js.

In [75]:
%prun -l 7 -s cumulative run_experiment()

 

In [76]:
# Output shown in seperate window.
# 3804 function calls in 0.465 seconds

#    Ordered by: cumulative time
#    List reduced from 31 to 7 due to restriction <7>

#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    0.000    0.000    0.465    0.465 {built-in method builtins.exec}
#         1    0.000    0.000    0.465    0.465 <string>:1(<module>)
#         1    0.001    0.001    0.465    0.465 <ipython-input-71-963ae22297ec>:4(run_experiment)
#       100    0.421    0.004    0.429    0.004 linalg.py:834(eigvals)
#       100    0.034    0.000    0.034    0.000 {method 'randn' of 'mtrand.RandomState' objects}
#       300    0.003    0.000    0.003    0.000 {method 'reduce' of 'numpy.ufunc' objects}
#       200    0.000    0.000    0.002    0.000 {method 'all' of 'numpy.ndarray' objects}

### Profiling a Function Line by Line
#### In some cases the information obtained from '%prun' may not tell entire story about function's execution time, or may be so complex that the results are hard to interpret.
#### In such cases, there is a library 'line_profiler' (obtainable via PyPI or other package management tools), which contains an IPython extension enabling new magic function '%lprun'. This function computes line-by-line profiling of one or more function.
#### You can enable this extension by modifying your IPython configuration with following line. 

In [79]:
# c.TerminalIPythonApp.extensions = ['line_profile']

# Or run the following command
%load_ext line_profiler

#### 'line_profiler' can be used programatically but is most powerful when used interactively in IPython.
#### Lets assume you have a module with following code doing NumPy operations. To understand its performance, '%prun' provides output that is not very clear.
#### But with line_profiler IPython activated, a new command '%lprun' is available. The only difference is that we must instruct which function or functions we wish to profile. eg - %lprun -f func1 -f func2 statement_to_profile

In [81]:
# Contents of 'prof_mod'
from numpy.random import randn

def add_and_sum(x, y):
    added = x + y
    summed = added.sum(axis=1)
    return summed

def call_function():
    x = randn(1000, 1000)
    y = randn(1000, 1000)
    return add_and_sum(x, y)

In [82]:
%run examples/prof_mod.py

In [83]:
x = randn(3000, 3000)
y = randn(3000, 3000)

%prun add_and_sum(x, y)

 

In [84]:
# Output of %prun:
# 7 function calls in 0.080 seconds

#    Ordered by: internal time

#    ncalls  tottime  percall  cumtime  percall filename:lineno(function)
#         1    0.062    0.062    0.073    0.073 prof_mod.py:3(add_and_sum)
#         1    0.011    0.011    0.011    0.011 {method 'reduce' of 'numpy.ufunc' objects}
#         1    0.006    0.006    0.079    0.079 <string>:1(<module>)
#         1    0.000    0.000    0.011    0.011 {method 'sum' of 'numpy.ndarray' objects}
#         1    0.000    0.000    0.080    0.080 {built-in method builtins.exec}
#         1    0.000    0.000    0.011    0.011 _methods.py:31(_sum)
#         1    0.000    0.000    0.000    0.000 {method 'disable' of '_lsprof.Profiler' objects}

In [85]:
# Now using line_profiler
%lprun -f add_and_sum add_and_sum(x, y)

In [86]:
# %lprun output:
# Timer unit: 3.64672e-07 s

# Total time: 0.0351734 s
# File: C:\Users\adity\PythonForDataAnalysis\examples\prof_mod.py
# Function: add_and_sum at line 3

# Line #      Hits         Time  Per Hit   % Time  Line Contents
# ==============================================================
#      3                                           def add_and_sum(x, y):
#      4         1      67057.0  67057.0     69.5      added = x + y
#      5         1      29391.0  29391.0     30.5      summed = added.sum(axis=1)
#      6         1          4.0      4.0      0.0      return summed

#### The output of %lprun is much more easier to interpret than %prun.
#### We can even profile preceeding module code (eg - 'call_function') and profile that as well as the function being called (eg - 'add_and_sum') to get full picture of performance.
#### It is a good rule of thumb to use '%prun' for "macro" profiling and '%lprun' for "micro" profiling.
#### NOTE - The reason we explicitly specify names of functions that we want to profile with %lprun is that the overhead of 'tracing' the execution time of each line is substantial. Tracing functions that are not of interest has potential to significantly alter the profile result.

In [88]:
%lprun -f add_and_sum -f call_function call_function()

In [89]:
# Output of %lprun:
# Timer unit: 3.64672e-07 s

# Total time: 0.00391694 s
# File: C:\Users\adity\PythonForDataAnalysis\examples\prof_mod.py
# Function: add_and_sum at line 3

# Line #      Hits         Time  Per Hit   % Time  Line Contents
# ==============================================================
#      3                                           def add_and_sum(x, y):
#      4         1       7101.0   7101.0     66.1      added = x + y
#      5         1       3636.0   3636.0     33.9      summed = added.sum(axis=1)
#      6         1          4.0      4.0      0.0      return summed

# Total time: 0.0620351 s
# File: C:\Users\adity\PythonForDataAnalysis\examples\prof_mod.py
# Function: call_function at line 8

# Line #      Hits         Time  Per Hit   % Time  Line Contents
# ==============================================================
#      8                                           def call_function():
#      9         1      84061.0  84061.0     49.4      x = randn(1000, 1000)
#     10         1      72992.0  72992.0     42.9      y = randn(1000, 1000)
#     11         1      13059.0  13059.0      7.7      return add_and_sum(x, y)

## Tips for Productive Code Development Using IPython
#### Writing code in a way that makes iteasy to develop, debug and ultimately use may be a paradigm shift for many users.
#### Procedural details like code reloading may require some adjustment as well as code style concerns.
#### Implementing most of these strategies is more of an art than science and will require some experimentation before being effective.
#### Usually, software designed with IPython is easier to work with than code intended to be run from standalone command-line. Especially when diagnosing an error or debugging

### Reloading Module Dependencies
#### When we import a module, the code in that module is executed and its variables, functions and imports are stored in newly created module namespace.
#### The potential difficulty in interactive IPython code comes when run a script that depends on a module where you have already made changes. Because the old version of that module is the one being used.
#### This is Python's 'load once' module system. To cope with this, you have 2 options.
#### First, use 'reload' function in 'importlib' module in the standard library.
#### But if the dependencies go deeper, it might be a bit tricky to insert usage of 'reload' all over the place.
#### For this, IPython has 'dreload' functionfor 'deep' (recursive) reloading. It will attempt to reload the library as well as all of its dependencies. eg - 'dreload(smoe_lib)'.

In [92]:
'''
import some_lib

x = 5
y = [1, 2, 3, 4]

result = some_lib.get_answer(x, y)
'''
# After changing 'some_lib'
'''
import some_lib
import importlib

importlib.reload(some_lib)
'''

'\nimport some_lib\nimport importlib\n\nimportlib.reload(some_lib)\n'

### Code Design Tips
#### There are no simple recipes for good design, but there are some high-level principles that are effective in work.

#### Keep Relevant Objects and Data Alive
#### It is quite common to see a program run from command-line with structure shown below.
#### But if you run the same code in IPython, the results or objects defined in 'main' function will not be accessible by the IPython shell.
#### A better way is to have whatever code in main execute directly in the module's global namespace. Or "if __name__ == '__main__':" block, if you want module to be importable.
#### This is equivalent to defining top-level variables in cells in Jupyter notebook.

In [95]:
'''
from my_function import g

def f(x, y):
    return g(x + y)

def main():
    x = 6
    y = 7.5
    result = x + y
    
if __name__ == '__main__':
    main()
'''

"\nfrom my_function import g\n\ndef f(x, y):\n    return g(x + y)\n\ndef main():\n    x = 6\n    y = 7.5\n    result = x + y\n    \nif __name__ == '__main__':\n    main()\n"

#### Flat is Better than Nested
#### Deeply nested code makes it difficult during testing or debugging a function to reach the code of interest.
#### The idea 'flat is better than nested' applies generally to develop code for interactive use as well.
#### Making functions decoupled and modular as possible makes them easier to test, debug and use interactively.

#### Overcome Fear of Longer Files
#### In other languages (eg - Java), we are advised to keep files short. Long length is usally bad 'code smell', indicating refactoring or reorganizing may be necessary.
#### But while working in IPython, working with 10 small but interconnected files is likely to cause you more headaches than 2 or 3 longer files.
#### Fewer files means fewer modules to reload and less jumping between files while editing.
#### Obviously, putting all your code in a single mostrous file also does not make sense and should be avoided.
#### Each module should be internally cohesive and be as obvious as possible while finding methods and classes for each area of functionality.

## Advanced IPython Features
#### Making full use of IPython system may lead to writing code in different way or dig into the configuration

### Making your Classes IPython-Friendly
#### IPython makes every effort to display console-friendly string representations of objects taht you inspect.
#### For many objects (eg - dicts, lists, tuples) the built-in 'pprint' module is used for nice formatting.
#### But for user-defined classes, you have to generate desired string output yourself.
#### For a simple class like the one shown below, the default output is not very nice.
#### IPython takes string returned by '__repr__' magic method by doing 'output = repr(obj)' and prints that to the console.
#### So, we can simply add a '__repr__' method to get more helpful output.

In [100]:
# General class
class Message:
    def __init__(self, msg):
        self.msg = msg
        

In [101]:
# Print of the class object
x = Message('I have a secret')
x

<__main__.Message at 0x1f080134278>

In [104]:
# Class with defined output
class Message:
    def __init__(self, msg):
        self.msg = msg
        
    def __repr__(self):
        return('Message:%s' % self.msg)

In [105]:
x = Message('I have a Secret')
x

Message:I have a Secret

### Profiles and Configuration
#### Most aspects of appearence and behavior of IPython and Jupyter environments are configurable through configuration systems.
#### Some things you can change via configuiration are:
####     1. Change color scheme
####     2. Change how input and output prompts look.
####     3. Execute arbitrary list of Python statements.(eg - imports that you use all times or something you want to happen each time you launch IPython)
####     4. Enables always-on IPython extensions. eg - '%lprun' in line profiler
####     5. Enabling Jupyter extensions
####     6. Define your own magics or system aliases.
#### Configurations for IPython shell are specified in special 'ipython_config.py' files, found in 'ipython/' directory in USer home directory.
#### Configurations are performed based on particular profile. When you start IPython normally, you by default load up the 'default profile' stored in the 'profile_default' directory.
#### The configuration file for me is present in : "C:\Users\adity\.ipython\profile_default"
#### To initialize this file on your system, run the below command.    

In [107]:
'''
ipython profile create
'''

'\nipython profile create\n'

#### We will avoid the details of the file contents. It does contain comments describing what each configuration option is for, which we can tinker and customize.
#### Another useful feature is the possibility to have multipleprofiles. Creating a seperate profile for having a tailored IPython configuration for a specific application or project.
#### We can create another profile by using the command - 'ipython profile create profile_name'
#### Once done, you can edit the config files in the newly created 'profile_new_project' directory and launch ipython as shown below.

In [109]:
# Creating new profile
'''
ipython profile create new_project
'''
# Launching IPython from console
'''
ipython --profile=new_project
'''

'\nipython --profile=new_project\n'

#### Configuration for Jupyter is a bit different as you may use it for other languages as well.
#### To create config file in Jupyter, run the below command. This will create a default config file to '.jupyter/jupyter_notebook_config.py' in the home directory. You can rename it as shown below.
#### When launching Jupyter, add the '--config' argument.

In [111]:
# Create config file for Jupyter in console
'''
jupyter notebook --generate-config
'''
# Renaming the config file
'''
mv ~/.jupyter/jupyter_notebook_config.py ~/.jupyter/my_custom_config.py
'''
# Launching Jupyter with newly created config file
'''
jupyter notebook --config=~/.jupyter/my_custom_config.py
'''

'\njupyter notebook --config=~/.jupyter/my_custom_config.py\n'