# `vprof`

[vprof](https://github.com/nvdv/vprof) is a comprehensive visual code profiling tool which serves you a code runtime profile on a web server. It's awesome.

In [1]:
from vprof import profiler

## Example: apriori

I'm using the apriori frequent itemset generation algorithm I wrote in `web-data-mining` to guinea pig this one.

In [2]:
import apriori

In [3]:
apriori.init_pass("75000-out1.csv", 0.1)[0]

75000it [00:00, 190828.87it/s]


[[7], [28], [45]]

In [4]:
help(profiler.run)

Help on function run in module vprof.profiler:

run(func, options, args=(), kwargs={}, host='localhost', port=8000)
    Runs profilers specified by options against func.
    Args:
        func: Python function object.
        options: A string with profilers configuration (i.e. 'cmh').
        args: Arguments to pass to func.
        kwargs: Keyword arguments to pass to func.
        host: Host to send profilers data.
        port: Port to send profilers.data.



In [5]:
# def print_hello(foo=""):
#     print("Hello World! And also {0}".format(foo))

In [6]:
# profiler.run(print_hello, 'ch')

Note that `vprof` requires running the algorithm once for each type of graph generated. Note also that before launching the application here you need to start up the HTTP server by typing `vprof -r` into the command prompt.

In [7]:
profiler.run(apriori.init_pass,
             'ch', # This is the "everything" option.
             args=("75000-out1.csv", 0.1))

75000it [00:00, 171223.10it/s]
75000it [00:10, 7044.50it/s]


In [9]:
apriori.apriori("75000-out1.csv", 0.01)

75000it [00:00, 184718.48it/s]
75000it [00:24, 3097.72it/s]
75000it [00:00, 88334.16it/s]
75000it [00:00, 171614.96it/s]
75000it [00:00, 255958.13it/s]
75000it [00:00, 301187.72it/s]


[[0],
 [1],
 [2],
 [3],
 [4],
 [5],
 [6],
 [7],
 [8],
 [9],
 [10],
 [11],
 [12],
 [13],
 [14],
 [15],
 [16],
 [17],
 [18],
 [19],
 [20],
 [21],
 [22],
 [23],
 [24],
 [25],
 [26],
 [27],
 [28],
 [29],
 [30],
 [31],
 [32],
 [33],
 [34],
 [35],
 [36],
 [37],
 [38],
 [39],
 [40],
 [41],
 [42],
 [43],
 [44],
 [45],
 [46],
 [47],
 [48],
 [49],
 [3, 35],
 [7, 15],
 [23, 40],
 [41, 43],
 [24, 40],
 [16, 45],
 [7, 49],
 [12, 31],
 [29, 47],
 [11, 45],
 [0, 2],
 [7, 11],
 [31, 36],
 [33, 42],
 [17, 47],
 [2, 46],
 [7, 45],
 [31, 48],
 [37, 45],
 [23, 41],
 [40, 41],
 [7, 37],
 [27, 28],
 [15, 49],
 [32, 45],
 [16, 32],
 [1, 19],
 [0, 46],
 [36, 48],
 [23, 43],
 [40, 43],
 [24, 41],
 [18, 35],
 [3, 18],
 [17, 29],
 [12, 36],
 [24, 43],
 [23, 24],
 [12, 48],
 [14, 44],
 [4, 9],
 [5, 22],
 [11, 37],
 [12, 31, 48],
 [7, 15, 49],
 [24, 40, 41],
 [23, 24, 40],
 [40, 41, 43],
 [17, 29, 47],
 [23, 40, 43],
 [23, 24, 41],
 [16, 32, 45],
 [31, 36, 48],
 [24, 41, 43],
 [7, 37, 45],
 [23, 24, 43],
 [11, 37,

In [8]:
profiler.run(apriori.apriori,
             'ch',
             args=("75000-out1.csv", 0.01))

75000it [00:00, 167028.33it/s]
75000it [00:30, 2471.52it/s]
75000it [00:01, 71972.85it/s]
75000it [00:00, 140441.37it/s]
75000it [00:00, 216125.93it/s]
75000it [00:00, 254222.63it/s]
75000it [00:01, 38797.58it/s]
22534it [08:32, 44.28it/s]


KeyboardInterrupt: 

The `m` setting (measuring memory) fills up RAM, but this will be fixed.

The computational overhead introduced is extordinary. This can be countered by running the algorithm in partial mode (by dramatically cutting down on the size of the input etc.).