In [1]:
using Revise


[FProfile.jl](https://github.com/cstjean/FProfile.jl) provides an alternative interface for [Julia's sampling profiler](https://docs.julialang.org/en/latest/manual/profile/). Please read the introduction of that document before proceeding if you've never used `@profile`.

# Profiling

You can build a profile by calling `@fprofile(code, delay=0.001, n_samples=1000000)`:

In [2]:
using FProfile, Calculus

pd = @fprofile second_derivative(sin, 1.0)

[1m[36mINFO: [39m[22m[36mRecompiling stale cache file /Users/cedric/.julia/lib/v0.6/FProfile.ji for module FProfile.
[39m

ProfileData(48 backtraces)

`@fprofile(N, ...)` is shorthand for `@fprofile(for _ in 1:N ... end)`:

In [3]:
pd = @fprofile 1000000 second_derivative(sin, 1.0)

ProfileData(1016 backtraces)

`ProfileData` merely wraps the internal data in `Base.Profile`. Do not forget that Julia compiles code the first time a function is run; if you do not want to measure compilation time, execute your code once before profiling.

# Flat view

FProfile's `flat` report is a [dataframe](http://juliadata.github.io/DataFrames.jl/stable/man/getting_started/#Getting-Started-1), however no particular knowledge of dataframes is necessary. I'll provide a few common operations below.

In [4]:
using DataFrames

df = flat(pd)
head(df, 5)   # show only the first 5 rows (the 5 rows with the highest count)

LoadError: [91mMethodError: no method matching haskey(::FProfile.#symbol2accessor, ::Symbol)[0m
Closest candidates are:
  haskey([91m::DataFrames.Index[39m, ::Symbol) at /Users/cedric/.julia/v0.6/DataFrames/src/other/index.jl:57
  haskey([91m::Dict[39m, ::Any) at dict.jl:505
  haskey([91m::Base.ImmutableDict[39m, ::Any) at dict.jl:639
  ...[39m

The first column shows what fraction of backtraces (in %) go through the `method at file:line_number` in the `stackframe` column. It's the same quantity as in `Base.Profile.print()`, except for recursive calls: if `f(1)` calls `f(0)`, that's 2 counts in Base's report, but only 1 count in FProfile.

The other columns unpack the `stackframe`; they are useful for selecting subsets of the table. For instance, if I only care about the `derivative` function, I might use

```julia
    df[df[:function].===derivative, :]
```

It is common to focus optimization efforts on one or more modules at a time (... the ones you're developing). `flat(pd, MyModule)` filters out other modules and adds a useful column: `end_count_percent` measures how much `MyModule`-specific work is done on that line.

For instance, in the code below, while the `do_computation()` call takes a long time (it has a high `count_percent`), it merely calls another `Main` function, so it has a low `end_count_percent`. `sum_of_sin` has `end_count_percent = 100%` because while it calls `sum` and `sin`, those are defined in another module (`Base`), and counted as external to `Main`.

`flat(pd, (Module1, Module2, ...))` is also accepted.

In [5]:
@noinline do_computation(n) = sum_of_sin(n)
@noinline sum_of_sin(n) = sum(sin, 1:n)
pd2 = @fprofile do_computation(10000000)
flat(pd2, Main)

LoadError: [91mMethodError: no method matching haskey(::FProfile.#symbol2accessor, ::Symbol)[0m
Closest candidates are:
  haskey([91m::DataFrames.Index[39m, ::Symbol) at /Users/cedric/.julia/v0.6/DataFrames/src/other/index.jl:57
  haskey([91m::Dict[39m, ::Any) at dict.jl:505
  haskey([91m::Base.ImmutableDict[39m, ::Any) at dict.jl:639
  ...[39m

It pays to make sure that functions with a high `end_count_percent` are [well optimized](https://docs.julialang.org/en/latest/manual/performance-tips/).

Another way to reduce the level of detail is to aggregate by `:specialization, :method, :file, :function`, or `:module`.

In [6]:
df_by_method = flat(pd, combineby=:method)

LoadError: [91mMethodError: no method matching haskey(::FProfile.#symbol2accessor, ::Symbol)[0m
Closest candidates are:
  haskey([91m::DataFrames.Index[39m, ::Symbol) at /Users/cedric/.julia/v0.6/DataFrames/src/other/index.jl:57
  haskey([91m::Dict[39m, ::Any) at dict.jl:505
  haskey([91m::Base.ImmutableDict[39m, ::Any) at dict.jl:639
  ...[39m

You can see the context (caller/called functions) around each of these rows by passing it to `tree`:

In [7]:
tree(pd, df_by_method, 9)   # show the context of the 9th row of `df_by_method`

LoadError: [91mUndefVarError: df_by_method not defined[39m

Other useful dataframe commands:

```julia
df[[:count_percent, :method]]   # select only those two columns
sort(df, cols=:end_count_percent, rev=true)  # sort by end_count_percent
showall(df)   # show the whole dataframe
```

See `?flat` for more options.

# Tree view

FProfile's tree view looks the same as `Base.Profile.print(format=:tree)`. The numbers represent raw counts. (If some branches seem out of place, see [this issue](https://github.com/JuliaLang/julia/issues/9689))

In [8]:
tr = tree(pd)

 1012 ./task.jl:335; (::IJulia.##14#17)()
  1012 ...ulia/src/eventloop.jl:8; eventloop(::ZMQ.Socket)
   1012 ...c/execute_request.jl:154; execute_request(::ZMQ.Socket, :...
    1012 ...Compat/src/Compat.jl:464; include_string(::Module, ::Str...
     1012 ./loading.jl:515; include_string(::String, ::String)
      1012 ./<missing>:?; anonymous
       1012 ...ile/src/FProfile.jl:40; macro expansion
        1012 ./profile.jl:23; macro expansion
         1011 ...le/src/FProfile.jl:55; macro expansion
          2    .../src/derivative.jl:0; second_derivative(::Functio...
          1006 .../src/derivative.jl:71; second_derivative(::Functio...
           5   ...src/derivative.jl:0; derivative(::Function, ::Sy...
           1   ...src/derivative.jl:2; derivative(::Function, ::Sy...
           559 ...src/derivative.jl:3; derivative(::Function, ::Sy...
           2   ...ite_difference.jl:0; finite_difference_hessian(:...
           188 ...ite_difference.jl:224; finite_difference_hessian(:...
    

If you're interested in a particular module/file/method/function, you can pass it to `tree`, along with an optional _neighborhood range_.

In [9]:
tr_deriv = tree(pd, second_derivative, -1:1)    # -1:1 = show one level of callers and one level of called functions

LoadError: [91mMethodError: no method matching haskey(::FProfile.#symbol2accessor, ::Symbol)[0m
Closest candidates are:
  haskey([91m::DataFrames.Index[39m, ::Symbol) at /Users/cedric/.julia/v0.6/DataFrames/src/other/index.jl:57
  haskey([91m::Dict[39m, ::Any) at dict.jl:505
  haskey([91m::Base.ImmutableDict[39m, ::Any) at dict.jl:639
  ...[39m

Trees are an indexable and filterable datastructure. Use `get_specialization, get_method, get_file, get_function, get_module, is_C_call` and `is_inlined` in your `filter` predicate.

In [10]:
tr_deriv[1,1]

LoadError: [91mUndefVarError: tr_deriv not defined[39m

# Backtraces

(if you want to build your own analysis)

The raw Profile data is available either through `Base.Profile.retrieve()`, or through `pd.data` and `pd.lidict`. However, you might find `FProfile.backtraces(::ProfileData)` more immediately useful. 

In [11]:
count, trace = backtraces(pd)[1]  # get the first unique backtrace
@show count                       # the number of times that trace occurs in the raw data
trace

count = 1


10-element Array{StackFrame,1}:
 (::IJulia.##14#17)() at task.jl:335                                  
 eventloop(::ZMQ.Socket) at eventloop.jl:8                            
 execute_request(::ZMQ.Socket, ::IJulia.Msg) at execute_request.jl:154
 include_string(::Module, ::String, ::String) at Compat.jl:464        
 include_string(::String, ::String) at loading.jl:515                 
 anonymous at <missing>:?                                             
 macro expansion at FProfile.jl:40 [inlined]                          
 macro expansion at profile.jl:23 [inlined]                           
 macro expansion at FProfile.jl:55 [inlined]                          
 second_derivative(::Function, ::Float64) at derivative.jl:71         

Use the `get_method, get_file, ...` functions on `StackFrame` objects (see above). `tree(pd::ProfileData)` is defined as `tree(backtraces(pd))`, and similarly for `flat`, so you can modify the backtraces and get a tree/flat view of the results.