# Test Results Review Report

This is supposed to help create the milestone report for now and eventuall generate it automatically. It is an interactive Jupyter document (or a static view of such a document) containing Julia code segments and their output. It is part of the JDP project which aims to create an *easily accessible* system for exploring test results and automating *arbitrary* workflows. Notebooks such as these are intended to provide an easy starting point for engineers and other technical users to create their own reports, possibly just by tweaking the existing ones.

Eventually this workbook should be easily installable locally or accessible from some remote location with zero knowledge of Julia or Jupyter. However right now it is not, but you can still try by visiting: https://github.com/richiejp/jdp.

Obviously you can also access the library from a REPL or use it in a traditional script or application, but Jupyter provides a nice, persistent, graphical environment. I won't discuss how to use Jupyter in this notebook (just click on help at the top), but will heavily annotate the code.

## Setup

First we need to load the JDP library which does the heavy lifting; importing and transforming the test result data from OpenQA into something useable. Note that this assumes you started this notebook by running `julia src/notebook.jl`.

> NOTE: It is required that you run this cell before the code cells following it. However not all of the cells need to be executed in order.

You may see a bunch of horrible angry red text when running this. Unfortunately this could either be info messages or error messages from the logging system, Jupyter treats both to a red background.

In [1]:
# Run the install script which will setup the JDP project if necessary
include("install.jl")

# Bring DataFrame's _members_ into our namespace, so we can call them directly
using DataFrames

# Import some libraries from the JDP project
using JDP.OpenQA    # Contains functions for dealing with the OpenQA web API
using JDP.TableDB   # Functions for accessing test data in table like formats (currently DataFrames)
using JDP.Bugzilla  # Functions for accessing the Bugzilla API(s)

Activating JDP package at /home/rich/julia/jdp/
Installing project deps if necessary...
[32m[1m  Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m  Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`

┌ Info: Recompiling stale cache file /home/rich/.julia/compiled/v1.0/DataFrames/AR9oZ.ji for DataFrames [a93c6f00-e57d-5684-b7b6-d8193f3e46c0]
└ @ Base loading.jl:1184
┌ Info: Recompiling stale cache file /home/rich/.julia/compiled/v1.0/JDP/uw0DL.ji for JDP [6d7e372e-c0bd-11e8-2b07-216b8359d694]
└ @ Base loading.jl:1184


Next we set some variables which are used in later code cells. Cache type can be set to `:json` or `:binary` (which are [symbols](https://docs.julialang.org/en/v1/manual/metaprogramming/#Symbols-1)). We always save data as JSON first, but afterwards it can be copied into a binary format as well to increase loading speed.

> NOTE: Julia has a _very_ strong type system, but we can still assign variables like a dynamic language. For library code it is generally a good idea to explicitly state what types you are expecting, but in Notebook code we can just let the compiler guess the type.

In [2]:
datadir = "/home/rich/qa/data/osd" # The cache dir for the OpenQA test result data
cache_type = :json # Set to :json to use the raw JSON data from OpenQA

:json

Next we may download some new results for a given build or builds to our local cache. This usually takes a long time, hence why there is a local cache.

In [None]:
# Loop over an array of strings... yeah I bet you really needed to be told that didn't you?
for jid in ["426", "429", "431", "432", "435"]
    # Get some job results from the openqa.suse.de (osd) OpenQA instance.
    # Optional arguments (after the ';') like 'build' and 'groupid' are passed to the OpenQA API
    OpenQA.save_job_results_json(OpenQA.osd, datadir; build="0$jid", groupid="155")
end

Now we load the data into memory, this can also take a while. If we don't have any new data to load we can use the binary format.

In [4]:
# Ensure the variables are defined in the global scope
json = nothing
df = nothing

if cache_type == :binary
    # The raw data from OpenQA is absurdly huge, so to save on start up time, we can use a binary format
    df = TableDB.load_module_results(joinpath(datadir, "cache.jld2"))
else
    json = OpenQA.load_job_results_json(datadir) # This is the raw OpenQA data in the form of a Dict
    df = TableDB.get_module_results(json)        # This is a more refined form of the data as a DataFrame
end

"Loaded $(nrow(df)) results"

"Loaded 500985 results"

If we have some new data then we can update the binary cache.

In [None]:
TableDB.save_module_results(joinpath(datadir, "cache.jld2"), df)

The function `describe` from the DataFrames package gives us some stats and information about the structure of the loaded data. For the raw json value we can just use `summary`.

In [5]:
summary(json)

"10929-element Array{Dict{String,Any},1}"

In [6]:
describe(df, stats = [:nunique, :min, :max, :eltype])

Unnamed: 0,variable,nunique,min,max,eltype
1,build,48,0250,0419,String
2,name,5810,1_autotest,zram03,String
3,result,5,canceled,softfailed,String
4,arch,4,aarch64,x86_64,String
5,suit,71,"(""LTP"", ""can"")","(""fstests"", ""xfs"")","Tuple{String,Union{Missing, String}}"
6,bugrefs,161,[],"[""t#2094962""]","Array{SubString,N} where N"


Look at the pretty table! We can also display graphs which could be even more delightful. Unfortunately it is slightly less pretty if you are viewing this as a static page.

## Failed tests for build

Let's look at what tests failed for a given build. First we need to filter out passed test results and results from other builds. Then we can group the results by test name and suit, amalgamating some of the columns to make the table easier to view. Filter is fairly simple, but the grouping is a bit more complex and there is a bit of Julia magic, see [Split-Apply-Combine](http://juliadata.github.io/DataFrames.jl/stable/man/split_apply_combine.html) for help.

> NOTE: Packages such as QUERY.jl allow one to use an SQL like syntax which is probably a lot easier to understand for most people.

In [6]:
build = "0435"

# The syntax "var -> expr" is an anonymous function, strings starting with 'r' are regexs.
# In Julia you don't need to write 'return' (unless you want to return early), most 
# statements return whatever the value of the final expression is
fails = filter(r -> r.build == build && occursin(r"failed", r[:result]), df)

# group by name then apply the function defined by `do r ...` to each group
# Putting `do r` after `by` is like writing `by(r -> ...`. i.e. `do r` defines a function
# and passes it as the first argument to `by`.
fails_by_name = by(fails, [:name, :suit]) do r
    # 'by' first groups the results by name and suit then passes each group to us in the variable 'r'
    # we then use 'r' to produce a new DataFrame containing a single row. We return the new DataFrames 
    # and `by` then combines them... at least I think that is what happpens.
    DataFrame(
        # We have to write Tuple otherwise DataFrame creates a multi-row result (because r.result is an array)
        result = Tuple(unique(r.result)),
        arch = Tuple(unique(r.arch)),
        # Three dots `...` 'splats' an array (or tuple) into multiple function arguments 
        # and `vcat` concatenates it's arguments together
        bugrefs = Tuple(unique(vcat(r.bugrefs...)))
        # also, and don't panic if this is a little more difficult to understand, 
        # 'unique' removes duplicate elements from a collection
    )
end

"$(nrow(fails_by_name)) tests failed this build"

"73 tests failed this build"

We probably have too many failures to display in Jupyter, so let's just try displaying failures for a subset of tests. We can focus on a particular test suit and remove tests which already appear to be tagged.

In [7]:
missing_bugrefs = filter(fails_by_name) do r
    length(r.bugrefs) < 1 && # Remove tests which already have bug refs
    r.suit[1] == "LTP" &&    # Only include LTP results
    r.name != "boot_ltp" &&  # Don't include boot_ltp and shutdown_ltp modules
    r.name != "shutdown_ltp"
end

Unnamed: 0_level_0,name,suit,result,arch,bugrefs
Unnamed: 0_level_1,String,Tuple…,Tuple…,Tuple…,Tuple…
1,epoll_wait02,"(""LTP"", ""syscalls"")","(""failed"",)","(""s390x"",)",()
2,statx05,"(""LTP"", ""syscalls"")","(""failed"",)","(""s390x"",)",()


In [13]:
missing_bugrefs = filter(fails_by_name) do r
    length(r.bugrefs) < 1 &&
    r.suit == ("fstests", "xfs")
end

Unnamed: 0_level_0,name,suit,result,arch,bugrefs
Unnamed: 0_level_1,String,Tuple…,Tuple…,Tuple…,Tuple…
1,xfs-083,"(""fstests"", ""xfs"")","(""failed"",)","(""aarch64"",)",()
2,xfs-491,"(""fstests"", ""xfs"")","(""failed"",)","(""aarch64"", ""ppc64le"", ""x86_64"")",()
3,xfs-492,"(""fstests"", ""xfs"")","(""failed"",)","(""aarch64"", ""ppc64le"", ""x86_64"")",()
4,xfs-493,"(""fstests"", ""xfs"")","(""failed"",)","(""aarch64"", ""ppc64le"", ""x86_64"")",()
5,generic-445,"(""fstests"", ""xfs"")","(""failed"",)","(""ppc64le"",)",()
6,xfs-173,"(""fstests"", ""xfs"")","(""failed"",)","(""ppc64le"",)",()


Let's try to find if any of these tests had bug refs in past builds

In [14]:
names_index = Set(missing_bugrefs.name) # Convert the name column into a hash set

past_results = by(filter(r -> r.name in names_index && length(r.bugrefs) > 0, df), [:name, :suit]) do r
    DataFrame(bugrefs = Tuple(
        # Remove OpenQA's self references with !startswith
        filter(br -> !startswith(br, "t#"), unique(vcat(r.bugrefs...)))
    ))
end

Unnamed: 0_level_0,name,suit,bugrefs
Unnamed: 0_level_1,String,Tuple…,Tuple…
1,generic-445,"(""fstests"", ""btrfs"")","(""bsc#1103543"",)"
2,xfs-083,"(""fstests"", ""xfs"")","(""bsc#1105017"",)"
3,generic-445,"(""fstests"", ""xfs"")","(""bsc#1073390"", ""bsc#1105025"")"
4,xfs-173,"(""fstests"", ""xfs"")","(""bsc#1073390"", ""bsc#1105025"")"


You may still find that there are still too many results to view here. It is left as an excercise to the reader to filter out even more (you may just want to blacklist tests like `boot_ltp` and `partition` which create a lot of noise).

It is difficult to judge from looking at a bug reference what it is about and whether it is relevant to a particular test failure. The `BugRefs.ipynb` notebook provides indepth tools for dealing with trackers and bug references, but we can also display a summary of the bug info here. First we need to login to Bugzilla, a user name and password prompt are displayed if necessary.

In [31]:
bsc_ses = Bugzilla.login("bsc");

User Name: rpalethorpe
Password: ········


We have the amazing ability to produce Markdown programatically and display it.

In [32]:
import Markdown: MD

refs = foldl(past_results.bugrefs; init=[]) do acc, brefs
    vcat([brefs...], acc)
end
# The pipe operator '|>' pipes. This is the same as writting unique(filter(...)), but has the advantage
# of confusing some people who have not see it before
refs = filter(ref -> startswith(ref, "bsc"), refs) |> unique
bugs = map(ref -> Bugzilla.get_bug(bsc_ses, parse(Int, ref[5:end])), refs)
map(Bugzilla.to_md, bugs) |> MD

**P3 - Medium**(*Normal*) NEW: fstests with xfs on generic/042 fails with difference on golden output on 4.12.14-4.7-default

**P2 - High**(*Normal*) NEW: xfstests generic/486 fail in xfs

**P2 - High**(*Normal*) NEW: xfstests xfs/013 fails in ppc64le occasionally

**P2 - High**(*Normal*) RESOLVED: xftests generic/502 fails for btrfs



Now let's get the completely tagless tests on their own

In [19]:
names_index = Set(missing_bugrefs.name)
by(filter(r -> r.name in names_index && length(r.bugrefs) < 1 && occursin(r"failed", r.result), df), [:name, :suit]) do r
    DataFrame(
        arch = Tuple(unique(r.arch)),
        frequency = length(unique(r.build))
    )
end

Unnamed: 0_level_0,name,suit,arch,frequency
Unnamed: 0_level_1,String,Tuple…,Tuple…,Int64
1,generic-445,"(""fstests"", ""xfs"")","(""ppc64le"",)",6
2,xfs-173,"(""fstests"", ""xfs"")","(""ppc64le"",)",2
3,xfs-083,"(""fstests"", ""xfs"")","(""aarch64"", ""x86_64"")",4
4,xfs-491,"(""fstests"", ""xfs"")","(""ppc64le"", ""x86_64"", ""aarch64"")",5
5,xfs-492,"(""fstests"", ""xfs"")","(""ppc64le"", ""x86_64"", ""aarch64"")",5
6,xfs-493,"(""fstests"", ""xfs"")","(""ppc64le"", ""x86_64"", ""aarch64"")",5
