# Test Results Review Report

This is supposed to help create the milestone report for now and eventuall generate it automatically. It is an interactive document containing code segments and their output. It is part of the JDP project which aims to create an *easily accessible* system for exploring test results and automating *arbitrary* workflows.

Eventually this workbook should be easily installable locally or accessible from some remote location with zero knowledge of Julia or Jupyter. However right now it is not, but you can still try by visiting: https://github.com/richiejp/jdp

## Setup

First we need to load the JDP library which does the heavy lifting; importing and transforming the test result data from OpenQA into something useable. Note that this assumes you started this notebook by running `julia src/notebook.jl`.

In [65]:
# Bring Pkg (the Package manager library) into our namespace so that we can access it's members with Pkg.member
import Pkg

# Bring DataFrame's _members_ into our namespace, so we can call them directly
using DataFrames

# Assumes you started the notebook with jdp/src/notebook.jl
# Switch our 'environment' to the JDP project
Pkg.activate("../")
# Make sure the project dependencies are downloaded and precompiled (this can take a while)
Pkg.instantiate()

using JDP
using JDP.OpenQA
using JDP.TableDB

[32m[1m  Updating[22m[39m registry at `~/.julia/registries/General`
[32m[1m  Updating[22m[39m git-repo `https://github.com/JuliaRegistries/General.git`

Next we set some variables so that we get the data from the correct location. You can download the JSON data from OpenQA with
```julia
OpenQA.save_job_json(OpenQA.osd, datadir, group_id)
``` 
if you are feeling brave (probably you should wait until I find another way of providing the data).

In [14]:
datadir = "/home/rich/qa/data/osd" # The cache dir for the OpenQA test result data
cache_type = :binary # Set to :json to use the raw JSON data from OpenQA

:binary

Now we load the data into memory, this can take a while

In [66]:
json = nothing
df = nothing

if cache_type == :binary
    # The raw data from OpenQA is absurdly huge, so to save on start up time, we can use a binarary format
    df = TableDB.load_module_results(joinpath(datadir, "cache.jld2"))
else
    json = OpenQA.load_job_results_json(datadir)
    df = TableDB.get_module_results(json)
end

"Loaded $(nrow(df)) results"

"Loaded 300224 results"

The function `describe` from the DataFrames package gives us some stats and information about the structure of the loaded data

In [23]:
describe(df, stats = [:nunique, :min, :max, :eltype])

Unnamed: 0,variable,nunique,min,max,eltype
1,build,41,0250,0393,String
2,name,2400,1_fs_stress,xfs-490,String
3,result,5,canceled,softfailed,String
4,arch,4,aarch64,x86_64,String
5,suit,22,"(""LTP"", ""cve"")","(""fstests"", ""xfs"")","Tuple{String,Union{Missing, String}}"
6,bugrefs,151,[],"[""t#2009216""]","Array{SubString,N} where N"


## Failed tests for build

Let's look at what tests failed for a given build. First we need to filter out passed test results and results from other builds. Then we can group the results by test name and suit, amalgamating some of the columns to make the table easier to view. Filter is fairly simple, but the grouping is a bit more complex and there is a bit of Julia magic, see [Split-Apply-Combine](http://juliadata.github.io/DataFrames.jl/stable/man/split_apply_combine.html) for help.

In [64]:
build = "0393"

# The syntax "var -> expr" is an anonymous function, strings starting with 'r' are regexs
fails = filter(r -> r.build == build && occursin(r"failed", r[:result]), df)

# group by name then apply the function defined by `do r ...` to each group
# Putting 'do' after `by` is like writing `by(r -> ...`
fails_by_name = by(fails, [:name, :suit]) do r
    DataFrame(
        # We have to write Tuple otherwise DataFrame creates a multi-row result
        result = Tuple(unique(r.result)), 
        arch = Tuple(unique(r.arch)),
        # Three dots `...` 'splats' an array into multiple function arguments 
        # and `vcat` concatenates it's arguments together
        bugrefs = Tuple(unique(vcat(r.bugrefs...)))
    )
end

"$(nrow(fails_by_name)) tests failed this build"

"68 tests failed this build"

We probably have too many failures to display in Jupyter, so let's just trying displaying failures for which there are no bug tags

In [51]:
missing_bugrefs = filter(r -> length(r.bugrefs) < 1, fails_by_name)

Unnamed: 0,name,suit,result,arch,bugrefs
1,btrfs-152,"(""fstests"", ""btrfs"")","(""failed"",)","(""aarch64"",)",()
2,generic-427,"(""fstests"", ""btrfs"")","(""failed"",)","(""aarch64"",)",()
3,generate_report,"(""fstests"", ""xfs"")","(""failed"",)","(""aarch64"",)",()
4,xfs-113,"(""fstests"", ""xfs"")","(""failed"",)","(""aarch64"",)",()
5,xfs-278,"(""fstests"", ""xfs"")","(""failed"",)","(""aarch64"", ""ppc64le"", ""x86_64"")",()
6,boot_ltp,"(""LTP"", ""syscalls"")","(""failed"",)","(""aarch64"", ""x86_64"")",()
7,Numa-testcases,"(""LTP"", ""numa"")","(""failed"",)","(""aarch64"",)",()
8,boot_ltp,"(""LTP"", ""net.multicast"")","(""failed"",)","(""aarch64"",)",()
9,fallocate05,"(""LTP"", ""syscalls"")","(""failed"",)","(""ppc64le"",)",()
10,generic-119,"(""fstests"", ""btrfs"")","(""failed"",)","(""ppc64le"",)",()


Let's try to find if any of these tests had bug refs in past builds

In [62]:
names_index = Set(missing_bugrefs.name)

past_results = by(filter(r -> r.name in names_index, df), [:name, :suit]) do r
    DataFrame(bugrefs = Tuple(
        # Remove OpenQA's self references with !startswith
        filter(br -> !startswith(br, "t#"), unique(vcat(r.bugrefs...)))
    ))
end

Unnamed: 0,name,suit,bugrefs
1,boot_ltp,"(""LTP"", ""cve"")","(""bsc#1083900"", ""bsc#1074293"", ""poo#38084"", ""poo#37069"")"
2,install_ltp,"(""OpenQA"", ""kernel"")","(""poo#38141"", ""bsc#1093797"", ""bsc#1106178"")"
3,boot_ltp,"(""LTP"", ""syscalls"")","(""bsc#1099134"", ""bsc#1074293"", ""poo#35347"", ""bsc#1099173"", ""bsc#1102358"", ""bsc#1102250"", ""bsc#1108010"", ""bsc#1108028"", ""poo#40424"")"
4,fallocate05,"(""LTP"", ""syscalls"")","(""bsc#1099134"", ""bsc#1074293"", ""poo#35347"", ""bsc#1099173"", ""bsc#1102358"", ""bsc#1102250"", ""bsc#1108010"", ""bsc#1108028"", ""poo#40424"")"
5,boot_ltp,"(""LTP"", ""numa"")","(""bsc#1099878"", ""bsc#1102250"")"
6,Numa-testcases,"(""LTP"", ""numa"")","(""bsc#1099878"", ""bsc#1102250"")"
7,boot_ltp,"(""LTP"", ""syscalls-ipc"")",()
8,boot_ltp,"(""LTP"", ""ima"")","(""poo#37480"", ""poo#37838"")"
9,boot_ltp,"(""LTP"", ""net.ipv6"")",()
10,boot_ltp,"(""LTP"", ""net.multicast"")","(""bsc#1102250"",)"


You may still find that there are still too many results to view here. It is left as an excercise to the reader to filter out even more (you may just want to blacklist tests like `boot_ltp` and `partition` which create a lot of noise).

And finally (for now), let's get the completely tagless tests on their own

In [63]:
filter(r -> length(r.bugrefs) < 1, past_results)

Unnamed: 0,name,suit,bugrefs
1,boot_ltp,"(""LTP"", ""syscalls-ipc"")",()
2,boot_ltp,"(""LTP"", ""net.ipv6"")",()
3,generic-119,"(""fstests"", ""btrfs"")",()
4,btrfs-007,"(""fstests"", ""btrfs"")",()
5,generic-091,"(""fstests"", ""xfs"")",()
6,generic-119,"(""fstests"", ""xfs"")",()
7,xfs-113,"(""fstests"", ""xfs"")",()
8,xfs-278,"(""fstests"", ""xfs"")",()
9,boot_ltp,"(""LTP"", ""net_stress.interface"")",()
