# Result Status Differences

This script looks for differences between test results to find interesting changes. When it finds something which may be relevant it can notify any interested parties. This uses the [JDP framework](https://rpalethorpe.io.suse.de/jdp/).

First we need to build up our data structures to create the test matrix. There are some stats here which may be useful, but otherwise you can safely skip this part most of the time.

## Contents

- [Setup](#setup)
- Results
  + [LTP](#ltp)
  + [HPC](#hpc)
  + [Networking](#networking)
  + [Public Cloud](#publiccloud)
  + [Other](#other)
  + [File Systems](#fstests)
    * [BTRFS](#btrfs)
    * [XFS](#xfs)
- [Notifications](#notifications)

## Setup

In [None]:
# Monitors library source files and recompiles them after most changes
import Revise

# Run the init script which will setup the JDP project if necessary
include("../src/init.jl")

# Bring DataFrame's _members_ into our namespace, so we can call them directly
using DataFrames
import DataStructures: SortedDict, SortedSet, SDSemiToken
import Dates: Day
import TOML

# import the markdown string literal/macro
import Markdown
import Markdown: @md_str, MD

# Import some libraries from the JDP project
using JDP.Conf
using JDP.Trackers.OpenQA    # Contains functions for dealing with the OpenQA web API
using JDP.Trackers.Bugzilla  # Functions for accessing the Bugzilla API(s)
using JDP.Repository
using JDP.Spammer

In [None]:
html"<h2 id='setup'>Setup</h2>"

First we load a large chunk of the results in our database into memory where we can play with them.

In [None]:
allres = Repository.fetch(OpenQA.TestResult, Vector, "osd", OpenQA.RecentOrInterestingJobsDef)

md"We have **$(length(allres))** results in total"

We only show results for a single product, which can be set here.

In [None]:
product = "sle-12-SP5-Server-DVD"
cloudproduct = "sle-12-SP5"

prodres = filter(allres) do res
    res.product == product
end

cloudprodres = filter(allres) do res
    startswith(res.product, cloudproduct) && ("Public Cloud" in res.flags)
end

md"""
We have $(length(prodres)) test results for $(product) and $(length(cloudprodres)) for $(cloudproduct) cloud
"""

Now we create a 'build matrix', which has one result for each product build. The field subset ordering decides which test result fields are used to decide whether two test results are equal and how they are ordered.

The function `OpenQA.describe` is used to return a summary of the result matrix. Otherwise this report would be a little verbose. You can safely remove the describe to see what that looks like.

In [None]:
fullm = OpenQA.build_matrix(prodres, 
    OpenQA.FieldSubsetOrdering(:suit, :machine, :name, :flags))
OpenQA.describe(fullm)

In [None]:
fullcm = OpenQA.build_matrix(cloudprodres, 
    OpenQA.FieldSubsetOrdering(:machine, :suit, :name, :flags))
OpenQA.describe(fullcm)

Remove older builds and tests only present in those builds for speed and to avoid counting tests which have been permanently disabled in the missing stats.

In [None]:
m = OpenQA.truncate_builds(fullm, 7)
OpenQA.describe(m)

In [None]:
cm = OpenQA.truncate_builds(fullcm, 7)
OpenQA.describe(cm)

Some helper functions which are used in filtering for each test suite. The functions `OpenQA.filter_builds`, `OpenQA.filter_seqs` and `OpenQA.group_matrix` are fairly generic. Although not as generic as using `DataFrames` methods.

In [None]:
# removes builds where some percentage of the tests returned no result
function filter_bad_builds(mat, tolerance::Float64)
    tcount = length(mat.seqs) # seqs is short for test sequences
    
    OpenQA.filter_builds(mat) do builds
        nons = 0
        for testres in builds
            if testres == nothing
                nons += 1
            end
        end
        nons / tcount < tolerance
    end
end

# removes tests which returned the same result for all builds
function filter_consistant_tests(mat)
    OpenQA.filter_seqs(mat) do ex, seq # ex is short for exemplar test
        ftest = first(seq)
        fres = ftest == nothing ? "none" : ftest.result
        !all(seq) do test
            res = test == nothing ? "none" : test.result
            res == fres
        end
    end
end

function group_by_machine(mat)
    # Note that tests are implicitly grouped by the result status sequence
    # as well the function passed here
    OpenQA.group_matrix(mat) do test1, test2
        test1.suit == test2.suit
    end
end

# Usually the results would be limited to approximately your display size
ENV["LINES"] = 500

function filter_and_group(fn, mat, tolerance)
    mat = OpenQA.filter_seqs(fn, mat)
    display(md"After test filter: $(OpenQA.describe(mat))")
    mat = filter_bad_builds(mat, tolerance)
    display(md"After bad build filter: $(OpenQA.describe(mat))")
    mat = filter_consistant_tests(mat)
    display(md"After consistant test filter: $(OpenQA.describe(mat))")
    group_by_machine(mat)
end

## Results

The results of a number of different test suites or environments follow

In [None]:
html"<h3 id='ltp'>LTP</h3>"

In [None]:
ltpmg = filter_and_group(m, 0.25) do ex, seq
    ex.suit[1] == "LTP"
end

In [None]:
html"<h3 id='hpc'>HPC</h3>"

In [None]:
hpcmg = filter_and_group(m, 0.25) do ex, seq
    length(ex.suit) > 1 && ex.suit[1:2] == ["OpenQA", "HPC"]
end

In [None]:
html"<h3 id='networking'>Networking</h3>"

In [None]:
netmg = filter_and_group(m, 0.25) do ex, seq
    occursin("wicked", ex.job.name)
end

In [None]:
html"<h3 id='publiccloud'>Public Cloud</h3>"

In [None]:
cmg = filter_and_group(cm, 0.25) do ex, seq
    true
end

In [None]:
html"<h3 id='other'>Other</h3>"

Some of the tests listed here are simply OpenQA helper modules or tests which have not been properly categorised yet.

In [None]:
othmg = filter_and_group(m, 0.25) do ex, seq
    suit = ex.suit[1]
    
    suit ≠ "LTP" && suit ≠ "fstests" && 
    !("Public Cloud" in ex.flags) && 
    get(ex.suit, 2, nothing) ≠ "HPC" &&
    !occursin("wicked", ex.job.name)
end

In [None]:
html"<h3 id='fstests'>File Systems</h3><h4 id='btrfs'>BTRFS</h4>"

In [None]:
btrfsmg = filter_and_group(m, 0.15) do ex, seq
    ex.suit[1] == "fstests" && ex.suit[2] == "btrfs"
end

In [None]:
html"<h4 id='xfs'>XFS</h4>"

In [None]:
xfsmg = filter_and_group(m, 0.15) do ex, seq
    ex.suit[1] == "fstests" && ex.suit[2] == "xfs"
end

In [None]:
html"<h2 id='notifications'>Notifications</h2>"

Next we notify interested persons of the changes in test results. To limit the amount of noise, each test can only be included in a notification to the specified set of users once a month.

In [None]:
function maybe_notify(gm, report_id, notifyprefs)
    mentions = Set()
    changed_tests = 0
    if isempty(gm.m.builds) 
        return changed_tests
    end
    build = first(gm.m.builds)
    
    # Notifications are not effective if there are too many of them. Also setting the
    # notified flags for each users-test pair can be expensive.
    if length(gm.groups) > 100
        @warn "No notifications will be sent for $report_id due to the excessive number of changes"
        return 0
    end

    for g in gm.groups
        test = first(g.tests)
        test_name = join(test.suit, ":") * ":$(test.name)"
        test_id = "$test_name@$(test.arch)[" * join(test.flags, ",") * "]"
        users = vcat((users for (pattern, users) in notifyprefs if occursin(pattern, test_id))...)
        users_key = join(users, "&")
        flag_key = "diff-notified-$test_id$users_key"
        latest = if haskey(g.seq.builds, build)
            g.seq.builds[build]
        else
            nothing
        end
                            
        oldres = Repository.get_temp_flag(flag_key)
        newres = latest ≠ nothing ? latest.result : "none"
        @debug test_id repr(oldres) newres
        if oldres ≠ newres
            changed_tests += 1
            push!(mentions, users...)
            Repository.set_temp_flag(flag_key, newres, Day(7))
        end
    end

    if changed_tests > 0
        io = IOBuffer()
        print(io, """
At least $changed_tests tests appear to have changed status recently in the $report_id category.\n
See the [Status Difference Report](https://rpalethorpe.io.suse.de/jdp/reports/Report-Status-Diff.html#$report_id) for details""")

        Spammer.post_message(Spammer.Message(String(take!(io)), collect(mentions)))
    end
    
    changed_tests
end

The targets of the notifications are taken from the OpenQA job group descriptions.

In [None]:
testprefs = OpenQA.load_notify_preferences("osd")

In [None]:
changes = maybe_notify(ltpmg, "ltp", testprefs)
md"Sent **$changes** change notifications"

In [None]:
changes = maybe_notify(hpcmg, "hpc", testprefs)
md"Sent **$changes** change notifications"

In [None]:
changes = maybe_notify(netmg, "network", testprefs)
md"Sent **$changes** change notifications"

In [None]:
changes = maybe_notify(cmg, "publiccloud", testprefs)
md"Sent **$changes** change notifications"

In [None]:
changes = maybe_notify(othmg, "other", testprefs)
md"Sent **$changes** change notifications"

In [None]:
changes = maybe_notify(btrfsmg, "btrfs", testprefs)
md"Sent **$changes** change notifications"

In [None]:
changes = maybe_notify(xfsmg, "xfs", testprefs)
md"Sent **$changes** change notifications"