In [2]:
library(tidyverse)

categorized <- read_csv("categorized-failure-ids.csv") %>% select(-failureMessage) %>%
    pivot_longer(-c('failureID','test'), names_to="category", values_to="present") %>%
    filter(present == 1)
matched_failures <- read_csv("matched-failures.csv",     col_types = "ccccccdddddd") %>% select(-FailConfigs.flakerake, -`FailConfigs.flakerake-obo`) %>% replace(is.na(.), 0)

categorized_matched_failures <- matched_failures %>% inner_join(categorized, by=c("failureID","test"))

── [1mAttaching packages[22m ─────────────────────────────────────── tidyverse 1.3.1 ──

[32m✔[39m [34mggplot2[39m 3.3.5     [32m✔[39m [34mpurrr  [39m 0.3.4
[32m✔[39m [34mtibble [39m 3.1.6     [32m✔[39m [34mdplyr  [39m 1.0.7
[32m✔[39m [34mtidyr  [39m 1.1.4     [32m✔[39m [34mstringr[39m 1.4.0
[32m✔[39m [34mreadr  [39m 2.0.2     [32m✔[39m [34mforcats[39m 0.5.1

── [1mConflicts[22m ────────────────────────────────────────── tidyverse_conflicts() ──
[31m✖[39m [34mdplyr[39m::[32mfilter()[39m masks [34mstats[39m::filter()
[31m✖[39m [34mdplyr[39m::[32mlag()[39m    masks [34mstats[39m::lag()

[1m[1mRows: [1m[22m[34m[34m1622[34m[39m [1m[1mColumns: [1m[22m[34m[34m35[34m[39m

[36m──[39m [1m[1mColumn specification[1m[22m [36m────────────────────────────────────────────────────────[39m
[1mDelimiter:[22m ","
[31mchr[39m  (3): test, failureID, failureMessage
[32mdbl[39m (32): AssertionError, IOError, SleepyTimeOut(Proba

In [3]:
# How many Timeouts?
nrow(categorized_matched_failures %>% filter(category=='Timeout' & rerun>0))

In [4]:
categorized_matched_failures %>%
    mutate(status = case_when(
        rerun > 0 & flakerake ==0 & isolatedRerun <= 2 & flakeFlaggerRepl <= 2 ~ "Only FlakeFlagger",
        flakerake > 0 & rerun ==0 & isolatedRerun <=2 & flakeFlaggerRepl <= 2 ~ "Only FlakeRake",
        flakerake == 0 & (isolatedRerun > 2 | flakeFlaggerRepl > 2) ~ "RerunRepl or IsolatedRerun, NOT FlakeRake",
        # flakerake > 0 & rerun > 0 ~ "FlakeRake and Rerun",
        TRUE ~ "Other"
    )) %>%
    group_by(status, category) %>%   
    summarise(nFailures = n()) %>% bind_rows(summarise(
        .,
        across(where(is.numeric), sum),
        across(where(is.character), ~"**Total**")
    ))%>% ungroup() %>% arrange(status,desc(nFailures))

`summarise()` has grouped output by 'status'. You can override using the `.groups` argument.



status,category,nFailures
<chr>,<chr>,<int>
Only FlakeFlagger,**Total**,787
Only FlakeFlagger,java.net.UnknownHostException,224
Only FlakeFlagger,AssertionError,148
Only FlakeFlagger,IOException,144
Only FlakeFlagger,ArtifactResolutionException: Could not transfer artifact,104
Only FlakeFlagger,NullPointerException,46
Only FlakeFlagger,Timeout,33
Only FlakeFlagger,was updated by another transaction concurrently,21
Only FlakeFlagger,SocketException,19
Only FlakeFlagger,Address already in use,13


## Facts that are stated in-line in the paper:

IOExceptions found only by FlakeFlagger's rerun:

In [7]:
categorized_matched_failures %>%
    filter(category == "IOException" & rerun > 0 & flakerake == 0 &
        isolatedRerun == 0 & flakeFlaggerRepl == 0) %>%
    select(slug, test, failureID, category)

test,failureID,category
<chr>,<chr>,<chr>
tachyon.client.LocalBlockInStreamTest#skipTest,33a796f3e535132339ad2875ddb8f2b2,IOException
org.apache.hadoop.hbase.stargate.client.TestRemoteAdmin#testDeleteTable,915423355e31f66413907728c3095756,IOException
org.apache.hadoop.hbase.stargate.client.TestRemoteAdmin#testCreateTable,915423355e31f66413907728c3095756,IOException
org.apache.hadoop.hbase.stargate.TestStatusResource#testGetClusterStatusXML,d4155255635e96648c351f3bc983a9a5,IOException
org.apache.hadoop.hbase.stargate.TestStatusResource#testGetClusterStatusPB,d4155255635e96648c351f3bc983a9a5,IOException
org.apache.hadoop.hbase.stargate.Test00MiniCluster#testDFSMiniCluster,33818538dfe2d28ad9b6011377f84b0f,IOException
org.apache.hadoop.hbase.stargate.Test00MiniCluster#testZooKeeperMiniCluster,33818538dfe2d28ad9b6011377f84b0f,IOException
org.apache.hadoop.hbase.stargate.Test00MiniCluster#testHBaseMiniCluster,33818538dfe2d28ad9b6011377f84b0f,IOException
org.apache.hadoop.hbase.stargate.Test00MiniCluster#testStargateServlet,33818538dfe2d28ad9b6011377f84b0f,IOException
org.apache.hadoop.hbase.stargate.client.TestRemoteTable#testGetTableDescriptor,68ec2a005d74685d9ce7c1579bd791dd,IOException


## Flaky tests not reproduced in any of our experiments

In [15]:
categorized_matched_failures %>%
    filter(rerun > 0 & flakerake == 0 &
        isolatedRerun == 0 & flakeFlaggerRepl == 0) %>%
        group_by(slug,category) %>% summarise(nFailures=n()) %>% ungroup() %>%
        pivot_wider(names_from=category,values_from=nFailures) %>%
        replace(is.na(.),0) %>%
         bind_rows(summarise(
        .,
        across(where(is.numeric), sum),
        across(where(is.character), ~"**Total**")
    ))

`summarise()` has grouped output by 'slug'. You can override using the `.groups` argument.



slug,NullPointerException,Timeout,AssertionError,IOException,java.net.UnknownHostException,java.net.ConnectException,Unexpected exception,Address already in use,IllegalArgumentException,⋯,Wanted but not invoked,Bind failed,java.lang.IllegalStateException,ArtifactResolutionException: Could not transfer artifact,java.lang.ExceptionInInitializerError,java.lang.NoClassDefFoundError:,EOFException,java.lang.NoSuchMethodError,SocketException,java.lang.IllegalAccessException
<chr>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,⋯,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>,<int>
activiti-activiti,1,1,0,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
Alluxio-alluxio,14,0,1,1,116,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
apache-ambari,0,0,0,0,9,1,1,0,0,⋯,0,0,0,0,0,0,0,0,0,0
apache-commons-exec,0,0,1,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
apache-hbase,0,23,3,140,27,2,0,3,1,⋯,0,0,0,0,0,0,0,0,0,0
apache-httpcore,0,1,4,0,0,0,0,0,0,⋯,2,0,0,0,0,0,0,0,0,0
apache-incubator-dubbo,1,1,3,0,0,0,0,10,0,⋯,0,4,1,0,0,0,0,0,0,0
doanduyhai-Achilles,0,0,0,0,1,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
elasticjob-elastic-job-lite,0,0,4,0,0,0,0,0,0,⋯,0,0,0,0,0,0,0,0,0,0
qos-ch-logback,0,2,2,0,0,0,0,0,0,⋯,1,0,0,0,0,0,0,0,0,0
