# 6 - Analyses of which errors were fixed

## Table of contents:
* [Notebook setup](#notebook-setup)
* [Listing most common violations](#most-common)
* [Most common fixed and unfixed errors](#most-common-fixed-unfixed)
* [Errors most likely to be fixed](#most-likely-to-be-fixed)

## Notebook setup <a class="anchor" id="notebook-setup"></a>

Import dependencies

In [5]:
import warnings
import pandas as pd
import numpy as np
import sqlite3
import psycopg2
import sys
from sqlalchemy import create_engine
from scipy import stats

warnings.simplefilter(action="ignore", category=FutureWarning)
pd.options.mode.chained_assignment = None  # default='warn'

Connect to db

In [6]:
url = "postgresql+psycopg2://admin:secret@localhost:5432/accessibility_monitoring_app"
engine = create_engine(url)

Import data from public.audits_checkresult

In [7]:
pd.set_option("display.max_columns", None)

df = pd.read_sql("SELECT * FROM public.audits_checkresult;", engine)
df.head()

Unnamed: 0,id,is_deleted,type,check_result_state,notes,audit_id,page_id,wcag_definition_id,retest_notes,retest_state
0,15,False,axe,error,Refers to the blue 'i' button in the 'Contact ...,1,1,69,,fixed
1,34,False,axe,error,° Turquoise and white\r\n° Red text for the cu...,1,2,23,,fixed
2,2,False,manual,no-error,,1,1,10,,not-retested
3,3,False,manual,no-error,,1,1,11,,not-retested
4,4,False,manual,no-error,,1,1,12,,not-retested


Importing WCAG definitions

In [8]:
wcag_definitions_df = pd.read_sql("SELECT * FROM public.audits_wcagdefinition;", engine)
wcag_definitions_df.head()

Unnamed: 0,id,type,name,description,url_on_w3,report_boilerplate,date_start,date_end
0,1,pdf,WCAG 1.4.3 Contrast (Minimum),,https://www.w3.org/WAI/WCAG21/Understanding/co...,Poor colour contrast makes it difficult for so...,,
1,2,pdf,WCAG 2.4.2 Page titled,,https://www.w3.org/WAI/WCAG21/Understanding/pa...,PDF documents should have titles that describe...,,
2,3,pdf,WCAG 3.1.1 Language of Page,,https://www.w3.org/WAI/WCAG21/Understanding/la...,Assistive technologies are more accurate when ...,,
3,4,pdf,WCAG 1.3.1 Info and Relationships,In tables,https://www.w3.org/WAI/WCAG21/Understanding/in...,Information in tables must be shown in a way t...,,
4,6,pdf,WCAG 1.1.1 Non-text content,,https://www.w3.org/WAI/WCAG21/Understanding/no...,People with sight loss may not see an image cl...,,


In [9]:
id_to_definition_dict = dict(zip(wcag_definitions_df.id, wcag_definitions_df.name))
df["wcag_definition"] = df.replace({"wcag_definition_id": id_to_definition_dict})["wcag_definition_id"]
df.head()

Unnamed: 0,id,is_deleted,type,check_result_state,notes,audit_id,page_id,wcag_definition_id,retest_notes,retest_state,wcag_definition
0,15,False,axe,error,Refers to the blue 'i' button in the 'Contact ...,1,1,69,,fixed,"WCAG 4.1.2 Name, Role, Value"
1,34,False,axe,error,° Turquoise and white\r\n° Red text for the cu...,1,2,23,,fixed,WCAG 1.4.3 Contrast (minimum)
2,2,False,manual,no-error,,1,1,10,,not-retested,WCAG 1.4.4. Resize Text
3,3,False,manual,no-error,,1,1,11,,not-retested,WCAG 1.4.10 Reflow
4,4,False,manual,no-error,,1,1,12,,not-retested,WCAG 1.2.1 Audio-only and video-only (prerecor...


## Listing most common violations  <a class="anchor" id="most-common"></a>

In [15]:
most_common_violations = df[df["retest_state"] != "not-retested"]["wcag_definition"].value_counts()
print(f"Total violations: {most_common_violations.sum()}")
most_common_violations

Total violations: 1330


WCAG 2.4.7 Focus Visible                                                 231
WCAG 4.1.2 Name, Role, Value                                             197
WCAG 2.1.1 Keyboard                                                      172
WCAG 1.4.3 Contrast (minimum)                                            161
WCAG 1.3.1 Info and Relationships                                        142
WCAG 2.4.4 Link Purpose (In Context) and WCAG 4.1.2 Name, Role, Value    104
WCAG 1.1.1 Non-text Content                                               47
WCAG 2.4.3 Focus Order                                                    36
WCAG 1.4.3 Contrast (Minimum)                                             32
WCAG 1.4.10 Reflow                                                        31
WCAG 2.4.2 Page titled                                                    31
WCAG 2.4.1 Bypass Blocks and WCAG 4.1.2 Name, Role, Value                 30
WCAG 2.2.2 Pause, Stop, Hide                                              24

Seperating the fixed and unfixed errors

In [17]:
fixed_errors = df[
    (df["check_result_state"] == "error")
    & (df["retest_state"] == "fixed")
]

unfixed_errors = df[
    (df["check_result_state"] == "error")
    & (df["retest_state"] == "not-fixed")
]

fixed_errors = fixed_errors[fixed_errors['wcag_definition'].isin(most_common_violations.index)]
unfixed_errors = unfixed_errors[unfixed_errors['wcag_definition'].isin(most_common_violations.index)]

## Most common fixed and unfixed errors <a class="anchor" id="most-common-fixed-unfixed"></a>

In [19]:
print(fixed_errors["wcag_definition"].value_counts().sum())
fixed_errors["wcag_definition"].value_counts()

1051


WCAG 2.4.7 Focus Visible                                                 174
WCAG 4.1.2 Name, Role, Value                                             157
WCAG 2.1.1 Keyboard                                                      132
WCAG 1.4.3 Contrast (minimum)                                            125
WCAG 1.3.1 Info and Relationships                                        119
WCAG 2.4.4 Link Purpose (In Context) and WCAG 4.1.2 Name, Role, Value     90
WCAG 1.1.1 Non-text Content                                               37
WCAG 2.4.3 Focus Order                                                    33
WCAG 1.4.3 Contrast (Minimum)                                             26
WCAG 1.4.10 Reflow                                                        25
WCAG 2.4.1 Bypass Blocks and WCAG 4.1.2 Name, Role, Value                 21
WCAG 2.2.2 Pause, Stop, Hide                                              20
WCAG 2.4.2 Page titled                                                    20

In [20]:
print(unfixed_errors["wcag_definition"].value_counts().sum())
unfixed_errors["wcag_definition"].value_counts()

279


WCAG 2.4.7 Focus Visible                                                 57
WCAG 4.1.2 Name, Role, Value                                             40
WCAG 2.1.1 Keyboard                                                      40
WCAG 1.4.3 Contrast (minimum)                                            36
WCAG 1.3.1 Info and Relationships                                        23
WCAG 2.4.4 Link Purpose (In Context) and WCAG 4.1.2 Name, Role, Value    14
WCAG 2.4.2 Page titled                                                   11
WCAG 1.1.1 Non-text Content                                              10
WCAG 2.4.1 Bypass Blocks and WCAG 4.1.2 Name, Role, Value                 9
WCAG 1.3.1 Info and Relationships and WCAG 4.1.2 Name, Role, Value        8
WCAG 1.4.3 Contrast (Minimum)                                             6
WCAG 1.4.10 Reflow                                                        6
WCAG 2.2.2 Pause, Stop, Hide                                              4
WCAG 3.1.1 L

## Errors most likely to be fixed  <a class="anchor" id="most-likely-to-be-fixed"></a>

Above shows the most commonly fixed issues from highest to lowest 

In [26]:
temp = pd.DataFrame(dict(unfixed_errors = unfixed_errors["wcag_definition"].value_counts(), fixed_errors = fixed_errors["wcag_definition"].value_counts()))
temp = temp.fillna(0)
temp["total"] = temp["unfixed_errors"] + temp["fixed_errors"]
temp["unfixed_errors_ratio"] =  temp["unfixed_errors"] / temp["total"]
temp["fixed_errors_ratio"] =  temp["fixed_errors"] / temp["total"]

temp[temp["total"] > 20].sort_values("unfixed_errors_ratio")

Unnamed: 0,unfixed_errors,fixed_errors,total,unfixed_errors_ratio,fixed_errors_ratio
WCAG 2.4.3 Focus Order,3.0,33,36.0,0.083333,0.916667
"WCAG 2.4.4 Link Purpose (In Context) and WCAG 4.1.2 Name, Role, Value",14.0,90,104.0,0.134615,0.865385
WCAG 1.3.1 Info and Relationships,23.0,119,142.0,0.161972,0.838028
"WCAG 2.2.2 Pause, Stop, Hide",4.0,20,24.0,0.166667,0.833333
WCAG 1.4.3 Contrast (Minimum),6.0,26,32.0,0.1875,0.8125
WCAG 1.4.10 Reflow,6.0,25,31.0,0.193548,0.806452
"WCAG 4.1.2 Name, Role, Value",40.0,157,197.0,0.203046,0.796954
WCAG 1.1.1 Non-text Content,10.0,37,47.0,0.212766,0.787234
WCAG 1.4.3 Contrast (minimum),36.0,125,161.0,0.223602,0.776398
WCAG 2.1.1 Keyboard,40.0,132,172.0,0.232558,0.767442
