Для проверки скорости загрузки и доступности сайта используется API гугл-сервиса PageSpeed Insights. 

Подробнее: https://developers.google.com/speed/docs/insights/rest/v5/pagespeedapi/runpagespeed

Для использования любых API гугла нужно 

1. иметь проект в консоли разработчика, 
2. подключить к проекту нужный API, 
3. (не всегда) получить API ключ 

В качестве индикаторов скорости загрузки страницы мною были выбраны три метрики:
* First Contentful Paint (FCP) - время для загрузки первого видимого элемента (через какое время после входа на страницу хоть что-то на ней становится видно). "A fast FCP helps reassure the user that something is happening." 
* Largest Contentful Paint (LCP) - время для загрузки самого большого видимого элемента (через какое время после входа на страницу на ней становится видна основная часть). "A fast LCP helps reassure the user that the page is useful."
* Time to Interactive (TTI) - время, после которой со страницей можно полностью взаимодействовать. 

Задать конкретную метрику можно в аргументах функции, указывается аббревиатура (регистр не важен) либо значение all, если нужны все. По умолчанию выдаются все. Результат выдается в милисекундах.

Подробнее о доступных метриках можно почитать здесь: https://web.dev/metrics/

В качестве индикаторов доступности сайта сейчас выгружаются вообще все доступные метрики. Результат выдается как бинарная переменная (1/0 по этому критерию, либо None, если он не применим). Описание метрик доступно здесь: https://web.dev/lighthouse-accessibility/. Они складываются в единый индекс с определенными весами, описанием которых есть здесь: https://web.dev/accessibility-scoring/

`first-contentful-paint` - First Contentful Paint marks the time at which the first text or image is painted. [Learn more](https://web.dev/first-contentful-paint/)

`largest-contentful-paint` - Largest Contentful Paint marks the time at which the largest text or image is painted. [Learn more](https://web.dev/lighthouse-largest-contentful-paint/)

`interactive` - Time to interactive is the amount of time it takes for the page to become fully interactive. [Learn more](https://web.dev/interactive/)

`color-contrast` - Background and foreground colors do not have a sufficient contrast ratio (0) / Background and foreground colors have a sufficient contrast ratio (1)

`document-title` - Search engine users rely on the title to determine whether a page is relevant to their search

`duplicate-id-active` - `[id]` attributes on active, focusable elements are unique

`aria-hidden-body` - `[aria-hidden="true"]` elements contain focusable descendants. Using the `aria-hidden="true"` attribute on an element hides the element and all its children from screen readers and other assistive technologies. If the hidden element contains a focusable element, assistive technologies won't read the focusable element, but keyboard users will still be able to navigate to it, which can cause confusion.

`aria-roles` - [role] values are not valid. A page fails this audit when it contains an element with an invalid role value. In the example Lighthouse audit above, button has been misspelled as buton, which isn't a valid role value.

`aria-command-name` - ARIA items do not have accessible names `(button, link, menuitem)`. Users of screen readers and other assistive technologies need information about the behavior and purpose of controls on your web page. Built-in HTML controls like buttons and radio groups come with that information built in. For custom controls you create, however, you must provide the information with ARIA roles and attributes.

`aria-input-field-name` - ARIA items do not have accessible names `(combobox, listbox, searchbox, slider, spinbutton, textbox)`. Users of screen readers and other assistive technologies need information about the behavior and purpose of controls on your web page. Built-in HTML controls like buttons and radio groups come with that information built in. For custom controls you create, however, you must provide the information with ARIA roles and attributes.

`button-name` - Buttons do not have an accessible name

`input-image-alt` - `<input type="image">` elements have `[alt]` text. Informative elements should aim for short, descriptive alternate text. Decorative elements can be ignored with an empty alt attribute.

`form-field-multiple-labels` - No form fields have multiple labels. Labels ensure that form controls are announced properly by assistive technologies like screen readers. Assistive technology users rely on these labels to navigate forms. Mouse and touchscreen users also benefit from labels because the label text makes a larger click target.

`viewport` - [user-scalable="no"] is used in the <meta name="viewport"> element or the [maximum-scale] attribute is less than 5. The user-scalable="no" parameter for the <meta name="viewport"> element disables browser zoom on a web page. The maximum-scale parameter limits the amount the user can zoom. Both are problematic for users with low vision who rely on browser zoom to see the contents of a web page.

`frame-title` - Users of screen readers and other assistive technologies rely on frame titles to describe the contents of frames. Navigating through frames and inline frames can quickly become difficult and confusing for assistive technology users if the frames are not marked with a title attribute.

`heading-order` - Heading elements are not in a sequentially-descending order. Screen readers have commands to quickly jump between headings or to specific landmark regions. In fact, a survey of screen reader users found that they most often navigate an unfamiliar page by exploring the headings. For example, using an <h1> element for your page title and then using <h3> elements for the page's main sections will cause the audit to fail because the <h2> level is skipped

In [1]:
import requests, json
import pandas as pd
file = '2021_lab_websites_checked_v3.csv' # list of websites selected for analysis
websites_checked_df = pd.read_csv(file)
#api_key = API_KEY

In [3]:
websites_checked_df.info()

<class 'pandas.core.frame.DataFrame'>
RangeIndex: 15386 entries, 0 to 15385
Data columns (total 2 columns):
 #   Column   Non-Null Count  Dtype 
---  ------   --------------  ----- 
 0   ogrn     15386 non-null  int64 
 1   website  15386 non-null  object
dtypes: int64(1), object(1)
memory usage: 240.5+ KB


In [3]:
page = 'https://web.dev/'
api_url = f'https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={page}&key={api_key}&category=accessibility&category=performance'
api_url

'https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url=https://web.dev/&key=AIzaSyCGPHLwF5TYKTsM7AEpTbitRDoATkk7YSw&category=accessibility&category=performance'

In [5]:
data = requests.get(api_url).json()
data['lighthouseResult'].keys()



In [36]:
{
        'fcp': 'first-contentful-paint',
        'lcp': 'largest-contentful-paint',
        'tti': 'interactive'
        }

{'performance': {'id': 'performance',
  'title': 'Performance',
  'score': 1,
  'auditRefs': [{'id': 'first-contentful-paint',
    'weight': 10,
    'group': 'metrics',
    'acronym': 'FCP',
    'relevantAudits': ['server-response-time',
     'render-blocking-resources',
     'redirects',
     'critical-request-chains',
     'uses-text-compression',
     'uses-rel-preconnect',
     'uses-rel-preload',
     'font-display',
     'unminified-javascript',
     'unminified-css',
     'unused-css-rules']},
   {'id': 'interactive', 'weight': 10, 'group': 'metrics', 'acronym': 'TTI'},
   {'id': 'speed-index', 'weight': 10, 'group': 'metrics', 'acronym': 'SI'},
   {'id': 'total-blocking-time',
    'weight': 30,
    'group': 'metrics',
    'acronym': 'TBT',
    'relevantAudits': ['long-tasks',
     'third-party-summary',
     'third-party-facades',
     'bootup-time',
     'mainthread-work-breakdown',
     'dom-size',
     'duplicated-javascript',
     'legacy-javascript',
     'viewport']},
   

In [55]:
data['lighthouseResult']['audits']['first-contentful-paint']

{'id': 'interactive',
 'title': 'Time to Interactive',
 'description': 'Time to interactive is the amount of time it takes for the page to become fully interactive. [Learn more](https://web.dev/interactive/).',
 'score': 1,
 'scoreDisplayMode': 'numeric',
 'displayValue': '0.3\xa0s',
 'numericValue': 310,
 'numericUnit': 'millisecond'}

In [None]:
{'id': 'color-contrast',
 'title': 'Background and foreground colors have a sufficient contrast ratio',
 'description': 'Low-contrast text is difficult or impossible for many users to read. [Learn more](https://web.dev/color-contrast/).',
 'score': 1}

In [2]:
def get_key(data, key):
    d = data['lighthouseResult']['audits'][key]
    return {x : y for x, y in d.items() if x in ['id', 'title', 'score', 'numericValue', 'numericUnit']}
        
def get_lighthouse(page, api_key):
    frames = []
    api_url = f'https://www.googleapis.com/pagespeedonline/v5/runPagespeed?url={page}&key={api_key}&category=accessibility&category=performance'
    data = requests.get(api_url).json()
    try:
        for key in ['interactive', 'largest-contentful-paint', 'first-contentful-paint',
                    'color-contrast', 'document-title', 'duplicate-id-active', 
                    'aria-hidden-body', 'aria-roles', 'aria-input-field-name', 'aria-command-name',
                    'button-name', 'input-image-alt', 'frame-title', 'heading-order',
                    'form-field-multiple-labels', 'viewport']:
            d = get_key(data, key)
            d['website'] = page
            frames.append(d)
    except KeyError:
        d = {'id' : 'lighthouseError', 'title' : 'lighthouseError', 'score' : 'lighthouseError', 
                'numericValue' : 'lighthouseError', 'numericUnit' : 'lighthouseError'}
        d['website'] = page
        frames.append(d)
    return pd.DataFrame(frames)

In [65]:
frames = []
for key in ['interactive', 'largest-contentful-paint', 'first-contentful-paint',
            'color-contrast', 'document-title', 'duplicate-id-active', 
            'aria-hidden-body', 'aria-roles', 'aria-input-field-name', 'aria-command-name',
            'button-name', 'input-image-alt', 'frame-title', 'heading-order',
            'form-field-multiple-labels', 'viewport']:
    d=get_key(data, key)
    d['web'] = 'https://web.dev/'
    frames.append(d)

In [66]:
pd.DataFrame(frames)

Unnamed: 0,id,title,score,numericValue,numericUnit,web
0,interactive,Time to Interactive,1.0,310.0,millisecond,https://web.dev/
1,largest-contentful-paint,Largest Contentful Paint,0.99,620.0,millisecond,https://web.dev/
2,first-contentful-paint,First Contentful Paint,1.0,310.0,millisecond,https://web.dev/
3,color-contrast,Background and foreground colors have a suffic...,1.0,,,https://web.dev/
4,document-title,Document has a `<title>` element,1.0,,,https://web.dev/
5,duplicate-id-active,"`[id]` attributes on active, focusable element...",,,,https://web.dev/
6,aria-hidden-body,"`[aria-hidden=""true""]` is not present on the d...",1.0,,,https://web.dev/
7,aria-roles,`[role]` values are valid,1.0,,,https://web.dev/
8,aria-input-field-name,ARIA input fields have accessible names,,,,https://web.dev/
9,aria-command-name,"`button`, `link`, and `menuitem` elements have...",,,,https://web.dev/


In [67]:
get_lighthouse('http://pomogidobru.ru/', api_key)

Unnamed: 0,id,title,score,numericValue,numericUnit,website
0,interactive,Time to Interactive,0.66,3697.0,millisecond,http://pomogidobru.ru/
1,largest-contentful-paint,Largest Contentful Paint,0.21,3714.0,millisecond,http://pomogidobru.ru/
2,first-contentful-paint,First Contentful Paint,0.33,1935.0,millisecond,http://pomogidobru.ru/
3,color-contrast,Background and foreground colors do not have a...,0.0,,,http://pomogidobru.ru/
4,document-title,Document has a `<title>` element,1.0,,,http://pomogidobru.ru/
5,duplicate-id-active,"`[id]` attributes on active, focusable element...",,,,http://pomogidobru.ru/
6,aria-hidden-body,"`[aria-hidden=""true""]` is not present on the d...",1.0,,,http://pomogidobru.ru/
7,aria-roles,`[role]` values are valid,,,,http://pomogidobru.ru/
8,aria-input-field-name,ARIA input fields have accessible names,,,,http://pomogidobru.ru/
9,aria-command-name,"`button`, `link`, and `menuitem` elements have...",,,,http://pomogidobru.ru/


In [27]:
frames = []

for u in websites_checked_df.website.tolist():
    frames.append(get_lighthouse(u, api_key))

In [25]:
pd.concat(frames)

Unnamed: 0,id,title,score,numericValue,numericUnit,website
0,interactive,Time to Interactive,0.81,2985.5,millisecond,http://ul-lyceum.ru
1,largest-contentful-paint,Largest Contentful Paint,0.21,3743,millisecond,http://ul-lyceum.ru
2,first-contentful-paint,First Contentful Paint,0.67,1333.86,millisecond,http://ul-lyceum.ru
3,color-contrast,Background and foreground colors do not have a...,0,,,http://ul-lyceum.ru
4,document-title,Document has a `<title>` element,1,,,http://ul-lyceum.ru
...,...,...,...,...,...,...
11,input-image-alt,"`<input type=""image"">` elements have `[alt]` text",,,,http://aschar.ru
12,frame-title,`<frame>` or `<iframe>` elements have a title,,,,http://aschar.ru
13,heading-order,Heading elements appear in a sequentially-desc...,1,,,http://aschar.ru
14,form-field-multiple-labels,No form fields have multiple labels,,,,http://aschar.ru


In [30]:
pd.concat(frames)#.to_csv('Gryadka_Speed_WCAG.csv', index = False)

Unnamed: 0,id,title,score,numericValue,numericUnit,website
0,interactive,Time to Interactive,0.81,2985.5,millisecond,http://ul-lyceum.ru
1,largest-contentful-paint,Largest Contentful Paint,0.21,3743,millisecond,http://ul-lyceum.ru
2,first-contentful-paint,First Contentful Paint,0.67,1333.86,millisecond,http://ul-lyceum.ru
3,color-contrast,Background and foreground colors do not have a...,0,,,http://ul-lyceum.ru
4,document-title,Document has a `<title>` element,1,,,http://ul-lyceum.ru
...,...,...,...,...,...,...
11,input-image-alt,"`<input type=""image"">` elements have `[alt]` text",,,,http://dvimo.ru
12,frame-title,`<frame>` or `<iframe>` elements have a title,,,,http://dvimo.ru
13,heading-order,Heading elements appear in a sequentially-desc...,1,,,http://dvimo.ru
14,form-field-multiple-labels,No form fields have multiple labels,,,,http://dvimo.ru
