# Speed Metrics

Loads and saves PageSpeed and page loading time.

Currently uses a local PHP script sending requests to the GT Metrix API.

A complete test may last up to 30 minutes.

For mobile device performances tested with “mobile_metrics.ipynb”. GT Metrix doesn't offer testing on mobile devices for now.

Feel free to contact me for help: https://www.quel-media.com/about.html#contact

© Paul Ronga under Apache-2 Licence (see LICENCE.txt).

In [1]:
import pandas as pd
import requests
from IPython.display import HTML
import json
import datetime

In [2]:
# change this for your local tester / an external tool
TESTER_URL = 'http://rospo.local/~paul/gtmetrix/medias.php'

In [3]:
# dataframe containing media id, name and URLs
medias = pd.read_csv('df/media_list.csv')

medias.head(2)

Unnamed: 0,media_id,Name,URL_short,URL,URL_mobile
0,19,La Tribune de Genève,tdg.ch,https://www.tdg.ch/,https://m.tdg.ch
1,20,24 heures,24heures.ch,https://www.24heures.ch,https://m.24heures.ch


In [15]:
# remove Konbini
medias = medias[medias['media_id'] < 34].copy()

# media id as string
medias['media_id'] = medias['media_id'].apply(lambda x: str(x))

missing_medias = None

TypeError: unorderable types: str() < int()

In [5]:
# this new dataframe will contain our stats
df_speed = pd.DataFrame(columns=['Name', 'media_id', 'pagespeed_score', 'page_load_time', 'fully_loaded_time', 'report_url'])

In [28]:
# Run again if failed
target_medias = medias
if missing_medias is not None:
    target_medias = missing_medias

for i, row in target_medias.iterrows():
    print('Testing', row['Name'], '...')
    media_index = i

    payload = {'media': medias.loc[media_index][['Name', 'media_id', 'URL']].to_dict()}
    r = requests.post(TESTER_URL, json=payload)
    
    print(r.text, end='\n\n')
    
    result = json.loads(r.text.split('\n')[-1])

    df_speed = df_speed.append(pd.DataFrame([[
        result['media']['Name'],
        result['media']['media_id'],
        result['results']['pagespeed_score'],
        result['results']['page_load_time'] / 1000,
        result['results']['fully_loaded_time'] / 1000,
        result['results']['report_url']
    ]], columns=['Name', 'media_id', 'pagespeed_score', 'page_load_time', 'fully_loaded_time', 'report_url']))

Testing Libération ...
Test started with pDJkTZy1
{"media":{"Name":"Lib\u00e9ration","URL":"https:\/\/www.liberation.fr\/","media_id":"28"},"results":{"onload_time":38384,"first_contentful_paint_time":1078,"page_elements":1631,"report_url":"https:\/\/gtmetrix.com\/reports\/www.liberation.fr\/GH9Me0mr","redirect_duration":50,"first_paint_time":674,"dom_content_loaded_duration":null,"dom_content_loaded_time":2470,"dom_interactive_time":2457,"page_bytes":7833828,"page_load_time":38384,"html_bytes":54929,"fully_loaded_time":38685,"html_load_time":62,"rum_speed_index":1044,"yslow_score":39,"pagespeed_score":0,"backend_duration":6,"onload_duration":2,"connect_duration":6}}

Testing La Côte ...
Test started with 9dEmkuVX
{"media":{"Name":"La C\u00f4te","URL":"https:\/\/www.lacote.ch\/","media_id":"29"},"results":{"onload_time":12102,"first_contentful_paint_time":2154,"page_elements":254,"report_url":"https:\/\/gtmetrix.com\/reports\/www.lacote.ch\/QALDXAu2","redirect_duration":0,"first_paint_

In [30]:
# Use this in case you get e.g. a “The page took too long to load” or “Unable to analyze your site” error.
# It will contain missing medias. You can loop through it in the previous cell.
missing_medias = medias[(-medias['media_id'].isin(df_speed['media_id']))]

In [31]:
missing_medias.head(3)

Unnamed: 0,media_id,Name,URL_short,URL,URL_mobile


In [32]:
# To check for a report after an error
print("https:\/\/gtmetrix.com\/reports\/www.lacote.ch\/ZrEyp4s4".replace('\\', ''))

https://gtmetrix.com/reports/www.lacote.ch/ZrEyp4s4


In [34]:
# add current timestamp
df_speed['timestamp'] = datetime.datetime.now().strftime('%Y-%m-%d %H:%M:%S')
df_speed

Unnamed: 0,Name,media_id,pagespeed_score,page_load_time,fully_loaded_time,report_url,timestamp
0,La Tribune de Genève,19,33,12.388,12.67,https://gtmetrix.com/reports/www.tdg.ch/yoxcl2hi,2018-08-03 14:21:16
0,24 heures,20,32,12.228,12.701,https://gtmetrix.com/reports/www.24heures.ch/E...,2018-08-03 14:21:16
0,Le Temps,21,48,9.794,12.257,https://gtmetrix.com/reports/www.letemps.ch/hk...,2018-08-03 14:21:16
0,Le Monde,22,41,8.764,9.306,https://gtmetrix.com/reports/www.lemonde.fr/gz...,2018-08-03 14:21:16
0,RTS info,23,56,5.405,5.847,https://gtmetrix.com/reports/www.rts.ch/ZY8w3pqZ,2018-08-03 14:21:16
0,20 minutes (ch),24,21,12.562,13.875,https://gtmetrix.com/reports/www.20min.ch/J93Q...,2018-08-03 14:21:16
0,Le Matin,25,0,12.875,14.031,https://gtmetrix.com/reports/www.lematin.ch/NE...,2018-08-03 14:21:16
0,Mediapart,26,27,5.343,5.919,https://gtmetrix.com/reports/www.mediapart.fr/...,2018-08-03 14:21:16
0,Le Figaro,27,29,10.482,30.289,https://gtmetrix.com/reports/www.lefigaro.fr/9...,2018-08-03 14:21:16
0,Libération,28,0,38.384,38.685,https://gtmetrix.com/reports/www.liberation.fr...,2018-08-03 14:21:16


In [35]:
outputfile = 'df/archive/speed_metrics_{}.csv'.format( datetime.datetime.now().strftime('%Y-%m-%d') )
print('Saving to {}...'.format(outputfile))

Saving to df/archive/speed_metrics_2018-08-03.csv...


In [36]:
df_speed.to_csv(outputfile) # archive
df_speed.to_csv('df/speed_metrics.csv') # temp file