The HTTP Archive provides information about website performance such as # of HTTP requests, use of gzip, and amount of JavaScript. This information is recorded over time revealing trends in how the Internet is performing. Built using Open Source software, the code and data are available to everyone allowing researchers large and small to work fr…
PHP JavaScript CSS Other
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Failed to load latest commit information.
archives
bulktest
custom_metrics
images
lists
.gitignore
.htaccess
COPYING
Makefile
README.md
about.php
addsite.php
admin.php
apple-touch-icon-precomposed.png
apple-touch-icon.png
charts.inc
compare.php
comparedates.php
crawl-data.php
crawls.inc
dbapi.inc
download.php
downloads.php
favicon.ico
filmstrip.js
findurl.php
frame.php
har.css
har.js
har_to_pagespeed
harviewer.js
index.php
interesting-images.js
interesting.php
news.php
pages.inc
patchwork.js
patchwork.php
removesite.php
requests.inc
robots.txt
runs.js
runs.php
schema.js
settings.inc
sorttable-async.js
stats.inc
status.inc
style.css
tablesort.js
trends.inc
trends.php
ui.inc
urls.inc
urls.php
utils.inc
viewsite.php
websites.php

README.md

The HTTP Archive tracks how the Web is built

This repo contains the source code powering the HTTP Archive data collection.

What is the HTTP Archive?

Successful societies and institutions recognize the need to record their history - this provides a way to review the past, find explanations for current behavior, and spot emerging trends. In 1996 Brewster Kahle realized the cultural significance of the Internet and the need to record its history. As a result he founded the Internet Archive which collects and permanently stores the Web's digitized content.

In addition to the content of web pages, it's important to record how this digitized content is constructed and served. The HTTP Archive provides this record. It is a permanent repository of web performance information such as size of pages, failed requests, and technologies utilized. This performance information allows us to see trends in how the Web is built and provides a common data set from which to conduct web performance research.