AWS Cloud Service Metadata
Here you will find metadata in a computer-friendly format about Amazon Web Services' (AWS) cloud infrastructure such as:
- When a service (such as
kinesis) became available
- Which services are regionless (
iam), operate in specific regions (
s3) and which are dependent on availability zones within a region (
- What services are available in each region and when they became available in those regions
- How many availability zones are in a region (and when there were changes)
The dataset is most easily usable as a set of the following CSV files:
aws/services.csvcontains a "main" list of services and whether they work in regions and if they are dependent on zones.
aws/services-state.csvcontains information when a service was introduced, and what was its availability (limited preview / limited beta, public beta and general availability/GA).
aws/services-regions.csvcontains information for services that operate in a specific region about where they are available and when they became available in the particular region.
aws/zones.csvcontains information about each region and how many availability zones they contain (and when).
This is just a dataset repository. There is nothing functional (no code) in this repository. You are free to use the data freely (as long as giving me the credit for the data) as defined by the CC BY 4.0 license.
Data format description
Here are samples as first few lines of each file:
==> aws/services-regions.csv <== date,service,region,accurate,description 2004-11-03,sqs,us-east-1,1,http://aws.amazon.com/about-aws/whats-new/2004/11/03/introducing-the-amazon-simple-queue-service/ ==> aws/services-state.csv <== date,service,limited,beta,ga,accurate,description 2004-10-04,alexa,0,0,1,1,"Announcement http://aws.amazon.com/about-aws/whats-new/2004/10/04/introducing-the-alexa-web-information-service/ doesn't state beta, and earliest product page accessible https://web.archive.org/web/20060223102040/http://www.amazon.com/gp/browse.html?node=12782661 also doesn't state beta (GA). Since no contradictory information, assume GA on release." ==> aws/services.csv <== service,name,hasregions,haszones,description alexa,Alexa Web Information Service,0,0,http://aws.amazon.com/awis/ ==> aws/zones.csv <== date,region,zones,accurate,description 2007-10-22,us-east-1,1,1,Since zones were introduced 2008-03-27 before that there were ”no” zones – so one only.
- All files start with a header row.
- All dates are ISO 8601 format (YYYY-MM-DD). Dates only.
- Boolean values (
haszonescolumns) are stored as
1for a true value and
0for a false value.
descriptionfield is for human consumption only and contains further details such as links to announcements and rationale for judgement calls.
- Zone information is included only for zones that actually available
for use as beta or GA but not for limited access (like China as
of 2014-06-23). Note that
govcloudis included because although it has limited access (US public sector only) it is generally available to all those that are allowed to use it.
The names and format of these data files may change in the future. Consider the dataset as a beta release :-)
There's also a
aws/updated.csv file which contains
just one column and one data row containing one date value being the
date that this dataset is considered to be up-to-date to.
Why and how?
I needed this metadata for a research project (and couldn't find suitable data online), and it was quite a bit of effort (about 4-5 days of work) so I wanted to share the data in the hope this will save someone a lot of work and someone will find it useful. (I'd be delighted to hear if you do find it useful.)
To collect the data I primarily went through all news items from AWS. I used the Internet Archive's Wayback Machine a lot to check up on older versions of product pages, FAQs, infrastructure information etc. I also searched through AWS developer forum announcements. Jeff Barr's blog posts in the AWS blog were often also useful source of information.
If you spot an error in the data, please do one of:
- Edit the master version (the
maketo update CSV versions of the files and submit a pull request for those changes. (You'll need
unoconvinstalled for this to work.) or
- Edit the csv in GitHub's editor and send it as a pull request or
- Open an issue, or send email, or get somehow in touch with the updated information.
All dates are in ISO 8601 format (YYYY-MM-DD). Many files contain a
accurate column which should be
1 (TRUE value) if the date can be
confirmed from sources and
0 if it is interpolated from available
Please note that all data must be backed up with references. Use
description column for references and any rationale for
If you want to add entirely new data (as opposed to fixing errors or
providing more accurate date information) please note that to keep the
aws/updated.csv file meaningful all changes up to the latest
date must be included in the dataset.
AWS Cloud Service Metadata by Santeri Paavolainen is licensed under a Creative Commons Attribution 4.0 International License.