Met Office AWS Earth data - Subscribing to data
=============

The Met Office has made data from its world leading weather models available for research purposes. In this page, we'll go over how to subscribe to the data feeds.

For more information on the available data and how to load it see the Getting Started example.

## Overview

### Storage in S3

The Met Office data files are being stored in Amazon Web Service (AWS) [Simple Storage Service (S3)](https://aws.amazon.com/s3/) which is an object store. Files (known as objects) are stored in groups (known as buckets). We have a separate bucket for each of the four datasets we are publishing (UK and Global Deterministic and UK and Global Ensembles).

While object stores are inexpensive and highly scalable they suffer when it comes to listing the objects which are available in the bucket. Listing a bucket can take a very long time, and as this data is being uploaded very rapidly it is impractical to epxect to discover new data by listing the bucket.

### Topics and Queues

To solve this we have provided topics using the AWS [Simple Notificaion Service (SNS)](https://aws.amazon.com/sns/) which you can be subscribed to and will notify you when new objects are created. The objects will be held in the bucket for 7 days after the notification is sent to give you a chance to perform analysis using them and then they will be deleted.

The easiest way to subscribe to a topic is using the AWS [Simple Queue Service (SQS)](https://aws.amazon.com/sqs/). This is something which you can create in your own AWS account and then subscribe it to our topics. The queue will start storing all of the notifications that we emit ready for you to inspect and process them.

## Creating a queue

To create a queue you must log into your AWS account and navigate to the SQS page and click "Get Started Now".

![SQS Getting Started Page](https://images.informaticslab.co.uk/misc/b7e227ee082349173cd1d63308ef7e60.png)

Then enter a name for your queue and select a queue type and click "Quick Create Queue". We recommend you choose a standard queue.

_FIFO queues will ensure that messages will be delievered in the order they were receieved, however the notifications are not guaranteed to be in the order that the data was generated so there is little benefit._

![Create a queue](https://images.informaticslab.co.uk/misc/8dd77faa5603cb6a9bc578ef06d5edfc.png)

## Subscribing to some data

Once you have your queue you need to subscribe it to our SNS topics. You can do this by right-clicking on the queue and selecting "Subscribe Queue to SNS Topic".

![Subscribe queue](https://images.informaticslab.co.uk/misc/e52ef3183fc946ea62d9c5f5a415fed0.png)

Then you need to select "EU (London)" as the region as all of our infrastructure for AWS Earth is in the UK and then paste in the SNS Amazon Resource Names (ARN) for the topic you wish to subscribe to. For the list of dataset SNS ARNs see our page on the [AWS Open Data Registry](https://registry.opendata.aws/) 

Let's subscribe to the UKV topic which is the high resolution UK determinstic weather simulation feed using the ARN `arn:aws:sns:eu-west-2:021908831235:aws-earth-mo-atmospheric-ukv-prd`. 

_Hint: You can subscribe one queue to multiple topics. Just repeat this step with different ARNs._

![Subscribe to UKV](https://images.informaticslab.co.uk/misc/117548fcbc816663d08f9d66ffd86333.png)

Once your subscription is set up you must wait for some new data to be generated, then you will start to see messages build up on the queue.

## Processing some messages

There are many ways to process messages form a queue. It is up to you how you would like to manage it. However to get your started here is an example of using the Python library boto to read messages from the queue.

In [2]:
import boto3
sqs = boto3.client('sqs')

First let's get a message from our queue. We need to provide the queue URL which will consist of the region, your account number and the queue name. You can find this in the SQS control panel when you click on your queue. We will also limit the request to one message, but you can increase this if you like.

In [26]:
messages = sqs.receive_message(
    QueueUrl='https://sqs.eu-west-2.amazonaws.com/123456123456/my-awesome-queue',
    MaxNumberOfMessages=1)

Now let's extract our single message from our list of messages.

In [19]:
[message] = messages['Messages']
message

{'MessageId': 'a355a119-2a2f-4e24-b8bd-3b7a3be3fef4',
 'ReceiptHandle': 'AQEB3FLQhPVSkFJUUa61qL7wvcaLdJVXqgD8PH+UWPte05OtPS5NdlQbhJb+PJ3LbejB6FBDz3XEVqz4hqfDNAZAzJQCXzprGJLWna5QayMaNbF5Hq4BEzfQs+jTRvRqn/b++tb5xPKk37LT0ThV7j2wg0dF5JefQlpiq8+zk0logtUYHkNzth0V41qsnr8rZGpMWG8koue+PBzYj0QR6vtpQ4Kj0YCq18GqGqK3yjOSTf8KqIFxKb/9S82oSrptmLP67vmAzfqWio2GchQevK+R7wKHYYX3DSVNzrmpRr08OpmYmBSoMq8kDyNtaBgfEBDK4i5L9RwLKP2OUvT35n4H1wfxGRz+dgcqnYI5ihpEb0d6aa+zH0yhEwwdTyAGIKLCmBd3/+STG9BDhVNyXf5PjA==',
 'MD5OfBody': '516a07c3e239fd2c84a74f06aab219d6',
 'Body': '{\n  "Type" : "Notification",\n  "MessageId" : "28608723-cb8b-5e9b-9b39-d3afba035edf",\n  "TopicArn" : "arn:aws:sns:eu-west-2:021908831235:aws-earth-mo-atmospheric-ukv-prd",\n  "Message" : "{\\"model\\": \\"mo-atmospheric-ukv-prd\\", \\"ttl\\": 1545316805, \\"time\\": \\"2018-12-12T22:00:00Z\\", \\"bucket\\": \\"aws-earth-mo-atmospheric-ukv-prd\\", \\"created_time\\": \\"2018-12-11T16:26:16Z\\", \\"name\\": \\"wind_speed\\", \\"object_size\\": 1007

Then we need to extract the SNS notification which has been stored in the `Body` attribute in a JSON encoded form.

In [22]:
import json
notification = json.loads(message['Body'])
notification

{'Type': 'Notification',
 'MessageId': '28608723-cb8b-5e9b-9b39-d3afba035edf',
 'TopicArn': 'arn:aws:sns:eu-west-2:021908831235:aws-earth-mo-atmospheric-ukv-prd',
 'Message': '{"model": "mo-atmospheric-ukv-prd", "ttl": 1545316805, "time": "2018-12-12T22:00:00Z", "bucket": "aws-earth-mo-atmospheric-ukv-prd", "created_time": "2018-12-11T16:26:16Z", "name": "wind_speed", "object_size": 100758303.0, "forecast_period": "111600", "forecast_reference_time": "2018-12-11T15:00:00Z", "pressure": "100000.0 95000.0 92500.0 90000.0 85000.0 80000.0 75000.0 70000.0 60000.0 50000.0 45000.0 40000.0 37500.0 35000.0 32500.0 30000.0 27500.0 25000.0 22500.0 20000.0 17500.0 15000.0 12500.0 10000.0 7000.0 5000.0 4000.0 3000.0 2000.0 1000.0", "forecast_period_units": "seconds", "pressure_units": "Pa", "key": "54354b4b317a019b8f19f2660c2ae4f0cf5a8d26.nc"}',
 'Timestamp': '2018-12-13T14:40:08.668Z',
 'SignatureVersion': '1',
 'Signature': 'DLW2sUCvx7f8eFQe7Wgdu3/65c9gPh8A4tUnLaMDHjjgC4KDK8ScbL/3HLfQi6cBKKZsulr4

Then finally we can extract the information about the S3 object which has been stored in the notification `Message` in a JSON encoded form.

In [24]:
s3_object = json.loads(notification['Message'])
s3_object

{'model': 'mo-atmospheric-ukv-prd',
 'ttl': 1545316805,
 'time': '2018-12-12T22:00:00Z',
 'bucket': 'aws-earth-mo-atmospheric-ukv-prd',
 'created_time': '2018-12-11T16:26:16Z',
 'name': 'wind_speed',
 'object_size': 100758303.0,
 'forecast_period': '111600',
 'forecast_reference_time': '2018-12-11T15:00:00Z',
 'pressure': '100000.0 95000.0 92500.0 90000.0 85000.0 80000.0 75000.0 70000.0 60000.0 50000.0 45000.0 40000.0 37500.0 35000.0 32500.0 30000.0 27500.0 25000.0 22500.0 20000.0 17500.0 15000.0 12500.0 10000.0 7000.0 5000.0 4000.0 3000.0 2000.0 1000.0',
 'forecast_period_units': 'seconds',
 'pressure_units': 'Pa',
 'key': '54354b4b317a019b8f19f2660c2ae4f0cf5a8d26.nc'}

This dictionary of information should provide you with some insight into what the file contains and should also look very similar to the example notification used in the "Getting Started" guide.