# Read data from cloud

There's an open data fork of the Groningen project in Software Underground's AWS.

Let's try to read data from it.

First you'll need to do this in your environment:

    pip install boto3
    
    
## No anonymous access

As far as I can tell, it is no longer possible to use `boto3` to read data, get file listings, etc, from an S3 bucket, even if the bucket is public.

I used the following code to make a file listing, which is on AWS here: https://swung-hosted.s3.ca-central-1.amazonaws.com/groningen/FILENAMES.txt

In [35]:
import boto3
import secrets

session = boto3.Session(
    aws_access_key_id=secrets.AWS_ACCESS_KEY_ID,
    aws_secret_access_key=secrets.AWS_SECRET_ACCESS_KEY,
)

s3 = session.resource('s3')

In [40]:
bucket = s3.Bucket('swung-hosted')

with open('../data/FILENAMES.txt', 'wt') as f:
    for obj in bucket.objects.all():
        f.write(obj.key + '\n')

## Read direct from URL

Some libraries let you read directly:

In [42]:
import pandas as pd

url = "https://swung-hosted.s3.ca-central-1.amazonaws.com/groningen/Formation_tops/Groningen__Formation_tops__EPSG_28992.csv"

df = pd.read_csv(url)
df.head()

Unnamed: 0,X,Y,Z,TWT picked,TWT auto,Geological age,MD,Type,Surface,Well,...,Used by dep.conv.,Used by geo mod,Zone log,Edited by user,Symbol,Locked to fault,"FLOAT,Continuous","FLOAT,Carb_net2","FLOAT,SH_WS_belowcontact",PVD auto
0,256256.0,591586.0,-2824.0,,-1875.9,,2831.5,Horizon,USS_3.1_T,AMR- 1,...,False,False,0.0,False,0.0,0.0,,,,-2824.0
1,256634.0,591613.0,-2790.0,,,,2818.75,Horizon,USS_3.1_T,AMR- 2,...,False,True,0.0,False,0.0,0.0,,,,-2790.0
2,256627.0,591617.0,-2789.0,,,,2828.31,Horizon,USS_3.1_T,AMR- 3,...,False,True,0.0,False,0.0,0.0,,,,-2789.0
3,256583.0,591606.0,-2786.0,,,,2829.86,Horizon,USS_3.1_T,AMR- 4,...,False,True,0.0,False,0.0,0.0,,,,-2786.0
4,256533.0,591778.0,-2791.0,,,,2888.83,Horizon,USS_3.1_T,AMR- 5B,...,False,True,0.0,False,0.0,0.0,,,,-2791.0


In [43]:
from welly import Well

url = "https://swung-hosted.s3.ca-central-1.amazonaws.com/groningen/Well_data/Oude_Pekela_field/OPK-__1.las"

w = Well.from_las(url)
w



OPK- 1 11000080112101,OPK- 1 11000080112101.1
crs,CRS({})
location,
province,
api,
td,
data,"CAL, DENS, FACIES, FACIES_PP, FACIES_PP_ED, FLDE, FLGR, FLSO, GENERALTIME1, GR, NET_NOV14, NEUT, PERMNET_2015, PERMNET_NOV14, PORNET_NOV14, RESD, RESM, SH, SON"


## Download with requests

For everything else, the `requests` library is nice:

In [None]:
import requests

# NB This file is about 12GB.
url = "https://swung-hosted.s3.ca-central-1.amazonaws.com/groningen/Seismic_Volume/R3136_15UnrPrDMkD_Full_D_Rzn_RMO_Shp_vG.SEGY"

with requests.get(url, stream=True) as r:
    r.raise_for_status()
    with open('../data/NAM/Seismic_Volume/R3136_15UnrPrDMkD_Full_D_Rzn_RMO_Shp_vG.SEGY', 'wb') as f:
        for chunk in r.iter_content(chunk_size=16_384):  # Bytes in chunk.
            f.write(chunk)
