## AWS S3 and Deepnote
> Integrating AWS S3 with Deepnote

#### Prerequisites

1. [Sign Up for AWS Account](https://docs.aws.amazon.com/AmazonS3/latest/gsg/SigningUpforS3.html). 
> Make sure you **create an IAM user** and **copy the credentials (save the .csv).**

2. [Create a S3 Bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html)
> Place it in a region close to you, e.g. ```eu-west-2```.

#### S3 Integration with Deepnote

1. Navigate to [Integration Tab](https://docs.deepnote.com/features/integrations) of Deepnote Project

    > Make sure you create an IAM User and download the security credentials. See the [getting started docs](https://docs.aws.amazon.com/AmazonS3/latest/gsg/SigningUpforS3.html) if you havn't already.

2. Add [Amazon S3 Integration](https://docs.deepnote.com/integrations/aws-s3).

    > Use IAM User Access Key and Security Key. 
    >
    > Make sure you have [created a bucket](https://docs.aws.amazon.com/AmazonS3/latest/gsg/CreatingABucket.html) and use that bucket's name.
    >
    > You can call the integration name whatever you'd like e.g. sdsacademys3

3. Once connected, click ```how to use``` to get started.

    > Deepnote integrates data storage services, such as S3, by simply extending them onto your project under the directory path ```/datasets``` (it can extend more than one integration). This way, you can simply interact with S3 as if it were a directory within your project. Check your S3 bucket as you go along to see the changes.


## Sample Project
> Requests CSV Data (Swimming Personal Best's from 18/19 Season), Filters 'Great British' Athletes.

In [None]:
import pandas as pd

url = 'https://sportsdatasolutionsacademy.s3.eu-west-2.amazonaws.com/data/swimming_psb_data.csv'
df = pd.read_csv(url)
df = df[df['c_NOC'] == 'Great Britain'].drop_duplicates()

#### Upload


In [None]:
df.to_csv(r'/datasets/sdsacademys3/gb_swimming_psb_data.csv', index=False)

#### List

In [None]:
!ls /datasets/sdsacademys3

#### Read

In [None]:
df = pd.read_csv('/datasets/sdsacademys3/gb_swimming_psb_data.csv')

df

#### Delete

In [None]:
!rm /datasets/sdsacademys3/gb_swimming_psb_data.csv

!ls /datasets/sdsacademys3