# Feed data to S3

The goal of this short exercise is to practise using `boto3` to communicate with the Amazon Data Lake service: S3.

The exercise is organized in two parts:
* Part one will have you upload a file from your local system directly to s3
* Part two will show you how to send files to your s3 that only exist as local python variables

## Part 1

1. Start by installing `boto3` if it has not been alrady

In [11]:
# Install boto3 using pip 
## Add '!' only if you install directly from a Jupyter Notebook
!pip install Boto3



2. Import the necessary libraries

In [1]:
import json
import os
import pandas as pd
import boto3
from dotenv import load_dotenv   #for python-dotenv method
load_dotenv()   

True

3. Create an instance of `boto3.Session` that connects with your aws account.

In [2]:
access_key = os.environ.get('aws_access')
secret_key = os.environ.get('aws_secret')

session = boto3.Session(aws_access_key_id=access_key, 
                        aws_secret_access_key=secret_key)

4. Create a variable called `s3` that connects your session to the s3 ressource.

In [9]:
s3 = session.resource("s3")

5. Create a variable called `bucket` that will connect to an existing bucket in your s3 or to a bucket you are creating now.

In [4]:
""""bucket = s3.create_bucket(Bucket="bucketcrypto",CreateBucketConfiguration={
        'LocationConstraint': 'eu-west-3'
    }
)
"""
bucket = s3.Bucket("bucketcrypto")

6. In the [documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Bucket.upload_file), find a method that lets you upload files to your s3, and upload the file `crypto.json` to your bucket in the folder of your choice.

In [None]:
put_object = bucket.upload_file('crypto_data.json', 'crypto_datffa1.json')

## Part 2

1. Using the `json` library, load the `crypto_data.json` file in a variable called `crypto`

In [7]:
with open("crypto_data.json", 'r') as file:
    crypto = json.load(file)


2. What is the type of variable `crypto`

In [8]:
type(crypto)

dict

3. Display the different possible keys in `crypto` in the form of a list

In [9]:
print(crypto.keys())

dict_keys(['btc-bitcoin', 'eth-ethereum', 'usdt-tether', 'bnb-binance-coin', 'hex-hex', 'usdc-usd-coin', 'ada-cardano', 'sol-solana', 'xrp-xrp', 'luna-terra'])


4. Have a look at the value associated with the first key, do you think it could easily be converted to a pandas DataFrame?

In [10]:
crypto.get('btc-bitcoin')

[{'time_open': '2012-01-01T00:00:00Z',
  'time_close': '2012-01-01T23:59:59Z',
  'open': 4.72,
  'high': 4.72,
  'low': 4.72,
  'close': 4.72},
 {'time_open': '2012-01-02T00:00:00Z',
  'time_close': '2012-01-02T23:59:59Z',
  'open': 5.27,
  'high': 5.27,
  'low': 5.27,
  'close': 5.27},
 {'time_open': '2012-01-03T00:00:00Z',
  'time_close': '2012-01-03T23:59:59Z',
  'open': 5.22,
  'high': 5.22,
  'low': 5.22,
  'close': 5.22},
 {'time_open': '2012-01-04T00:00:00Z',
  'time_close': '2012-01-04T23:59:59Z',
  'open': 4.88,
  'high': 4.88,
  'low': 4.88,
  'close': 4.88},
 {'time_open': '2012-01-05T00:00:00Z',
  'time_close': '2012-01-05T23:59:59Z',
  'open': 5.57,
  'high': 5.57,
  'low': 5.57,
  'close': 5.57},
 {'time_open': '2012-01-06T00:00:00Z',
  'time_close': '2012-01-06T23:59:59Z',
  'open': 6.95,
  'high': 6.95,
  'low': 6.95,
  'close': 6.95},
 {'time_open': '2012-01-07T00:00:00Z',
  'time_close': '2012-01-07T23:59:59Z',
  'open': 6.7,
  'high': 6.7,
  'low': 6.7,
  'close': 6.

5. Convert the previous object to a pandas dataframe

In [11]:
df_btc = pd.DataFrame(crypto.get('btc-bitcoin'))
df_btc.head()

Unnamed: 0,time_open,time_close,open,high,low,close,market_cap,volume
0,2012-01-01T00:00:00Z,2012-01-01T23:59:59Z,4.72,4.72,4.72,4.72,,
1,2012-01-02T00:00:00Z,2012-01-02T23:59:59Z,5.27,5.27,5.27,5.27,,
2,2012-01-03T00:00:00Z,2012-01-03T23:59:59Z,5.22,5.22,5.22,5.22,,
3,2012-01-04T00:00:00Z,2012-01-04T23:59:59Z,4.88,4.88,4.88,4.88,,
4,2012-01-05T00:00:00Z,2012-01-05T23:59:59Z,5.57,5.57,5.57,5.57,,


6. Now that you know how to convert one element of `crypto` into a dataframe, write ONE line of code that will convert each element into a dataframe and store them in a `list` object.

<details>
  <summary>Spoiler</summary>
  Use a list comprehension.
</details>


In [12]:
list_values = [pd.DataFrame(crypto.get(k)) for k in crypto.keys()]
print(len(list_values))

10


7. For each of these dataframes add a column called `"coin"` that contains the name of the key it was associated with.

<details>
  <summary>Spoiler</summary>
  Use a for loop and the `zip` command to be able to loop over two iterables at once
</details>

In [21]:
for x,y in zip(list_values,crypto.keys()):
    x['coin'] = y

8. Upload each of these dataframes to the folder of your choice in your s3

<details>
  <summary>Spoiler</summary>
  Use the pandas method `to_csv` to convert your dataframes to csv files and the appropriate command from the bucket object
</details>

In [None]:
for x,y in zip(list_values,crypto.keys()):
    put_object = bucket.put_object(Key=f"{y}/values-{y}.csv", Body=x.to_csv(index=False))

9. Now go to your S3 to make sure the files have been uploaded correctly!