# Feed data to S3

The goal of this short exercise is to practise using `boto3` to communicate with the Amazon Data Lake service: S3.

The exercise is organized in two parts:
* Part one will have you upload a file from your local system directly to s3
* Part two will show you how to send files to your s3 that only exist as local python variables

## Part 1

1. Start by installing `boto3` if it has not been alrady

In [23]:
# Install boto3 using pip 
## Add '!' only if you install directly from a Jupyter Notebook
!pip install Boto3



2. Import the necessary libraries

In [24]:
import json
import os
import pandas as pd
import boto3

3. Create an instance of `boto3.Session` that connects with your aws account.

4. Create a variable called `s3` that connects your session to the s3 ressource.

5. Create a variable called `bucket` that will connect to an existing bucket in your s3 or to a bucket you are creating now.

6. In the [documentation](https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Bucket.upload_file), find a method that lets you upload files to your s3, and upload the file `crypto.json` to your bucket in the folder of your choice.

## Part 2

1. Using the `json` library, load the `crypto_data.json` file in a variable called `crypto`

2. What is the type of variable `crypto`

dict

3. Display the different possible keys in `crypto` in the form of a list

['btc-bitcoin',
 'eth-ethereum',
 'usdt-tether',
 'bnb-binance-coin',
 'hex-hex',
 'usdc-usd-coin',
 'ada-cardano',
 'sol-solana',
 'xrp-xrp',
 'luna-terra']

4. Have a look at the value associated with the first key, do you think it could easily be converted to a pandas DataFrame?

[{'time_open': '2012-01-01T00:00:00Z',
  'time_close': '2012-01-01T23:59:59Z',
  'open': 4.72,
  'high': 4.72,
  'low': 4.72,
  'close': 4.72},
 {'time_open': '2012-01-02T00:00:00Z',
  'time_close': '2012-01-02T23:59:59Z',
  'open': 5.27,
  'high': 5.27,
  'low': 5.27,
  'close': 5.27},
 {'time_open': '2012-01-03T00:00:00Z',
  'time_close': '2012-01-03T23:59:59Z',
  'open': 5.22,
  'high': 5.22,
  'low': 5.22,
  'close': 5.22},
 {'time_open': '2012-01-04T00:00:00Z',
  'time_close': '2012-01-04T23:59:59Z',
  'open': 4.88,
  'high': 4.88,
  'low': 4.88,
  'close': 4.88},
 {'time_open': '2012-01-05T00:00:00Z',
  'time_close': '2012-01-05T23:59:59Z',
  'open': 5.57,
  'high': 5.57,
  'low': 5.57,
  'close': 5.57}]

5. Convert the previous object to a pandas dataframe

Unnamed: 0,time_open,time_close,open,high,low,close,market_cap,volume
0,2012-01-01T00:00:00Z,2012-01-01T23:59:59Z,4.720000,4.720000,4.720000,4.720000,,
1,2012-01-02T00:00:00Z,2012-01-02T23:59:59Z,5.270000,5.270000,5.270000,5.270000,,
2,2012-01-03T00:00:00Z,2012-01-03T23:59:59Z,5.220000,5.220000,5.220000,5.220000,,
3,2012-01-04T00:00:00Z,2012-01-04T23:59:59Z,4.880000,4.880000,4.880000,4.880000,,
4,2012-01-05T00:00:00Z,2012-01-05T23:59:59Z,5.570000,5.570000,5.570000,5.570000,,
...,...,...,...,...,...,...,...,...
3645,2021-12-27T00:00:00Z,2021-12-27T23:59:59Z,50871.136201,52040.010462,50648.783664,50805.926381,9.620695e+11,2.706242e+10
3646,2021-12-28T00:00:00Z,2021-12-28T23:59:59Z,50785.014226,50785.014226,47532.642786,47754.479284,9.604874e+11,3.441168e+10
3647,2021-12-29T00:00:00Z,2021-12-29T23:59:59Z,47686.412549,48148.756050,46457.923665,46565.851628,9.019232e+11,3.235327e+10
3648,2021-12-30T00:00:00Z,2021-12-30T23:59:59Z,46622.167212,47903.118342,46378.878022,47171.745143,8.818376e+11,2.812671e+10


6. Now that you know how to convert one element of `crypto` into a dataframe, write ONE line of code that will convert each element into a dataframe and store them in a `list` object.
<details>
  <summary>Spoiler</summary>
Use a list comprehension :)
</details>


[                 time_open            time_close          open          high  \
 0     2012-01-01T00:00:00Z  2012-01-01T23:59:59Z      4.720000      4.720000   
 1     2012-01-02T00:00:00Z  2012-01-02T23:59:59Z      5.270000      5.270000   
 2     2012-01-03T00:00:00Z  2012-01-03T23:59:59Z      5.220000      5.220000   
 3     2012-01-04T00:00:00Z  2012-01-04T23:59:59Z      4.880000      4.880000   
 4     2012-01-05T00:00:00Z  2012-01-05T23:59:59Z      5.570000      5.570000   
 ...                    ...                   ...           ...           ...   
 3645  2021-12-27T00:00:00Z  2021-12-27T23:59:59Z  50871.136201  52040.010462   
 3646  2021-12-28T00:00:00Z  2021-12-28T23:59:59Z  50785.014226  50785.014226   
 3647  2021-12-29T00:00:00Z  2021-12-29T23:59:59Z  47686.412549  48148.756050   
 3648  2021-12-30T00:00:00Z  2021-12-30T23:59:59Z  46622.167212  47903.118342   
 3649  2021-12-31T00:00:00Z  2021-12-31T23:59:59Z  47183.140352  48511.448334   
 
                low       

7. For each of these dataframes add a column called `"coin"` that contains the name of the key it was associated with.
<details>
  <summary>Spoiler</summary>
Use a for loop and the `zip` command to be able to loop over two iterables at once
</details>

Unnamed: 0,time_open,time_close,open,high,low,close,market_cap,volume,coin
0,2012-01-01T00:00:00Z,2012-01-01T23:59:59Z,4.720000,4.720000,4.720000,4.720000,,,btc-bitcoin
1,2012-01-02T00:00:00Z,2012-01-02T23:59:59Z,5.270000,5.270000,5.270000,5.270000,,,btc-bitcoin
2,2012-01-03T00:00:00Z,2012-01-03T23:59:59Z,5.220000,5.220000,5.220000,5.220000,,,btc-bitcoin
3,2012-01-04T00:00:00Z,2012-01-04T23:59:59Z,4.880000,4.880000,4.880000,4.880000,,,btc-bitcoin
4,2012-01-05T00:00:00Z,2012-01-05T23:59:59Z,5.570000,5.570000,5.570000,5.570000,,,btc-bitcoin
...,...,...,...,...,...,...,...,...,...
3645,2021-12-27T00:00:00Z,2021-12-27T23:59:59Z,50871.136201,52040.010462,50648.783664,50805.926381,9.620695e+11,2.706242e+10,btc-bitcoin
3646,2021-12-28T00:00:00Z,2021-12-28T23:59:59Z,50785.014226,50785.014226,47532.642786,47754.479284,9.604874e+11,3.441168e+10,btc-bitcoin
3647,2021-12-29T00:00:00Z,2021-12-29T23:59:59Z,47686.412549,48148.756050,46457.923665,46565.851628,9.019232e+11,3.235327e+10,btc-bitcoin
3648,2021-12-30T00:00:00Z,2021-12-30T23:59:59Z,46622.167212,47903.118342,46378.878022,47171.745143,8.818376e+11,2.812671e+10,btc-bitcoin


8. Upload each of these dataframes to the folder of your choice in your s3
<details>
  <summary>Spoiler</summary>
Use the pandas method `to_csv` to convert your dataframes to csv files and the appropriate command from the bucket object
</details>

9. Now go to your S3 to make sure the files have been uploaded correctly!