# Amazon S3 Bucket Connection
This recipe shows you how to connect to AWS S3 bucket. Using the Integrations tab, you can make this connection without exposing sensitive data; you can learn more about integrations [here](https://workspace-docs.datacamp.com/integrations/environment-variables).

As an example, you can connect your workspace to a fictional online ticket sale dataset. The files for this dataset are stored in a S3 bucket that is hosted on a DataCamp server. The ER diagram of this sample database is shown in the appendix of this recipe. The right set of environment variables to connect to this sample database can be found in [this section](https://workspace-docs.datacamp.com/integrations/s3-bucket#sample-database-online-ticket-sales) of the documentation.

Once you are familiar with this example, you can also connect to your own Amazon S3 bucket by inserting your own credentials as the environment variables.

In [6]:
# Import packages
import os
import io
import boto3
import pandas as pd


In [7]:
os.environ["AWS_BUCKET_NAME"] = "datacamp-workspacedemo-workspacedemos3-prod"
DIR_NAME = "workspacedemos3/"
s3 = boto3.resource('s3')
# Load the bucket with specified name.
bucket = s3.Bucket(os.environ["AWS_BUCKET_NAME"])
# The data will be loaded into a dictionary of pandas DataFrames
dfs = {}
# Each DataFrame will be accessible with a key (the filename without extension)
for bucket_object in bucket.objects.filter(Prefix=DIR_NAME):
    obj = bucket.Object(key=bucket_object.key)
    data_string = io.BytesIO(obj.get()['Body'].read())
    dfs[bucket_object.key[:-4].replace(DIR_NAME, "")] = pd.read_csv(data_string, encoding='utf8', sep="|", index_col=0, header=None)

In [8]:
# Print out all available DataFrames. These are the same as the files in the S3 bucket
list(dfs.keys())

['categories', 'dates', 'events', 'listings', 'sales', 'users', 'venues']

In [9]:
# Print out the head of a DataFrame as an example
dfs['categories'].head()

Unnamed: 0_level_0,1,2,3
0,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,Sports,MLB,Major League Baseball
2,Sports,NHL,National Hockey League
3,Sports,NFL,National Football League
4,Sports,NBA,National Basketball Association
5,Sports,MLS,Major League Soccer


## Appendix
#### 1. ER diagram of online ticket sales database
This ER diagram contains information about all the tables in the sample database, and shows how they relate to each other. ([database and ER diagram - source](https://docs.aws.amazon.com/redshift/latest/dg/c_sampledb.html))

![ER diagram](tickitdb-er.png)"