# How to Upload And Download Files From AWS S3 Using Python (2022)
## Learn how to use cloud resources in your Python scripts
![](images/pexels.jpg)

I am writing this post out of sheer frustration. 

Every post I've read on this topic assumed that I had already had an account in AWS, an S3 bucket and a mound of data already stored. They just show the code but kindly shadow over the most important part - how to make the code work through your AWS account. 

Well, I could've figured out the code easily, thank you very much. I had sift through many SO threads and the AWS docs to get rid of every nasty authentication error along the way.

So that you won't feel the same and do the hard work, I will share all the technicalities of managing and S3 bucket programmatically, right from account creation to adding permissions to your local machine to access your AWS resources.

### Step 1: Setup an account

Right, let's start with creating your AWS account, if you haven't already. Nothing unusual, just follow the steps from this [link](https://aws.amazon.com/free/?trk=ps_a131L0000085EJvQAM&trkCampaign=acq_paid_search_brand&sc_channel=ps&sc_campaign=acquisition_US&sc_publisher=google&sc_category=core-main&sc_country=US&sc_geo=NAMER&sc_outcome=acq&sc_detail=aws&sc_content=Brand_Core_aws_e&sc_segment=432339156150&sc_medium=ACQ-P|PS-GO|Brand|Desktop|SU|Core-Main|Core|US|EN|Text&s_kwcid=AL!4422!3!432339156150!e!!g!!aws&ef_id=EAIaIQobChMIxa2ogpvA9QIVxP7jBx2iAwgNEAAYASAAEgJGR_D_BwE:G:s&s_kwcid=AL!4422!3!432339156150!e!!g!!aws&all-free-tier.sort-by=item.additionalFields.SortRank&all-free-tier.sort-order=asc&awsf.Free%20Tier%20Types=*all&awsf.Free%20Tier%20Categories=*all):

![](images/sign_up.gif)

Then, we will go to the [AWS IAM (Identity and Access Management) console](https://console.aws.amazon.com/), which is the place we will be doing most of the work.

![](images/aws_console.gif)

From the console, you can easily switch between different AWS servers, create users, add policies and allow access to your user account. We will do each one by one.

### Step 2: Create a user

For one AWS account, you can create multiple users and each user can have various levels of access to your account's resources. Let's create a sample user for this tutorial:

![](images/create_user.gif)

In the IAM console:

1. Go to the users tab.
2. Click on Add users.
3. Enter a username in the field.
4. Tick the "Access key - Programmatic access field" (essential).
5. Click "Next" and "Attach existing policies directly".
6. Tick the "AdministratorAccess" policy.
7. Click "Next" until you see the "Create user" button
8. Finally, download the given CSV file of your user's credentials.

It should look like this:

![](images/credentials.png)

Store it somewhere safe, because we will be using the credentials later. 

### Step 3: Create a bucket

Now, let's create an S3 bucket where we can store data.

![](images/create_bucket.gif)

In the [IAM console](https://console.aws.amazon.com/):

1. Click services in the top left corner.
2. Scroll down to storage and select S3 from the right-hand list.
3. Click "Create bucket" and give it a name. 

You can choose any region you want. Leave the rest of the settings as is and click "Create bucket" once more. 

### Step 4: Create a policy and add it to your user

In AWS, access is management through policies. A policy can be a set of settings or a JSON file attached to an AWS object (user, resource, group, roles) and it controls what aspects of the object you can use. 

Below, we will create a policy that enables us to interact with our bucket programmatically - i.e., through the CLI or in a script. 

![](images/create_policy.gif)

In the [IAM console](https://console.aws.amazon.com/):

1. Go to the Policies tab and click "Create a policy".
2. Click the "JSON" tab and insert the code below:

```json
{
    "Version": "2012-10-17",
    "Statement": [
        {
            "Sid": "ConsoleAccess",
            "Effect": "Allow",
            "Action": [
                "s3:*"
            ],
            "Resource": [
                "arn:aws:s3:::your-bucket-name",
                "arn:aws:s3:::your-bucket-name/*"
            ]
        }
    ]
}
```
replacing *your-bucket-name* with your own. If you pay attention, in the Action field of the JSON, we are putting `s3:*` to allow any type of interaction to our bucket. This is very broad, so you may allow only certain actions. In that case, check out [this page](https://docs.aws.amazon.com/IAM/latest/UserGuide/reference_policies_examples_s3_rw-bucket-console.html) of the AWS docs to learn to limit access.

Now, this policy is only attached to the bucket. We should attach it to the user as well so that your API credentials work properly. Here are the instructions:

![](images/add_permission.gif)

In the [IAM console](https://console.aws.amazon.com/):

1. Go the Users tab and click on the user we created in the last section.
2. Click the "Add permissions" button.
3. Click "Attach existing policies" tab.
4. Filter them by the policy we just created.
5. Tick the policy, review it and click "Add" the final time.

### Step 5: Download AWS CLI and configure your user

Now, we download the AWS command-line tool, because it makes authentication so much easier. Kindly go to [this page](https://aws.amazon.com/cli/) and download the executable for your platform:

![](images/download_cli.gif)

Run the executable and reopen any active terminal sessions to let the changes take effect. Then, type `aws configure`:

![](images/configure_aws.gif)

Insert your AWS Key ID and Secret Access Key, along with the region your created your bucket in (use the CSV file). You can find the region name of your bucket in the S3 page of the console:

![](images/region.png)

Just click "Enter" when you reach the Default Output Format field in the configuration. There won't be any output.

### Step 6: Upload your files

We are nearly there.

Now, we upload a sample dataset to our bucket so that we can download it in a script later:

![](images/upload_file.gif)

It should be easy once you go to the S3 page and open your bucket. 

### Step 7: Check if authentication is working

Finally, pip install the [Boto3 package](https://boto3.amazonaws.com/v1/documentation/api/latest/index.html) and run this snippet:

In [2]:
import boto3  # pip install boto3

# Let's use Amazon S3
s3 = boto3.resource("s3")

# Print out bucket names
for bucket in s3.buckets.all():
    print(bucket.name)

sample-bucket-1801


If the output contains your bucket name(s), congratulations - you now have full access to many AWS services through `boto3`, not just S3.

### Using Python Boto3 to download files from S3 bucket

With the Boto3 package, you have programmatic access to many AWS services such as SQS, EC2, SES and many aspects of the IAM console. 

However, as a regular data scientists, you will mostly need to upload and download data from an S3 bucket, so we will only cover only those operations. 

Let's start with the download. After importing the package, create an S3 class using the `client` function:

In [10]:
import boto3

# Create an S3 access object
s3 = boto3.client("s3")

To download a file from an S3 bucket and immediately save it to a file, we can use the `download_file` function:

In [11]:
s3.download_file(
    Bucket="sample-bucket-1801", Key="train.csv", Filename="data/downloaded_from_s3.csv"
)

There won't be any output if the download is successful. You should pass the exact file path of the file to be downloaded to the `Key` parameter. The `Filename` should contain the pass you want to save the file to.

Uploading is also very straightforward:

In [13]:
s3.upload_file(
    Filename="data/downloaded_from_s3.csv",
    Bucket="sample-bucket-1801",
    Key="new_file.csv",
)

The function is `upload_file` and you only have to change the order of the parameters from the download function. 

![](images/new_file.png)

### Conclusion

I suggest reading the [Boto3 docs] for more advanced examples of managing your AWS resources. It covers services other than S3 and contains code recipes for the most common tasks with each one. 

Thanks for reading!

**You can become a premium Medium member using the link below and get access to all of my stories and thousands of others:**

https://ibexorigin.medium.com/membership

**Or just subscribe to my email list:**

https://ibexorigin.medium.com/subscribe

#### You can reach out to me on [LinkedIn](https://www.linkedin.com/in/bextuychiev/) or [Twitter](https://twitter.com/BexTuychiev) for a friendly chat about all things data. Or you can just read another story from me. How about these:

https://towardsdatascience.com/8-booming-data-science-libraries-you-must-watch-out-in-2022-cec2dbb42437

https://towardsdatascience.com/how-to-get-started-on-kaggle-in-2022-even-if-you-are-terrified-8e073853ac46

https://towardsdatascience.com/7-cool-python-packages-kagglers-are-using-without-telling-you-e83298781cf4

https://towardsdatascience.com/22-2-built-in-python-libraries-you-didnt-know-existed-p-guarantee-8-275685dbdb99

https://towardsdatascience.com/good-bye-pandas-meet-terality-its-evil-twin-with-identical-syntax-455b42f33a6d

https://towardsdatascience.com/6-pandas-mistakes-that-silently-tell-you-are-a-rookie-b566a252e60d

https://towardsdatascience.com/8-booming-data-science-libraries-you-must-watch-out-in-2022-cec2dbb42437