# S3 Write API

## What is S3 Write API?

AWS offers many S3 Write APIs for put, copy, delete objects, and more. Since write APIs can cause irreversible impacts, it's important to ensure that you understand the behavior of the API before using it. In this section, we will learn how to use these APIs.

## Simple Text / Bytes Read and Write

In [1]:
from s3pathlib import S3Path

s3path = S3Path("s3://s3pathlib/file.txt")
s3path

S3Path('s3://s3pathlib/file.txt')

In [2]:
s3path.write_text("Hello Alice!")
s3path.read_text()

'Hello Alice!'

In [3]:
s3path.write_bytes(b"Hello Bob!")
s3path.read_bytes()

b'Hello Bob!'

Note that the ``s3path.write_bytes()`` or ``s3path.write_text()`` will overwrite the existing file silently. They don't raise an error if the file already exists. If you want to avoid overwrite, you can check the existence of the file before writing.

In [46]:
if s3path.exists() is False:
    s3path.write_text("Hello Alice!")

The ``s3path.write_bytes()`` and ``s3path.write_text()`` will return a new object representing the object you just put. This is because on a versioning enabled bucket, the ``put_object`` API will create a new version of the object. So the ``s3path.write_bytes()`` and ``s3path.write_text()`` should return the new version of the object.

In [49]:
# in regular bucket, there's no versioning
s3path_new = s3path.write_text("Hello Alice!")
print(s3path_new == s3path)
print(s3path_new is s3path)

False


In [55]:
# in versioning enabled bucket, write_text() will create a new version
s3path = S3Path("s3://s3pathlib-versioning-enabled/file.txt")
s3path_v1 = s3path.write_text("v1")
s3path_v2 = s3path.write_text("v2")

In [56]:
s3path_v1.read_text(version_id=s3path_v1.version_id)

'v1'

In [57]:
s3path_v2.read_text(version_id=s3path_v2.version_id)

'v2'

In [59]:
print(f"v1 = {s3path_v1.version_id}")
print(f"v2 = {s3path_v2.version_id}")

v1 = 5e6tkjttW8HOiKFTUlhzHDbQaYYd_Hnb
v2 = ujV73FmN0pPMIVFZ1YaviZyGeNCQ1a04


## File-like object IO

[File Object](https://docs.python.org/3/glossary.html#term-file-object( is an object exposing a file-oriented API (with methods such as ``read()`` or ``write()``) to an underlying resource. Depending on the way it was created, a file object can mediate access to a real on-disk file or to another type of storage or communication device (for example standard input/output, in-memory buffers, sockets, pipes, etc.). File objects are also called file-like objects or streams.

- [json](https://docs.python.org/3/library/json.html)
- [yaml](https://pyyaml.org/wiki/PyYAMLDocumentation)
- [pandas](https://pandas.pydata.org/docs/reference/io.html)
- [polars](https://pola-rs.github.io/polars/py-polars/html/reference/io.html)

### JSON

In [4]:
import json

s3path = S3Path("s3://s3pathlib/data.json")

# write to s3
with s3path.open("w") as f:
    json.dump({"name": "Alice"}, f)

In [5]:
# read from s3
with s3path.open("r") as f:
    print(json.load(f))

{'name': 'Alice'}


### YAML

In [36]:
import yaml

s3path = S3Path("s3://s3pathlib/config.yml")

# write to s3
with s3path.open("w") as f:
    yaml.dump({"name": "Alice"}, f)

In [37]:
# read from s3
with s3path.open("r") as f:
    print(yaml.load(f, Loader=yaml.SafeLoader))

{'name': 'Alice'}


### Pandas

In [25]:
import pandas as pd

s3path = S3Path("s3://s3pathlib/data.csv")

df = pd.DataFrame(
    [
        (1, "Alice"),
        (2, "Bob"),
    ],
    columns=["id", "name"]
)

# write to s3
with s3path.open("w") as f:
    df.to_csv(f, index=False)

In [27]:
# read from s3
with s3path.open("r") as f:
    df = pd.read_csv(f)
    print(df)

   id   name
0   1  Alice
1   2    Bob


### Polars

In [28]:
import polars as pl

s3path = S3Path("s3://s3pathlib/data.parquet")

df = pl.DataFrame(
    [
        (1, "Alice"),
        (2, "Bob"),
    ],
    schema=["id", "name"]
)

# write to s3
with s3path.open("wb") as f:
    df.write_parquet(f)

In [29]:
# read from s3
with s3path.open("rb") as f:
    df = pl.read_parquet(f)
    print(df)

shape: (2, 2)
┌─────┬───────┐
│ id  ┆ name  │
│ --- ┆ ---   │
│ i64 ┆ str   │
╞═════╪═══════╡
│ 1   ┆ Alice │
│ 2   ┆ Bob   │
└─────┴───────┘


### Metadata and Tagging

In [38]:
s3path = S3Path("s3://s3pathlib/file.txt")

In [39]:
s3path.write_text("Hello", metadata={"name": "alice", "age": "18"}, tags={"name": "alice", "age": "18"})

S3Path('s3://s3pathlib/file.txt')

In [40]:
s3path.metadata

{'age': '18', 'name': 'alice'}

In [43]:
s3path.get_tags()

(None, {'name': 'alice', 'age': '18'})

There's [no way to only update the metadata without updating the content](https://docs.aws.amazon.com/AmazonS3/latest/userguide/UsingMetadata.html). You have to put the object again with the new metadata.

In [44]:
s3path.write_text("Hello", metadata={"name": "alice", "age": "24"})

S3Path('s3://s3pathlib/file.txt')

In [45]:
S3Path("s3://s3pathlib/file.txt").metadata

{'age': '18', 'name': 'alice'}

## Delete, Copy, Move (Cut)



**Delete**

The ``delete`` API is the recommended API from 2.X.Y to delete:

- object
- directory
- specific version of an object
- all versions of an object
- all object all versions in a directory

By default, if you are trying to delete everything in S3 bucket, it will prompt to confirm the deletion. You can skip the confirmation by setting ``skip_prompt=True``.

In [17]:
s3dir = S3Path("s3://s3pathlib/tmp/")
s3dir.joinpath("README.txt").write_text("readme")
s3dir.joinpath("file.txt").write_text("Hello")
s3dir.joinpath("folder/file.txt").write_text("Hello")
s3dir.count_objects()

3

In [18]:
# Delete a file
s3path_readme = s3dir.joinpath("README.txt")
s3path_readme.delete()
s3path_readme.exists()

False

In [19]:
s3dir.count_objects()

2

In [20]:
# Delete the entire folder
s3dir.delete()
s3dir.count_objects()

0

In [24]:
# Delete a specific version of an object (permanently delete)
s3path = S3Path("s3://s3pathlib-versioning-enabled/file.txt")
s3path.delete(is_hard_delete=True)
v1 = s3path.write_text("v1").version_id
v2 = s3path.write_text("v2").version_id
v3 = s3path.write_text("v3").version_id
s3path.list_object_versions().all()

[S3Path('s3://s3pathlib-versioning-enabled/file.txt'),
 S3Path('s3://s3pathlib-versioning-enabled/file.txt'),
 S3Path('s3://s3pathlib-versioning-enabled/file.txt')]

In [25]:
s3path.delete(version_id=v1)
s3path.read_text(version_id=v1)

ClientError: An error occurred (NoSuchVersion) when calling the GetObject operation: The specified version does not exist.

In [26]:
s3path.list_object_versions().all()

[S3Path('s3://s3pathlib-versioning-enabled/file.txt'),
 S3Path('s3://s3pathlib-versioning-enabled/file.txt')]

In [27]:
# Delete all versions of an object (permanently delete)
s3path.delete(is_hard_delete=True)
s3path.list_object_versions().all()

[]

In [28]:
# Delete all objects all versions in a directory (permanently delete)
s3dir = S3Path("s3://s3pathlib-versioning-enabled/tmp/")
s3path1 = s3dir.joinpath("file1.txt")
s3path2 = s3dir.joinpath("file2.txt")
s3dir.delete(is_hard_delete=True)
s3path1.write_text("v1")
s3path1.write_text("v2")
s3path2.write_text("v1")
s3path2.write_text("v2")
s3dir.list_object_versions().all()

[S3Path('s3://s3pathlib-versioning-enabled/tmp/file1.txt'),
 S3Path('s3://s3pathlib-versioning-enabled/tmp/file1.txt'),
 S3Path('s3://s3pathlib-versioning-enabled/tmp/file2.txt'),
 S3Path('s3://s3pathlib-versioning-enabled/tmp/file2.txt')]

In [29]:
s3path.delete(is_hard_delete=True)
s3path.list_object_versions().all()

[]

**Copy**

In [11]:
s3path_source = S3Path("s3://s3pathlib/source/data.json")
s3path_source.write_text("this is data")
s3path_target = s3path.change(new_dirname="target")
print(f"Copy {s3path_source.uri} to {s3path_target.uri} ...")
s3path_source.copy_to(s3path_target, overwrite=True)
print(f"content of {s3path_target.uri} is: {s3path_target.read_text()!r}")
print(f"{s3path_source} still exists: {s3path_source.exists()}")

Copy s3://s3pathlib/source/data.json to s3://s3pathlib/target/data.json ...
content of s3://s3pathlib/target/data.json is: 'this is data'
S3Path('s3://s3pathlib/source/data.json') still exists: True


**Move**

move is actually copy then delete the original file. It's a shortcut of ``copy_to`` and ``delete``.

In [12]:
s3path_source = S3Path("s3://s3pathlib/source/config.yml")
s3path_source.write_text("this is config")
s3path_target = s3path.change(new_dirname="target")
print(f"Copy {s3path_source.uri} to {s3path_target.uri} ...")
s3path_source.move_to(s3path_target, overwrite=True)
print(f"content of {s3path_target.uri} is: {s3path_target.read_text()!r}")
print(f"{s3path_source} still exists: {s3path_source.exists()}")

Copy s3://s3pathlib/source/config.yml to s3://s3pathlib/target/data.json ...
content of s3://s3pathlib/target/data.json is: 'this is config'
S3Path('s3://s3pathlib/source/config.yml') still exists: False
