## Multipart Upload with AIStore SDK

This notebook demonstrates how to use AIStore's multipart upload functionality to efficiently upload large files by splitting them into smaller parts that can be uploaded concurrently.


### Setup: Create a client and bucket


In [None]:
from aistore import Client

# Connect to AIStore cluster
ais_url = "http://localhost:8080"
client = Client(ais_url)

# Create or get a bucket for our multipart upload examples
bucket = client.bucket("multipart-demo-bck").create(exist_ok=True)
print(f"Using bucket: {bucket.name}")


Using bucket: multipart-demo-bck


## Basic Multipart Upload Workflow

A multipart upload consists of four main steps:
1. **Create** a multipart upload session
2. **Add parts** by uploading content to each part
3. **Complete** the upload to assemble all parts into the final object
4. **Abort** (optional) to cancel the upload if needed


### Example 1: Basic Multipart Upload


In [2]:
# Get a reference to the object we want to upload
obj = bucket.object("my-multipart-object")

# Step 1: Create a multipart upload session
mpu = obj.multipart_upload().create()
print(f"Created multipart upload with ID: {mpu.upload_id}")

# Step 2: Add parts (part numbers must start from 1)
part1_content = b"This is the content of part 1. "
part2_content = b"This is the content of part 2. "
part3_content = b"This is the content of part 3."

# Upload each part
mpu.add_part(1).put_content(part1_content)
mpu.add_part(2).put_content(part2_content)
mpu.add_part(3).put_content(part3_content)

# Step 3: Complete the upload
response = mpu.complete()
print(f"Multipart upload completed with status: {response.status_code}")

# Verify the final object content
final_content = obj.get_reader().read_all()
print(f"Final object size: {len(final_content)} bytes")


Created multipart upload with ID: N3eXTrPE4d
Multipart upload completed with status: 200
Final object size: 92 bytes


### Example 2: Parallel Part Upload

For better performance with large files, you can upload parts in parallel using threading.


In [3]:
import concurrent.futures
import time

def upload_part(mpu, part_number, content):
    """Upload a single part and return timing info."""
    mpu.add_part(part_number).put_content(content)
    return part_number

# Prepare parts for parallel upload
obj = bucket.object("parallel-upload-object")
mpu = obj.multipart_upload().create()

# Create parts with substantial content
parts_data = []
for i in range(1, 6):  # 5 parts
    content = f"Parallel part {i} content: " * 200  # Larger content
    parts_data.append((i, content.encode()))

# Upload parts in parallel using ThreadPoolExecutor
with concurrent.futures.ThreadPoolExecutor(max_workers=3) as executor:
    # Submit all upload tasks
    futures = [
        executor.submit(upload_part, mpu, part_num, content)
        for part_num, content in parts_data
    ]
    
    # Collect results
    results = []
    for future in concurrent.futures.as_completed(futures):
        part_num = future.result()
        results.append(part_num)

# Complete the upload
mpu.complete()

print(f"Parallel upload completed: {results}")
print(f"Final object size: {len(obj.get_reader().read_all())} bytes")


Parallel upload completed: [1, 2, 3, 5, 4]
Final object size: 25000 bytes


### Example 3: Part Number Guidelines and Out-of-Order Upload

- Part numbers must be positive consecutive integers starting with 1 (1, 2, 3, ...)
- Parts are assembled in part number order, not upload order
- You can upload parts out of order


In [4]:
# Demonstrate out-of-order upload
obj = bucket.object("out-of-order-object")
mpu = obj.multipart_upload().create()

# Upload parts in reverse order
parts_content = {
    3: b"This should be third. ",
    1: b"This should be first. ",
    2: b"This should be second. "
}

# Upload in order: 3, 1, 2
for part_num in [3, 1, 2]:
    writer = mpu.add_part(part_num)
    writer.put_content(parts_content[part_num])
    print(f"Uploaded part {part_num}")

print(f"Parts uploaded in order: {mpu.parts}")

# Complete and verify order
mpu.complete()
final_content = obj.get_reader().read_all().decode()
print(f"Final content: '{final_content}'")
print("Content is assembled in part number order (1, 2, 3), not upload order!")


Uploaded part 3
Uploaded part 1
Uploaded part 2
Parts uploaded in order: [3, 1, 2]
Final content: 'This should be first. This should be second. This should be third. '
Content is assembled in part number order (1, 2, 3), not upload order!


## Summary

This notebook demonstrated the key aspects of multipart uploads in AIStore:

1. **Basic workflow**: create → add parts → complete
2. **Parallel uploads** for better performance with threading
3. **Part number rules**: positive integers, assembled in order regardless of upload sequence