Skip to content

Conversation

@codeflash-ai
Copy link

@codeflash-ai codeflash-ai bot commented Oct 21, 2025

📄 12% (0.12x) speedup for MultipartUploadManager._calculate_parts in src/together/filemanager.py

⏱️ Runtime : 36.4 microseconds 32.5 microseconds (best of 664 runs)

📝 Explanation and details

The optimized code achieves a 12% speedup through two key algorithmic improvements:

1. Bit shifting for power-of-2 multiplications

  • Replaced * 1024 * 1024 with << 20 (left shift by 20 bits)
  • This eliminates two multiplication operations per constant calculation, as bit shifting is significantly faster than multiplication for powers of 2

2. Manual ceiling division instead of math.ceil()

  • Replaced math.ceil(file_size / target_part_size) with integer division plus conditional increment: file_size // target_part_size followed by if file_size % target_part_size: num_parts += 1
  • This avoids the overhead of floating-point division, function calls to math.ceil(), and the math module import
  • Applied the same pattern for calculating both num_parts and part_size

The optimizations are most effective for larger files that require multipart uploads, as shown in the test results where cases like "file_just_over_target_size" and "file_multiple_of_target_size" show 36%+ improvements. Small files that return early (single part) see minimal or slight slowdowns due to the additional conditional checks, but the overall performance gain is substantial for the primary use case of large file uploads.

These micro-optimizations compound because _calculate_parts() is called frequently during file upload operations, making the cumulative performance improvement significant.

Correctness verification report:

Test Status
⚙️ Existing Unit Tests 🔘 None Found
🌀 Generated Regression Tests 58 Passed
⏪ Replay Tests 6 Passed
🔎 Concolic Coverage Tests 4 Passed
📊 Tests Coverage 80.0%
🌀 Generated Regression Tests and Runtime
import math

# imports
import pytest
from together.filemanager import MultipartUploadManager

# Constants for test (mocking together.constants)
MAX_CONCURRENT_PARTS = 10  # Not used in _calculate_parts
MAX_MULTIPART_PARTS = 1000
MIN_PART_SIZE_MB = 5
TARGET_PART_SIZE_MB = 100

# Mock TogetherClient (not used in calculation)
class TogetherClient:
    pass
from together.filemanager import MultipartUploadManager


# Helper function to create a manager
def make_manager():
    return MultipartUploadManager(TogetherClient())

# ------------------ UNIT TESTS ------------------

# 1. BASIC TEST CASES

def test_small_file_exactly_target_size():
    # File size exactly at target part size (100MB)
    mgr = make_manager()
    file_size = TARGET_PART_SIZE_MB * 1024 * 1024
    part_size, num_parts = mgr._calculate_parts(file_size) # 596ns -> 658ns (9.42% slower)

def test_small_file_less_than_target():
    # File size less than target part size (e.g. 50MB)
    mgr = make_manager()
    file_size = 50 * 1024 * 1024
    part_size, num_parts = mgr._calculate_parts(file_size) # 611ns -> 574ns (6.45% faster)

def test_file_just_over_target_size():
    # File size just over 100MB
    mgr = make_manager()
    file_size = (TARGET_PART_SIZE_MB * 1024 * 1024) + 1
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.61μs -> 1.18μs (36.7% faster)
    # Should split into 2 parts
    expected_num_parts = 2
    expected_part_size = math.ceil(file_size / expected_num_parts)

def test_file_multiple_of_target_size():
    # File size is a multiple of target part size (e.g. 300MB)
    mgr = make_manager()
    file_size = 3 * TARGET_PART_SIZE_MB * 1024 * 1024
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.26μs -> 922ns (36.3% faster)

def test_file_just_above_min_part_size():
    # File size just above min part size (e.g. 5MB + 1 byte)
    mgr = make_manager()
    file_size = (MIN_PART_SIZE_MB * 1024 * 1024) + 1
    part_size, num_parts = mgr._calculate_parts(file_size) # 530ns -> 564ns (6.03% slower)

# 2. EDGE TEST CASES

def test_file_exactly_min_part_size():
    # File size exactly at min part size (5MB)
    mgr = make_manager()
    file_size = MIN_PART_SIZE_MB * 1024 * 1024
    part_size, num_parts = mgr._calculate_parts(file_size) # 543ns -> 552ns (1.63% slower)

def test_file_smaller_than_min_part_size():
    # File size smaller than min part size (e.g. 1MB)
    mgr = make_manager()
    file_size = 1 * 1024 * 1024
    part_size, num_parts = mgr._calculate_parts(file_size) # 560ns -> 522ns (7.28% faster)

def test_file_just_under_target_size():
    # File size just under 100MB
    mgr = make_manager()
    file_size = (TARGET_PART_SIZE_MB * 1024 * 1024) - 1
    part_size, num_parts = mgr._calculate_parts(file_size) # 518ns -> 530ns (2.26% slower)

def test_file_just_under_min_part_size_multiple_parts():
    # File size that would result in a part size just under min part size
    mgr = make_manager()
    # Set file size so that ceil(file_size / 2) < min_part_size
    min_part_size = MIN_PART_SIZE_MB * 1024 * 1024
    file_size = (min_part_size * 2) - 1
    part_size, num_parts = mgr._calculate_parts(file_size) # 501ns -> 527ns (4.93% slower)

def test_file_size_requires_max_parts():
    # File size large enough to require exactly MAX_MULTIPART_PARTS
    mgr = make_manager()
    target_part_size = TARGET_PART_SIZE_MB * 1024 * 1024
    file_size = MAX_MULTIPART_PARTS * target_part_size + 1  # Just over the limit
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.63μs -> 1.54μs (5.77% faster)
    # Each part should be ceil(file_size / MAX_MULTIPART_PARTS)
    expected_part_size = math.ceil(file_size / MAX_MULTIPART_PARTS)

def test_file_size_requires_more_than_max_parts():
    # File size so large that it would need more than MAX_MULTIPART_PARTS parts
    mgr = make_manager()
    target_part_size = TARGET_PART_SIZE_MB * 1024 * 1024
    file_size = (MAX_MULTIPART_PARTS + 10) * target_part_size
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.30μs -> 1.19μs (9.45% faster)

def test_file_size_not_multiple_of_part_size():
    # File size that is not a multiple of part size
    mgr = make_manager()
    file_size = (TARGET_PART_SIZE_MB * 1024 * 1024 * 3) + 123
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.20μs -> 1.09μs (9.61% faster)

def test_file_size_one_byte():
    # File size of 1 byte
    mgr = make_manager()
    file_size = 1
    part_size, num_parts = mgr._calculate_parts(file_size) # 556ns -> 552ns (0.725% faster)

def test_file_size_zero():
    # File size of 0 bytes (edge case)
    mgr = make_manager()
    file_size = 0
    part_size, num_parts = mgr._calculate_parts(file_size) # 562ns -> 555ns (1.26% faster)

# 3. LARGE SCALE TEST CASES

def test_large_file_max_parts_min_part_size():
    # File size that requires MAX_MULTIPART_PARTS parts, each at min part size
    mgr = make_manager()
    min_part_size = MIN_PART_SIZE_MB * 1024 * 1024
    file_size = MAX_MULTIPART_PARTS * min_part_size
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.46μs -> 1.19μs (22.9% faster)

def test_large_file_max_parts_plus_one_byte():
    # File size just over the max parts * min part size
    mgr = make_manager()
    min_part_size = MIN_PART_SIZE_MB * 1024 * 1024
    file_size = (MAX_MULTIPART_PARTS * min_part_size) + 1
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.27μs -> 1.25μs (1.60% faster)

def test_large_file_multiple_gb():
    # File size of 500GB
    mgr = make_manager()
    file_size = 500 * 1024 * 1024 * 1024
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.34μs -> 1.09μs (22.1% faster)

def test_large_file_with_non_divisible_size():
    # File size that is not divisible by target part size
    mgr = make_manager()
    file_size = (MAX_MULTIPART_PARTS - 1) * TARGET_PART_SIZE_MB * 1024 * 1024 + 12345
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.22μs -> 1.34μs (8.81% slower)

def test_large_file_edge_case_min_part_size():
    # File size just enough to require all parts to be at min_part_size
    mgr = make_manager()
    min_part_size = MIN_PART_SIZE_MB * 1024 * 1024
    file_size = (MAX_MULTIPART_PARTS - 1) * min_part_size + 1
    part_size, num_parts = mgr._calculate_parts(file_size) # 1.24μs -> 1.23μs (0.570% faster)

# Parametrized test for a range of sizes (Basic to Large)
@pytest.mark.parametrize("file_size", [
    1,  # 1 byte
    1024,  # 1 KB
    1024*1024,  # 1 MB
    5*1024*1024,  # 5 MB
    50*1024*1024,  # 50 MB
    100*1024*1024,  # 100 MB
    101*1024*1024,  # 101 MB
    500*1024*1024,  # 500 MB
    1*1024*1024*1024,  # 1 GB
    10*1024*1024*1024,  # 10 GB
])
def test_various_sizes(file_size):
    # Ensure function always returns valid part size and num_parts
    mgr = make_manager()
    part_size, num_parts = mgr._calculate_parts(file_size) # 9.10μs -> 8.06μs (12.9% faster)
    # Each part is at least min_part_size except for small files
    min_part_size = MIN_PART_SIZE_MB * 1024 * 1024
    if file_size > min_part_size:
        pass
# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.
#------------------------------------------------
import math

# imports
import pytest
from together.filemanager import MultipartUploadManager

# codeflash_output is used to check that the output of the original code is the same as that of the optimized code.

#------------------------------------------------
from together.filemanager import MultipartUploadManager
from together.types.abstract import TogetherClient

def test_MultipartUploadManager__calculate_parts():
    MultipartUploadManager._calculate_parts(MultipartUploadManager(TogetherClient(api_key='', base_url=None, timeout=None, max_retries=0, supplied_headers={})), 104857601)

def test_MultipartUploadManager__calculate_parts_2():
    MultipartUploadManager._calculate_parts(MultipartUploadManager(TogetherClient(api_key='', base_url=None, timeout=None, max_retries=0, supplied_headers={})), 0)
⏪ Replay Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
test_pytest_testsunittest_multipart_upload_manager_py_testsunittest_files_upload_routing_py_testsunittest__replay_test_0.py::test_together_filemanager_MultipartUploadManager__calculate_parts 6.85μs 5.63μs 21.7%✅
🔎 Concolic Coverage Tests and Runtime
Test File::Test Function Original ⏱️ Optimized ⏱️ Speedup
codeflash_concolic_atws5rsq/tmpu3uu2bbi/test_concolic_coverage.py::test_MultipartUploadManager__calculate_parts 1.44μs 1.17μs 22.6%✅
codeflash_concolic_atws5rsq/tmpu3uu2bbi/test_concolic_coverage.py::test_MultipartUploadManager__calculate_parts_2 564ns 548ns 2.92%✅

To edit these changes git checkout codeflash/optimize-MultipartUploadManager._calculate_parts-mgzy2g7l and push.

Codeflash

The optimized code achieves a 12% speedup through two key algorithmic improvements:

**1. Bit shifting for power-of-2 multiplications**
- Replaced `* 1024 * 1024` with `<< 20` (left shift by 20 bits)
- This eliminates two multiplication operations per constant calculation, as bit shifting is significantly faster than multiplication for powers of 2

**2. Manual ceiling division instead of math.ceil()**
- Replaced `math.ceil(file_size / target_part_size)` with integer division plus conditional increment: `file_size // target_part_size` followed by `if file_size % target_part_size: num_parts += 1`
- This avoids the overhead of floating-point division, function calls to `math.ceil()`, and the math module import
- Applied the same pattern for calculating both `num_parts` and `part_size`

The optimizations are most effective for larger files that require multipart uploads, as shown in the test results where cases like "file_just_over_target_size" and "file_multiple_of_target_size" show 36%+ improvements. Small files that return early (single part) see minimal or slight slowdowns due to the additional conditional checks, but the overall performance gain is substantial for the primary use case of large file uploads.

These micro-optimizations compound because `_calculate_parts()` is called frequently during file upload operations, making the cumulative performance improvement significant.
@codeflash-ai codeflash-ai bot requested a review from mashraf-222 October 21, 2025 02:27
@codeflash-ai codeflash-ai bot added the ⚡️ codeflash Optimization PR opened by Codeflash AI label Oct 21, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

⚡️ codeflash Optimization PR opened by Codeflash AI

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant