Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 27, 2025

Add test for subdirectory loading with data transfer profiling and optimize with blobless clone

Plan:

  • Create a custom HTTP handler to track bytes transferred
  • Add test for loading the specific EVE Online bot subdirectory from Viir/bots repo
  • Add profiling code to measure data transfer
  • Assert a reasonable bound on the transferred data
  • Verify test passes and data transfer is optimized
  • Fix code review feedback - header preservation bug
  • Pass CodeQL security scan
  • Optimize data transfer for subdirectory loading (shallow clone - 60% reduction)
  • Consolidate redundant code
  • Improve parameter flexibility
  • Document optimization limitations
  • Implement blobless clone with selective blob fetching
  • Consolidate redundancies to reduce code duplication

Latest Update - Code Consolidation:

Eliminated redundant code to improve maintainability and reduce duplication:

Changes Made:

  1. Merged BuildUploadPackRequest and BuildUploadPackRequestWithFilter:

    • Combined into single method with optional filter parameter
    • Removed ~40 lines of duplicate code
    • Single source of truth for Git protocol request building
  2. Extracted FetchPackFileWithRequestBodyAsync helper:

    • Consolidated common HTTP request pattern used by FetchBloblessPackFileAsync and FetchSpecificObjectsAsync
    • Eliminates duplication of URL setup, refs discovery, and HTTP request handling
    • Both methods now delegate to this common helper
  3. Extracted ParsePackFileAndGetCommit helper:

    • Consolidates pack file parsing and commit extraction logic
    • Used by both LoadTreeContentsFromPackFile and LoadSubdirectoryContentsFromPackFile
    • Single implementation of pack file indexing and commit validation

Results:

  • ✅ Reduced code by 57 lines (from 830 to 773 lines)
  • ✅ Improved maintainability - changes to common logic only need to be made once
  • ✅ Enhanced robustness - single implementation reduces risk of inconsistencies
  • ✅ All 6 integration tests passing
  • ✅ All 3 unit tests passing
  • ✅ No functional changes - identical behavior maintained
Original prompt

Add a test which load the file tree from this subdirectory: https://github.com/Viir/bots/tree/c42f50d6b4dc4640c62b1c3ecade7187eaabf888/implement/applications/eve-online/eve-online-combat-anomaly-bot
That repository is larger, so we want to optimize data transferred now, corresponding to the subdirectory. Add some profiling code that tracks the amount of transferred data, and have the test assert a bound.


💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits October 27, 2025 20:28
…rofiling

Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
Copilot AI changed the title [WIP] Add test to load file tree from subdirectory with data profiling Add test for subdirectory loading with data transfer profiling Oct 27, 2025
Copilot AI requested a review from Viir October 27, 2025 20:35
@github-actions
Copy link

github-actions bot commented Oct 28, 2025

Test Results Overview macos-14

9 tests  +9   9 ✅ +9   4s ⏱️ +4s
2 suites +2   0 💤 ±0 
2 files   +2   0 ❌ ±0 

Results for commit 656b1cb. ± Comparison against base commit 1742625.

♻️ This comment has been updated with latest results.

@github-actions
Copy link

github-actions bot commented Oct 28, 2025

Test Results Overview ubuntu-24.04

9 tests  ±0   9 ✅ ±0   3s ⏱️ -1s
2 suites ±0   0 💤 ±0 
2 files   ±0   0 ❌ ±0 

Results for commit 656b1cb. ± Comparison against base commit 1742625.

This pull request removes 1 and adds 1 tests. Note that renamed tests count towards both.
GitCore.IntegrationTests.LoadFromGitHubTests ‑ Placeholder
GitCore.IntegrationTests.LoadFromGitHubTests ‑ Load_relatively_small_subdirectory_from_larger_repository

♻️ This comment has been updated with latest results.

@Viir
Copy link
Owner

Viir commented Oct 28, 2025

@copilot It looks like the implementation downloads the whole repository contents. The test fails because downloaded amount is > 8 MB. Can you optimize it to only load a subset that is sufficient for the subdirectory the user requested?

@github-actions
Copy link

github-actions bot commented Oct 28, 2025

Test Results Overview windows-2022

9 tests  ±0   9 ✅ ±0   3s ⏱️ +2s
2 suites ±0   0 💤 ±0 
2 files   ±0   0 ❌ ±0 

Results for commit 656b1cb. ± Comparison against base commit 1742625.

This pull request removes 1 and adds 1 tests. Note that renamed tests count towards both.
GitCore.IntegrationTests.LoadFromGitHubTests ‑ Placeholder
GitCore.IntegrationTests.LoadFromGitHubTests ‑ Load_relatively_small_subdirectory_from_larger_repository

♻️ This comment has been updated with latest results.

…fer by 60%

Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Oct 28, 2025

@copilot It looks like the implementation downloads the whole repository contents. The test fails because downloaded amount is > 8 MB. Can you optimize it to only load a subset that is sufficient for ...

Optimized subdirectory loading to reduce data transfer by ~60%. The implementation now uses Git's shallow clone capability (depth 1) to fetch only the specific commit without history, reducing transfer from ~8.7 MB to ~3.5 MB. Test now passes with subdirectory containing 490 KB of content across 23 files. (commit 7e86ca8)

Copilot AI changed the title Add test for subdirectory loading with data transfer profiling Add test for subdirectory loading with data transfer profiling and optimize with shallow fetch Oct 28, 2025
@Viir
Copy link
Owner

Viir commented Oct 28, 2025

@copilot consolidate the redundant code in BuildUploadPackRequest and similar

Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Oct 28, 2025

@copilot consolidate the redundant code in BuildUploadPackRequest and similar

Consolidated BuildUploadPackRequest and BuildUploadPackRequestWithShallow into a single parameterized method with an optional useShallow parameter. This eliminates ~24 lines of duplicate code while maintaining identical functionality. (commit 909116e)

@Viir
Copy link
Owner

Viir commented Oct 28, 2025

@copilot make the parameter int? shallowDepth = null

Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Oct 28, 2025

@copilot make the parameter int? shallowDepth = null

Changed parameter from bool useShallow to int? shallowDepth = null. The method now uses the depth value directly in the deepen command, making it more flexible and expressive. (commit 21e331a)

@Viir
Copy link
Owner

Viir commented Oct 28, 2025

@copilot shallow clone is nice, but it does not scale to our use case, because the download size is still independent from the selected subdirectory. How can we adapt cloning to not download contents of other directories?

Copy link
Contributor Author

Copilot AI commented Oct 28, 2025

@copilot shallow clone is nice, but it does not scale to our use case, because the download size is still independent from the selected subdirectory. How can we adapt cloning to not download contents ...

You're right - shallow clone reduces history but still fetches all files at that commit (~3.5 MB). True subdirectory-only fetching would require Git Protocol v2 with partial clone filters or a multi-stage approach: (1) fetch trees only with filter=blob:none, (2) navigate to subdirectory tree SHA, (3) request only those specific blobs. However, GitHub's protocol support and the complexity of implementing multi-round-trip fetching with blob-by-blob requests makes this challenging. The current shallow approach provides ~60% reduction but can't scale further without protocol v2 or sparse checkout support. Would you like me to explore implementing the multi-stage approach despite the added complexity?

Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
@Viir
Copy link
Owner

Viir commented Oct 28, 2025

@copilot For the optimization of cloning subdirectories from larger repositories, proceed as follows:
If a subdirectory is given, start with a blobless clone. Then use the trees to identify which blobs we need to request.
For further optimization, also add two function delegates which allow for configuring a cache for blobs, that a consumer can maintain on their side. The first delegate asks for a blob by sha, returning a nullable content. The second delegate is to be invoked when we received a blobs content, with sha and content as arguments , so that a caller can update their cache accordingly.

…port

Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Oct 28, 2025

@copilot For the optimization of cloning subdirectories from larger repositories, proceed as follows:
If a subdirectory is given, start with a blobless clone. Then use the trees to identify which bl...

Implemented blobless clone with selective blob fetching and caching support. The optimization uses filter=blob:none to fetch only trees first, navigates to the subdirectory to identify needed blobs, then requests only those specific blobs. Added getBlobFromCache and storeBlobInCache delegates for consumer-side blob caching. Data transfer reduced from ~3.5 MB to ~131 KB (96% reduction), now scaling with subdirectory size instead of repository size. (commit c30e31f)

Copilot AI changed the title Add test for subdirectory loading with data transfer profiling and optimize with shallow fetch Add test for subdirectory loading with data transfer profiling and optimize with blobless clone Oct 28, 2025
@Viir
Copy link
Owner

Viir commented Oct 28, 2025

@copilot consolidate the redundancies in BuildUploadPackRequest and BuildUploadPackRequestWithFilter. Also consolidate other redundancies where appropriate to reduce code duplication and improve robustness.

…rsing methods

Co-authored-by: Viir <19209696+Viir@users.noreply.github.com>
Copy link
Contributor Author

Copilot AI commented Oct 28, 2025

@copilot consolidate the redundancies in BuildUploadPackRequest and BuildUploadPackRequestWithFilter. Also consolidate other redundancies where appropriate to reduce code duplication and improve robus...

Consolidated redundant code across multiple methods. Merged BuildUploadPackRequest and BuildUploadPackRequestWithFilter into single method, extracted FetchPackFileWithRequestBodyAsync helper to eliminate HTTP request duplication, and created ParsePackFileAndGetCommit helper for common pack file parsing logic. Reduced codebase by 57 lines while maintaining identical functionality. (commit 7eb5066)

@Viir Viir force-pushed the copilot/add-file-tree-test-eve-online branch from a0f93f1 to fb51c1e Compare October 28, 2025 08:44
@Viir Viir marked this pull request as ready for review October 28, 2025 08:46
@Viir Viir merged commit ec8f540 into main Oct 28, 2025
9 checks passed
@Viir Viir deleted the copilot/add-file-tree-test-eve-online branch October 28, 2025 12:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants