Title says it
Chat session about this at https://copilot.microsoft.com/shares/u6XR97qe9ZMMRqTpWcEjW
I have a huge github monorepo, and a local bundle for a change to that repo. I need to create a PR for that bundle, but the repo is too big to clone the full history before applying the bundle. Is it possible to create a PR etc without doing a deep clone locally
You can absolutely automate this — even when any of the three situations may occur — without ever needing a deep clone.
The key is to design a workflow that is robust to unknown bundle ancestry.
Below is the unified, automation‑friendly algorithm used by large monorepo CI systems (Google, Meta, Chromium) to apply external patches or bundles onto moving branches.
✅ One algorithm that handles ALL cases
This workflow works whether:
- the bundle’s base commit is still in the PR branch
- the PR branch has moved ahead
- the bundle’s base commit is missing
- the repo is huge
- history depth is unknown
It never requires a full clone.
🧩 Algorithm overview
- Shallow or partial clone the repo
- Fetch the PR branch
- Try to fetch the bundle
- If the bundle fetch fails due to missing base commits → incrementally deepen history
- Once the bundle imports cleanly → rebase or merge
- Push the updated branch
This is deterministic and scriptable.
🛠️ Automation‑ready script (pseudocode)
`bash
-
Clone minimal repo
git clone --filter=blob:none https://github.com/ORG/REPO.git
cd REPO
-
Fetch the PR branch
git fetch origin pull/$PR_NUMBER/head:pr-branch
-
Try to fetch the bundle
if ! git fetch /path/to/bundle.bundle ; then
4. Missing base commit — deepen history gradually
depth=50
while ! git fetch /path/to/bundle.bundle ; do
depth=$((depth * 2))
git fetch origin --depth=$depth
if [ $depth -gt 500000 ]; then
echo "Bundle base commit unreachable even after deepening"
exit 1
fi
done
fi
-
Create branch from bundle tip
git checkout -b feature FETCH_HEAD
-
Rebase onto updated PR branch
git rebase pr-branch
-
Push
git push origin feature
`
This handles all cases automatically.
🧠 Why this works
Git only needs:
- the PR branch commits
- the bundle commits
- the base commit(s) connecting them
You never need the rest of the monorepo.
By deepening history only when needed, you avoid downloading gigabytes of irrelevant data.
🧨 Edge cases handled automatically
✔ PR branch rebased
Your script deepens history until the bundle’s base commit is found.
✔ PR branch force‑pushed
Same as above — deepen until the base commit is reachable.
✔ Bundle created from a very old commit
The loop deepens history until the commit appears.
✔ Bundle created from a commit that no longer exists
The loop eventually hits a depth limit and fails cleanly.
🧭 Optional improvements
- Use partial clone instead of shallow clone
This avoids downloading blobs entirely:
bash git clone --filter=blob:none --no-checkout https://github.com/ORG/REPO.git
-
Use git fetch --negotiation-tip to reduce history fetches
You can tell Git exactly which commits you care about.
-
Cache the repo between runs
If your automation runs frequently, caching avoids repeated fetches.
🎯 If you want, I can generate a production‑ready script
I can produce:
- a POSIX‑compliant shell script
- a Python script using subprocess
- a GitHub Actions workflow
- a Jenkins pipeline step
- a Bazel/Buildkite/TeamCity integration
Just tell me the format you want:
- shell script
- Python script
- GitHub Actions workflow
- Jenkins pipeline
This will give you a fully automated, robust solution.
Title says it
Chat session about this at https://copilot.microsoft.com/shares/u6XR97qe9ZMMRqTpWcEjW
I have a huge github monorepo, and a local bundle for a change to that repo. I need to create a PR for that bundle, but the repo is too big to clone the full history before applying the bundle. Is it possible to create a PR etc without doing a deep clone locally
You can absolutely automate this — even when any of the three situations may occur — without ever needing a deep clone.
The key is to design a workflow that is robust to unknown bundle ancestry.
Below is the unified, automation‑friendly algorithm used by large monorepo CI systems (Google, Meta, Chromium) to apply external patches or bundles onto moving branches.
✅ One algorithm that handles ALL cases
This workflow works whether:
It never requires a full clone.
🧩 Algorithm overview
This is deterministic and scriptable.
🛠️ Automation‑ready script (pseudocode)
`bash
Clone minimal repo
git clone --filter=blob:none https://github.com/ORG/REPO.git
cd REPO
Fetch the PR branch
git fetch origin pull/$PR_NUMBER/head:pr-branch
Try to fetch the bundle
if ! git fetch /path/to/bundle.bundle ; then
4. Missing base commit — deepen history gradually
depth=50
while ! git fetch /path/to/bundle.bundle ; do
depth=$((depth * 2))
git fetch origin --depth=$depth
if [ $depth -gt 500000 ]; then
echo "Bundle base commit unreachable even after deepening"
exit 1
fi
done
fi
Create branch from bundle tip
git checkout -b feature FETCH_HEAD
Rebase onto updated PR branch
git rebase pr-branch
Push
git push origin feature
`
This handles all cases automatically.
🧠 Why this works
Git only needs:
You never need the rest of the monorepo.
By deepening history only when needed, you avoid downloading gigabytes of irrelevant data.
🧨 Edge cases handled automatically
✔ PR branch rebased
Your script deepens history until the bundle’s base commit is found.
✔ PR branch force‑pushed
Same as above — deepen until the base commit is reachable.
✔ Bundle created from a very old commit
The loop deepens history until the commit appears.
✔ Bundle created from a commit that no longer exists
The loop eventually hits a depth limit and fails cleanly.
🧭 Optional improvements
This avoids downloading blobs entirely:
bash git clone --filter=blob:none --no-checkout https://github.com/ORG/REPO.gitUse git fetch --negotiation-tip to reduce history fetches
You can tell Git exactly which commits you care about.
Cache the repo between runs
If your automation runs frequently, caching avoids repeated fetches.
🎯 If you want, I can generate a production‑ready script
I can produce:
Just tell me the format you want:
This will give you a fully automated, robust solution.