Skip to content

Replicate CDN ranges automatically#139

Merged
blotus merged 5 commits intomasterfrom
external-allowlist-replication
Apr 1, 2026
Merged

Replicate CDN ranges automatically#139
blotus merged 5 commits intomasterfrom
external-allowlist-replication

Conversation

@blotus
Copy link
Copy Markdown
Member

@blotus blotus commented Mar 30, 2026

No description provided.

Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a scheduled GitHub Actions workflow to periodically download CDN/provider IP ranges from external sources, write them into repo-managed whitelist files, and open a PR with the updates.

Changes:

  • Introduces a new scheduled/dispatchable workflow to fetch, transform (optional jq), and write external source outputs.
  • Adds a JSON configuration file describing each external source (URL, output path, optional jq_filter/header).
  • Automates PR creation (and auto-merge) when outputs change.

Reviewed changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.

File Description
.github/workflows/sync-external-sources.yml New workflow to download/process sources, commit updates, open PR, and auto-merge.
.github/sync-sources.json Defines the external sources and output file locations (including AWS ip-ranges jq filters).

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines +39 to +90
for i in $(seq 0 $(( COUNT - 1 ))); do
NAME=$(jq -r ".[$i].name" "$CONFIG_FILE")
URL=$(jq -r ".[$i].url" "$CONFIG_FILE")
OUTPUT=$(jq -r ".[$i].output" "$CONFIG_FILE")
JQ_FILTER=$(jq -r ".[$i].jq_filter // empty" "$CONFIG_FILE")
HEADER=$(jq -r ".[$i].header // empty" "$CONFIG_FILE")

echo "--- Processing: $NAME ---"
echo " URL: $URL"
echo " Output: $OUTPUT"

TMPFILE=$(mktemp)
if ! curl -fsSL "$URL" -o "$TMPFILE"; then
echo "::warning::Failed to download $NAME from $URL"
ERRORS="$ERRORS $NAME"
rm -f "$TMPFILE"
continue
fi

OUTFILE=$(mktemp)

if [ -n "$HEADER" ]; then
echo "$HEADER" > "$OUTFILE"
else
: > "$OUTFILE"
fi

if [ -n "$JQ_FILTER" ]; then
if ! jq -r "$JQ_FILTER" "$TMPFILE" >> "$OUTFILE"; then
echo "::warning::jq filter failed for $NAME"
ERRORS="$ERRORS $NAME"
rm -f "$TMPFILE" "$OUTFILE"
continue
fi
else
cat "$TMPFILE" >> "$OUTFILE"
fi

rm -f "$TMPFILE"

mkdir -p "$(dirname "$OUTPUT")"

if [ -f "$OUTPUT" ] && cmp -s "$OUTFILE" "$OUTPUT"; then
echo " No changes for $NAME"
rm -f "$OUTFILE"
else
mv "$OUTFILE" "$OUTPUT"
echo " Updated: $OUTPUT"
CHANGED_FILES="$CHANGED_FILES $OUTPUT"
CHANGED_NAMES="$CHANGED_NAMES $NAME"
fi
done
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If .github/sync-sources.json is an empty array, seq 0 $((COUNT - 1)) becomes seq 0 -1 and the step will fail under set -e. Consider guarding for COUNT == 0 (skip loop) or iterating via jq -c '.[]' instead of index arithmetic.

Suggested change
for i in $(seq 0 $(( COUNT - 1 ))); do
NAME=$(jq -r ".[$i].name" "$CONFIG_FILE")
URL=$(jq -r ".[$i].url" "$CONFIG_FILE")
OUTPUT=$(jq -r ".[$i].output" "$CONFIG_FILE")
JQ_FILTER=$(jq -r ".[$i].jq_filter // empty" "$CONFIG_FILE")
HEADER=$(jq -r ".[$i].header // empty" "$CONFIG_FILE")
echo "--- Processing: $NAME ---"
echo " URL: $URL"
echo " Output: $OUTPUT"
TMPFILE=$(mktemp)
if ! curl -fsSL "$URL" -o "$TMPFILE"; then
echo "::warning::Failed to download $NAME from $URL"
ERRORS="$ERRORS $NAME"
rm -f "$TMPFILE"
continue
fi
OUTFILE=$(mktemp)
if [ -n "$HEADER" ]; then
echo "$HEADER" > "$OUTFILE"
else
: > "$OUTFILE"
fi
if [ -n "$JQ_FILTER" ]; then
if ! jq -r "$JQ_FILTER" "$TMPFILE" >> "$OUTFILE"; then
echo "::warning::jq filter failed for $NAME"
ERRORS="$ERRORS $NAME"
rm -f "$TMPFILE" "$OUTFILE"
continue
fi
else
cat "$TMPFILE" >> "$OUTFILE"
fi
rm -f "$TMPFILE"
mkdir -p "$(dirname "$OUTPUT")"
if [ -f "$OUTPUT" ] && cmp -s "$OUTFILE" "$OUTPUT"; then
echo " No changes for $NAME"
rm -f "$OUTFILE"
else
mv "$OUTFILE" "$OUTPUT"
echo " Updated: $OUTPUT"
CHANGED_FILES="$CHANGED_FILES $OUTPUT"
CHANGED_NAMES="$CHANGED_NAMES $NAME"
fi
done
if [ "$COUNT" -eq 0 ]; then
echo "No sources configured in $CONFIG_FILE; nothing to do."
else
for i in $(seq 0 $(( COUNT - 1 ))); do
NAME=$(jq -r ".[$i].name" "$CONFIG_FILE")
URL=$(jq -r ".[$i].url" "$CONFIG_FILE")
OUTPUT=$(jq -r ".[$i].output" "$CONFIG_FILE")
JQ_FILTER=$(jq -r ".[$i].jq_filter // empty" "$CONFIG_FILE")
HEADER=$(jq -r ".[$i].header // empty" "$CONFIG_FILE")
echo "--- Processing: $NAME ---"
echo " URL: $URL"
echo " Output: $OUTPUT"
TMPFILE=$(mktemp)
if ! curl -fsSL "$URL" -o "$TMPFILE"; then
echo "::warning::Failed to download $NAME from $URL"
ERRORS="$ERRORS $NAME"
rm -f "$TMPFILE"
continue
fi
OUTFILE=$(mktemp)
if [ -n "$HEADER" ]; then
echo "$HEADER" > "$OUTFILE"
else
: > "$OUTFILE"
fi
if [ -n "$JQ_FILTER" ]; then
if ! jq -r "$JQ_FILTER" "$TMPFILE" >> "$OUTFILE"; then
echo "::warning::jq filter failed for $NAME"
ERRORS="$ERRORS $NAME"
rm -f "$TMPFILE" "$OUTFILE"
continue
fi
else
cat "$TMPFILE" >> "$OUTFILE"
fi
rm -f "$TMPFILE"
mkdir -p "$(dirname "$OUTPUT")"
if [ -f "$OUTPUT" ] && cmp -s "$OUTFILE" "$OUTPUT"; then
echo " No changes for $NAME"
rm -f "$OUTFILE"
else
mv "$OUTFILE" "$OUTPUT"
echo " Updated: $OUTPUT"
CHANGED_FILES="$CHANGED_FILES $OUTPUT"
CHANGED_NAMES="$CHANGED_NAMES $NAME"
fi
done
fi

Copilot uses AI. Check for mistakes.
Comment on lines +102 to +105
fi

- name: Create PR with changes
if: env.NEEDS_UPDATE == 'true'
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The workflow records download/jq failures in ERRORS but still proceeds to open and auto-merge a PR if any other source changed. This can merge partial/incomplete updates. Consider failing the job when ERRORS is non-empty, or at least skipping PR creation/merge when any source failed.

Suggested change
fi
- name: Create PR with changes
if: env.NEEDS_UPDATE == 'true'
echo "HAS_ERRORS=true" >> "$GITHUB_ENV"
fi
- name: Create PR with changes
if: env.NEEDS_UPDATE == 'true' && env.HAS_ERRORS != 'true'

Copilot uses AI. Check for mistakes.
Comment on lines +125 to +126

gh pr merge --auto --merge "$PR_URL"
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Auto-merging content pulled directly from third-party URLs (gh pr merge --auto --merge) creates a supply-chain risk (a compromised upstream could push arbitrary changes into this repo). Consider removing auto-merge, requiring review, and/or validating that the generated files only contain expected IP/CIDR lines before creating/merging the PR.

Suggested change
gh pr merge --auto --merge "$PR_URL"

Copilot uses AI. Check for mistakes.
Comment on lines +1 to +7
name: Sync External Sources

on:
schedule:
- cron: "0 0 * * *" # every day at midnight UTC
workflow_dispatch: {}

Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There’s no concurrency configured, so a manual dispatch overlapping with the scheduled run (or a slow run overlapping the next schedule) can create multiple branches/PRs and potential merge conflicts. Consider adding a workflow-level concurrency group (e.g., based on workflow name) with cancel-in-progress: true.

Copilot uses AI. Check for mistakes.
Comment thread .github/sync-sources.json
{
"name": "cloudflare-ipv6",
"url": "https://www.cloudflare.com/ips-v6",
"output": "whitelists/cdn_providers/cloudflare_ip6s.txt"
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cloudflare_ip6s.txt is a slightly ambiguous filename ("ip6s" isn’t a common abbreviation elsewhere in this repo). Consider renaming to something clearer/consistent like cloudflare_ipv6.txt/cloudflare_ipv6s.txt to make it easier to discover and avoid accidental typos in downstream consumers.

Suggested change
"output": "whitelists/cdn_providers/cloudflare_ip6s.txt"
"output": "whitelists/cdn_providers/cloudflare_ipv6s.txt"

Copilot uses AI. Check for mistakes.
Comment on lines +85 to +89
mv "$OUTFILE" "$OUTPUT"
echo " Updated: $OUTPUT"
CHANGED_FILES="$CHANGED_FILES $OUTPUT"
CHANGED_NAMES="$CHANGED_NAMES $NAME"
fi
Copy link

Copilot AI Apr 1, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

CHANGED_FILES/CHANGED_NAMES are built by prefixing each append with a space, which leaves a leading space in the final values (affecting PR title/body formatting) and relies on word-splitting later. Consider using a newline-delimited list or a bash array and joining when needed.

Copilot uses AI. Check for mistakes.
@blotus blotus merged commit 4460cc1 into master Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants