Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
26 changes: 26 additions & 0 deletions .github/workflows/_link_check.yml
Original file line number Diff line number Diff line change
Expand Up @@ -55,3 +55,29 @@ jobs:
echo "Or add \`@lint-ignore\` somewhere on the same line as the reference you want to skip checking."
exit 1
}

lint-file-size:
if: ${{ github.event_name == 'pull_request' }}
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main
with:
runner: linux.2xlarge
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we have smaller runners we can use? this isn't particularly resource-intensive

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tried linux.large and docker image pulling failed. Bump it to 2x

docker-image: ci-image:executorch-ubuntu-22.04-linter
submodules: false
fetch-depth: 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIRC 0 is unlimited. couldn't we make this some large (because it gates the maximum depth of a stack) but finite number, like say 50?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I tried look up fetch-depth in both executorch and pytorch, and looks like all are defined as 0 or 1...I can't find definition for this variable so using the same as other jobs to be safe

ref: ${{ inputs.ref }}
timeout: 30
script: |
chmod +x ./scripts/lint_file_size.sh
./scripts/lint_file_size.sh $(
if [ "${{ github.event_name }}" = "pull_request" ]; then
echo "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}"
else
echo "${{ github.event.before }}" "${{ github.sha }}"
fi
) || {
echo
echo "File size lint failed: some files exceed the 1 MB limit."
echo "If you really need large files, consider using Git LFS or storing them elsewhere."
echo "If you really need to get unblocked and check in the file, can add it to the EXCEPTIONS list in scripts/lint_file_size.sh."
exit 1
}
73 changes: 73 additions & 0 deletions scripts/lint_file_size.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
#!/bin/bash
# Copyright (c) Meta Platforms, Inc. and affiliates.
# All rights reserved.
#
# This source code is licensed under the BSD-style license found in the
# LICENSE file in the root directory of this source tree.

set -euo pipefail

status=0

green='\e[1;32m'; red='\e[1;31m'; cyan='\e[1;36m'; reset='\e[0m'

# Following is the rules for the file size linting:
# 1. For all files, the file size can't be larger than 1MB
# 2. For images/videos, the file size can't be larger than 8MB
# 3. There is an exception list defined in the script if it's really needed

# List of files to skip (relative paths)
EXCEPTIONS=(
"examples/models/llama/params/demo_rand_params.pth"
"examples/models/llama/tokenizer/test/resources/test_tiktoken_tokenizer.model"
"examples/qualcomm/oss_scripts/llama/artifacts/stories260k_hybrid_llama_qnn.pte"
# Following needs to be clean up
"examples/mediatek/models/llm_models/weights/Llama-3.2-1B-Instruct/tokenizer.json"
"examples/mediatek/models/llm_models/weights/Llama-3.2-3B-Instruct/tokenizer.json"
"examples/mediatek/models/llm_models/weights/llama3-8B-instruct/tokenizer.json"
)

is_exception() {
local f=$1
for ex in "${EXCEPTIONS[@]}"; do
if [[ "$f" == "$ex" ]]; then
return 0
fi
done
return 1
}

if [ $# -eq 2 ]; then
base=$1
head=$2
echo "Checking changed files between $base...$head"
files=$(git diff --name-only "$base...$head")
else
echo "Checking all files in repository"
files=$(git ls-files)
fi

for file in $files; do
if is_exception "$file"; then
echo -e "${cyan}SKIP${reset} $file (in exception list)"
continue
fi
if [ -f "$file" ]; then
# Set size limit depending on extension
if [[ "$file" =~ \.(png|jpg|jpeg|gif|svg|mp3|mp4)$ ]]; then
MAX_SIZE=$((8 * 1024 * 1024)) # 8 MB for pictures
else
MAX_SIZE=$((1 * 1024 * 1024)) # 1 MB for others
fi

size=$(wc -c <"$file")
if [ "$size" -gt "$MAX_SIZE" ]; then
echo -e "${red}FAIL${reset} $file (${cyan}${size} bytes${reset}) exceeds ${MAX_SIZE} bytes"
status=1
else
echo -e "${green}OK${reset} $file (${size} bytes)"
fi
fi
done

exit $status
Loading