-
Notifications
You must be signed in to change notification settings - Fork 685
Add CI for checking file size #14274
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
0d07492
d33dc1d
2537273
6febe36
cfbadfa
b746e93
5bba482
8e846bf
4d3f622
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -55,3 +55,29 @@ jobs: | |
echo "Or add \`@lint-ignore\` somewhere on the same line as the reference you want to skip checking." | ||
exit 1 | ||
} | ||
|
||
lint-file-size: | ||
if: ${{ github.event_name == 'pull_request' }} | ||
uses: pytorch/test-infra/.github/workflows/linux_job_v2.yml@main | ||
with: | ||
runner: linux.2xlarge | ||
docker-image: ci-image:executorch-ubuntu-22.04-linter | ||
submodules: false | ||
fetch-depth: 0 | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. IIRC 0 is unlimited. couldn't we make this some large (because it gates the maximum depth of a stack) but finite number, like say 50? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I tried look up |
||
ref: ${{ inputs.ref }} | ||
timeout: 30 | ||
script: | | ||
chmod +x ./scripts/lint_file_size.sh | ||
./scripts/lint_file_size.sh $( | ||
if [ "${{ github.event_name }}" = "pull_request" ]; then | ||
echo "${{ github.event.pull_request.base.sha }}" "${{ github.event.pull_request.head.sha }}" | ||
else | ||
echo "${{ github.event.before }}" "${{ github.sha }}" | ||
fi | ||
) || { | ||
echo | ||
echo "File size lint failed: some files exceed the 1 MB limit." | ||
echo "If you really need large files, consider using Git LFS or storing them elsewhere." | ||
echo "If you really need to get unblocked and check in the file, can add it to the EXCEPTIONS list in scripts/lint_file_size.sh." | ||
exit 1 | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,73 @@ | ||
#!/bin/bash | ||
# Copyright (c) Meta Platforms, Inc. and affiliates. | ||
# All rights reserved. | ||
# | ||
# This source code is licensed under the BSD-style license found in the | ||
# LICENSE file in the root directory of this source tree. | ||
|
||
set -euo pipefail | ||
|
||
status=0 | ||
|
||
green='\e[1;32m'; red='\e[1;31m'; cyan='\e[1;36m'; reset='\e[0m' | ||
|
||
# Following is the rules for the file size linting: | ||
# 1. For all files, the file size can't be larger than 1MB | ||
# 2. For images/videos, the file size can't be larger than 8MB | ||
# 3. There is an exception list defined in the script if it's really needed | ||
|
||
# List of files to skip (relative paths) | ||
EXCEPTIONS=( | ||
"examples/models/llama/params/demo_rand_params.pth" | ||
"examples/models/llama/tokenizer/test/resources/test_tiktoken_tokenizer.model" | ||
"examples/qualcomm/oss_scripts/llama/artifacts/stories260k_hybrid_llama_qnn.pte" | ||
# Following needs to be clean up | ||
"examples/mediatek/models/llm_models/weights/Llama-3.2-1B-Instruct/tokenizer.json" | ||
"examples/mediatek/models/llm_models/weights/Llama-3.2-3B-Instruct/tokenizer.json" | ||
"examples/mediatek/models/llm_models/weights/llama3-8B-instruct/tokenizer.json" | ||
) | ||
|
||
is_exception() { | ||
local f=$1 | ||
for ex in "${EXCEPTIONS[@]}"; do | ||
if [[ "$f" == "$ex" ]]; then | ||
return 0 | ||
fi | ||
done | ||
return 1 | ||
} | ||
|
||
if [ $# -eq 2 ]; then | ||
base=$1 | ||
head=$2 | ||
echo "Checking changed files between $base...$head" | ||
files=$(git diff --name-only "$base...$head") | ||
else | ||
echo "Checking all files in repository" | ||
files=$(git ls-files) | ||
fi | ||
|
||
for file in $files; do | ||
if is_exception "$file"; then | ||
echo -e "${cyan}SKIP${reset} $file (in exception list)" | ||
continue | ||
fi | ||
if [ -f "$file" ]; then | ||
# Set size limit depending on extension | ||
if [[ "$file" =~ \.(png|jpg|jpeg|gif|svg|mp3|mp4)$ ]]; then | ||
MAX_SIZE=$((8 * 1024 * 1024)) # 8 MB for pictures | ||
else | ||
MAX_SIZE=$((1 * 1024 * 1024)) # 1 MB for others | ||
fi | ||
|
||
size=$(wc -c <"$file") | ||
if [ "$size" -gt "$MAX_SIZE" ]; then | ||
echo -e "${red}FAIL${reset} $file (${cyan}${size} bytes${reset}) exceeds ${MAX_SIZE} bytes" | ||
status=1 | ||
else | ||
echo -e "${green}OK${reset} $file (${size} bytes)" | ||
fi | ||
fi | ||
done | ||
|
||
exit $status |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have smaller runners we can use? this isn't particularly resource-intensive
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tried linux.large and docker image pulling failed. Bump it to 2x