Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply mlagility to create ONNX model in CI #605

Merged
merged 57 commits into from
Jun 20, 2023
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
57 commits
Select commit Hold shift + click to select a range
a5a82ef
apply mlagility to create ONNX model in CI
jcwchen Apr 27, 2023
9db5fb2
correct path
jcwchen Apr 27, 2023
65c5d7e
which benchit
jcwchen Apr 27, 2023
d913609
cd mlagility
jcwchen Apr 27, 2023
ccbaf72
pip install -e .
jcwchen Apr 27, 2023
b24081d
add models
jcwchen Apr 27, 2023
acf2326
../mlagility_models/alexnet_torch_hub_2891f54c/onnx/
jcwchen Apr 27, 2023
2731ca4
git diff --exit-code -- .
jcwchen Apr 27, 2023
398b63d
add opset 17 models
jcwchen May 2, 2023
f6c54a1
add faster-rcnn in CIs
jcwchen May 2, 2023
0c06ef3
cd
jcwchen May 2, 2023
ac5f5ec
use 18 instead of 17
jcwchen May 2, 2023
05c495d
Merge branch 'jcw/add-mlagility' of https://github.com/jcwchen/models…
jcwchen May 2, 2023
1a583ec
test opset_version 18
jcwchen May 2, 2023
1f1db90
git diff subdirectory
jcwchen May 3, 2023
500b542
torch==2.0.0 torchvision==0.15.1
jcwchen May 3, 2023
ab889ae
run_mlagility.py
jcwchen May 4, 2023
0380b2f
use better path to store .onnx
jcwchen May 5, 2023
6ff1014
print exception
jcwchen May 6, 2023
db69afb
copy
jcwchen May 6, 2023
a0f0339
+base_name
jcwchen May 6, 2023
dd41d77
correct path from mlagility
jcwchen May 6, 2023
2795d20
add another way to upload new model
jcwchen Jun 6, 2023
37cf55e
validate_model
jcwchen Jun 6, 2023
cea6c52
import os.path as osp
jcwchen Jun 6, 2023
19f6bc3
correct path
jcwchen Jun 6, 2023
934765a
from pathlib import Path
jcwchen Jun 6, 2023
a6f9618
pip install -r
jcwchen Jun 6, 2023
fa5d290
[]
jcwchen Jun 6, 2023
00e4e01
fix
jcwchen Jun 6, 2023
5edb28f
dir_path
jcwchen Jun 6, 2023
388806f
correct path
jcwchen Jun 6, 2023
943fc34
add bart.py
jcwchen Jun 7, 2023
d789edd
change name
jcwchen Jun 7, 2023
3b34a19
update path
jcwchen Jun 7, 2023
ff87370
bart
jcwchen Jun 7, 2023
28c7da1
not os.path.exists(model_name)
jcwchen Jun 8, 2023
d755004
"18"
jcwchen Jun 8, 2023
9eff439
correct
jcwchen Jun 8, 2023
e0f0ff0
change path
jcwchen Jun 14, 2023
358887a
add new-models to CI trigger
jcwchen Jun 14, 2023
bec94e5
add test_data_set
jcwchen Jun 14, 2023
2e99292
add .yml
jcwchen Jun 14, 2023
9a1c97a
ls
jcwchen Jun 14, 2023
6192ee7
stdout=sys.stdout
jcwchen Jun 14, 2023
53b3c98
change commit
jcwchen Jun 14, 2023
0158b32
ls .cache
jcwchen Jun 14, 2023
6196319
ls .cache/alexnet_torch_hub_7df2a577/onnx
jcwchen Jun 14, 2023
2f6d3ee
ls
jcwchen Jun 14, 2023
cfeacf1
stdout=sys.stdout
jcwchen Jun 14, 2023
38d44af
torch==2.0.0
jcwchen Jun 14, 2023
d7d3d27
mlagility_models_dir
jcwchen Jun 14, 2023
7af0369
new config
jcwchen Jun 14, 2023
355c6c3
add 3 more models
jcwchen Jun 15, 2023
4dae47a
add config
jcwchen Jun 15, 2023
0e1a570
remove bert and gpt2 for now
jcwchen Jun 15, 2023
0ef4ef4
clean path
jcwchen Jun 15, 2023
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 2 additions & 3 deletions .github/workflows/codeql.yml
Original file line number Diff line number Diff line change
Expand Up @@ -13,10 +13,9 @@ name: "CodeQL"

on:
push:
branches: [ "main" ]
branches: [ main, new-models]
pull_request:
# The branches below must be a subset of the branches above
branches: [ "main" ]
branches: [ main, new-models]
schedule:
- cron: '31 11 * * 4'

Expand Down
13 changes: 4 additions & 9 deletions .github/workflows/linux_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,11 @@

name: Linux CI

# Triggers the workflow on push or pull request events but only for the main branch
on:
push:
branches: [ main ]
branches: [ main, new-models]
pull_request:
branches: [ main ]
branches: [ main, new-models]

jobs:
# This workflow contains a single job called "build"
Expand Down Expand Up @@ -43,10 +42,6 @@ jobs:
python workflow_scripts/generate_onnx_hub_manifest.py --target diff --drop
git diff --exit-code -- ONNX_HUB_MANIFEST.json || { echo 'Please use "python workflow_scripts/generate_onnx_hub_manifest.py --target diff" to update ONNX_HUB_MANIFEST.json.' ; exit 1; }

- name: Test new models by onnx
- name: Test new models by onnx and onnxruntime
run: |
python workflow_scripts/test_models.py --target onnx --drop

- name: Test new models by onnxruntime
run: |
python workflow_scripts/test_models.py --target onnxruntime --drop
python workflow_scripts/test_models.py --target all --drop
39 changes: 39 additions & 0 deletions .github/workflows/mlagility_validation.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
name: Validate created ONNX model from mlagility

on:
push:
branches: [ main, new-models]
pull_request:
branches: [ main, new-models]

jobs:
build:
runs-on: ubuntu-latest
strategy:
matrix:
python-version: ['3.8']

steps:
- uses: actions/checkout@v3
name: Checkout repo
- uses: conda-incubator/setup-miniconda@v2
with:
miniconda-version: "latest"
activate-environment: mla
python-version: ${{ matrix.python-version }}

- name: Install dependencies and mlagility
run: |
python -m pip install --upgrade pip
python -m pip install onnx onnxruntime requests py-cpuinfo
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Line 36 installs both onnx and onnxruntime. Do we need a separate pip install here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess you are talking about this requirements: https://github.com/groq/mlagility/blob/main/models/requirements.txt, but actually onnx and onnxruntime are not there?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Installation of MLAgility here installs onnx and onnxruntime

# Print CPU info for debugging ONNX Runtime inference difference
python -m cpuinfo
git clone https://github.com/groq/mlagility.git
cd mlagility
pip install -r models/requirements.txt
pip install -e .

- name: Validate created ONNX model from mlagility
run: |
pip install -r models/mlagility/requirements.txt
python workflow_scripts/run_mlagility.py
13 changes: 4 additions & 9 deletions .github/workflows/windows_ci.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,11 @@

name: Windows CI

# Triggers the workflow on push or pull request events but only for the main branch
on:
push:
branches: [ main ]
branches: [ main, new-models]
pull_request:
branches: [ main ]
branches: [ main, new-models]

jobs:
# This workflow contains a single job called "build"
Expand Down Expand Up @@ -37,10 +36,6 @@ jobs:
# Print CPU info for debugging ONNX Runtime inference difference
python -m cpuinfo

- name: Test new models by onnx
- name: Test new models by onnx and onnxruntime
run: |
python workflow_scripts/test_models.py --target onnx --drop

- name: Test new models by onnxruntime
run: |
python workflow_scripts/test_models.py --target onnxruntime --drop
python workflow_scripts/test_models.py --target all --drop
3 changes: 3 additions & 0 deletions models/mlagility/alexnet/alexnet-18.onnx
Git LFS file not shown
3 changes: 3 additions & 0 deletions models/mlagility/alexnet/test_data_set_0/input_0.pb
Git LFS file not shown
3 changes: 3 additions & 0 deletions models/mlagility/alexnet/test_data_set_0/output_0.pb
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
Git LFS file not shown
2 changes: 2 additions & 0 deletions models/mlagility/requirements.txt
Original file line number Diff line number Diff line change
@@ -0,0 +1,2 @@
torch==2.0.0
torchvision==0.15.1
Comment on lines +1 to +2
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These requirements are also covered by mlagility. Do we need a separate requirements file here?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IIUC, mlagility is still using torch<=1.14.0 according to https://github.com/groq/mlagility/blob/main/models/requirements.txt. ONNX needs PyTorch 2.0 here to convert opset_version 18.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

MLAgility includes a setup.py that installs the base dependencies and a requirements.txt that installs additional packages for the mlagility/models.
The setup.py installs torch>=1.12.1 here which should install torch2.0.

3 changes: 3 additions & 0 deletions models/mlagility/resnet50/resnet50-18.onnx
Git LFS file not shown
3 changes: 3 additions & 0 deletions models/mlagility/resnet50/test_data_set_0/input_0.pb
Git LFS file not shown
3 changes: 3 additions & 0 deletions models/mlagility/resnet50/test_data_set_0/output_0.pb
Git LFS file not shown
9 changes: 8 additions & 1 deletion workflow_scripts/check_model.py
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ def has_vnni_support():

def run_onnx_checker(model_path):
model = onnx.load(model_path)
onnx.checker.check_model(model)
onnx.checker.check_model(model, full_check=True)


def ort_skip_reason(model_path):
Expand Down Expand Up @@ -66,3 +66,10 @@ def run_backend_ort(model_path, test_data_set=None, tar_gz_path=None):
ort_test_dir_utils.run_test_dir(test_dir_from_tar)
# remove the produced test_dir from ORT
test_utils.remove_onnxruntime_test_dir()

def run_backend_ort_with_data(model_path):
skip_reason = ort_skip_reason(model_path)
if skip_reason:
print(skip_reason)
return
ort_test_dir_utils.run_test_dir(model_path)
6 changes: 6 additions & 0 deletions workflow_scripts/config.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
models_info = [
# (script_path, model_name, model_zoo_directory)
("torch_hub/alexnet.py", "alexnet_torch_hub_7df2a577", "alexnet"),
("torch_hub/resnet50.py", "resnet50_torch_hub_31acb52e", "resnet50"),
("torchvision/fasterrcnn_resnet50_fpn_v2.py", "fasterrcnn_resnet50_fpn_v2_torchvision_ec445cac", "fasterrcnn_resnet50_fpn_v2"),
]
39 changes: 39 additions & 0 deletions workflow_scripts/run_mlagility.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,39 @@
import config
import os.path as osp
from pathlib import Path
import shutil
import subprocess
import sys


ZOO_OPSET_VERSION = "18"
base_name = f"-op{ZOO_OPSET_VERSION}-base.onnx"
cwd_path = Path.cwd()
mlagility_root = "mlagility/models"
mlagility_models_dir = "models/mlagility"
cache_converted_dir = ".cache"

errors = 0

for script_path, model_name, model_zoo_dir in config.models_info:
try:
print(f"----------------Checking {model_zoo_dir}----------------")
final_model_path = osp.join(mlagility_models_dir, model_zoo_dir, f"{model_zoo_dir}-{ZOO_OPSET_VERSION}.onnx")
subprocess.run(["benchit", osp.join(mlagility_root, script_path), "--cache-dir", cache_converted_dir,
"--onnx-opset", ZOO_OPSET_VERSION, "--export-only"],
cwd=cwd_path, stdout=sys.stdout,
stderr=sys.stderr)
shutil.copy(osp.join(cache_converted_dir, model_name, "onnx", model_name + base_name), final_model_path)
subprocess.run(["git", "diff", "--exit-code", "--", final_model_path],
cwd=cwd_path, stdout=sys.stdout,
stderr=sys.stderr)
print(f"Successfully checked {model_zoo_dir}.")
except Exception as e:
errors += 1
print(f"Failed to check {model_zoo_dir} because of {e}.")

if errors > 0:
print(f"All {len(config.models_info)} model(s) have been checked, but {errors} model(s) failed.")
Fixed Show fixed Hide fixed
sys.exit(1)
else:
print(f"All {len(config.models_info)} model(s) have been checked.")
29 changes: 27 additions & 2 deletions workflow_scripts/test_models.py
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

def get_all_models():
model_list = []
for directory in ["text", "vision"]:
for directory in ["text", "vision", "models"]:
for root, _, files in os.walk(directory):
for file in files:
if file.endswith(tar_ext_name) or file.endswith(onnx_ext_name):
Expand Down Expand Up @@ -100,10 +100,35 @@ def main():
print("[PASS] {} is checked by onnx. ".format(model_name))
# check uploaded standalone ONNX model by ONNX
elif onnx_ext_name in model_name:
test_utils.pull_lfs_file(model_path)
if args.target == "onnx" or args.target == "all":
test_utils.pull_lfs_file(model_path)
check_model.run_onnx_checker(model_path)
print("[PASS] {} is checked by onnx. ".format(model_name))
if args.target == "onnxruntime" or args.target == "all":
try:
# git lfs pull those test_data_set_* folders
root_dir = Path(model_path).parent
for _, dirs, _ in os.walk(root_dir):
for dir in dirs:
if "test_data_set_" in dir:
test_data_set_dir = os.path.join(root_dir, dir)
for _, _, files in os.walk(test_data_set_dir):
for file in files:
if file.endswith(".pb"):
test_utils.pull_lfs_file(os.path.join(test_data_set_dir, file))
check_model.run_backend_ort_with_data(model_path)
print("[PASS] {} is checked by onnxruntime. ".format(model_name))
except Exception as e:
if not args.create:
raise
else:
print("Warning: original test data for {} is broken: {}".format(model_path, e))
test_utils.remove_onnxruntime_test_dir()
if (not model_name.endswith("-int8.onnx") and not model_name.endswith("-qdq.onnx")) or check_model.has_vnni_support():
check_model.run_backend_ort(model_path, None, model_path)
else:
print("Skip quantized models because their test_data_set was created in avx512vnni machines. ")
print("[PASS] {} is checked by onnxruntime. ".format(model_name))

except Exception as e:
print("[FAIL] {}: {}".format(model_name, e))
Expand Down
Loading