Skip to content

Commit

Permalink
Merge branch 'main' into LayoutLMv3-TFLite-conversion-support
Browse files Browse the repository at this point in the history
  • Loading branch information
salmanmaq committed Apr 26, 2024
2 parents 382bda3 + c55f882 commit 4901d1d
Show file tree
Hide file tree
Showing 48 changed files with 1,029 additions and 104 deletions.
10 changes: 10 additions & 0 deletions .github/PULL_REQUEST_TEMPLATE.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,3 +20,13 @@ Fixes # (issue)
- [ ] Did you make sure to update the documentation with your changes?
- [ ] Did you write any new necessary tests?

## Who can review?

<!--
For faster review, we strongly recommend you to ping the following people:
- ONNX / ONNX Runtime : @fxmarty, @echarlaix, @JingyaHuang, @michaelbenayoun
- ONNX Runtime Training: @JingyaHuang
- BetterTransformer: @fxmarty
- GPTQ, quantization: @fxmarty, @SunMarc
- TFLite export: @michaelbenayoun
-->
20 changes: 18 additions & 2 deletions .github/workflows/build_main_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -49,10 +49,14 @@ jobs:
repository: 'huggingface/optimum-amd'
path: optimum-amd

- uses: actions/checkout@v2
with:
repository: 'huggingface/optimum-tpu'
path: optimum-tpu

- name: Free disk space
run: |
df -h
sudo apt-get update
sudo apt-get purge -y '^apache.*'
sudo apt-get purge -y '^imagemagick.*'
sudo apt-get purge -y '^dotnet.*'
Expand Down Expand Up @@ -133,6 +137,8 @@ jobs:
run: |
cd optimum-furiosa
pip install .
sudo apt install software-properties-common
sudo add-apt-repository --remove https://packages.microsoft.com/ubuntu/22.04/prod
sudo apt update
sudo apt install -y ca-certificates apt-transport-https gnupg
sudo apt-key adv --keyserver keyserver.ubuntu.com --recv-key 5F03AFA423A751913F249259814F888B20B09A7E
Expand All @@ -150,6 +156,16 @@ jobs:
mv furiosa-doc-build ../optimum
cd ..
- name: Make TPU documentation
run: |
sudo docker system prune -a -f
cd optimum-tpu
pip install -U pip
pip install . -f https://storage.googleapis.com/libtpu-releases/index.html
doc-builder build optimum.tpu docs/source/ --build_dir tpu-doc-build --version pr_$PR_NUMBER --version_tag_suffix "" --html --clean
mv tpu-doc-build ../optimum
cd ..
- name: Make AMD documentation
run: |
sudo docker system prune -a -f
Expand All @@ -171,7 +187,7 @@ jobs:
- name: Combine subpackage documentation
run: |
cd optimum
sudo python docs/combine_docs.py --subpackages nvidia amd intel neuron habana furiosa --version ${{ env.VERSION }}
sudo python docs/combine_docs.py --subpackages nvidia amd intel neuron tpu habana furiosa --version ${{ env.VERSION }}
cd ..
- name: Push to repositories
Expand Down
17 changes: 16 additions & 1 deletion .github/workflows/build_pr_documentation.yml
Original file line number Diff line number Diff line change
Expand Up @@ -53,6 +53,11 @@ jobs:
repository: 'huggingface/optimum-amd'
path: optimum-amd

- uses: actions/checkout@v2
with:
repository: 'huggingface/optimum-tpu'
path: optimum-tpu

- name: Setup environment
run: |
pip uninstall -y doc-builder
Expand Down Expand Up @@ -91,6 +96,16 @@ jobs:
sudo mv amd-doc-build ../optimum
cd ..
- name: Make TPU documentation
run: |
sudo docker system prune -a -f
cd optimum-tpu
pip install -U pip
pip install . -f https://storage.googleapis.com/libtpu-releases/index.html
doc-builder build optimum.tpu docs/source/ --build_dir tpu-doc-build --version pr_$PR_NUMBER --version_tag_suffix "" --html --clean
mv tpu-doc-build ../optimum
cd ..
- name: Make Optimum documentation
run: |
sudo docker system prune -a -f
Expand All @@ -101,7 +116,7 @@ jobs:
- name: Combine subpackage documentation
run: |
cd optimum
sudo python docs/combine_docs.py --subpackages nvidia amd intel neuron habana furiosa --version pr_$PR_NUMBER
sudo python docs/combine_docs.py --subpackages nvidia amd intel neuron tpu habana furiosa --version pr_$PR_NUMBER
sudo mv optimum-doc-build ../
cd ..
Expand Down
4 changes: 2 additions & 2 deletions .github/workflows/dev_test_bettertransformer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
- 3.8
os:
- ubuntu-20.04
- macos-latest
- macos-13
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
Expand All @@ -35,4 +35,4 @@ jobs:
- name: Test with unittest
working-directory: tests
run: |
python -m unittest discover -s bettertransformer -p test_*.py
python -m unittest discover -s bettertransformer -p test_*.py
4 changes: 2 additions & 2 deletions .github/workflows/dev_test_dummy_inputs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- 3.9
os:
- ubuntu-20.04
- macos-latest
- macos-13
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
Expand All @@ -35,4 +35,4 @@ jobs:
- name: Test with unittest
working-directory: tests
run: |
python -m unittest discover -s utils -p test_*.py
python -m unittest discover -s utils -p test_*.py
4 changes: 2 additions & 2 deletions .github/workflows/dev_test_fx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- 3.9
os:
- ubuntu-20.04
- macos-latest
- macos-13
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
Expand All @@ -35,4 +35,4 @@ jobs:
- name: Test with unittest
working-directory: tests
run: |
python -m pytest fx/optimization/test_transformations.py --exitfirst
python -m pytest fx/optimization/test_transformations.py --exitfirst
4 changes: 2 additions & 2 deletions .github/workflows/dev_test_onnx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -17,7 +17,7 @@ jobs:
- 3.9
os:
- ubuntu-20.04
- macos-latest
- macos-13
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
Expand All @@ -34,4 +34,4 @@ jobs:
- name: Test with unittest
working-directory: tests
run: |
python -m unittest discover -s onnx -p test_*.py
python -m unittest discover -s onnx -p test_*.py
4 changes: 2 additions & 2 deletions .github/workflows/dev_test_onnxruntime.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
os:
- ubuntu-20.04
- windows-2019
- macos-latest
- macos-13
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
Expand All @@ -36,4 +36,4 @@ jobs:
working-directory: tests
run: |
python -m pytest -n auto -m "not run_in_series" onnxruntime
python -m pytest -m "run_in_series" onnxruntime
python -m pytest -m "run_in_series" onnxruntime
4 changes: 2 additions & 2 deletions .github/workflows/dev_test_optimum_common.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ jobs:
os:
- ubuntu-20.04
- windows-2019
- macos-latest
- macos-13
runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
Expand All @@ -42,4 +42,4 @@ jobs:
as the staging tests cannot run in parallel.
export HUGGINGFACE_CO_STAGING=${{ matrix.python-version == 3.8 && matrix.os
== ubuntu-20.04 }}
python -m unittest discover -s tests -p test_*.py
python -m unittest discover -s tests -p test_*.py
2 changes: 1 addition & 1 deletion .github/workflows/test_bettertransformer.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9]
os: [ubuntu-20.04, macos-latest]
os: [ubuntu-20.04, macos-13]

runs-on: ${{ matrix.os }}
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_cli.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9]
os: [ubuntu-20.04, macos-latest]
os: [ubuntu-20.04, macos-13]

runs-on: ${{ matrix.os }}
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_dummy_inputs.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9]
os: [ubuntu-20.04, macos-latest]
os: [ubuntu-20.04, macos-13]

runs-on: ${{ matrix.os }}
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_fx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9]
os: [ubuntu-20.04, macos-latest]
os: [ubuntu-20.04, macos-13]

runs-on: ${{ matrix.os }}
steps:
Expand Down
43 changes: 43 additions & 0 deletions .github/workflows/test_offline.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
name: Offline usage / Python - Test

on:
push:
branches: [ main ]
pull_request:
branches: [ main ]

concurrency:
group: ${{ github.workflow }}-${{ github.head_ref || github.run_id }}
cancel-in-progress: true

jobs:
build:
strategy:
fail-fast: false
matrix:
python-version: [3.9]
os: [ubuntu-20.04]

runs-on: ${{ matrix.os }}
steps:
- uses: actions/checkout@v2
- name: Setup Python ${{ matrix.python-version }}
uses: actions/setup-python@v2
with:
python-version: ${{ matrix.python-version }}
- name: Install dependencies for pytorch export
run: |
pip install .[tests,exporters,onnxruntime]
- name: Test with unittest
run: |
HF_HOME=/tmp/ huggingface-cli download hf-internal-testing/tiny-random-gpt2
HF_HOME=/tmp/ HF_HUB_OFFLINE=1 optimum-cli export onnx --model hf-internal-testing/tiny-random-gpt2 gpt2_onnx --task text-generation
huggingface-cli download hf-internal-testing/tiny-random-gpt2
HF_HUB_OFFLINE=1 optimum-cli export onnx --model hf-internal-testing/tiny-random-gpt2 gpt2_onnx --task text-generation
pytest tests/onnxruntime/test_modeling.py -k "test_load_model_from_hub and not from_hub_onnx" -s -vvvvv
HF_HUB_OFFLINE=1 pytest tests/onnxruntime/test_modeling.py -k "test_load_model_from_hub and not from_hub_onnx" -s -vvvvv
2 changes: 1 addition & 1 deletion .github/workflows/test_onnx.yml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9]
os: [ubuntu-20.04, macos-latest]
os: [ubuntu-20.04, macos-13]

runs-on: ${{ matrix.os }}
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_onnxruntime.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9]
os: [ubuntu-20.04, windows-2019, macos-latest]
os: [ubuntu-20.04, windows-2019, macos-13]

runs-on: ${{ matrix.os }}
steps:
Expand Down
2 changes: 1 addition & 1 deletion .github/workflows/test_optimum_common.yml
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ jobs:
fail-fast: false
matrix:
python-version: [3.8, 3.9]
os: [ubuntu-20.04, windows-2019, macos-latest]
os: [ubuntu-20.04, windows-2019, macos-13]

runs-on: ${{ matrix.os }}
steps:
Expand Down
34 changes: 20 additions & 14 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,16 +14,18 @@ python -m pip install optimum

If you'd like to use the accelerator-specific features of 馃 Optimum, you can install the required dependencies according to the table below:

| Accelerator | Installation |
|:-----------------------------------------------------------------------------------------------------------------------|:--------------------------------------------------|
| [ONNX Runtime](https://huggingface.co/docs/optimum/onnxruntime/overview) | `pip install --upgrade-strategy eager optimum[onnxruntime]` |
| [Intel Neural Compressor](https://huggingface.co/docs/optimum/intel/index) | `pip install --upgrade-strategy eager optimum[neural-compressor]`|
| [OpenVINO](https://huggingface.co/docs/optimum/intel/index) | `pip install --upgrade-strategy eager optimum[openvino,nncf]` |
| [AMD Instinct GPUs and Ryzen AI NPU](https://huggingface.co/docs/optimum/amd/index) | `pip install --upgrade-strategy eager optimum[amd]` |
| [Habana Gaudi Processor (HPU)](https://huggingface.co/docs/optimum/habana/index) | `pip install --upgrade-strategy eager optimum[habana]` |
| [FuriosaAI](https://huggingface.co/docs/optimum/furiosa/index) | `pip install --upgrade-strategy eager optimum[furiosa]` |

The `--upgrade-strategy eager` option is needed to ensure the different packages are upgraded to the latest possible version.
| Accelerator | Installation |
|:-----------------------------------------------------------------------------------------------------------------------|:------------------------------------------------------------------|
| [ONNX Runtime](https://huggingface.co/docs/optimum/onnxruntime/overview) | `pip install --upgrade --upgrade-strategy eager optimum[onnxruntime]` |
| [Intel Neural Compressor](https://huggingface.co/docs/optimum/intel/index) | `pip install --upgrade --upgrade-strategy eager optimum[neural-compressor]`|
| [OpenVINO](https://huggingface.co/docs/optimum/intel/index) | `pip install --upgrade --upgrade-strategy eager optimum[openvino]` |
| [NVIDIA TensorRT-LLM](https://huggingface.co/docs/optimum/main/en/nvidia_overview) | `docker run -it --gpus all --ipc host huggingface/optimum-nvidia` |
| [AMD Instinct GPUs and Ryzen AI NPU](https://huggingface.co/docs/optimum/amd/index) | `pip install --upgrade --upgrade-strategy eager optimum[amd]` |
| [AWS Trainum & Inferentia](https://huggingface.co/docs/optimum-neuron/index) | `pip install --upgrade --upgrade-strategy eager optimum[neuronx]` |
| [Habana Gaudi Processor (HPU)](https://huggingface.co/docs/optimum/habana/index) | `pip install --upgrade --upgrade-strategy eager optimum[habana]` |
| [FuriosaAI](https://huggingface.co/docs/optimum/furiosa/index) | `pip install --upgrade --upgrade-strategy eager optimum[furiosa]` |

The `--upgrade --upgrade-strategy eager` option is needed to ensure the different packages are upgraded to the latest possible version.

To install from source:

Expand All @@ -45,6 +47,8 @@ python -m pip install optimum[onnxruntime]@git+https://github.com/huggingface/op
- TensorFlow Lite
- [OpenVINO](https://huggingface.co/docs/optimum/intel/inference)
- Habana first-gen Gaudi / Gaudi2, more details [here](https://huggingface.co/docs/optimum/main/en/habana/usage_guides/accelerate_inference)
- AWS Inferentia 2 / Inferentia 1, more details [here](https://huggingface.co/docs/optimum-neuron/en/guides/models)
- NVIDIA TensorRT-LLM , more details [here](https://huggingface.co/blog/optimum-nvidia)

The [export](https://huggingface.co/docs/optimum/exporters/overview) and optimizations can be done both programmatically and with a command line.

Expand All @@ -66,7 +70,7 @@ The [export](https://huggingface.co/docs/optimum/exporters/overview) and optimiz
Before you begin, make sure you have all the necessary libraries installed :

```bash
pip install --upgrade-strategy eager optimum[openvino,nncf]
pip install --upgrade --upgrade-strategy eager optimum[openvino]
```

It is possible to export 馃 Transformers and Diffusers models to the OpenVINO format easily:
Expand All @@ -75,7 +79,8 @@ It is possible to export 馃 Transformers and Diffusers models to the OpenVINO
optimum-cli export openvino --model distilbert-base-uncased-finetuned-sst-2-english distilbert_sst2_ov
```

If you add `--int8`, the weights will be quantized to INT8. Static quantization can also be applied on the activations using [NNCF](https://github.com/openvinotoolkit/nncf), more information can be found in the [documentation](https://huggingface.co/docs/optimum/main/en/intel/optimization_ov).
If you add `--weight-format int8`, the weights will be quantized to `int8`, check out our [documentation](https://huggingface.co/docs/optimum/main/en/intel/optimization_ov#weight-only-quantization) for more detail on weight only quantization. To apply quantization on both weights and activations, you can find more information [here](https://huggingface.co/docs/optimum/main/en/intel/optimization_ov#static-quantization).


To load a model and run inference with OpenVINO Runtime, you can just replace your `AutoModelForXxx` class with the corresponding `OVModelForXxx` class. To load a PyTorch checkpoint and convert it to the OpenVINO format on-the-fly, you can set `export=True` when loading your model.

Expand All @@ -100,7 +105,7 @@ You can find more examples in the [documentation](https://huggingface.co/docs/op
Before you begin, make sure you have all the necessary libraries installed :

```bash
pip install --upgrade-strategy eager optimum[neural-compressor]
pip install --upgrade --upgrade-strategy eager optimum[neural-compressor]
```

Dynamic quantization can be applied on your model:
Expand Down Expand Up @@ -190,14 +195,15 @@ optimum-cli export tflite \
We support many providers:

- Habana's Gaudi processors
- AWS Trainium instances, check [here](https://huggingface.co/docs/optimum-neuron/en/guides/distributed_training)
- ONNX Runtime (optimized for GPUs)

### Habana

Before you begin, make sure you have all the necessary libraries installed :

```bash
pip install --upgrade-strategy eager optimum[habana]
pip install --upgrade --upgrade-strategy eager optimum[habana]
```

```diff
Expand Down
Loading

0 comments on commit 4901d1d

Please sign in to comment.