This repository was archived by the owner on May 8, 2024. It is now read-only.
-
Notifications
You must be signed in to change notification settings - Fork 6
Add xpu device support. #1
Closed
Closed
Changes from all commits
Commits
File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change | ||||||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
|
|
@@ -182,7 +182,7 @@ Using the fine-tuned model to generate new text can be done using the `generate_ | |||||||||||||||
| The arguments for the `generate_text.py` script are as follows: | ||||||||||||||||
|
|
||||||||||||||||
| ``` | ||||||||||||||||
| usage: generate_text.py [-h] --model_config MODEL_CONFIG [--benchmark_mode] [--benchmark_seq_length BENCHMARK_SEQ_LENGTH] | ||||||||||||||||
| usage: generate_text.py [-h] --model_config MODEL_CONFIG [--benchmark_mode] [--benchmark_seq_length BENCHMARK_SEQ_LENGTH] [--device cpu] | ||||||||||||||||
|
|
||||||||||||||||
| optional arguments: | ||||||||||||||||
| -h, --help show this help message and exit | ||||||||||||||||
|
|
@@ -191,6 +191,8 @@ optional arguments: | |||||||||||||||
| --benchmark_mode use intel pytorch extension to optimize model. | ||||||||||||||||
| --benchmark_seq_length BENCHMARK_SEQ_LENGTH | ||||||||||||||||
| length of generation if benchmark mode is used. | ||||||||||||||||
| --device cpu | ||||||||||||||||
| Choose run on cpu or xpu, default is cpu. | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| **Configuration Parameters** | ||||||||||||||||
|
|
@@ -242,27 +244,35 @@ Within the yaml configuration file, the following optional arguments can be spec | |||||||||||||||
|
|
||||||||||||||||
| **YAML file** | **Environment Name** | **Configuration** | | ||||||||||||||||
| | :---: | :---: | :---: | | ||||||||||||||||
| `env/intel/text-intel-torch.yml` | `text-intel-torch` | Python=3.9.7, PyTorch v1.13, Intel® Extension for PyTorch v1.13, Intel® Neural Compressor v2.0.0 | | ||||||||||||||||
| `env/intel/text-intel-torch-cpu.yml` | `text-intel-torch` | Python=3.9.7, PyTorch v1.13, Intel® Extension for PyTorch v1.13, Intel® Neural Compressor v2.0.0 | | ||||||||||||||||
| `env/intel/text-intel-torch-xpu.yml` | `text-intel-torch` | Python=3.9.7, PyTorch v1.13, Intel® Extension for PyTorch v1.13+xpu, Intel® Neural Compressor v2.0.0 | | ||||||||||||||||
|
|
||||||||||||||||
| ### Optimized Solution Setup | ||||||||||||||||
|
|
||||||||||||||||
| Follow the below conda installation commands to setup the Intel® oneAPI optimized PyTorch environment for model training and text generation. | ||||||||||||||||
| Follow the below conda installation commands to setup the Intel® oneAPI optimized PyTorch environment for model training and text generation. Please choose `env/intel/text-intel-torch-xpu.yml` if you have Intel GPU available. | ||||||||||||||||
|
|
||||||||||||||||
| ```sh | ||||||||||||||||
| conda env create -f env/intel/text-intel-torch.yml | ||||||||||||||||
| conda env create -f env/intel/text-intel-torch-cpu.yml # For CPU | ||||||||||||||||
| or | ||||||||||||||||
| conda env create -f env/intel/text-intel-torch-xpu.yml # For XPU | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| *Activate stock conda environment* | ||||||||||||||||
| Use the following command to activate the environment that was created: | ||||||||||||||||
| ```sh | ||||||||||||||||
| conda activate text-intel-torch | ||||||||||||||||
| ``` | ||||||||||||||||
| Please perform this additional installation step only if you are using an Intel GPU, CPU users can skip this step: | ||||||||||||||||
| ```sh | ||||||||||||||||
| pip install -r env/intel/update_to_torch_xpu.txt | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| This script utilizes the dependencies found in the `env/intel/text-intel-torch.yml` file to create an environment as follows: | ||||||||||||||||
| This script utilizes the dependencies found in the `env/intel/text-intel-torch-cpu.yml` or `env/intel/text-intel-torch-xpu.yml` file to create an environment as follows: | ||||||||||||||||
|
|
||||||||||||||||
| | **YAML file** | **Environment Name** | **Configuration** | | ||||||||||||||||
| | :-------------------: | :------------------: | :-----------------------------: | | ||||||||||||||||
| | `env/intel/text-intel-torch.yml` | `text-intel-torch` | Python=3.9.x with PyTorch v1.13, Intel® Extension For PyTorch v 1.13.0, Intel® Neural Compressor v2.0.0 | | ||||||||||||||||
| | `env/intel/text-intel-torch-cpu.yml` | `text-intel-torch` | Python=3.9.x with PyTorch v1.13, Intel® Extension For PyTorch v 1.13.0, Intel® Neural Compressor v2.0.0 | | ||||||||||||||||
| | `env/intel/text-intel-torch-xpu.yml, update_to_torch_xpu.txt` | `text-intel-torch` | Python=3.9.x with PyTorch v1.13, Intel® Extension For PyTorch v 1.13.0+xpu, Intel® Neural Compressor v2.0.0 | | ||||||||||||||||
|
|
||||||||||||||||
| ### Intel® oneAPI Optimized Implementation | ||||||||||||||||
|
|
||||||||||||||||
|
|
@@ -282,6 +292,33 @@ The command to fine-tune the model with Intel® optimizations enabled is: | |||||||||||||||
| ipexrun --use_logical_core --enable_tcmalloc src/finetune_model.py --model_config configs/config_base.yml --data_path data/abcnews-date-text.csv --save_path saved_models/gpt2-medium-finetuned-intel --intel | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| **For Intel GPU training**, device 'xpu' must be added to 'python/site-packages/transformers/training_args.py' as shown in the below diff at lines. | ||||||||||||||||
| It can be changed manually follows: | ||||||||||||||||
| ```python | ||||||||||||||||
| diff --git a/training_args.py b/training_args.py | ||||||||||||||||
| index 1a90710..9421ccd 100644 | ||||||||||||||||
| --- a/training_args.py | ||||||||||||||||
| +++ b/training_args.py | ||||||||||||||||
| @@ -1466,6 +1466,8 @@ class TrainingArguments: | ||||||||||||||||
| torch.distributed.init_process_group( | ||||||||||||||||
| backend=self.xpu_backend, rank=rank, world_size=size, timeout=self.ddp_timeout_delta | ||||||||||||||||
| ) | ||||||||||||||||
| + elif torch.xpu.is_available(): | ||||||||||||||||
| + device = torch.device("xpu") | ||||||||||||||||
| elif is_torch_tpu_available(): | ||||||||||||||||
| device = xm.xla_device() | ||||||||||||||||
| self._n_gpu = 0 | ||||||||||||||||
| ``` | ||||||||||||||||
| Or to apply the patch automatically, please run the following Python script only once. | ||||||||||||||||
| ```bash | ||||||||||||||||
| $ cd src | ||||||||||||||||
| $ python ./apply_xpu_patch.py | ||||||||||||||||
|
Comment on lines
+314
to
+315
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Suggested change
|
||||||||||||||||
| ``` | ||||||||||||||||
| and then run the fine tuning step as follows: | ||||||||||||||||
| ```sh | ||||||||||||||||
| ipexrun --use_logical_core --enable_tcmalloc src/finetune_model.py --model_config configs/config_base.yml --data_path data/abcnews-date-text.csv --save_path saved_models/gpt2-medium-finetuned-intel --intel | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| **Expected Output**<br> | ||||||||||||||||
| The output-trained model will be saved as a `huggingface` pre-trained model saved to the 'saved_models/gpt2-medium-finetuned-intel'. Training time in seconds would be generated at the end of the training module. | ||||||||||||||||
|
|
||||||||||||||||
|
|
@@ -308,6 +345,11 @@ As above, this trained model can be used to generate text using the provided `ge | |||||||||||||||
| python src/generate_text.py --model_config configs/config_finetuned_intel.yml | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| To run inference on an Intel GPU, please run the following | ||||||||||||||||
| ```sh | ||||||||||||||||
| python src/generate_text.py --model_config configs/config_finetuned_inc.yml --device xpu | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| **Expected Output**<br> | ||||||||||||||||
| The converted ONNX fine-tuned model will be saved to `saved_models/gpt2-medium-finetuned-intel-onnx`. | ||||||||||||||||
|
|
||||||||||||||||
|
|
@@ -348,6 +390,11 @@ Once the quantized model is created, we can use the `generate_text.py` script on | |||||||||||||||
| python src/generate_text.py --model_config configs/config_finetuned_inc.yml | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| To run inference on an Intel GPU, please run the following | ||||||||||||||||
| ```sh | ||||||||||||||||
| python src/generate_text.py --model_config configs/config_finetuned_inc.yml --device xpu | ||||||||||||||||
| ``` | ||||||||||||||||
|
|
||||||||||||||||
| ## Performance Observations | ||||||||||||||||
|
|
||||||||||||||||
| In the following, we report some results comparing Intel® technologies vs the stock alternative on the task of generating new texts of various lengths. | ||||||||||||||||
|
|
||||||||||||||||
File renamed without changes.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,15 @@ | ||
| # Refer to https://intel.github.io/intel-extension-for-pytorch/xpu/1.13.10+xpu/tutorials/getting_started.html | ||
| # for intel_extension_for_pytorch installation. | ||
| name: text-intel-torch | ||
| channels: | ||
| - conda-forge | ||
| dependencies: | ||
| - python=3.9.7 | ||
| - pip=21.2.4 | ||
| - pip: | ||
| - neural-compressor==2.0 | ||
| - pandas==1.5.3 | ||
| - numpy==1.23.5 | ||
| - transformers==4.26.1 | ||
| - datasets==2.10.1 | ||
| - optimum[onnxruntime]==1.6.4 |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,3 @@ | ||
| -i https://developer.intel.com/ipex-whl-stable-xpu | ||
| torch==1.13.0a0 | ||
| intel_extension_for_pytorch==1.13.10+xpu |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,12 @@ | ||
| import transformers | ||
| import os | ||
| import subprocess | ||
|
|
||
| module_path = transformers.__path__[0] | ||
| # get path for training_args.py | ||
| target_file_path = os.path.join(module_path, "training_args.py") | ||
|
|
||
| # apply patch to training_args.py file in the transformers package | ||
| subprocess.run(["patch", target_file_path, "transformers_xpu.patch"]) | ||
|
|
||
| print("patch applied successfully") |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
|
|
@@ -26,7 +26,7 @@ | |
| AutoModelForCausalLM, | ||
| set_seed | ||
| ) | ||
|
|
||
| import intel_extension_for_pytorch as ipex | ||
|
|
||
| def generate_text( | ||
| tokenizer: PreTrainedTokenizer, | ||
|
|
@@ -36,7 +36,8 @@ def generate_text( | |
| min_length: int = 0, | ||
| max_length: int = 10, | ||
| algorithm: str = 'greedy', | ||
| stop_token: str = '.') -> Tuple[List[int],float]: | ||
| stop_token: str = '.', | ||
| device: str = 'cpu') -> Tuple[List[int],float]: | ||
| """Generate text using the provided model and algorithm. | ||
|
|
||
| Args: | ||
|
|
@@ -60,25 +61,41 @@ def generate_text( | |
| Tuple(List[int], float): generated sequence of token_ids, total time for | ||
| of model inference calls | ||
| """ | ||
|
|
||
|
|
||
| all_token_ids = input_ids.clone() | ||
| all_attention_masks = attention_mask.clone() | ||
| eos_token_id = tokenizer([stop_token], return_tensors='np')[ | ||
| 'input_ids'][0][0] | ||
| has_eos = torch.zeros(1, dtype=torch.bool) | ||
| ones_mask = torch.ones([1, 1]) | ||
|
|
||
| if device == 'xpu': | ||
| print("#### xpu") | ||
| all_token_ids = input_ids.to("xpu") | ||
| all_attention_masks = attention_mask.to("xpu") | ||
| eos_token_id = torch.tensor([eos_token_id], device=torch.device("xpu")) | ||
| has_eos = has_eos.to("xpu") | ||
| ones_mask = ones_mask.to("xpu") | ||
| model = model.to("xpu") | ||
|
|
||
| total_time = 0 | ||
| for step in range(max_length): | ||
|
|
||
| if isinstance(model, torch.nn.Module): | ||
| if isinstance(model, torch.nn.Module) and device == 'cpu': | ||
| start = time.time() | ||
| next_token_logits = torch.nn.functional.softmax( | ||
| model( | ||
| input_ids=all_token_ids, | ||
| attention_mask=all_attention_masks)[:, -1, :], dim=1) | ||
| end = time.time() | ||
| total_time += end - start | ||
| elif isinstance(model, torch.nn.Module) and device == 'xpu': | ||
| start = time.time() | ||
| next_token_logits = torch.nn.functional.softmax( | ||
| model( | ||
| input_ids=all_token_ids, | ||
| attention_mask=all_attention_masks)[0][:, -1, :], dim=1) | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @feng-intel please make the 'if-else' change as we discussed |
||
| end = time.time() | ||
| total_time += end - start | ||
| elif isinstance(model, ort.InferenceSession): | ||
| ort_input = { | ||
| "input_ids": np.array(all_token_ids), | ||
|
|
@@ -101,7 +118,7 @@ def generate_text( | |
| all_token_ids = torch.cat( | ||
| [all_token_ids, tokens_to_add.unsqueeze(-1)], dim=-1) | ||
| all_attention_masks = torch.cat( | ||
| [all_attention_masks, torch.ones([1, 1])], dim=-1).type_as(all_attention_masks) | ||
| [all_attention_masks, ones_mask], dim=-1).type_as(all_attention_masks) | ||
| if step > min_length and next_tokens == eos_token_id: | ||
| break | ||
|
|
||
|
|
@@ -115,8 +132,8 @@ def generate( | |
| max_length: int = 10, | ||
| prompt_file: str = None, | ||
| benchmark_mode: bool = False, | ||
| n_runs: int = 100): | ||
|
|
||
| n_runs: int = 100, | ||
| device: str = 'cpu'): | ||
| # read prompts from file into a list for batch processing | ||
| tokenized_input = [] | ||
| if not benchmark_mode and prompt_file is not None: | ||
|
|
@@ -143,7 +160,8 @@ def generate( | |
| tokenized_prompt.attention_mask, | ||
| min_length=max_length, | ||
| max_length=max_length, | ||
| algorithm='greedy') | ||
| algorithm='greedy', | ||
| device=device) | ||
| if i > 10: | ||
| times.append(total_time) | ||
| print(f"Average Generation time: {np.mean(times)}s") | ||
|
|
@@ -160,7 +178,8 @@ def generate( | |
| tokenized_prompt.attention_mask, | ||
| min_length=min_length, | ||
| max_length=max_length, | ||
| algorithm='sample') | ||
| algorithm='sample', | ||
| device=device) | ||
| tokenized_output.append(res) | ||
|
|
||
| out_json = [] | ||
|
|
@@ -216,7 +235,8 @@ def main(flags): | |
| min_length=min_length, | ||
| max_length=max_length, | ||
| prompt_file=prompt_file, | ||
| benchmark_mode=flags.benchmark_mode | ||
| benchmark_mode=flags.benchmark_mode, | ||
| device=flags.device | ||
| ) | ||
|
|
||
|
|
||
|
|
@@ -241,6 +261,11 @@ def main(flags): | |
| type=int, | ||
| default=10 | ||
| ) | ||
| parser.add_argument('--device', | ||
|
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. please add a choice here between cpu and xpu
Author
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. don't get it. |
||
| type=str, | ||
| required=False, | ||
| default='cpu', | ||
| help='Choose run on cpu or xpu, default is cpu.') | ||
|
|
||
| FLAGS = parser.parse_args() | ||
| main(FLAGS) | ||
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
| Original file line number | Diff line number | Diff line change |
|---|---|---|
| @@ -0,0 +1,20 @@ | ||
| --- a.py 2023-04-04 11:50:27.000000000 +0000 | ||
| +++ b.py 2023-04-04 11:52:00.000000000 +0000 | ||
| @@ -53,6 +53,7 @@ | ||
| requires_backends, | ||
| ) | ||
|
|
||
| +import intel_extension_for_pytorch | ||
|
|
||
| if is_torch_available(): | ||
| import torch | ||
| @@ -1466,6 +1467,8 @@ | ||
| torch.distributed.init_process_group( | ||
| backend=self.xpu_backend, rank=rank, world_size=size, timeout=self.ddp_timeout_delta | ||
| ) | ||
| + elif torch.xpu.is_available(): | ||
| + device = torch.device("xpu") | ||
| elif is_torch_tpu_available(): | ||
| device = xm.xla_device() | ||
| self._n_gpu = 0 | ||
| ~ |
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the diff with the latest one (one that also has "import intel_extension_for_pytorch as ipex")