Skip to content

Conversation

dhruvbird
Copy link
Contributor

@dhruvbird dhruvbird commented Sep 9, 2022

Stack from ghstack (oldest at bottom):

Summary: Currently, the model tracer generates the selected features YAML file only with used operators. This change adds support for dtypes and custom classes as well.

We need to add the flag -DENABLE_RECORD_KERNEL_FUNCTION_DTYPE when building PyTorch in Instrumentation Mode (i.e. with TRACING_BASED=1 for server builds) to enable capturing this data.

Test Plan: Built using USE_NUMPY=0 USE_DISTRIBUTED=0 USE_CUDA=0 TRACING_BASED=1 python setup.py develop

Ran the model tracer to observe this generated file: https://gist.github.com/dhruvbird/50e1860b39ae065e57d58f17e0912136

Then used the generated YAML to built pytorch (minimal build) using the command

BUILD_PYTORCH_MOBILE_WITH_HOST_TOOLCHAIN=1 \
USE_LIGHTWEIGHT_DISPATCH=0 BUILD_LITE_INTERPRETER=1 \
SELECTED_OP_LIST=/tmp/selected_ops.yaml \
TRACING_BASED=1 \
./scripts/build_mobile.sh

After that I generated a binary using this command:

g++ /tmp/main.cpp -L build_mobile/lib/ -I build_mobile/install/include/ -ffunction-sections -fdata-sections -Wl,--gc-sections \
    -lpthread -lc10 -Wl,--whole-archive -ltorch_cpu -Wl,--no-whole-archive -ltorch -lXNNPACK \
    -lpytorch_qnnpack -lcpuinfo -lclog -lpthreadpool -lkineto -lfmt -ldl -lc10

The table below shows the size reduction in all build modes.

Build Type Unstripped Stripped
Standard 49MiB 34MiB
Minimal w/o dtype 6.1MiB (12%) 4.5MiB (18%)
Minimal w/ dtype 3.7MiB (7%) 2.7MiB (11%)

Summary: Currently, the model tracer generates the selected features YAML file only with used operators. This change adds support for dtypes and custom classes as well.

We need to add the flag `-DENABLE_RECORD_KERNEL_FUNCTION_DTYPE` when building PyTorch in Instrumentation Mode (i.e. with `TRACING_BASED=1` for server builds) to enable capturing this data.

Test Plan: Built using `USE_NUMPY=0 USE_DISTRIBUTED=0 USE_CUDA=0 TRACING_BASED=1 python setup.py develop`

Ran the model tracer to observe this generated file: https://gist.github.com/dhruvbird/50e1860b39ae065e57d58f17e0912136

[ghstack-poisoned]
@pytorch-bot
Copy link

pytorch-bot bot commented Sep 9, 2022

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/84795

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit bd8f366:
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

@pytorch-bot pytorch-bot bot added the release notes: mobile release notes category label Sep 9, 2022
@dhruvbird dhruvbird requested a review from cccclai September 9, 2022 23:53
@facebook-github-bot facebook-github-bot added the oncall: jit Add this issue/PR to JIT oncall triage queue label Sep 9, 2022
@cccclai
Copy link
Contributor

cccclai commented Sep 10, 2022

Hmm I think we need to have the pr together with mobile build as well. Have we tried that?

@dhruvbird
Copy link
Contributor Author

Hmm I think we need to have the pr together with mobile build as well. Have we tried that?

Thanks for flagging! Yes, I just updated the PR summary to include it - I seem to have missed it earlier!

…l tracer"


Summary: Currently, the model tracer generates the selected features YAML file only with used operators. This change adds support for dtypes and custom classes as well.

We need to add the flag `-DENABLE_RECORD_KERNEL_FUNCTION_DTYPE` when building PyTorch in Instrumentation Mode (i.e. with `TRACING_BASED=1` for server builds) to enable capturing this data.

Test Plan: Built using `USE_NUMPY=0 USE_DISTRIBUTED=0 USE_CUDA=0 TRACING_BASED=1 python setup.py develop`

Ran the model tracer to observe this generated file: https://gist.github.com/dhruvbird/50e1860b39ae065e57d58f17e0912136

Then used the generated YAML to built pytorch (minimal build) using the command

```
BUILD_PYTORCH_MOBILE_WITH_HOST_TOOLCHAIN=1 \
USE_LIGHTWEIGHT_DISPATCH=0 BUILD_LITE_INTERPRETER=1 \
SELECTED_OP_LIST=/tmp/selected_ops.yaml \
TRACING_BASED=1 \
./scripts/build_mobile.sh
```

After that I generated a binary using this command:

```
g++ /tmp/main.cpp -L build_mobile/lib/ -I build_mobile/install/include/ -ffunction-sections -fdata-sections -Wl,--gc-sections \
    -lpthread -lc10 -Wl,--whole-archive -ltorch_cpu -Wl,--no-whole-archive -ltorch -lXNNPACK \
    -lpytorch_qnnpack -lcpuinfo -lclog -lpthreadpool -lkineto -lfmt -ldl -lc10
```

The table below shows the size reduction in all build modes.

| Build Type  | Unstripped | Stripped |
| ----------- | ----------- | ----------- |
| Standard | 49MiB | 34MiB |
| Minimal w/o dtype | 6.1MiB (12%) | 4.5MiB (18%) |
| Minimal w/ dtype | 3.7MiB (7%) | 2.7MiB (11%) |

[ghstack-poisoned]
…l tracer"


Summary: Currently, the model tracer generates the selected features YAML file only with used operators. This change adds support for dtypes and custom classes as well.

We need to add the flag `-DENABLE_RECORD_KERNEL_FUNCTION_DTYPE` when building PyTorch in Instrumentation Mode (i.e. with `TRACING_BASED=1` for server builds) to enable capturing this data.

Test Plan: Built using `USE_NUMPY=0 USE_DISTRIBUTED=0 USE_CUDA=0 TRACING_BASED=1 python setup.py develop`

Ran the model tracer to observe this generated file: https://gist.github.com/dhruvbird/50e1860b39ae065e57d58f17e0912136

Then used the generated YAML to built pytorch (minimal build) using the command

```
BUILD_PYTORCH_MOBILE_WITH_HOST_TOOLCHAIN=1 \
USE_LIGHTWEIGHT_DISPATCH=0 BUILD_LITE_INTERPRETER=1 \
SELECTED_OP_LIST=/tmp/selected_ops.yaml \
TRACING_BASED=1 \
./scripts/build_mobile.sh
```

After that I generated a binary using this command:

```
g++ /tmp/main.cpp -L build_mobile/lib/ -I build_mobile/install/include/ -ffunction-sections -fdata-sections -Wl,--gc-sections \
    -lpthread -lc10 -Wl,--whole-archive -ltorch_cpu -Wl,--no-whole-archive -ltorch -lXNNPACK \
    -lpytorch_qnnpack -lcpuinfo -lclog -lpthreadpool -lkineto -lfmt -ldl -lc10
```

The table below shows the size reduction in all build modes.

| Build Type  | Unstripped | Stripped |
| ----------- | ----------- | ----------- |
| Standard | 49MiB | 34MiB |
| Minimal w/o dtype | 6.1MiB (12%) | 4.5MiB (18%) |
| Minimal w/ dtype | 3.7MiB (7%) | 2.7MiB (11%) |

[ghstack-poisoned]
…l tracer"


Summary: Currently, the model tracer generates the selected features YAML file only with used operators. This change adds support for dtypes and custom classes as well.

We need to add the flag `-DENABLE_RECORD_KERNEL_FUNCTION_DTYPE` when building PyTorch in Instrumentation Mode (i.e. with `TRACING_BASED=1` for server builds) to enable capturing this data.

Test Plan: Built using `USE_NUMPY=0 USE_DISTRIBUTED=0 USE_CUDA=0 TRACING_BASED=1 python setup.py develop`

Ran the model tracer to observe this generated file: https://gist.github.com/dhruvbird/50e1860b39ae065e57d58f17e0912136

Then used the generated YAML to built pytorch (minimal build) using the command

```
BUILD_PYTORCH_MOBILE_WITH_HOST_TOOLCHAIN=1 \
USE_LIGHTWEIGHT_DISPATCH=0 BUILD_LITE_INTERPRETER=1 \
SELECTED_OP_LIST=/tmp/selected_ops.yaml \
TRACING_BASED=1 \
./scripts/build_mobile.sh
```

After that I generated a binary using this command:

```
g++ /tmp/main.cpp -L build_mobile/lib/ -I build_mobile/install/include/ -ffunction-sections -fdata-sections -Wl,--gc-sections \
    -lpthread -lc10 -Wl,--whole-archive -ltorch_cpu -Wl,--no-whole-archive -ltorch -lXNNPACK \
    -lpytorch_qnnpack -lcpuinfo -lclog -lpthreadpool -lkineto -lfmt -ldl -lc10
```

The table below shows the size reduction in all build modes.

| Build Type  | Unstripped | Stripped |
| ----------- | ----------- | ----------- |
| Standard | 49MiB | 34MiB |
| Minimal w/o dtype | 6.1MiB (12%) | 4.5MiB (18%) |
| Minimal w/ dtype | 3.7MiB (7%) | 2.7MiB (11%) |

[ghstack-poisoned]
dhruvbird added a commit that referenced this pull request Sep 10, 2022
Summary: Currently, the model tracer generates the selected features YAML file only with used operators. This change adds support for dtypes and custom classes as well.

We need to add the flag `-DENABLE_RECORD_KERNEL_FUNCTION_DTYPE` when building PyTorch in Instrumentation Mode (i.e. with `TRACING_BASED=1` for server builds) to enable capturing this data.

Test Plan: Built using `USE_NUMPY=0 USE_DISTRIBUTED=0 USE_CUDA=0 TRACING_BASED=1 python setup.py develop`

Ran the model tracer to observe this generated file: https://gist.github.com/dhruvbird/50e1860b39ae065e57d58f17e0912136

ghstack-source-id: 76fe50b
Pull Request resolved: #84795
Copy link
Contributor

@cccclai cccclai left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks!

@dhruvbird
Copy link
Contributor Author

The error RuntimeError: Error compiling objects for extension in the job linux-bionic-py3_7-clang8-xla / test (xla, 1, 1, linux.2xlarge) seems unrelated to this change, so force merging, since all other CI checks are passing.

…l tracer"


Summary: Currently, the model tracer generates the selected features YAML file only with used operators. This change adds support for dtypes and custom classes as well.

We need to add the flag `-DENABLE_RECORD_KERNEL_FUNCTION_DTYPE` when building PyTorch in Instrumentation Mode (i.e. with `TRACING_BASED=1` for server builds) to enable capturing this data.

Test Plan: Built using `USE_NUMPY=0 USE_DISTRIBUTED=0 USE_CUDA=0 TRACING_BASED=1 python setup.py develop`

Ran the model tracer to observe this generated file: https://gist.github.com/dhruvbird/50e1860b39ae065e57d58f17e0912136

Then used the generated YAML to built pytorch (minimal build) using the command

```
BUILD_PYTORCH_MOBILE_WITH_HOST_TOOLCHAIN=1 \
USE_LIGHTWEIGHT_DISPATCH=0 BUILD_LITE_INTERPRETER=1 \
SELECTED_OP_LIST=/tmp/selected_ops.yaml \
TRACING_BASED=1 \
./scripts/build_mobile.sh
```

After that I generated a binary using this command:

```
g++ /tmp/main.cpp -L build_mobile/lib/ -I build_mobile/install/include/ -ffunction-sections -fdata-sections -Wl,--gc-sections \
    -lpthread -lc10 -Wl,--whole-archive -ltorch_cpu -Wl,--no-whole-archive -ltorch -lXNNPACK \
    -lpytorch_qnnpack -lcpuinfo -lclog -lpthreadpool -lkineto -lfmt -ldl -lc10
```

The table below shows the size reduction in all build modes.

| Build Type  | Unstripped | Stripped |
| ----------- | ----------- | ----------- |
| Standard | 49MiB | 34MiB |
| Minimal w/o dtype | 6.1MiB (12%) | 4.5MiB (18%) |
| Minimal w/ dtype | 3.7MiB (7%) | 2.7MiB (11%) |

[ghstack-poisoned]
@dhruvbird dhruvbird added the topic: new features topic category label Sep 10, 2022
@dhruvbird
Copy link
Contributor Author

@pytorchbot merge

@pytorchmergebot
Copy link
Collaborator

@pytorchbot successfully started a merge job. Check the current status here.
The merge job was triggered without a flag. This means that your change will be merged once all checks on your PR have passed (ETA: 0-4 Hours). If this is not the intended behavior, feel free to use some of the other merge options in the wiki.
Please reach out to the PyTorch DevX Team with feedback or questions!

@facebook-github-bot facebook-github-bot deleted the gh/dhruvbird/99/head branch September 14, 2022 14:20
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cla signed Merged oncall: jit Add this issue/PR to JIT oncall triage queue release notes: mobile release notes category topic: new features topic category
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants