Issue search results

Filter by

432 results

(69 ms)inpytorch/ao (press backspace or delete to remove)

pytorch/ao
Failed to save the static quantized model

Hi, I am following this example and want to save the INT8 static quantization result, but it’s failing. Could you take a look, thanks! ... # quantized linear represented as an nn.Linear with ...

yiliu30

Opened
2 days ago

#1950

pytorch/ao
cast to mxfp8 across dim1 should be performant

placeholder, TODO fill me out

float8

vkuzo

Opened
3 days ago

#1945

pytorch/ao
Torchao import time

Hi folks, not a bug. In torchtune, importing the library takes ~7s. When I profile it, majority is coming from torchao imports. just a simple import torchao takes ~4s import time start = time.perf_counter() ...

felipemello1

Opened
3 days ago

#1944

pytorch/ao
[Bug] FSDP2 FP8 compatibility problem with nn.Linear layers (GPU count > out_features)

When using FSDP2 for Float8 training, an issue occurs when the number of GPUs exceeds the out_features of an nn.Linear layer. Specifically, FSDP2 splits the weight tensor into a shape of [0, in_features] ...

float8

HIT-cwh

Opened
4 days ago

#1938

pytorch/ao
[Torchao Experimental] Create meta tensors directly in meta kernels

In meta kernels in torchao/experimental/ops, we do things like: return torch::empty({num_out, k}).to( meta ); We should create meta tensors directly if possible.

metascroy

Opened
6 days ago

#1936

pytorch/ao
FSDP2 + CPU Offload + AdamW8bit issue

I am having some strange issue with low bit optimizer and the combination of FSDP2 and CPU Offloading: torch._dynamo.exc.TorchRuntimeError: Dynamo failed to run FX node with fake tensors: call_method ...

psinger

Opened
6 days ago

#1931

pytorch/ao
Torchao's CPU overhead counteracts the performance benefit of using quantization kernel.

Hi, I did some benchmark on LLM models with int4_weight_only on CPU/GPU/XPU and expected to see models have E2E speed up compared with pure bf16/fp16. From the aspect of kernel, int4 GEMM kernels are ...

LuFinch

Opened
7 days ago

#1930

pytorch/ao
fp8 quantization with FSDP2 error

When I fp8 quantize a model and then shard it using FSDP2, it reports an error: [rank1]: Traceback (most recent call last): [rank1]: File /mnt/teams/algo-teams/shared/code/wanx-inference/generate.py ...

happynear

Opened
8 days ago

#1929

pytorch/ao
Does torchao support FP8 Grouped GEMM?

Grouped GEMM kernels (https://github.com/fanshiqing/grouped_gemm) are used in many MoE models. I just wander does torchao support FP8 kernels for Grouped GEMM, such like the three commonly used ops: ...

float8

zigzagcai

Opened
8 days ago

#1928

pytorch/ao
Accelerate activation sparsity with activation compression

We ve come up with a training recipe for 2:4 activation sparsity, which is outlined in this paper: https://openreview.net/pdf?id=O5feVk7p6Y The gist of this approach is that: 1) we find high level of ...

good first issue

jcaip

Opened
9 days ago

#1920

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Restrict your search to the title by using the in:title qualifier.

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues

ProTip!

Press the

key to activate the search input again and adjust your query.

Languages

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Filter by

State

Advanced

pytorch/ao
Failed to save the static quantized model

pytorch/ao
cast to mxfp8 across dim1 should be performant

pytorch/ao
Torchao import time

pytorch/ao
[Bug] FSDP2 FP8 compatibility problem with nn.Linear layers (GPU count > out_features)

pytorch/ao
[Torchao Experimental] Create meta tensors directly in meta kernels

pytorch/ao
FSDP2 + CPU Offload + AdamW8bit issue

pytorch/ao
Torchao's CPU overhead counteracts the performance benefit of using quantization kernel.

pytorch/ao
fp8 quantization with FSDP2 error

pytorch/ao
Does torchao support FP8 Grouped GEMM?

pytorch/ao
Accelerate activation sparsity with activation compression

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.

issues Search Results · repo:pytorch/ao language:Python

Filter by

State

Advanced

432 results

Learn how you can use GitHub Issues to plan and track your work.

Learn how you can use GitHub Issues to plan and track your work.