huggingface / optimum-quanto Public

Notifications You must be signed in to change notification settings
Fork 61
Star 817

Code
Issues 18
Pull requests 2
Actions
Projects
Security
Insights

Additional navigation options

Code
Issues
Pull requests
Actions
Projects
Security
Insights

Issues: huggingface/optimum-quanto

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

18 Open 112 Closed

Author

Filter by author

Label

Filter by label

Use alt + click/return to exclude labels

or ⇧ + click/return for logical OR

Projects

Filter by project

Milestones

Filter by milestone

Assignee

Filter by who’s assigned

Assigned to nobody

Sort

Sort by

Newest Oldest Most commented Least commented Recently updated Least recently updated Best match

Most reactions

Issues list

Static Quantization - Weights Still in FP32

#347 opened Nov 6, 2024 by ClaraLovesFunk

How to support activation 4bit quantization?

#346 opened Nov 4, 2024 by Ther-nullptr

Only random noise is generated with Flux + LoRA with optimum-quanto >= 0.2.5

#343 opened Oct 30, 2024 by nelapetrzelkova

Inference Speed Slowdown with Static Quantization

#340 opened Oct 29, 2024 by ClaraLovesFunk

Will module output not be quantized when the model is directly trained after Calibration?

#336 opened Oct 11, 2024 by tusiqi1

Corrupted outputs with Marlin int4 kernels as parallelization increases bug

Something isn't working

help wanted

Extra attention is needed

#332 opened Oct 6, 2024 by dacorvo

optimum-quanto 0.25 requires ninja but 'pip check flux' reports 'ninja-1.11.1.1 is not supported on this platform' Stale

#331 opened Oct 5, 2024 by Davros666

Accuracy issue when using torch._int_mm on AMD CPUs

#319 opened Sep 26, 2024 by dacorvo

qint4 failed for diffusers: QBitsTensor cannot be changed

#312 opened Sep 19, 2024 by liyihao1230

qin4 inference fails with RuntimeError: Cannot set version_counter for inference tensor

#304 opened Sep 3, 2024 by BenjaminBossan

Potential Gradient Error when Reloading Frozen Weights in qmodule.py _load_from_state_dict Stale

#293 opened Aug 24, 2024 by cjfghk5697

Support for FP8 Matmuls

#275 opened Aug 9, 2024 by maktukmak

Support for new diffuser: flux1.schnell

#272 opened Aug 7, 2024 by KoppAlexander

Packages created on the CI are missing cpp and cuda extension files

#254 opened Jul 23, 2024 by dacorvo

qint4 failing with PixArt Transformer

#228 opened Jul 3, 2024 by sayakpaul

Inference from a reload quantized open clip model (by .load_state_dict) resulted in IndexError Stale

#217 opened Jun 24, 2024 by kechan

Switch to ruff native formatter good first issue

Good for newcomers

help wanted

Extra attention is needed

Stale

#186 opened Apr 22, 2024 by dacorvo

QTensor cannot be created from inside a dynamo graph

#46 opened Dec 11, 2023 by dacorvo

ProTip! Type g i on any issue or pull request to go back to the issue listing page.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly