Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[PERFORMANCE] Fix Packing thread regression in code #642

Merged
merged 2 commits into from
Jun 27, 2024

Conversation

Qubitium
Copy link
Collaborator

@Qubitium Qubitium commented Apr 16, 2024

Reason for PR:

However, this will need a new pkg depend: https://github.com/joblib/threadpoolctl

Python helpers to limit the number of threads used in native libraries that handle their own internal threadpool (BLAS and OpenMP implementations)

ref: #640 (comment)

Tests Passed:

  • test_quantization
  • test_serialization

@Qubitium
Copy link
Collaborator Author

No need to merge this if #640 is merged first as that contains everything here and more.

@Qubitium Qubitium changed the title Fix Packing thread regression in code [PERFORMANCE] Fix Packing thread regression in code Apr 28, 2024
@Qubitium
Copy link
Collaborator Author

For various reasons my team has forked and refractored AutoGPTQ into new GPTQModel project. This bug fixed has been merged there.

@Qubitium Qubitium merged commit ecd0c1c into AutoGPTQ:main Jun 27, 2024
@Qubitium Qubitium deleted the packing-threads branch June 27, 2024 06:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant