Disable chunked_forward() on AVX512 CPUs #179

borzunov · 2023-01-04T18:10:03Z

See benchmarks in #176.

borzunov · 2023-01-04T18:17:42Z

src/petals/bloom/modeling_utils.py

+            if self.word_embeddings.weight.numel() * 4 < 0.9 * psutil.virtual_memory().total:
+                logger.warning(
+                    "Running the client with dtype bfloat16 on CPU may be slow, since your CPU doesn't support AVX512. "
+                    "Consider loading the model with torch_dtype='float32'"


In future, we can also suggest to use fast_but_approximate=True here, which will run faiss under the hood for greedy and top_k generation

Disable chunked_forward() on AVX512 CPUs

ba88c59

borzunov requested a review from justheuristic January 4, 2023 18:10

borzunov added 3 commits January 4, 2023 18:10

Require cpufeature module

35bbe30

black

1d034c6

Improve comment

def5dfc

borzunov commented Jan 4, 2023

View reviewed changes

borzunov force-pushed the no-chunked-forward-on-avx512 branch from 203e2ca to ce76beb Compare January 4, 2023 18:20

Refactor

7880687

borzunov force-pushed the no-chunked-forward-on-avx512 branch from ce76beb to 7880687 Compare January 4, 2023 18:25

borzunov added 2 commits January 4, 2023 18:31

Fix using cpufeature

bfe0f33

Fix Python 3.7

49dce00

borzunov merged commit 5569838 into main Jan 4, 2023

borzunov deleted the no-chunked-forward-on-avx512 branch January 4, 2023 19:28

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Disable chunked_forward() on AVX512 CPUs #179

Disable chunked_forward() on AVX512 CPUs #179

borzunov commented Jan 4, 2023 •

edited

Loading

borzunov Jan 4, 2023

Disable chunked_forward() on AVX512 CPUs #179

Disable chunked_forward() on AVX512 CPUs #179

Conversation

borzunov commented Jan 4, 2023 • edited Loading

borzunov Jan 4, 2023

Choose a reason for hiding this comment

borzunov commented Jan 4, 2023 •

edited

Loading