Use bnb==0.40.0.post4 to fix bias bug, use bfloat16 by default #341

borzunov · 2023-07-12T10:34:16Z

This PR:

Includes a bnb hotfix: TimDettmers/bitsandbytes@90b0ac5
Changes the default llama dtype to bfloat16 to be compatible with guanaco.

justheuristic · 2023-07-12T10:37:43Z

src/petals/server/server.py

+        if self.block_config.model_type == "llama" and torch_dtype == torch.bfloat16 and quant_type != QuantType.NF4:
+            logger.warning(
+                "LLaMA is loaded in bfloat16 for compatibility with --quant_type nf4 servers (default). "
+                "If you use a private swarm without such servers, use --torch_dtype float16 to force the original float16 dtype"


Suggested change

"If you use a private swarm without such servers, use --torch_dtype float16 to force the original float16 dtype"

"If you want to run in float16, use --torch_dtype float16 to force the original float16 dtype"

justheuristic · 2023-07-12T10:38:13Z

src/petals/server/server.py

@@ -173,6 +173,12 @@ def __init__(
        self.quant_type = quant_type
        logger.info(f"Model weights are loaded in {get_dtype_name(torch_dtype, quant_type)} format")

+        if self.block_config.model_type == "llama" and torch_dtype == torch.bfloat16 and quant_type != QuantType.NF4:
+            logger.warning(
+                "LLaMA is loaded in bfloat16 for compatibility with --quant_type nf4 servers (default). "


Suggested change

"LLaMA is loaded in bfloat16 for compatibility with --quant_type nf4 servers (default). "

"LLaMA is loaded in bfloat16 for compatibility with Guanaco nf4 setup (default). "

justheuristic

LGTM as long as we warn everyone

borzunov · 2023-07-12T11:01:18Z

Transition to bfloat16 has been delayed due to bnb performance issues.

Use bnb==0.40.0.post4 to fix bias bug, use bfloat16 by default

b0d55ee

borzunov requested a review from justheuristic July 12, 2023 10:35

justheuristic reviewed Jul 12, 2023

View reviewed changes

justheuristic approved these changes Jul 12, 2023

View reviewed changes

Resolve review comments

f53c581

borzunov force-pushed the fix-nf4-and-dtypes branch from 5d434a1 to f53c581 Compare July 12, 2023 10:43

borzunov closed this Jul 12, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use bnb==0.40.0.post4 to fix bias bug, use bfloat16 by default #341

Use bnb==0.40.0.post4 to fix bias bug, use bfloat16 by default #341

borzunov commented Jul 12, 2023 •

edited

justheuristic Jul 12, 2023

justheuristic Jul 12, 2023

justheuristic left a comment

borzunov commented Jul 12, 2023

	"If you use a private swarm without such servers, use --torch_dtype float16 to force the original float16 dtype"
	"If you want to run in float16, use --torch_dtype float16 to force the original float16 dtype"

	"LLaMA is loaded in bfloat16 for compatibility with --quant_type nf4 servers (default). "
	"LLaMA is loaded in bfloat16 for compatibility with Guanaco nf4 setup (default). "

Use bnb==0.40.0.post4 to fix bias bug, use bfloat16 by default #341

Use bnb==0.40.0.post4 to fix bias bug, use bfloat16 by default #341

Conversation

borzunov commented Jul 12, 2023 • edited

justheuristic Jul 12, 2023

Choose a reason for hiding this comment

justheuristic Jul 12, 2023

Choose a reason for hiding this comment

justheuristic left a comment

Choose a reason for hiding this comment

borzunov commented Jul 12, 2023

borzunov commented Jul 12, 2023 •

edited