Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switch adapters slightly faster #353

Merged
merged 9 commits into from
Jul 14, 2023
Merged

Switch adapters slightly faster #353

merged 9 commits into from
Jul 14, 2023

Conversation

justheuristic
Copy link
Collaborator

Currently, each TransformerBackend.inference_step looks for adapters and sets the correct adapter type for each block. This is not very expensive, but it can measurably affect inference time:

image

This pull request uses faster adapter switching with just one variable assignment, without iterating over block.modules().

@@ -17,6 +17,7 @@
from petals.server.memory_cache import MemoryCache
from petals.server.task_pool import PrioritizedTaskPool
from petals.utils.misc import is_dummy
from petals.utils.peft import using_global_adapter
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can't import it here, this triggers bnb import

@@ -142,6 +144,7 @@ async def rpc_inference(
requested_backends = tuple(self.module_backends[uid] for uid in requested_uids)
max_length = metadata.get("max_length")
active_adapter = metadata.get("active_adapter", "")
assert not active_adapter or active_adapter in self.adapters
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a error message here, otherwise the client will receive P2PHandlerError("") and won't know what happened

Comment on lines 125 to 126
ADAPTER_NOT_SET = "__ADAPTER_NOT_SET"
GLOBAL_ACTIVE_ADAPTER = ADAPTER_NOT_SET
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
ADAPTER_NOT_SET = "__ADAPTER_NOT_SET"
GLOBAL_ACTIVE_ADAPTER = ADAPTER_NOT_SET
global_active_adapter = None
  1. GLOBAL_ACTIVE_ADAPTER is not a constant, it should not be in caps
  2. Why use a predefined string constant instead of just None?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. ok
  2. None means no adapter, ADAPTER_NOT_SET means you didn't even bother entering the context and you should be chastized for that

@borzunov borzunov changed the title Slightly faster adapter switching Switch adapters slightly faster Jul 14, 2023
@borzunov borzunov merged commit 37fdcb3 into main Jul 14, 2023
7 checks passed
@borzunov borzunov deleted the maybe-speedup-a-bit branch July 14, 2023 19:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants