Add `as_actor` single-actor mode #195

DNXie · 2025-09-19T18:59:15Z

Summary:
This PR adds ForgeActor.as_actor() and refactors ForgeActor.options() and .as_service() to improve configuration handling and dynamic subclassing for actors and services. Context: #173

Key changes include:

Added as_actor() to support launching a single actor directly.
Support positional argument for Service and Actor
Rename num_hosts to hosts and num_procs to procs in ProcessConfig.
.options() now stores all configuration parameters as class attributes rather than building a full config object immediately.
Dynamic subclasses are created only during .as_actor() or .as_service() calls, ensuring each configuration remains isolated.
Default configuration is applied automatically if .options() is not called.

Changes in behavior:

So the single actor initialization from

cfg = ProcessConfig(...)
actor = await MyForgeActor.launch(process_config=cfg, **actor_kwargs)

become

actor = await MyForgeActor.options(procs=1, ...).as_actor(**actor_kwargs)

Usage Examples 1 (Actor):

# Pre-configure a single actor
actor = await MyForgeActor.options(procs=1, hosts=1).as_actor(...)
await actor.shutdown()

# Default usage without calling options
actor = await MyForgeActor.as_actor(...)
await actor.shutdown()

Log:

Spawning single actor Counter

Usage Examples 2 (Service):

# Pre-configure a service with multiple replicas
service = await MyForgeActor.options(num_replicas=3, procs=2).as_service(...)
await service.shutdown()

# Default usage without calling options
service = await MyForgeActor.as_service(...)
await service.shutdown()

Log when num_replicas=3

The printed class name is its original class name instead of xxService (See #193)

INFO     forge.controller.actor:actor.py:123 Spawning Service Actor for Counter
INFO     forge.controller.actor:actor.py:207 Spawning single actor Counter
INFO     forge.controller.actor:actor.py:207 Spawning single actor Counter
INFO     forge.controller.actor:actor.py:207 Spawning single actor Counter

Usage Examples 3 (Positional argument):
This means you can now do:

await Counter.as_service(10)

instead of having to use keyword-only arguments like:

await Counter.as_service(v=10)

Test

pytest tests/unit_tests/test_service.py

Ritesh1905 · 2025-09-20T04:19:09Z

Wondering what is the motivation behind this? Why would one choose an actor creation directly over service? Single actor seems to be a special case of service?

allenwang28 · 2025-09-20T14:56:13Z

Wondering what is the motivation behind this? Why would one choose an actor creation directly over service? Single actor seems to be a special case of service?

For context: #173

You're right though, the rationale is that only vLLM should be a service currently. Trainer for e.g. will not really take advantage of fault tolerance or routing, so we should always expect it to be a singleton.

src/forge/controller/actor.py

allenwang28 · 2025-09-22T23:07:13Z

src/forge/controller/actor.py

-        num_replicas: int | None = None,
-        procs: int | None = None,
-        **service_kwargs,
+        procs: int,


also hosts: int, with_gpu: bool and num_replicas: int | None?

I kind of put them all in **kwargs since only procs is required for both service and actor. Do you think it is better to explicitly list them?

yes, please explicitly list them

allenwang28 · 2025-09-22T23:14:39Z

src/forge/controller/actor.py

+            class_attrs["num_replicas"] = 1
+        cfg = ServiceConfig(**filter_config_params(ServiceConfig, class_attrs))
+
+        service_cls = type(f"{cls.__name__}Service", (cls,), {"_service_config": cfg})


Why do we still need service_cls here? can the logic of as_service() be:

@classmethod async def as_service(cls, **actor_kwargs) -> "ServiceInterface": service = Service(cfg, cls, actor_kwargs) await service.__initialize__() return ServiceInterface(service, cls)

You are right! Removed.

allenwang28 · 2025-09-22T23:20:06Z

src/forge/controller/actor.py

+        cfg = ProcessConfig(**filter_config_params(ProcessConfig, class_attrs))
+
+        logger.info("Spawning single actor %s", cls.__name__)
+        actor = await cls.launch(process_config=cfg, **actor_kwargs)


hmm maybe we can modify the def launch() above to simplify things?

Like this:

@classmethod async def launch(cls, *args, **kwargs) -> "ForgeActor": proc_mesh = await get_proc_mesh(process_config=ProcessConfig(procs=cls._procs, hosts=cls._hosts, with_gpu=cls._with_gpu)) actor_name = kwargs.pop("name", cls.__name__) actor = await proc_mesh.spawn(actor_name, cls, *args, **kwargs) actor._proc_mesh = proc_mesh if hasattr(proc_mesh, "_hostname") and hasattr(proc_mesh, "_port"): host, port = proc_mesh._hostname, proc_mesh._port await actor.set_env.call(addr=host, port=port) await actor.setup.call() return actor

allenwang28 · 2025-09-23T14:39:18Z

src/forge/controller/actor.py

-        num_replicas: int | None = None,
-        procs: int | None = None,
-        **service_kwargs,
+        procs: int,


yes, please explicitly list them

allenwang28 · 2025-09-23T14:39:53Z

src/forge/controller/actor.py

-            # dynamically create a configured subclass for consistency
-            cls = type(f"{cls.__name__}Service", (cls,), {"_service_config": cfg})
+        class_attrs = {k: v for k, v in cls.__dict__.items() if not k.startswith("__")}
+        if "procs" not in class_attrs:


follow up comment on explicit attributes, this for e.g. is unclear and can be pretty brittle

Removed in the latest version

allenwang28 · 2025-09-23T14:40:29Z

src/forge/controller/actor.py

-        proc_mesh = await get_proc_mesh(process_config=process_config)
+        # Build process config from class attributes with defaults
+        cfg = ProcessConfig(
+            procs=getattr(cls, "procs", 1),


I generally try and use getattr as little as possible. If it's used too much it can mask real errors that can be really hard to debug later.

This is a fallback when the user doesn’t specify configs via .options(). In this case, the original ForgeActor class doesn’t have attributes like procs. If we are getting rid of getattr, one way I can think of is to add these attributes to ForgeActor class like

class ForgeActor(Actor): procs: int = 1 hosts: int | None = None with_gpus: bool = False num_replicas: int = 1 def __init__(self, *args, **kwargs):

But either way, it means the default values are specified in three places:

In types.py

As default values in .options()

As attributes on the ForgeActor class OR here in launch.

I’m not sure if there’s a cleaner way to handle this. I’ve updated the code accordingly (get rid of getattr), please take a look and let me know if you have any suggestions or improvements.

allenwang28 · 2025-09-23T14:40:44Z

src/forge/controller/actor.py

+        actor = await cls.launch(**actor_kwargs)
+
+        # Patch shutdown to bypass endpoint system
+        actor.shutdown = types.MethodType(


hmm this is a hack, we shouldn't be doing this. I'm guessing it's because we want to preserve the ability to

svc = MyActor.as_service() await svc.shutdown()

?

No, as_service returns a ServiceInterface. So when we call service.shutdown(), we are actually calling ServiceInterface.shutdown

The reason I have to do this hacky thing is:
Without it, actor.shutdown() gives me this error:

RuntimeError: Actor <class 'tests.unit_tests.test_service.Counter'>.shutdown is not annotated as an endpoint. To call it as one, add a @endpoint decorator to it, or directly wrap it in one as_endpoint(obj.method).call(...)

If I simply decorate shutdown with @endpoint, we'd have to call it like

await actor.shutdown.call()

But it would still give error:

AssertionError("Called shutdown on a replica with no proc_mesh.")

Any suggestions?

ah I see. Ok in that case, I think what we should do is not do actor.shutdown() for now, and just rely on eg

await RLTrainer.stop(trainer)

for now. Maybe what we can do next is have the provisioner keep track of all of the proc meshes, and do a global shutdown()? Including all the services etc. we can discuss more, just want to unblock this PR

Sounds good. Done!

allenwang28 · 2025-09-23T20:39:55Z

src/forge/controller/actor.py


    @classmethod
-    async def launch(cls, *, process_config: ProcessConfig, **kwargs) -> "ForgeActor":
+    async def launch(cls, **kwargs) -> "ForgeActor":


can you add *args here? This solves the *args related TODO that's listed here!

Added in launch and as_actor. Also tested in test_as_actor_with_kwargs_config

allenwang28

Ty @DNXie!

allenwang28 · 2025-09-23T20:56:34Z

src/forge/controller/actor.py

-            # Option C: skip options, use the default service config with num_replicas=1, procs=1
-            service = await MyForgeActor.as_service(...)
-            await service.shutdown()
+        Returns a dynamically created subclass of this ForgeActor with bound configuration.


Suggested change

Returns a dynamically created subclass of this ForgeActor with bound configuration.

Returns a version of ForgeActor with configured resource attributes.

allenwang28 · 2025-09-23T21:30:03Z

tests/unit_tests/test_service.py

+        self.kwargs = kwargs
+
+    @endpoint
+    async def get_args(self):


let's remove these tests, i think it's fine without

… service and actor (meta-pytorch#195) * add as_actor * add as_actor * refactor options to support both * rename num_hosts to hosts * rename * refactor actor.py and add more test cases * options stop taking config obj * fix lint * fix ci * fix broken tests * fix lint * remove xxService class * simplify launch * resolve comments * revert shutdown * remove shutdown patch * support args * update docstring * support args in as_service * fix ci

add as_actor

15a5f0f

DNXie requested a review from allenwang28 September 19, 2025 18:59

meta-cla bot added the CLA Signed This label is managed by the Meta Open Source bot. label Sep 19, 2025

allenwang28 reviewed Sep 20, 2025

View reviewed changes

src/forge/controller/actor.py Outdated Show resolved Hide resolved

src/forge/controller/actor.py Outdated Show resolved Hide resolved

src/forge/controller/actor.py Outdated Show resolved Hide resolved

src/forge/controller/actor.py Outdated Show resolved Hide resolved

DNXie added 8 commits September 22, 2025 10:32

add as_actor

1aa0dc3

refactor options to support both

be146a4

rename num_hosts to hosts

c8a1733

rename

72d0315

refactor actor.py and add more test cases

aeb6282

Merge branch 'as_actor' of github.com:DNXie/forge into as_actor

be1fbe9

options stop taking config obj

782e67d

fix lint

310b04d

DNXie requested a review from allenwang28 September 22, 2025 21:12

DNXie added 4 commits September 22, 2025 14:18

Merge remote-tracking branch 'upstream/main' into as_actor

5565b03

fix ci

9eb7c8f

fix broken tests

56c5b5e

fix lint

53f773c

allenwang28 reviewed Sep 22, 2025

View reviewed changes

DNXie added 2 commits September 22, 2025 16:45

remove xxService class

e7f5a76

simplify launch

3e733de

DNXie requested a review from allenwang28 September 22, 2025 23:53

allenwang28 reviewed Sep 23, 2025

View reviewed changes

resolve comments

88dc895

DNXie requested a review from allenwang28 September 23, 2025 19:04

revert shutdown

82eb6ca

allenwang28 reviewed Sep 23, 2025

View reviewed changes

remove shutdown patch

691344c

allenwang28 approved these changes Sep 23, 2025

View reviewed changes

DNXie added 2 commits September 23, 2025 14:23

support args

d67e55e

update docstring

645dea4

allenwang28 reviewed Sep 23, 2025

View reviewed changes

DNXie added 2 commits September 23, 2025 14:36

support args in as_service

0862965

fix ci

48d70e2

DNXie merged commit 88fcd6b into meta-pytorch:main Sep 23, 2025
5 checks passed

DNXie mentioned this pull request Sep 23, 2025

[RFC] Add MyActor.as_actor() API #173

Closed

casteryh mentioned this pull request Sep 24, 2025

grpo and toy_rl apps broken #225

Closed

DNXie mentioned this pull request Sep 24, 2025

Fix Policy.launch to be compatible with new ForgeActor.launch API #228

Merged

	Returns a dynamically created subclass of this ForgeActor with bound configuration.
	Returns a version of ForgeActor with configured resource attributes.

Add as_actor single-actor mode #195

Add as_actor single-actor mode #195

Uh oh!

Conversation

DNXie commented Sep 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Ritesh1905 commented Sep 20, 2025

Uh oh!

allenwang28 commented Sep 20, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

DNXie Sep 23, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

allenwang28 left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Add `as_actor` single-actor mode #195

Add `as_actor` single-actor mode #195

DNXie commented Sep 19, 2025 •

edited

Loading

DNXie Sep 23, 2025 •

edited

Loading