Implement `pipeline.to(device)`

Currently, pipeline modules are moved to the preferred compute device during `__call__`. This is reasonable, as they stay there as long as the user keeps passing the same `torch_device` across calls.

However, in multi-GPU model-serving scenarios, it could be useful to move each pipeline to a dedicated device during or immediately after instantiation. This would make it possible to create, say, 8 different pipelines and move each one to a different GPU. Doing it this way could potentially save CPU memory while preparing the service.

Currently, the workaround to achieve the same would be to perform a call with fake data immediately after the instantiation.

**Describe the solution you'd like**
Ideally, the following should work:
```Python
pipe = StableDiffusionPipeline.from_pretrained(model_id).to("cuda:1")
```

**Describe alternatives you've considered**
Current workaround:
```Python
pipe = StableDiffusionPipeline.from_pretrained(model_id)
_ = pipe(["cat"], num_inference_steps=1, torch_device="cuda:1")
```

Another alternative would be to pass the device to the initializer. This could be done in addition to adding a `to` method, but I believe it's not necessary as `to` is familiar enough to PyTorch users.

**Additional context**
See discussion in [this Slack thread](https://huggingface.slack.com/archives/C03HBN1C8CW/p1660716555104789).

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement `pipeline.to(device)` #195

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Implement pipeline.to(device) #195

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions

Implement `pipeline.to(device)` #195