Skip to content
This repository was archived by the owner on Oct 9, 2024. It is now read-only.
This repository was archived by the owner on Oct 9, 2024. It is now read-only.

How to understand this note: "note: Since Deepspeed-ZeRO can process multiple generate streams in parallel its throughput can be further divided by 8 or 16 ..." #99

Open
@HuipengXu

Description

@HuipengXu

Why is it said that only ds_zero is currently doing world_size streams on world_size gpus, while acclerate and ds inference should be doing the same as well since they also use multiprocessing?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions