Skip to content

[Refactor] For PP, wrap model and schedule into a wrapper model so that Trainer does not need to know about PP #416

@BlueCrescent

Description

@BlueCrescent

Currently, the Trainer receives the scheduled pipeline if it exists or None otherwise. When processing a batch is processed, two different logics are available depending on this parameter.

Goal:
Create a wrapper that itself is derived from NNModel that contains both the actual model and the schedule and can be used to hide the PP code from the Trainer. It probably should:

  • Perform the schedule in its forward() call.
  • Do nothing in its backward() call.

Potential problems:

  • Backwards is called on the loss (and loss computation in general differs between the two logics).

Additionally:

  • Use the same adaption in evaluator.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions