Support entropy loss

# Summary

Support entropy loss. This would require adding an entropy loss coefficient to the `trainer.algorithm` configuration. Loss calculation support would involve implementing entropy calculation with gradients enabled in  [ModelWrapper](https://github.com/NovaSky-AI/SkyRL/blob/104fefb758d2002b0f3a416ac24af3519cb0e1fc/skyrl-train/skyrl_train/model_wrapper.py#L24) and propagating that to [PolicyWorkerBase](https://github.com/NovaSky-AI/SkyRL/blob/104fefb758d2002b0f3a416ac24af3519cb0e1fc/skyrl-train/skyrl_train/workers/worker.py#L612)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support entropy loss #583

Summary

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Support entropy loss #583

Description

Summary

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions