The current `LocalHFBackend` seems to gradually increase memory on GPU when you repeatedly call instruct, resulting in a OOM error.
The current
LocalHFBackendseems to gradually increase memory on GPU when you repeatedly call instruct, resulting in a OOM error.