You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
{{ message }}
This repository has been archived by the owner on May 28, 2024. It is now read-only.
I try to run inside the latest image, but after the model warmup, it just died with no error.
I was trying to run this
aviary run --model ~/models/continuous_batching/mosaicml--mpt-7b-chat.yaml
the only change inside the yaml is to remove
ray_actor_options:
num_gpus: 1
since I don't have 'accelerator_type_a10', I have a6000
here is the last of the logs
ve taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Downloaded /home/ray/data/hub/models--mosaicml--mpt-7b-chat/snapshots/64e5c9c9fb53a8e89690c2dee75a5add37f7113e/pytorch_model-00001-of-00002.bin in 0:02:35.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Download: [1/2] -- ETA: 0:02:35
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Download file: pytorch_model-00002-of-00002.bin
(ServeController pid=30116) WARNING 2023-10-02 06:40:38,770 controller 30116 deployment_state.py:2006 - Deployment 'mosaicml--mpt-7b-chat' in application 'mosaicml--mpt-7b-chat' has 1 replicas that have taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(ServeController pid=30116) WARNING 2023-10-02 06:41:08,775 controller 30116 deployment_state.py:2006 - Deployment 'mosaicml--mpt-7b-chat' in application 'mosaicml--mpt-7b-chat' has 1 replicas that have taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Downloaded /home/ray/data/hub/models--mosaicml--mpt-7b-chat/snapshots/64e5c9c9fb53a8e89690c2dee75a5add37f7113e/pytorch_model-00002-of-00002.bin in 0:00:58.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Download: [2/2] -- ETA: 0
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) No safetensors weights found for model mosaicml/mpt-7b-chat at revision None. Converting PyTorch weights to safetensors.
(ServeController pid=30116) WARNING 2023-10-02 06:41:38,862 controller 30116 deployment_state.py:2006 - Deployment 'mosaicml--mpt-7b-chat' in application 'mosaicml--mpt-7b-chat' has 1 replicas that have taken more than 30s to initialize. This may be caused by a slow __init__ or reconfigure method.
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Convert: [1/2] -- Took: 0:00:20.415345
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) Convert: [2/2] -- Took: 0:00:06.243851
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) [INFO 2023-10-02 06:42:05,045] tgi.py: 214 Warming up model on workers...
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) [INFO 2023-10-02 06:42:05,054] tgi_worker.py: 650 Model is warming up. Num requests: 3 Prefill tokens: 6000 Max batch total tokens: None
(AviaryTGIInferenceWorker:mosaicml/mpt-7b-chat pid=31233) [INFO 2023-10-02 06:42:07,307] tgi_worker.py: 663 Model finished warming up (max_batch_total_tokens=None) and is ready to serve requests.
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) [INFO 2023-10-02 06:42:07,520] tgi.py: 170 Rolling over to new worker group [Actor(AviaryTGIInferenceWorker, 725292a8070301f947130c2c01000000)]
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) [INFO 2023-10-02 06:42:07,661] model_app.py: 83 Reconfigured and ready to serve.
(ServeReplica:mosaicml--mpt-7b-chat:mosaicml--mpt-7b-chat pid=30186) DeprecationWarning: `ray.state.actors` is a private attribute and access will be removed in a future Ray version.
/home/ray/anaconda3/lib/python3.9/tempfile.py:821: ResourceWarning: Implicitly cleaning up <TemporaryDirectory '/tmp/tmptyp67o3t'>
_warnings.warn(warn_message, ResourceWarning)
/home/ray/anaconda3/lib/python3.9/subprocess.py:1052: ResourceWarning: subprocess 28960 is still running
_warn("subprocess %s is still running" % self.pid,
ResourceWarning: Enable tracemalloc to get the object allocation traceback
(base) ray@4cd79d6dad32:~$
The text was updated successfully, but these errors were encountered:
I try to run inside the latest image, but after the model warmup, it just died with no error.
I was trying to run this
aviary run --model ~/models/continuous_batching/mosaicml--mpt-7b-chat.yaml
the only change inside the yaml is to remove
ray_actor_options:
num_gpus: 1
since I don't have 'accelerator_type_a10', I have a6000
here is the last of the logs
(base) ray@4cd79d6dad32:~$
The text was updated successfully, but these errors were encountered: