Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Deprecate serve command line options in favor of yaml config file #146

Closed
xwu99 opened this issue Mar 15, 2024 · 10 comments · Fixed by #165
Closed

Deprecate serve command line options in favor of yaml config file #146

xwu99 opened this issue Mar 15, 2024 · 10 comments · Fixed by #165
Assignees

Comments

@xwu99
Copy link
Contributor

xwu99 commented Mar 15, 2024

We will put most of the complex config in yaml file.
Only keep some necessary options and remove command line options already presented in yaml file to avoid confusing.

@carsonwang
Copy link
Contributor

@Deegue Please list the options to keep and options to remove here.

@Deegue
Copy link
Contributor

Deegue commented Mar 27, 2024

As for serve.py, I'm going to remove:

--port
--route_prefix
--num_replicas
--cpus_per_worker
--gpus_per_worker
--hpus_per_worker
--deepspeed
--workers_per_group
--ipex
--device

while those configs are still kept in command line options:
--config_file
--model_id_or_path
--tokenizer_id_or_path
--models
--serve_local_only
--simple
--keep_serve_terminal

@carsonwang
Copy link
Contributor

cc @xwu99 @jiafuzha @KepingYan

@xwu99
Copy link
Contributor Author

xwu99 commented Apr 3, 2024

Reference

@Deegue thanks for the summary. I am ok with this except @KepingYan pls clarify the differences between simple and OpenAI when using --route_prefix, could we just put it in the yaml file?

@KepingYan
Copy link
Contributor

Now we support three modes to pass parameters:

  1. If --config_file is specified, all attribute values are determined by this file.
  2. --config_file is None and --model_id_or_path is specified. In this case, --tokenizer_id_or_path --port --route_prefix --num_replicas --cpus_per_worker --gpus_per_worker --hpus_per_worker --deepspeed --workers_per_group --ipex --device will take effect.
  3. Both --config_file and --model_id_or_path are None, then all config files in the llm_on_ray/inference/models/ directory will be obtained as the serve models list. And users can specify a sub-list of models to deploy via --models.

If we remove these parameters, what will determine the other attribute values when user specifies --model_id_or_path?
After the modification, which methods of passing parameters will we provide to users?

--port
--route_prefix
--num_replicas
--cpus_per_worker
--gpus_per_worker
--hpus_per_worker
--deepspeed
--workers_per_group
--ipex
--device

@xwu99
Copy link
Contributor Author

xwu99 commented Apr 3, 2024

Now we support three modes to pass parameters:

  1. If --config_file is specified, all attribute values are determined by this file.

  2. --config_file is None and --model_id_or_path is specified. In this case, --tokenizer_id_or_path --port --route_prefix --num_replicas --cpus_per_worker --gpus_per_worker --hpus_per_worker --deepspeed --workers_per_group --ipex --device will take effect.

  3. Both --config_file and --model_id_or_path are None, then all config files in the llm_on_ray/inference/models/ directory will be obtained as the serve models list. And users can specify a sub-list of models to deploy via --models.

If we remove these parameters, what will determine the other attribute values when user specifies --model_id_or_path?

After the modification, which methods of passing parameters will we provide to users?


--port

--route_prefix

--num_replicas

--cpus_per_worker

--gpus_per_worker

--hpus_per_worker

--deepspeed

--workers_per_group

--ipex

--device

Can we keep 1 and 3 and remove 2?

@Deegue Keep in mind that We need to serve multiple models at the same time. So also need to support multiple config files

@Deegue
Copy link
Contributor

Deegue commented Apr 3, 2024

Now we support three modes to pass parameters:

  1. If --config_file is specified, all attribute values are determined by this file.
  2. --config_file is None and --model_id_or_path is specified. In this case, --tokenizer_id_or_path --port --route_prefix --num_replicas --cpus_per_worker --gpus_per_worker --hpus_per_worker --deepspeed --workers_per_group --ipex --device will take effect.
  3. Both --config_file and --model_id_or_path are None, then all config files in the llm_on_ray/inference/models/ directory will be obtained as the serve models list. And users can specify a sub-list of models to deploy via --models.

If we remove these parameters, what will determine the other attribute values when user specifies --model_id_or_path?
After the modification, which methods of passing parameters will we provide to users?


--port

--route_prefix

--num_replicas

--cpus_per_worker

--gpus_per_worker

--hpus_per_worker

--deepspeed

--workers_per_group

--ipex

--device

Can we keep 1 and 3 and remove 2?

@Deegue Keep in mind that We need to serve multiple models at the same time. So also need to support multiple config files

Obviously, it will be clear if we remove case 2..

@xwu99
Copy link
Contributor Author

xwu99 commented Apr 8, 2024

Now we support three modes to pass parameters:

  1. If --config_file is specified, all attribute values are determined by this file.
  2. --config_file is None and --model_id_or_path is specified. In this case, --tokenizer_id_or_path --port --route_prefix --num_replicas --cpus_per_worker --gpus_per_worker --hpus_per_worker --deepspeed --workers_per_group --ipex --device will take effect.
  3. Both --config_file and --model_id_or_path are None, then all config files in the llm_on_ray/inference/models/ directory will be obtained as the serve models list. And users can specify a sub-list of models to deploy via --models.

If we remove these parameters, what will determine the other attribute values when user specifies --model_id_or_path?
After the modification, which methods of passing parameters will we provide to users?


--port

--route_prefix

--num_replicas

--cpus_per_worker

--gpus_per_worker

--hpus_per_worker

--deepspeed

--workers_per_group

--ipex

--device

Can we keep 1 and 3 and remove 2?
@Deegue Keep in mind that We need to serve multiple models at the same time. So also need to support multiple config files

Obviously, it will be clear if we remove case 2..

Yes, pls go ahead to remove case 2 and update code & docs

@xwu99
Copy link
Contributor Author

xwu99 commented Apr 8, 2024

also rename --model_id_or_path to --model_id

@KepingYan
Copy link
Contributor

also rename --model_id_or_path to --model_id

If we remove case 2, can we also remove --model_id_or_path and --tokenizer_id_or_path? Just use --models to specify the list of models to deploy?

@xwu99 xwu99 closed this as completed May 11, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants