Add a gRPC server that can handle eval requests from a client using SQLGen by tommyang · Pull Request #1 · GoogleCloudPlatform/evalbench

tommyang · 2024-08-19T22:49:53Z

eval_server.py is the alternative main() of EvalBench. CLI mode evalbench.py is untouched.

eval_service.py implement the gRPC service logic, currently it converts streaming RPC requests into a list of EvalInput and call Evaluator.evaluate().

Known limitations/future work:

Configs are currently loaded at eval_service init, so cannot handle different databases yet. This is the main reason why GetDataset RPC is not implemented yet.
EvalResponse proto is under-defined, we likely need to report back scoring results eventually.
Because of the way how Evaluator.evaluate() + SQLPromptGenWork + SQLGenWork currently works (nl_prompt -> generated_prompt -> generated_sql), there is currently a hack in eval_service.py where client-side SQLGen-generated SQL converted as EvalInput.nl_prompt so that this becomes generated_sql after going through passthrough prompt and model generators. Evaluator.evaluate() needs to be refactored to get rid of this hack (e.g. the ability to skip SQLPromptGenWork & SQLGenWork, not just passthrough), otherwise the generated_sql value is still overriden by SQLGenWork. This would be a relatively invasive change since Evaluator.evaluate() is shared by the service and the CLI mode of evalbench.
There is high OOM potential.

IsmailMehdi

LGTM

Containerfile

…QLGen - `eval_server.py` is the alternative `main()` of EvalBench. CLI mode `evalbench.py` is untouched. - `eval_service.py` implement the gRPC service logic, currently it converts streaming RPC requests into a list of EvalInput and call `Evaluator.evaluate()`. - Add a Containerfile definition for deployment, using multi-stage build and distroless. Known limitations/future work: - Configs are currently loaded at eval_service init, so cannot handle different databases yet. This is the main reason why GetDataset RPC is not implemented yet. - EvalResponse proto is under-defined, we likely need to report back scoring results eventually. - Because of the way how Evaluator.evaluate() + SQLPromptGenWork + SQLGenWork currently works (`nl_prompt` -> `generated_prompt` -> `generated_sql`), there is currently a hack in `eval_service.py` where client-side SQLGen-generated SQL converted as `EvalInput.nl_prompt` so that this becomes `generated_sql` after going through passthrough prompt and model generators. `Evaluator.evaluate()` needs to be refactored to get rid of this hack (e.g. the ability to skip SQLPromptGenWork & SQLGenWork, not just passthrough), otherwise the `generated_sql` value is still overriden by SQLGenWork. This would be a relatively invasive change since Evaluator.evaluate() is shared by the service and the CLI mode of evalbench. - There is high OOM potential.

tommyang force-pushed the push-ootuqyuwkrku branch 2 times, most recently from c8d4d36 to 2e4ae77 Compare August 19, 2024 23:03

tommyang requested a review from IsmailMehdi August 20, 2024 00:22

IsmailMehdi previously approved these changes Aug 20, 2024

View reviewed changes

Containerfile Outdated Show resolved Hide resolved

tommyang force-pushed the push-ootuqyuwkrku branch from 2e4ae77 to f40a3e1 Compare August 20, 2024 16:37

tommyang force-pushed the push-ootuqyuwkrku branch from f40a3e1 to 36591c3 Compare August 20, 2024 16:47

tommyang dismissed IsmailMehdi’s stale review via 36591c3 August 20, 2024 22:58

IsmailMehdi approved these changes Aug 20, 2024

View reviewed changes

tommyang merged commit f39880f into GoogleCloudPlatform:main Aug 20, 2024

tommyang deleted the push-ootuqyuwkrku branch August 21, 2024 01:22

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add a gRPC server that can handle eval requests from a client using SQLGen#1

Add a gRPC server that can handle eval requests from a client using SQLGen#1
tommyang merged 1 commit intoGoogleCloudPlatform:mainfrom
tommyang:push-ootuqyuwkrku

tommyang commented Aug 19, 2024

Uh oh!

IsmailMehdi left a comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

tommyang commented Aug 19, 2024

Uh oh!

IsmailMehdi left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants