Skip to content

Add a gRPC server that can handle eval requests from a client using SQLGen#1

Merged
tommyang merged 1 commit intoGoogleCloudPlatform:mainfrom
tommyang:push-ootuqyuwkrku
Aug 20, 2024
Merged

Add a gRPC server that can handle eval requests from a client using SQLGen#1
tommyang merged 1 commit intoGoogleCloudPlatform:mainfrom
tommyang:push-ootuqyuwkrku

Conversation

@tommyang
Copy link
Copy Markdown
Collaborator

eval_server.py is the alternative main() of EvalBench. CLI mode evalbench.py is untouched.

eval_service.py implement the gRPC service logic, currently it converts streaming RPC requests into a list of EvalInput and call Evaluator.evaluate().

Known limitations/future work:

  • Configs are currently loaded at eval_service init, so cannot handle different databases yet. This is the main reason why GetDataset RPC is not implemented yet.
  • EvalResponse proto is under-defined, we likely need to report back scoring results eventually.
  • Because of the way how Evaluator.evaluate() + SQLPromptGenWork + SQLGenWork currently works (nl_prompt -> generated_prompt -> generated_sql), there is currently a hack in eval_service.py where client-side SQLGen-generated SQL converted as EvalInput.nl_prompt so that this becomes generated_sql after going through passthrough prompt and model generators. Evaluator.evaluate() needs to be refactored to get rid of this hack (e.g. the ability to skip SQLPromptGenWork & SQLGenWork, not just passthrough), otherwise the generated_sql value is still overriden by SQLGenWork. This would be a relatively invasive change since Evaluator.evaluate() is shared by the service and the CLI mode of evalbench.
  • There is high OOM potential.

@tommyang tommyang force-pushed the push-ootuqyuwkrku branch 2 times, most recently from c8d4d36 to 2e4ae77 Compare August 19, 2024 23:03
@tommyang tommyang requested a review from IsmailMehdi August 20, 2024 00:22
IsmailMehdi
IsmailMehdi previously approved these changes Aug 20, 2024
Copy link
Copy Markdown
Collaborator

@IsmailMehdi IsmailMehdi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

…QLGen

- `eval_server.py` is the alternative `main()` of EvalBench. CLI mode `evalbench.py` is untouched.
- `eval_service.py` implement the gRPC service logic, currently it converts streaming RPC requests into a list of EvalInput and call `Evaluator.evaluate()`.
- Add a Containerfile definition for deployment, using multi-stage build and distroless.

Known limitations/future work:
- Configs are currently loaded at eval_service init, so cannot handle different databases yet. This is the main reason why GetDataset RPC is not implemented yet.
- EvalResponse proto is under-defined, we likely need to report back scoring results eventually. 
- Because of the way how Evaluator.evaluate() + SQLPromptGenWork + SQLGenWork currently works (`nl_prompt` -> `generated_prompt` -> `generated_sql`), there is currently a hack in `eval_service.py` where client-side SQLGen-generated SQL converted as `EvalInput.nl_prompt` so that this becomes `generated_sql` after going through passthrough prompt and model  generators. `Evaluator.evaluate()` needs to be refactored to get rid of this hack (e.g. the ability to skip SQLPromptGenWork & SQLGenWork, not just passthrough), otherwise the `generated_sql` value is still overriden by SQLGenWork. This would be a relatively invasive change since Evaluator.evaluate() is shared by the service and the CLI mode of evalbench.
- There is high OOM potential.
@tommyang tommyang merged commit f39880f into GoogleCloudPlatform:main Aug 20, 2024
@tommyang tommyang deleted the push-ootuqyuwkrku branch August 21, 2024 01:22
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants