Logging/Saving Settings and Instructions for Inference Jobs #1646

mohammedouhibi · 2024-05-26T19:43:45Z

There needs to be a way to log or save all settings and instructions provided for every inference job that the vLLM inference server receives. This would be useful for debugging purposes, as it would allow us to track and analyze the input data and configurations used for each job.

Proposed solution :

Implement a logging mechanism that captures and stores the following information for each inference job:

Model parameters (e.g., model name, version, quantization settings)
Input data (e.g., prompt, context, or other input files)
Inference settings (e.g., temperature, top-k, top-p, number of tokens, etc.)
Timestamp and other relevant metadata

pseudotensor · 2024-05-26T21:18:33Z

Alot of information is stored in the save directory in history.json for every inference job. It has everything you mentioned, except not perhaps as much detail -- e.g.no input files, but just whether there are input files.

In addition, vLLM or TGI can also change its logging.

mohammedouhibi · 2024-05-28T14:00:03Z

Alot of information is stored in the save directory in history.json for every inference job. It has everything you mentioned, except not perhaps as much detail -- e.g.no input files, but just whether there are input files.

I cant seem to find history.json under /save, is there perhaps a run option that I'm missing?

pseudotensor · 2024-05-28T14:35:20Z

--save_dir=foo will place it in foo directory relative to startup in repo.

mohammedouhibi · 2024-05-28T14:43:44Z

--save_dir=foo will place it in foo directory relative to startup in repo.

Great! It works now.

though I'm missing the docs/chunks used in the job. I really think it needs to be added.
I have noticed that the docs being loaded when using the h2oGPT interface is different than those used when requesting the same instruction through a gradio client(i inferred this from the fact that the "num_prompt_tokens" value is higher when handling gradio client requests), it's causing my model to behave differently (worse in my case)

pseudotensor · 2024-05-29T01:15:54Z

Hi, the API call and history.json does contain 'save_dict['sources']` as the list of sources used.

mohammedouhibi · 2024-05-29T10:52:08Z

I see, thanks!

mohammedouhibi closed this as completed May 29, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Logging/Saving Settings and Instructions for Inference Jobs #1646

Logging/Saving Settings and Instructions for Inference Jobs #1646

mohammedouhibi commented May 26, 2024

pseudotensor commented May 26, 2024

mohammedouhibi commented May 28, 2024

pseudotensor commented May 28, 2024

mohammedouhibi commented May 28, 2024 •

edited

Loading

pseudotensor commented May 29, 2024

mohammedouhibi commented May 29, 2024

Logging/Saving Settings and Instructions for Inference Jobs #1646

Logging/Saving Settings and Instructions for Inference Jobs #1646

Comments

mohammedouhibi commented May 26, 2024

pseudotensor commented May 26, 2024

mohammedouhibi commented May 28, 2024

pseudotensor commented May 28, 2024

mohammedouhibi commented May 28, 2024 • edited Loading

pseudotensor commented May 29, 2024

mohammedouhibi commented May 29, 2024

mohammedouhibi commented May 28, 2024 •

edited

Loading