Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add model and adapter headings #220

Merged
merged 13 commits into from
Feb 5, 2024
Merged

Add model and adapter headings #220

merged 13 commits into from
Feb 5, 2024

Conversation

magdyksaleh
Copy link
Collaborator

No description provided.

@@ -397,6 +404,15 @@ async fn generate(
time_per_token.as_millis().to_string().parse().unwrap(),
);

headers.insert(
"x-predibase-model-id",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Might be a bit strange to hard-code predibase-specific things into an open-source library.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree. Can we just return the base model name / size, and the adapter name passed in request?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure can just make it model-id and adapter-id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Convection is still to use X for custom headers so would something like x-lorax-model=pb://model-name make sense and cater for HF models too?

Copy link
Contributor

@brightsparc brightsparc left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This would suffice for immediate needs, but how would this work for HF hosted models not loaded from PB?

@magdyksaleh
Copy link
Collaborator Author

This would suffice for immediate needs, but how would this work for HF hosted models not loaded from PB?

This is agnostic to model source but we can also make sure we pass a model source header so that you can infer the pb internals

@ksbrar ksbrar changed the title Add predibase headers for billing Add predibase headers for billing and analytics Feb 2, 2024
@@ -286,6 +292,7 @@ async fn generate(
}

let details = req.0.parameters.details || req.0.parameters.decoder_input_details;
let adapter_id = req.0.parameters.adapter_id.clone();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that following #212, adapter ID can come from the merged_parameters as well as the adapter_id.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good point

@@ -31,6 +33,10 @@ use tracing::{info_span, instrument, Instrument};
use utoipa::OpenApi;
use utoipa_swagger_ui::SwaggerUi;

lazy_static! {
static ref MODEL_ID: Mutex<String> = Mutex::new("".to_string());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What does this do?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It smells like we're setting a global lock on the current model ID, which would break concurrency?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It allows you to create a global var that we then update here with the model name. Needed so it is not read only.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see, but why do we need to lock every time we read it if it's only initialized once at the beginning and then read-only? Feels like we're taking a concurrency hit for no reason.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgaddair we just set it once when we run the server for the first time. Maybe we can have a read-write lock. Or maybe we can do something different but we want to communicate the value of model_id here to the request handler here. Maybe we can create a partial function like in python? idk

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tgaddair I think we would need to lock a global var but I could be wrong here

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually looks like I can just use this https://docs.rs/once_cell/latest/once_cell/

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like exactly what we want! Nice find.

@magdyksaleh magdyksaleh changed the title Add predibase headers for billing and analytics Add model and adapter headings Feb 2, 2024
adapter_ids: vec![adapter_id.clone().unwrap()],
..Default::default()
});
let (adapter_id, adapter_source, adapter_parameters) = extract_adapter_params(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we may want to just discard adapter_id at this point, as it should b contained in the adapter parameters.

So I would imagine this function just returning adapter_source and adapter_parameters.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

oh yeah nice!

@tgaddair tgaddair merged commit f0dcb05 into main Feb 5, 2024
1 check passed
@tgaddair tgaddair deleted the add-headers branch February 5, 2024 17:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants