-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support/aws bedrock #120
Support/aws bedrock #120
Conversation
…nd added claude2 support, wip
…tion parameters, possibility to change teacher model
LLM_GENERATION_PARAMETERS = ["temperature", "top_p", "max_new_tokens"] | ||
|
||
|
||
class LLama_Bedrock_API(Bedrock_API): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Strange capitalization.
from tanuki.language_models.llm_configs.openai_config import OpenAIConfig | ||
from tanuki.language_models.llm_configs.claude_config import ClaudeConfig | ||
from tanuki.language_models.llm_configs.llama_config import LlamaBedrockConfig | ||
DEFAULT_MODELS = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So I suggest this goes to init.py
return input_config | ||
if isinstance(input_config, str): | ||
# This is purely for backwards compatibility as we used to save the model as a string | ||
if type == "distillation": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"distillation" -> into a DISTILLATION constant.
src/tanuki/function_modeler.py
Outdated
|
||
def _get_dataset_info(self, dataset_type, func_hash, type="length"): | ||
""" | ||
Get the dataset size for a function hash | ||
""" | ||
return self.data_worker.load_dataset(dataset_type, func_hash, return_type=type) | ||
|
||
def _configure_teacher_models(self, teacher_models: list, func_hash: str): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please include more expressive type definition.
src/tanuki/models/api_manager.py
Outdated
""" | ||
Adds an API provider to the API manager. | ||
""" | ||
if provider =="openai": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Replace with constant.
""" | ||
distilled_model: BaseModelConfig = DEFAULT_MODELS["gpt-3.5-finetune"] | ||
current_model_stats : Dict = { | ||
"trained_on_datapoints": 0, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Constants...
"anthropic.claude-v2:1": ClaudeConfig(model_name = "anthropic.claude-v2:1", context_length = 200000), | ||
"llama_70b_chat_aws": LlamaBedrockConfig(model_name = "meta.llama2-70b-chat-v1", context_length = 4096), | ||
"llama_13b_chat_aws": LlamaBedrockConfig(model_name = "meta.llama2-13b-chat-v1", context_length = 4096), | ||
"ada-002": OpenAIConfig(model_name="text-embedding-ada-002", context_length=-1) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lets add at least 1 bedrock embedding model.
Implementing support for Bedrock and LLama models through bedrock as the teacher models
Main changes