Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support/aws bedrock #120

Merged
merged 34 commits into from
Jan 16, 2024
Merged

Support/aws bedrock #120

merged 34 commits into from
Jan 16, 2024

Conversation

MartBakler
Copy link
Collaborator

@MartBakler MartBakler commented Dec 12, 2023

Implementing support for Bedrock and LLama models through bedrock as the teacher models
Main changes

  • Create a function config class to track function configurations (previously was a dict)
  • Create a model config object to describe models with different executions (openai vs bedrock, different model names, different prompting options)
  • Create a Bedrock and LLamaBedrock API to execute bedrock modelling
  • Add the option for users to configure teacher models for patched functions

@MartBakler MartBakler marked this pull request as ready for review December 15, 2023 15:00
LLM_GENERATION_PARAMETERS = ["temperature", "top_p", "max_new_tokens"]


class LLama_Bedrock_API(Bedrock_API):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Strange capitalization.

from tanuki.language_models.llm_configs.openai_config import OpenAIConfig
from tanuki.language_models.llm_configs.claude_config import ClaudeConfig
from tanuki.language_models.llm_configs.llama_config import LlamaBedrockConfig
DEFAULT_MODELS = {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So I suggest this goes to init.py

return input_config
if isinstance(input_config, str):
# This is purely for backwards compatibility as we used to save the model as a string
if type == "distillation":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"distillation" -> into a DISTILLATION constant.


def _get_dataset_info(self, dataset_type, func_hash, type="length"):
"""
Get the dataset size for a function hash
"""
return self.data_worker.load_dataset(dataset_type, func_hash, return_type=type)

def _configure_teacher_models(self, teacher_models: list, func_hash: str):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please include more expressive type definition.

"""
Adds an API provider to the API manager.
"""
if provider =="openai":
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Replace with constant.

"""
distilled_model: BaseModelConfig = DEFAULT_MODELS["gpt-3.5-finetune"]
current_model_stats : Dict = {
"trained_on_datapoints": 0,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Constants...

"anthropic.claude-v2:1": ClaudeConfig(model_name = "anthropic.claude-v2:1", context_length = 200000),
"llama_70b_chat_aws": LlamaBedrockConfig(model_name = "meta.llama2-70b-chat-v1", context_length = 4096),
"llama_13b_chat_aws": LlamaBedrockConfig(model_name = "meta.llama2-13b-chat-v1", context_length = 4096),
"ada-002": OpenAIConfig(model_name="text-embedding-ada-002", context_length=-1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Lets add at least 1 bedrock embedding model.

@MartBakler MartBakler merged commit ca36726 into master Jan 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants