Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[llm] Replace model_name with *required* base_model, add preset LLM registry, update internal adapter modules #3423

Merged
merged 48 commits into from
Jun 29, 2023

Conversation

tgaddair
Copy link
Collaborator

@tgaddair tgaddair commented May 25, 2023

Summary

The new base_model parameter is also just a string, backed by a anyOf schema that either validates whether the provided string is a Ludwig-supported preset through JSON or an otherwise valid model in the Huggingface repo at runtime (i.e. via validation checks).

It is the first required parameter in Ludwig, outside of input and output features, and it uses a bespoke schema tweak to enforce that. It is also the first parameter to use an anyOf implementation; this is necessary because oneOf validation requires a value to match exactly one option schema (a preset LLM name would partially match both schemas and cause an error).

If the user does choose a preset LLM then after JSON validation but before config object initialization this value will be swapped for the full slash-delimited path defined in the MODEL_PRESETS dictionary.

Usage

base_model: llama-7b

is valid and can be verified by JSON or by the validation checks. By the end of config object initialization it has been replaced by huggyllama/llama-7b (which is defined in MODEL_PRESETS).

base_model: bigscience/bloom-3b

is also valid but can only be verified during config-object initialization by the extra validation checks.

and

base_model:  <anything other than a string>

will cause an immediate validation failure.

Necessary follow-up PRs, prompted by changes here

  • Looks like the required field is missing for model-level parameters (e.g. input_features). Probably an accidental regression. - Fix: int: Add back required for input and output features to the Ludwig JSON schema #3442
  • Add official support for required parameters using marshmallow's in-built options
  • Add support for arbitrary selector or "controller" fields (other than type).
  • Investigate whether the extra metadata dicts inserted into marshmallow fields are necessary anymore

@tgaddair tgaddair requested a review from arnavgarg1 May 25, 2023 05:05
@github-actions
Copy link

github-actions bot commented May 25, 2023

Unit Test Results

  6 files  ±0    6 suites  ±0   1h 10m 6s ⏱️ - 6m 17s
33 tests ±0  29 ✔️ ±0    4 💤 ±0  0 ±0 
99 runs  ±0  87 ✔️ ±0  12 💤 ±0  0 ±0 

Results for commit 0eaa15f. ± Comparison against base commit f905b32.

♻️ This comment has been updated with latest results.

Copy link
Contributor

@arnavgarg1 arnavgarg1 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ksbrar ksbrar requested a review from arnavgarg1 June 23, 2023 17:18
ludwig/config_validation/validation.py Outdated Show resolved Hide resolved
ludwig/schema/llms/base_model.py Outdated Show resolved Hide resolved
ludwig/schema/metadata/configs/llm.yaml Show resolved Hide resolved
ludwig/schema/model_types/llm.py Outdated Show resolved Hide resolved
ludwig/schema/model_types/llm.py Outdated Show resolved Hide resolved
ludwig/utils/backward_compatibility.py Outdated Show resolved Hide resolved
tests/ludwig/schema/test_model_config.py Outdated Show resolved Hide resolved
@ksbrar ksbrar requested a review from justinxzhao June 27, 2023 18:30
@ksbrar ksbrar merged commit 183c3ef into master Jun 29, 2023
16 checks passed
@ksbrar ksbrar deleted the llm-model-class branch June 29, 2023 00:25
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants