Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added prompt_template preprocessing param for text features #3298

Merged
merged 6 commits into from
Mar 30, 2023

Conversation

tgaddair
Copy link
Collaborator

Useful when fine-tuning large language models (LLMs) that benefit from providing additional context to improve the quality of the embeddings they generate. Particularly valuable when using fixed weights (trainable=false).

@github-actions
Copy link

github-actions bot commented Mar 29, 2023

Unit Test Results

    6 files  ±  0    6 suites  ±0   31m 8s ⏱️ - 1h 21m 22s
  64 tests  - 87  61 ✔️  - 77    3 💤  - 10  0 ±0 
104 runs   - 87  93 ✔️  - 77  11 💤  - 10  0 ±0 

Results for commit 491ab56. ± Comparison against base commit 87a56fa.

This pull request removes 139 and adds 52 tests. Note that renamed tests count towards both.
tests.integration_tests.test_automl ‑ test_auto_train
tests.integration_tests.test_automl ‑ test_autoconfig_preprocessing_balanced
tests.integration_tests.test_automl ‑ test_autoconfig_preprocessing_imbalanced
tests.integration_tests.test_automl ‑ test_autoconfig_preprocessing_text_image
tests.integration_tests.test_automl ‑ test_create_auto_config[image]
tests.integration_tests.test_automl ‑ test_create_auto_config[multimodal]
tests.integration_tests.test_automl ‑ test_create_auto_config[tabular_large]
tests.integration_tests.test_automl ‑ test_create_auto_config[tabular_small]
tests.integration_tests.test_automl ‑ test_create_auto_config[text]
tests.integration_tests.test_automl ‑ test_create_auto_config_with_dataset_profile
…
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_ray_model_training_with_augmentation_pipeline[preprocessing0-False]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_ray_model_training_with_augmentation_pipeline[preprocessing0-True]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_ray_model_training_with_augmentation_pipeline[preprocessing0-augmentation_pipeline_ops2]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_ray_model_training_with_augmentation_pipeline[preprocessing1-False]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_ray_model_training_with_augmentation_pipeline[preprocessing1-True]
tests.ludwig.augmentation.test_augmentation_pipeline ‑ test_ray_model_training_with_augmentation_pipeline[preprocessing1-augmentation_pipeline_ops2]
tests.ludwig.automl.test_base_config ‑ test_dataset_info[dask]
tests.ludwig.automl.test_base_config ‑ test_dataset_info[pandas]
tests.ludwig.automl.test_base_config ‑ test_infer_parquet_types
tests.ludwig.automl.test_base_config ‑ test_is_field_boolean[dask]
…
This pull request removes 11 skipped tests and adds 1 skipped test. Note that renamed tests count towards both.
tests.integration_tests.test_class_imbalance_feature ‑ test_imbalance_ray[oversample_minority]
tests.integration_tests.test_class_imbalance_feature ‑ test_imbalance_ray[undersample_majority]
tests.integration_tests.test_horovod ‑ test_horovod_gpu_memory_limit
tests.integration_tests.test_hyperopt_ray_horovod ‑ test_hyperopt_executor_bohb
tests.integration_tests.test_hyperopt_ray_horovod ‑ test_hyperopt_executor_with_metric
tests.integration_tests.test_hyperopt_ray_horovod ‑ test_hyperopt_run_hyperopt
tests.integration_tests.test_ray ‑ test_ray_image_modin
tests.integration_tests.test_ray ‑ test_ray_set_and_vector_outputs[csv]
tests.integration_tests.test_ray ‑ test_ray_set_and_vector_outputs[parquet]
tests.integration_tests.test_ray ‑ test_ray_split
…
tests.ludwig.models.test_training_determinism ‑ test_training_determinism_ray_backend

♻️ This comment has been updated with latest results.

inverse_vocabulary=metadata[f"{prefix}str2idx"],
tokenizer_type=preprocessing_parameters[f"{prefix}tokenizer"],
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you also be able to remove the rest of the backwards compatibility workaround (introduced in #1859)?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not too familiar with this code tbh, maybe you can take it in a follow-up?

@tgaddair tgaddair merged commit 5e4ceab into master Mar 30, 2023
@tgaddair tgaddair deleted the prompt-templ branch March 30, 2023 16:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants