-
Notifications
You must be signed in to change notification settings - Fork 11
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add MLCube integration #1
Add MLCube integration #1
Conversation
* test commit * Delete unncessary files. Add utils and constants for supporting functions in eval * Add core supporting functions for model trainig and scoring * Add main functionality to eval, and supporting utils functions. Update requirements * Add folder structure. Add random training file and its results for testing setup. Minor fix to constant and setup file * Add gitignore to ignore everything except test file. Delete selection folder since it is not necessary * Add gitignore file to ignore all files in train sets except random_500.csv * Simplify output readout to avoid bug * Updated file and methods to match with previous design pattern * Add data file as input in main function and yaml file so all paths in yaml are relative * Add docker-compose file, and modify dockerfile, requirements and main accordingly * Fixed type hinting as suggested in PR review
@davidjurado How would someone specify the Also is there a way to point to a file outside of |
Hello @colbybanbury, I'm sorry for the late reply, I didn't get a notification of your comment. To specify a different workspace folder you can use mlcube run --task=select --workspace=path/to/new_folder To point a file outside the workspace folder you need to have a parameter for the task you want to run, this is defined in the select:
# Run selection algorithm
parameters:
inputs:
{
allowed_training_set: { type: file, default: data/preliminary_evaluation_dataset/allowed_training_set.yaml },
train_embeddings_dir: data/preliminary_evaluation_dataset/train_embeddings/,
}
outputs: { outdir: select_output/ } and let's say we want to define a different allowed_training_set, we need to specify the name of the parameters to override and provide the absolute path of the new file we want to use: mlcube run --task=select allowed_training_set=/Users/me/allowed_training_set.yaml |
DataPerf Speech Example - MLCube integration
Project setup
Project structure
Tasks execution
Execute complete pipeline
# Run all steps mlcube run --task=download,select,evaluate -Pdocker.build_strategy=always