Issues: kubeflow/training-operator
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
fix(compatability): match-case syntax only compatible with Python3.10
#2096
opened May 2, 2024 by
PantherHawk
chore(style): provide type for
STORAGE_INITIALIZER_VOLUME
constant
#2093
opened May 2, 2024 by
PantherHawk
Add DeepSpeed Example with MPI Operator
area/example
good first issue
help wanted
#2091
opened Apr 29, 2024 by
andreyvelich
Flaky Test: [It] should create desired Pods and Services: Distributed TFJob (4 workers, 2 PS) is succeeded
#2086
opened Apr 27, 2024 by
tenzen-y
Not getting Kubeflow Training SDK v1.7 when installing
kubeflow-training
#2082
opened Apr 24, 2024 by
JamesKunstle
Update pytorch launcher component in Kubeflow Pipelines repository
good first issue
help wanted
kind/feature
#2068
opened Apr 17, 2024 by
anishasthana
Support CertManager for the Webhook cert generation
kind/feature
#2049
opened Apr 10, 2024 by
tenzen-y
PytorchJob restartPolicy: ExitCode does not honor backoffLimit for retryable errors
kind/feature
#2045
opened Apr 5, 2024 by
kellyaa
Add more AI/ML Training Examples
area/example
good first issue
help wanted
#2040
opened Mar 29, 2024 by
andreyvelich
3 of 7 tasks
[SDK] Use HuggingFace Data Collator for more Transformers in LLM Trainer
area/sdk
#2032
opened Mar 15, 2024 by
andreyvelich
Adapt TFJob examples to TensorFlow v2
good first issue
help wanted
#2015
opened Mar 8, 2024 by
tenzen-y
2 tasks
Add workflows to verify if examples are valid
good first issue
help wanted
#2014
opened Mar 8, 2024 by
tenzen-y
How to restart a large-scale training job using OnFailure restart policy
#2000
opened Jan 30, 2024 by
hfwen0502
Previous Next
ProTip!
Adding no:label will show everything without a label.