Skip to content

issues Search Results · repo:kubeflow/trainer language:Python

Filter by

1k results
 (60 ms)

1k results

inkubeflow/trainer (press backspace or delete to remove)

What you would like to be added? Add a GitHub action workflow for: - Publishing helm charts to GHCR, so user can install the trainer Helm chart by specifying OCI image repository and version: ...
kind/feature
lifecycle/needs-triage
  • ChenYi015
  • Opened 
    2 days ago
  • #2488

What happened? During testing, I noticed the following error when running list_runtimes() APIs: File /Users/avelichk/go/src/github.com/kubeflow/trainer/sdk/kubeflow/trainer/models/trainer_v1alpha1_ml_policy.py ...
area/sdk
kind/bug
  • andreyvelich
  • 1
  • Opened 
    3 days ago
  • #2485

https://github.com/kubeflow/trainer/blob/master/CONTRIBUTING.md paths like: kubectl apply --server-side -k github.com/kubeflow/training-operator/manifests/overlays/standalone does not exist
  • Okabe-Rintarou-0
  • 3
  • Opened 
    3 days ago
  • #2480

What happened? As we found in our E2E tests, the ClusterTrainingRunime factory randomly fails in the following. { level : error , ts : 2025-03-05T19:26:42.084342655Z , caller : runtime/signal_unix.go:917 ...
kind/bug
  • tenzen-y
  • 1
  • Opened 
    4 days ago
  • #2477

What you would like to be added? As I mentioned in https://github.com/kubeflow/trainer/blob/3ec8f0705f515269b5ab8744c20b9d085f50d1ce/pkg/runtime/framework/core/framework_test.go#L51-L53, it would be better ...
area/controller
kind/feature
  • tenzen-y
  • 2
  • Opened 
    6 days ago
  • #2468

What you would like to be added? Since we updated JobSet to v0.8.0, we should refactor the controller code to support DependsOn API for the Initializer Job and MPI orchestration. /assign @andreyvelich ...
area/controller
kind/feature
  • andreyvelich
  • Opened 
    6 days ago
  • #2467

What you would like to be added? We should explore the uv project manager for the Kubeflow Python SDK. It is faster than other tools, and many Python libraries have started adopting it. In particular, ...
area/sdk
kind/discussion
kind/feature
  • andreyvelich
  • 2
  • Opened 
    9 days ago
  • #2462

What happened? Flaky Integration Test: TestDatasetIntegration.test_dataset_download[HuggingFace - Public dataset-huggingface-test_case0] - https://github.com/kubeflow/trainer/actions/runs/13595830014/job/38012401152 ...
area/testing
good first issue
help wanted
kind/bug
  • tenzen-y
  • 7
  • Opened 
    9 days ago
  • #2460

What you would like to be added? It would be great to reconsider the TrainJob Created condition. The tenantavely alternative candidate is Initialized and ComponentsCreated as we discussed in https://github.com/kubeflow/trainer/pull/2439#discussion_r1959297527. ...
kind/feature
  • tenzen-y
  • Opened 
    9 days ago
  • #2459

What you would like to be added? It would be great to add Kubeflow TrainerPipelineFramework documentations to https://www.kubeflow.org/docs/components/trainer/operator-guides/ Why is this needed? We ...
area/docs
good first issue
help wanted
kind/documentation
  • tenzen-y
  • 7
  • Opened 
    9 days ago
  • #2458
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Restrict your search to the title by using the in:title qualifier.
Issue origami icon

Learn how you can use GitHub Issues to plan and track your work.

Save views for sprints, backlogs, teams, or releases. Rank, sort, and filter issues to suit the occasion. The possibilities are endless.Learn more about GitHub Issues
ProTip! 
Press the
/
key to activate the search input again and adjust your query.
Issue search results · GitHub