# LMEvalJob Custom Resource Generation

This notebook demonstrates how to create LMEvalJob Custom Resources (CRs) using the TrustyAI SDK in a Pythonic way. We'll cover:

1. **Simple LMEvalJob creation** for basic model evaluation
2. **Builder pattern** for more control and flexibility
3. **Custom task cards** with preprocessing steps
4. **Complex LLM-as-a-Judge** configurations
5. **Generic utilities** for building custom evaluations

## Import Required Modules

In [11]:
# Core data models
# Convenience imports (same as above, but from eval package)
# from trustyai.providers.eval import LMEvalJobBuilder, create_simple_lmeval_job
import json

from IPython.display import Markdown, display

from trustyai.core.lmevaljob import (
    Loader,
    TaskCard,
)

# Kubernetes client for TrustyAI resources
from trustyai.core.trustyai_kubernetes_client import TrustyAIKubernetesClient

# Builder and utility functions
from trustyai.providers.eval.utils import (
    LMEvalJobBuilder,
    create_copy_step,
    create_filter_by_condition_step,
    create_literal_eval_step,
    create_llm_as_judge_metric,
    create_mt_bench_template,
    create_rating_task,
    create_rename_splits_step,
    create_rename_step,
    create_task_card_json,
)


## 1. Simple LMEvalJob Creation

The easiest way to create an LMEvalJob is using the `create_simple_lmeval_job()` helper function.

In [12]:
# Create a simple evaluation job using the new static method
simple_job = LMEvalJobBuilder.simple(
    name="simple-evaluation",
    model_name="google/flan-t5-small",
    tasks=["hellaswag", "arc_easy", "boolq"],
    namespace="test",
    limit=1
)

print("✅ Created simple LMEvalJob:")
print(f"   Name: {simple_job.metadata.name}")
print(f"   Namespace: {simple_job.metadata.namespace}")
print(f"   Model: {simple_job.spec.modelArgs[0].value}")
print(f"   Tasks: {simple_job.spec.taskList.taskNames}")
print()

# Create another simple job with a limit
simple_job_with_limit = LMEvalJobBuilder.simple(
    name="simple-evaluation-limited",
    model_name="google/flan-t5-small",
    tasks=["hellaswag", "arc_easy"],
    namespace="trustyai",
    limit=100
)

print("✅ Created simple LMEvalJob with limit:")
print(f"   Name: {simple_job_with_limit.metadata.name}")
print(f"   Model: {simple_job_with_limit.spec.modelArgs[0].value}")
print(f"   Tasks: {simple_job_with_limit.spec.taskList.taskNames}")
print(f"   Limit: {simple_job_with_limit.spec.limit}")
print()

# Display the YAML output
display(Markdown(f"```yaml\n{simple_job.to_yaml()}\n```"))

✅ Created simple LMEvalJob:
   Name: simple-evaluation
   Namespace: test
   Model: google/flan-t5-small
   Tasks: ['hellaswag', 'arc_easy', 'boolq']

✅ Created simple LMEvalJob with limit:
   Name: simple-evaluation-limited
   Model: google/flan-t5-small
   Tasks: ['hellaswag', 'arc_easy']
   Limit: 100



```yaml
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: LMEvalJob
metadata:
  name: simple-evaluation
  namespace: test
spec:
  model: hf
  modelArgs:
  - name: pretrained
    value: google/flan-t5-small
  taskList:
    taskNames:
    - hellaswag
    - arc_easy
    - boolq
  logSamples: true
  allowOnline: true
  allowCodeExecution: true
  limit: '1'

```

## 2. Builder Pattern for Advanced Configuration

For more control over the job configuration, use the `LMEvalJobBuilder` class with its fluent API.

In [13]:
# Build a more advanced job using the builder pattern
advanced_job = (LMEvalJobBuilder("advanced-eval")
                .namespace("my-namespace")
                .model("hf")
                .pretrained_model("microsoft/DialoGPT-medium")
                .task_names(["piqa", "winogrande"])
                .limit(50)
                .log_samples(False)
                .allow_online(True)
                .allow_code_execution(False)
                .hf_token("hf_your_token_here")
                .env_var("CUDA_VISIBLE_DEVICES", "0,1")
                .env_var("TRANSFORMERS_CACHE", "/tmp/cache")
                .build())

print("✅ Created advanced LMEvalJob with builder pattern:")
print(f"   Name: {advanced_job.metadata.name}")
print(f"   Model: {advanced_job.spec.modelArgs[0].value}")
print(f"   Limit: {advanced_job.spec.limit}")
print(f"   Log samples: {advanced_job.spec.logSamples}")
print(f"   Environment variables: {len(advanced_job.spec.pod.container.env)}")
print()

# Display as JSON for variety
display(Markdown(f"```json\n{advanced_job.to_json()}\n```"))

✅ Created advanced LMEvalJob with builder pattern:
   Name: advanced-eval
   Model: microsoft/DialoGPT-medium
   Limit: 50
   Log samples: False
   Environment variables: 3



```json
{
  "apiVersion": "trustyai.opendatahub.io/v1alpha1",
  "kind": "LMEvalJob",
  "metadata": {
    "name": "advanced-eval",
    "namespace": "my-namespace"
  },
  "spec": {
    "model": "hf",
    "modelArgs": [
      {
        "name": "pretrained",
        "value": "microsoft/DialoGPT-medium"
      }
    ],
    "taskList": {
      "taskNames": [
        "piqa",
        "winogrande"
      ]
    },
    "logSamples": false,
    "allowOnline": true,
    "allowCodeExecution": false,
    "limit": "50",
    "pod": {
      "container": {
        "env": [
          {
            "name": "HF_TOKEN",
            "value": "hf_your_token_here"
          },
          {
            "name": "CUDA_VISIBLE_DEVICES",
            "value": "0,1"
          },
          {
            "name": "TRANSFORMERS_CACHE",
            "value": "/tmp/cache"
          }
        ]
      }
    }
  }
}
```

## 3. Custom Task Cards with Preprocessing Steps

Create custom task cards using dataclasses and generic preprocessing step builders.

In [14]:
# Create a custom loader
custom_loader = Loader(
    __type__="load_hf",
    path="my-organization/custom-dataset",
    split="validation"
)

# Create preprocessing steps using helper functions
preprocess_steps = [
    create_rename_splits_step({"validation": "test"}),
    create_filter_by_condition_step({"quality": "high"}, "eq"),
    create_rename_step({"input_text": "question", "target": "answer"}),
    create_literal_eval_step("metadata"),
    create_copy_step("metadata/category", "category"),
]

# Create the task card with dataclasses
custom_task_card = TaskCard(
    __type__="task_card",
    loader=custom_loader,
    preprocess_steps=preprocess_steps,
    task="custom.evaluation.task",
    templates=["custom.template"]
)

# Convert to JSON string
card_json = create_task_card_json(custom_task_card)

print("✅ Created custom TaskCard:")
print(f"   Loader path: {custom_loader.path}")
print(f"   Split: {custom_loader.split}")
print(f"   Preprocessing steps: {len(preprocess_steps)}")
print(f"   Task: {custom_task_card.task}")
print()

# Display the JSON structure
card_data = json.loads(card_json)
display(Markdown(f"```json\n{json.dumps(card_data, indent=2)}\n```"))

✅ Created custom TaskCard:
   Loader path: my-organization/custom-dataset
   Split: validation
   Preprocessing steps: 5
   Task: custom.evaluation.task



```json
{
  "__type__": "task_card",
  "loader": {
    "__type__": "load_hf",
    "path": "my-organization/custom-dataset",
    "split": "validation"
  },
  "preprocess_steps": [
    {
      "__type__": "rename_splits",
      "mapper": {
        "validation": "test"
      }
    },
    {
      "__type__": "filter_by_condition",
      "values": {
        "quality": "high"
      },
      "condition": "eq"
    },
    {
      "__type__": "rename",
      "field_to_field": {
        "input_text": "question",
        "target": "answer"
      }
    },
    {
      "__type__": "literal_eval",
      "field": "metadata"
    },
    {
      "__type__": "copy",
      "field": "metadata/category",
      "to_field": "category"
    }
  ],
  "task": "custom.evaluation.task",
  "templates": [
    "custom.template"
  ]
}
```

In [15]:
# Use the custom task card in an LMEvalJob
custom_card_job = (LMEvalJobBuilder("custom-card-eval")
                   .namespace("custom-ns")
                   .pretrained_model("my-custom-model")
                   .custom_card(
                       card_json=card_json,
                       template_ref="custom.template",
                       format_str="custom.format",
                       metrics=["custom_metric"]
                   )
                   .build())

print("✅ Created LMEvalJob with custom task card:")
print(f"   Job name: {custom_card_job.metadata.name}")
print(f"   Has custom card: {custom_card_job.spec.taskList.taskRecipes is not None}")
print(f"   Card contains: {len(json.loads(custom_card_job.spec.taskList.taskRecipes[0].card.custom))} fields")

✅ Created LMEvalJob with custom task card:
   Job name: custom-card-eval
   Has custom card: True
   Card contains: 5 fields


## 4. MT-Bench Style LLM-as-a-Judge Configuration

Create a complex LLM-as-a-Judge setup similar to the MT-Bench evaluation, demonstrating custom templates, tasks, and metrics.

In [16]:
# Helper function to create MT-Bench style task card (like in our tests)
def create_mt_bench_task_card(dataset_path: str = "OfirArviv/mt_bench_single_score_gpt4_judgement") -> str:
    """Create MT-Bench specific task card configuration."""
    loader = Loader(__type__="load_hf", path=dataset_path, split="train")

    preprocess_steps = [
        create_rename_splits_step({"train": "test"}),
        create_filter_by_condition_step({"turn": 1}, "eq"),
        create_filter_by_condition_step({"reference": "[]"}, "eq"),
        create_rename_step({
            "model_input": "question",
            "score": "rating",
            "category": "group",
            "model_output": "answer"
        }),
        create_literal_eval_step("question"),
        create_copy_step("question/0", "question"),
        create_literal_eval_step("answer"),
        create_copy_step("answer/0", "answer"),
    ]

    task_card = TaskCard(
        __type__="task_card",
        loader=loader,
        preprocess_steps=preprocess_steps,
        task="tasks.response_assessment.rating.single_turn",
        templates=["templates.response_assessment.rating.mt_bench_single_turn"]
    )

    return create_task_card_json(task_card)

# Create the MT-Bench task card
mt_bench_card_json = create_mt_bench_task_card()

# Create custom definitions for LLM-as-a-Judge
mt_bench_template = create_mt_bench_template()
rating_task = create_rating_task()
llm_judge_metric = create_llm_as_judge_metric("mistralai/Mistral-7B-Instruct-v0.2")

print("✅ Created MT-Bench components:")
print(f"   Template: {mt_bench_template.name}")
print(f"   Task: {rating_task.name}")
print(f"   Metric: {llm_judge_metric.name}")
print(f"   Task card preprocessing steps: {len(json.loads(mt_bench_card_json)['preprocess_steps'])}")

✅ Created MT-Bench components:
   Template: response_assessment.rating.mt_bench_single_turn
   Task: response_assessment.rating.single_turn
   Metric: llmaaj_metric
   Task card preprocessing steps: 8


In [17]:
# Build the complete LLM-as-a-Judge job
llm_judge_job = (LMEvalJobBuilder("custom-llmaaj-metric")
                 .model("hf")
                 .add_model_arg("pretrained", "google/flan-t5-small")
                 .custom_card(
                     card_json=mt_bench_card_json,
                     template_ref="response_assessment.rating.mt_bench_single_turn",
                     format_str="formats.models.mistral.instruction",
                     metrics=["llmaaj_metric"]
                 )
                 .custom_definitions(
                     templates=[mt_bench_template],
                     tasks=[rating_task],
                     metrics=[llm_judge_metric]
                 )
                 .log_samples(True)
                 .allow_online(True)
                 .allow_code_execution(True)
                 .hf_token("<HF_TOKEN>")
                 .build())

print("✅ Created complete LLM-as-a-Judge evaluation:")
print(f"   Job name: {llm_judge_job.metadata.name}")
print(f"   Model: {llm_judge_job.spec.modelArgs[0].value}")
print(f"   Custom templates: {len(llm_judge_job.spec.custom.templates)}")
print(f"   Custom tasks: {len(llm_judge_job.spec.custom.tasks)}")
print(f"   Custom metrics: {len(llm_judge_job.spec.custom.metrics)}")
print()

# Display the full YAML
display(Markdown("### Complete LLM-as-a-Judge YAML:"))
display(Markdown(f"```yaml\n{llm_judge_job.to_yaml()}\n```"))

✅ Created complete LLM-as-a-Judge evaluation:
   Job name: custom-llmaaj-metric
   Model: google/flan-t5-small
   Custom templates: 1
   Custom tasks: 1
   Custom metrics: 1



### Complete LLM-as-a-Judge YAML:

```yaml
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: LMEvalJob
metadata:
  name: custom-llmaaj-metric
spec:
  model: hf
  modelArgs:
  - name: pretrained
    value: google/flan-t5-small
  taskList:
    taskRecipes:
    - card:
        custom: "{\n    \"__type__\": \"task_card\",\n    \"loader\": {\n        \"\
          __type__\": \"load_hf\",\n        \"path\": \"OfirArviv/mt_bench_single_score_gpt4_judgement\"\
          ,\n        \"split\": \"train\"\n    },\n    \"preprocess_steps\": [\n \
          \       {\n            \"__type__\": \"rename_splits\",\n            \"\
          mapper\": {\n                \"train\": \"test\"\n            }\n      \
          \  },\n        {\n            \"__type__\": \"filter_by_condition\",\n \
          \           \"values\": {\n                \"turn\": 1\n            },\n\
          \            \"condition\": \"eq\"\n        },\n        {\n            \"\
          __type__\": \"filter_by_condition\",\n            \"values\": {\n      \
          \          \"reference\": \"[]\"\n            },\n            \"condition\"\
          : \"eq\"\n        },\n        {\n            \"__type__\": \"rename\",\n\
          \            \"field_to_field\": {\n                \"model_input\": \"\
          question\",\n                \"score\": \"rating\",\n                \"\
          category\": \"group\",\n                \"model_output\": \"answer\"\n \
          \           }\n        },\n        {\n            \"__type__\": \"literal_eval\"\
          ,\n            \"field\": \"question\"\n        },\n        {\n        \
          \    \"__type__\": \"copy\",\n            \"field\": \"question/0\",\n \
          \           \"to_field\": \"question\"\n        },\n        {\n        \
          \    \"__type__\": \"literal_eval\",\n            \"field\": \"answer\"\n\
          \        },\n        {\n            \"__type__\": \"copy\",\n          \
          \  \"field\": \"answer/0\",\n            \"to_field\": \"answer\"\n    \
          \    }\n    ],\n    \"task\": \"tasks.response_assessment.rating.single_turn\"\
          ,\n    \"templates\": [\n        \"templates.response_assessment.rating.mt_bench_single_turn\"\
          \n    ]\n}"
      template:
        ref: response_assessment.rating.mt_bench_single_turn
      format: formats.models.mistral.instruction
      metrics:
      - ref: llmaaj_metric
  custom:
    templates:
    - name: response_assessment.rating.mt_bench_single_turn
      value: "{\n    \"__type__\": \"input_output_template\",\n    \"instruction\"\
        : \"Please act as an impartial judge and evaluate the quality of the response\
        \ provided by an AI assistant to the user question displayed below. Your evaluation\
        \ should consider factors such as the helpfulness, relevance, accuracy, depth,\
        \ creativity, and level of detail of the response. Begin your evaluation by\
        \ providing a short explanation. Be as objective as possible. After providing\
        \ your explanation, you must rate the response on a scale of 1 to 10 by strictly\
        \ following this format: \\\"[[rating]]\\\", for example: \\\"Rating: [[5]]\\\
        \".\\n\\n\",\n    \"input_format\": \"[Question]\\n{question}\\n\\n[The Start\
        \ of Assistant's Answer]\\n{answer}\\n[The End of Assistant's Answer]\",\n\
        \    \"output_format\": \"[[{rating}]]\",\n    \"postprocessors\": [\n   \
        \     \"processors.extract_mt_bench_rating_judgment\"\n    ]\n}"
    tasks:
    - name: response_assessment.rating.single_turn
      value: "{\n    \"__type__\": \"task\",\n    \"input_fields\": {\n        \"\
        question\": \"str\",\n        \"answer\": \"str\"\n    },\n    \"outputs\"\
        : {\n        \"rating\": \"float\"\n    },\n    \"metrics\": [\n        \"\
        metrics.spearman\"\n    ]\n}"
    metrics:
    - name: llmaaj_metric
      value: "{\n    \"__type__\": \"llm_as_judge\",\n    \"inference_model\": {\n\
        \        \"__type__\": \"hf_pipeline_based_inference_engine\",\n        \"\
        model_name\": \"mistralai/Mistral-7B-Instruct-v0.2\",\n        \"max_new_tokens\"\
        : 256,\n        \"use_fp16\": true\n    },\n    \"template\": \"templates.response_assessment.rating.mt_bench_single_turn\"\
        ,\n    \"task\": \"rating.single_turn\",\n    \"format\": \"formats.models.mistral.instruction\"\
        ,\n    \"main_score\": \"mistralai_mistral_7b_instruct_v0.2_huggingface_template_mt_bench_single_turn\"\
        \n}"
  logSamples: true
  allowOnline: true
  allowCodeExecution: true
  pod:
    container:
      env:
      - name: HF_TOKEN
        value: <HF_TOKEN>

```

## 5. Exploring the Generated Structure

Let's examine the structure of our generated CRs to understand what we've created.

In [18]:
# Examine the card.custom field structure
recipe = llm_judge_job.spec.taskList.taskRecipes[0]
card_custom = json.loads(recipe.card.custom)

print("🔍 Analyzing the task card structure:")
print(f"   Card type: {card_custom['__type__']}")
print(f"   Loader type: {card_custom['loader']['__type__']}")
print(f"   Dataset path: {card_custom['loader']['path']}")
print(f"   Preprocessing steps: {len(card_custom['preprocess_steps'])}")
print()

print("📋 Preprocessing step types:")
for i, step in enumerate(card_custom['preprocess_steps']):
    print(f"   {i+1}. {step['__type__']}")
print()

# Examine custom definitions
print("🔧 Custom definitions:")
template_config = json.loads(llm_judge_job.spec.custom.templates[0].value)
task_config = json.loads(llm_judge_job.spec.custom.tasks[0].value)
metric_config = json.loads(llm_judge_job.spec.custom.metrics[0].value)

print(f"   Template type: {template_config['__type__']}")
print(f"   Task type: {task_config['__type__']}")
print(f"   Metric type: {metric_config['__type__']}")
print(f"   Judge model: {metric_config['inference_model']['model_name']}")

🔍 Analyzing the task card structure:
   Card type: task_card
   Loader type: load_hf
   Dataset path: OfirArviv/mt_bench_single_score_gpt4_judgement
   Preprocessing steps: 8

📋 Preprocessing step types:
   1. rename_splits
   2. filter_by_condition
   3. filter_by_condition
   4. rename
   5. literal_eval
   6. copy
   7. literal_eval
   8. copy

🔧 Custom definitions:
   Template type: input_output_template
   Task type: task
   Metric type: llm_as_judge
   Judge model: mistralai/Mistral-7B-Instruct-v0.2


## 6. Different Judge Models

Demonstrate how easy it is to swap judge models and create variations.

In [19]:
# Create variations with different judge models
judge_models = [
    "meta-llama/Llama-2-7b-chat-hf",
    "microsoft/DialoGPT-large",
    "google/flan-t5-xl"
]

jobs = []
for model in judge_models:
    # Create custom metric for this judge
    custom_metric = create_llm_as_judge_metric(model)
    
    # Create job with this judge
    job = (LMEvalJobBuilder(f"eval-{model.split('/')[-1].lower().replace('-', '_')}")
           .namespace("experiments")
           .pretrained_model("google/flan-t5-base")
           .custom_card(
               card_json=mt_bench_card_json,
               template_ref="response_assessment.rating.mt_bench_single_turn",
               format_str="formats.models.llama.chat" if "llama" in model.lower() else "formats.models.mistral.instruction",
               metrics=["llmaaj_metric"]
           )
           .custom_definitions(
               templates=[mt_bench_template],
               tasks=[rating_task],
               metrics=[custom_metric]
           )
           .build())
    
    jobs.append(job)
    
    print(f"✅ Created job '{job.metadata.name}' with judge: {model}")

print(f"\n🎯 Generated {len(jobs)} evaluation jobs with different judge models!")

✅ Created job 'eval-llama_2_7b_chat_hf' with judge: meta-llama/Llama-2-7b-chat-hf
✅ Created job 'eval-dialogpt_large' with judge: microsoft/DialoGPT-large
✅ Created job 'eval-flan_t5_xl' with judge: google/flan-t5-xl

🎯 Generated 3 evaluation jobs with different judge models!


## 7. Deployment Ready Output

Generate deployment-ready YAML files that can be applied to a Kubernetes cluster.

In [20]:
# Create a production-ready job with proper configuration
production_job = (LMEvalJobBuilder("production-evaluation")
                  .namespace("trustyai-production")
                  .pretrained_model("microsoft/DialoGPT-large")
                  .task_names(["hellaswag", "arc_challenge", "truthfulqa_mc"])
                  .limit(1000)
                  .log_samples(True)
                  .allow_online(False)  # Secure: no online access
                  .allow_code_execution(False)  # Secure: no code execution
                  .env_var("HF_TOKEN", "${HF_TOKEN}")  # Use secret reference
                  .env_var("CUDA_VISIBLE_DEVICES", "0")
                  .env_var("TRANSFORMERS_CACHE", "/cache")
                  .build())

print("🚀 Production-ready LMEvalJob:")
print(f"   Name: {production_job.metadata.name}")
print(f"   Namespace: {production_job.metadata.namespace}")
print(f"   Security: Online={production_job.spec.allowOnline}, Code={production_job.spec.allowCodeExecution}")
print(f"   Sample limit: {production_job.spec.limit}")
print()

# Display deployment command
display(Markdown("### Deployment Commands:"))
display(Markdown("""
```bash
# Save to file
kubectl apply -f production_job.yaml

# Or apply directly
cat <<EOF | kubectl apply -f -
# [YAML content would go here]
EOF
```
"""))

# Show the final YAML
display(Markdown("### Production YAML:"))
display(Markdown(f"```yaml\n{production_job.to_yaml()}\n```"))

🚀 Production-ready LMEvalJob:
   Name: production-evaluation
   Namespace: trustyai-production
   Security: Online=False, Code=False
   Sample limit: 1000



### Deployment Commands:


```bash
# Save to file
kubectl apply -f production_job.yaml

# Or apply directly
cat <<EOF | kubectl apply -f -
# [YAML content would go here]
EOF
```


### Production YAML:

```yaml
apiVersion: trustyai.opendatahub.io/v1alpha1
kind: LMEvalJob
metadata:
  name: production-evaluation
  namespace: trustyai-production
spec:
  model: hf
  modelArgs:
  - name: pretrained
    value: microsoft/DialoGPT-large
  taskList:
    taskNames:
    - hellaswag
    - arc_challenge
    - truthfulqa_mc
  logSamples: true
  allowOnline: false
  allowCodeExecution: false
  limit: '1000'
  pod:
    container:
      env:
      - name: HF_TOKEN
        value: ${HF_TOKEN}
      - name: CUDA_VISIBLE_DEVICES
        value: '0'
      - name: TRANSFORMERS_CACHE
        value: /cache

```

In [22]:
# Import the new Kubernetes client
from trustyai.core.trustyai_kubernetes_client import TrustyAIKubernetesClient

# Create a client (uses default kubeconfig)
client = TrustyAIKubernetesClient()

# Submit a job and get a resource handle
submitted_job = client.submit(simple_job)

if submitted_job:
    print(f"✅ Successfully submitted: {submitted_job}")
    print(f"   Resource name: {submitted_job.name}")
    print(f"   Namespace: {submitted_job.namespace}")
    print(f"   Kind: {submitted_job.kind}")
    print()
    
    # Demonstrate resource management methods
    print("📋 Available management methods:")
    print("   - submitted_job.get_status()      # Get current status")
    print("   - submitted_job.get_logs()        # Get pod logs")
    print("   - submitted_job.wait_for_completion()  # Wait for completion")
    print("   - submitted_job.is_running()      # Check if running")
    print("   - submitted_job.is_completed()    # Check if completed")
    print("   - submitted_job.is_failed()       # Check if failed")
    print("   - submitted_job.delete()          # Delete the resource")
    print()
    
    # Example: Check if the job would be running (this would fail in demo environment)
    print("🔍 Example usage (would require actual cluster):")
    print("   status = submitted_job.get_status()")
    print("   if submitted_job.is_running():")
    print("       print('Job is currently running...')")
    
else:
    print("❌ Failed to submit job (likely no cluster available in demo environment)")
    print("💡 The client provides these methods for actual cluster deployment:")
    print("   - client.submit(lmeval_job)       # Submit and get resource handle")
    print("   - client.list_resources()         # List all resources")
    print("   - client.get_resource(name)       # Get handle to existing resource")
    print("   - client.generate_yaml(job)       # Generate YAML for kubectl")
    print("   - client.save_yaml_to_file(job, path)  # Save YAML to file")

[DEBUG] Deploying LMEvalJob to namespace: test
[DEBUG] API Group: trustyai.opendatahub.io, Version: v1alpha1
[DEBUG] Resource metadata: {'name': 'simple-evaluation', 'namespace': 'test'}
[DEBUG] Successfully created LMEvalJob 'simple-evaluation' in namespace 'test'
✅ Successfully submitted: LMEvalJob(name='simple-evaluation', namespace='test')
   Resource name: simple-evaluation
   Namespace: test
   Kind: LMEvalJob

📋 Available management methods:
   - submitted_job.get_status()      # Get current status
   - submitted_job.get_logs()        # Get pod logs
   - submitted_job.wait_for_completion()  # Wait for completion
   - submitted_job.is_running()      # Check if running
   - submitted_job.is_completed()    # Check if completed
   - submitted_job.is_failed()       # Check if failed
   - submitted_job.delete()          # Delete the resource

🔍 Example usage (would require actual cluster):
   status = submitted_job.get_status()
   if submitted_job.is_running():
       print('Job is cu

## 8. Kubernetes Client for TrustyAI Resources

The TrustyAI SDK now includes a dedicated Kubernetes client for submitting and managing TrustyAI resources.

### 🆕 **New Features:**

- **`LMEvalJobBuilder.simple()`**: Static method for creating simple jobs with optional limit parameter (replaces deprecated `create_simple_lmeval_job`)
- **`TrustyAIKubernetesClient`**: Dedicated client for TrustyAI resources
- **`SubmittedResource`**: Handle returned by `client.submit()` with methods for:
  - Status checking (`get_status()`, `is_running()`, `is_completed()`, `is_failed()`)
  - Log retrieval (`get_logs()`)
  - Resource management (`delete()`, `wait_for_completion()`)