# Helmbot - Helm Chart Variable Parser and Question Generator

This notebook demonstrates how to:
1. Parse Helm template files to extract variables
2. Use LangChain and OpenAI to generate user-friendly questions for those variables
3. Create an intelligent interface for Helm chart configuration

## Overview
The goal is to automatically analyze Helm charts and generate minimal, grouped questions that help users configure their deployments without needing deep Kubernetes knowledge.

## Step 1: Install Required Dependencies

First, we need to install the necessary Python packages for working with LangChain and OpenAI.

In [1]:
# Install LangChain packages
!pip install langchain
!pip install langchain_community

# Install OpenAI package
!pip install openai

print("✅ All dependencies installed successfully!")

Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable
Defaulting to user installation because normal site-packages is not writeable
✅ All dependencies installed successfully!
Defaulting to user installation because normal site-packages is not writeable
✅ All dependencies installed successfully!


## Step 2: Parse Helm Template Files

Now we'll scan the Helm template directory to find all template files and extract variables that need to be configured.

In [2]:
# Import required libraries
import os
import re

# Step 2a: List all Helm template files in sample_helm directory
template_dir = 'sample_helm'

# Find all YAML and template files
template_files = [f for f in os.listdir(template_dir) if f.endswith(('.yaml', '.tpl'))]
print(f'📁 Found {len(template_files)} template files:')
for file in template_files:
    print(f'   - {file}')

print("\n" + "="*50)

📁 Found 7 template files:
   - helm-sample-chart-deployment.yaml
   - helm-sample-chart-hpa.yaml
   - helm-sample-chart-ingress.yaml
   - helm-sample-chart-service.yaml
   - helm-sample-chart-serviceaccount.yaml
   - helm-sample-chart-tests_test-connection.yaml
   - helm-sample-chart-_helpers.tpl



In [3]:
# Step 2b: Extract variables from template files
variables = set()
pattern = re.compile(r'\{\{\s*\.Values\.([a-zA-Z0-9_]+)')

print("🔍 Scanning template files for variables...")
for fname in template_files:
    file_path = os.path.join(template_dir, fname)
    try:
        with open(file_path, 'r', encoding='utf-8') as f:
            content = f.read()
            found = pattern.findall(content)
            if found:
                print(f'   📄 {fname}: {found}')
            variables.update(found)
    except Exception as e:
        print(f'   ❌ Error reading {fname}: {e}')

print(f'\n✅ Total unique variables found: {len(variables)}')
print(f'📋 Variables needed: {sorted(list(variables))}')

🔍 Scanning template files for variables...
   📄 helm-sample-chart-deployment.yaml: ['replicaCount', 'image', 'image', 'image', 'service']
   📄 helm-sample-chart-hpa.yaml: ['autoscaling', 'autoscaling', 'autoscaling', 'autoscaling']
   📄 helm-sample-chart-service.yaml: ['service', 'service']
   📄 helm-sample-chart-serviceaccount.yaml: ['serviceAccount']
   📄 helm-sample-chart-tests_test-connection.yaml: ['service']

✅ Total unique variables found: 5
📋 Variables needed: ['autoscaling', 'image', 'replicaCount', 'service', 'serviceAccount']


## Step 3: Configure OpenAI API

To generate intelligent questions from the extracted variables, we need to set up OpenAI API access.

In [None]:
# Set up OpenAI API Key
import os

# Option 1: Set as environment variable (recommended for security)
# Uncomment and replace with your actual API key:
os.environ['OPENAI_API_KEY'] = ''

# Option 2: For testing only (less secure)
# Uncomment and replace with your actual key:
# OPENAI_API_KEY = 'sk-your-actual-openai-api-key-here'

print("🔑 API Key Setup Instructions:")
print("1. Get your API key from: https://platform.openai.com/api-keys")
print("2. Uncomment one of the options above and replace with your actual key")
print("3. Run this cell to set up authentication")

# Check if API key is set
if 'OPENAI_API_KEY' in os.environ and os.environ['OPENAI_API_KEY']:
    print("✅ API key is configured!")
else:
    print("⚠️  API key not found. Please set it up before proceeding.")

🔑 API Key Setup Instructions:
1. Get your API key from: https://platform.openai.com/api-keys
2. Uncomment one of the options above and replace with your actual key
3. Run this cell to set up authentication
✅ API key is configured!


## Step 4: Initialize LangChain and Create Prompt Template

Set up the LangChain components for generating intelligent questions from the extracted variables.

In [5]:
# Import LangChain components
from langchain_community.chat_models import ChatOpenAI
from langchain.prompts import PromptTemplate

# Initialize the ChatOpenAI model
llm = ChatOpenAI(
    model='gpt-3.5-turbo',  # Using gpt-3.5-turbo as it's more widely available
    temperature=0.7  # Controls creativity (0.0 = very focused, 1.0 = very creative)
)

# Create a prompt template for generating questions
prompt = PromptTemplate(
    input_variables=['variables'],
    template="""Given the following Helm chart variables: {variables}

Generate the minimum set of user-friendly questions needed to configure all these values.
Group related variables together where possible to minimize the number of questions.
Make the questions clear and understandable for users who may not be Kubernetes experts.

Format your response as a numbered list of questions.
For each question, briefly explain what the variable controls in parentheses.

Example format:
1. What is the name of your application? (Sets the app name used in labels and resources)
2. How many replicas do you want to run? (Controls horizontal scaling)
"""
)

print("🤖 LangChain components initialized successfully!")
print(f"📝 Using model: {llm.model_name}")
print("✅ Prompt template created for question generation")

  llm = ChatOpenAI(


🤖 LangChain components initialized successfully!
📝 Using model: gpt-3.5-turbo
✅ Prompt template created for question generation


## Step 5: Generate User-Friendly Questions

Use the LLM to generate intelligent, grouped questions based on the extracted Helm variables.

In [6]:
# Generate questions using the LLM
def generate_questions_for_variables(variables_list):
    """Generate user-friendly questions for Helm chart variables"""
    
    if not variables_list:
        print("❌ No variables found to generate questions for.")
        return
    
    # Check if API key is available
    if 'OPENAI_API_KEY' not in os.environ or not os.environ['OPENAI_API_KEY']:
        print("⚠️  OpenAI API key not found.")
        print("\n🔄 Showing example output instead:")
        show_example_questions(variables_list)
        return
    
    try:
        print("🚀 Generating intelligent questions using GPT-3.5-turbo...")
        print("📋 Input variables:", sorted(variables_list))
        print("\n" + "="*60)
        
        # Format the prompt with variables
        formatted_prompt = prompt.format(variables=', '.join(sorted(variables_list)))
        
        # Get response from the LLM
        response = llm.invoke(formatted_prompt)
        
        print("🎯 Generated Questions:")
        print(response.content)
        
        # Write the generated questions to a file
        with open('sample_helm/generated_questions.txt', 'w', encoding='utf-8') as file:
            file.write(response.content)
        print("\n💾 Questions saved to 'generated_questions.txt'")
        
        return response.content
        
    except Exception as e:
        print(f"❌ Error calling OpenAI API: {e}")
        print("\n🔄 Falling back to example questions:")
        show_example_questions(variables_list)

def show_example_questions(variables_list):
    """Show example questions when API is not available"""
    print("\n📝 Example questions that might be generated:")
    examples = [
        "1. What is the name of your application? (Sets app labels and resource names)",
        "2. How many replicas do you want to run? (Controls horizontal scaling)",
        "3. What Docker image should be used? (Specifies the container image)",
        "4. What port should the service expose? (Sets service port configuration)",
        "5. Do you need persistent storage? (Configures volume claims)",
    ]
    for example in examples:
        print(f"   {example}")

# Run the question generation
llm_response = None
if 'variables' in locals() and variables:
    llm_response=generate_questions_for_variables(list(variables))
else:
    print("⚠️  No variables found. Please run the previous steps first.")

🚀 Generating intelligent questions using GPT-3.5-turbo...
📋 Input variables: ['autoscaling', 'image', 'replicaCount', 'service', 'serviceAccount']

🎯 Generated Questions:
1. Do you want to enable autoscaling for your application? (Controls whether the application will automatically adjust the number of pods based on resource usage)
2. What Docker image do you want to use for your application? (Specifies the image to be used for the application)
3. How many replicas of your application do you want to run initially? (Sets the initial number of pods for the application)
4. What type of service do you want to expose for your application? (Defines the type of Kubernetes service to be created for accessing the application)
5. Do you want to specify a service account for your application? (Determines the service account used by the application pods)

💾 Questions saved to 'generated_questions.txt'
🎯 Generated Questions:
1. Do you want to enable autoscaling for your application? (Controls wheth

## Step 6: Next Steps and Extensions

### Potential Enhancements:
1. **Interactive UI**: Create a web interface to ask questions and collect answers
2. **Values.yaml Generation**: Automatically generate `values.yaml` from user responses
3. **Variable Validation**: Add type checking and validation for user inputs
4. **Template Analysis**: Analyze dependencies between variables
5. **Multi-chart Support**: Handle Helm charts with sub-charts and dependencies

### Usage Tips:
- Test with different Helm charts to see how the question generation adapts
- Experiment with different GPT models (gpt-3.5-turbo vs gpt-4) for cost vs quality trade-offs
- Consider adding context about the application type to generate more relevant questions

## Step 6: Generate values.yaml file ##
### steps to do the same:
1. Check if generate_question exist and load the file
2. Get te response for all the questions
3. Send all the required details and questions and answers and generate the `Values.Yaml` from user response
4. Store the same in same_helm directory itself. 



In [9]:
# Step 6: Generate values.yaml file using GPT-4.1
import os
from langchain_community.chat_models import ChatOpenAI

# 1. Check if generated_questions.txt exists and load the questions
questions_path = os.path.join('sample_helm', 'generated_questions.txt')
if not os.path.exists(questions_path):
    print(f"❌ {questions_path} not found. Please run the previous steps to generate questions.")
else:
    with open(questions_path, 'r', encoding='utf-8') as f:
        questions = [q.strip() for q in f.readlines() if q.strip()]
    print(f"✅ Loaded {len(questions)} questions from generated_questions.txt\n")
    
    # 2. Get the response for all the questions from the user
    answers = []
    print("Please answer the following questions to generate your values.yaml:")
    for idx, question in enumerate(questions, 1):
        #print(f"\nQ{idx}: {question}")
        answer = input(f"\nQ{idx}: {question}")
        answers.append((question, answer))
    
    # 3. Use GPT-4.1 to generate values.yaml from questions and answers
    llm_gpt4 = ChatOpenAI(model='gpt-4.1', temperature=0.3)
    
    prompt_yaml = """
Given the following Helm chart configuration questions and user answers, generate a valid YAML file suitable for values.yaml. 
Only output the YAML content, no explanations or extra text.

Questions and Answers:
{qa_pairs}
"""
    qa_pairs = "\n".join([f"Q: {q}\nA: {a}" for q, a in answers])
    formatted_prompt = prompt_yaml.format(qa_pairs=qa_pairs)
    print("\n🚀 Sending questions and answers to GPT-4.1 to generate values.yaml...")
    response = llm_gpt4.invoke(formatted_prompt)
    yaml_content = response.content.strip()
    
    # 4. Store the generated YAML in sample_helm/values.yaml
    values_path = os.path.join('sample_helm', 'values.yaml')
    with open(values_path, 'w', encoding='utf-8') as f:
        f.write(yaml_content)
    print(f"\n💾 values.yaml generated and saved to {values_path}\n")
    print("--- values.yaml preview ---\n")
    print(yaml_content)


✅ Loaded 5 questions from generated_questions.txt

Please answer the following questions to generate your values.yaml:

🚀 Sending questions and answers to GPT-4.1 to generate values.yaml...

🚀 Sending questions and answers to GPT-4.1 to generate values.yaml...

💾 values.yaml generated and saved to sample_helm\values.yaml

--- values.yaml preview ---

autoscaling:
  enabled: true

image:
  repository: nginix
  tag: latest

replicaCount: 2

service:
  type: ClusterIP

serviceAccount:
  create: false
  name: ""

💾 values.yaml generated and saved to sample_helm\values.yaml

--- values.yaml preview ---

autoscaling:
  enabled: true

image:
  repository: nginix
  tag: latest

replicaCount: 2

service:
  type: ClusterIP

serviceAccount:
  create: false
  name: ""


In [7]:
# 🧪 Test/Playground Cell
# Use this cell for testing and experimentation

# Quick test: Display current variables if available
if 'variables' in locals():
    print("Current variables found:", sorted(list(variables)))
else:
    print("Run the previous cells first to extract variables")

# You can add your own test code here

Current variables found: ['autoscaling', 'image', 'replicaCount', 'service', 'serviceAccount']
