# ChatBot Using OllamaLLM (llama3)

# # Python and C++ Code Generator Chatbot - Documentation

## General Overview
This project is a **Python-based chatbot** that generates code solutions for user-provided programming tasks. It uses **LangChain** and **Ollama (LLaMA 3)** to interact with a local large language model (LLM). The system:

1. Accepts a **user question or request**.
2. Generates a **well-commented, optimized Python and C++ code solution**.
3. Outputs the solution in a **JSON format**, which is then extracted and displayed cleanly.

The chatbot also includes a **mock fallback LLM**, so it works even if Ollama is not running.

---

## 1. Dependencies

```bash
pip install langchain
pip install langchain-ollama


In [1]:
pip install langchain

Note: you may need to restart the kernel to use updated packages.


In [2]:
pip install langchain-ollama

Note: you may need to restart the kernel to use updated packages.


# Chatbot Code Python

In [13]:
import re
import json
import sys
from langchain.prompts import PromptTemplate
from langchain_ollama import OllamaLLM

# ==============================================================================
# 1. PROMPT TEMPLATE FUNCTION (C++ version)
# ==============================================================================
def create_description_template():
    """
    Creates a PromptTemplate configured for a Python code generation LLM.
    """
    template = """
You are an expert Python programmer. The user will provide you with a question or request.
Your task is to write a perfect, optimized, yet simplified Python code solution for the user's request.

**Instructions:**
- Include comments in the generated code to explain key parts.
- Describe the thinking process briefly in a single comment block.
- Write short, simple, and optimized Python code that directly solves the user's request.
- Output only a valid JSON array. Do NOT wrap the code in triple quotes.
- Escape all internal double quotes in the code.
- The JSON array must contain exactly one object with the keys "question" and "code".

**Input question:**
{user_request}

**Output format** (MUST be a JSON array with one object):
[
    {{
        "question": "The user's original input question here...",
        "code": "The full, optimized, and simplified Python code with comments here..."
    }}
]

Return only the JSON array as output.
Begin.
"""
    return PromptTemplate(input_variables=["user_request"], template=template)

# ==============================================================================
# 2. ROBUST JSON EXTRACTION FUNCTION
# ==============================================================================

def extract_json_from_llm_output(output_str: str) -> list:
    """
    Robustly extracts a JSON array from the LLM output, fixing common
    issues like triple-quoted code fields and unescaped double quotes
    and newlines within the code block.
    Returns a Python list.
    """
    import re
    import json

    output_str = output_str.strip()

    # 1. Look for the JSON array structure [ ... ]
    match = re.search(r'\[\s*\{[\s\S]*?\}\s*\]', output_str, re.DOTALL)
    if not match:
        raise ValueError("No JSON array structure found in LLM output.")

    json_str = match.group(0)

    # 2. Find each "code": occurrence and replace its raw content with a safely escaped JSON string
    last_end = 0
    parts = []
    for m in re.finditer(r'"code"\s*:', json_str):
        # if this found "code": is inside a region we've already processed, skip it
        if m.start() < last_end:
            continue

        # append text before this "code":
        parts.append(json_str[last_end:m.start()])

        # advance pointer to char after the colon
        j = m.end()
        # skip whitespace
        while j < len(json_str) and json_str[j].isspace():
            j += 1

        # detect delimiter: triple quotes or single double-quote
        if json_str.startswith('"""', j) or json_str.startswith("'''", j):
            delim = json_str[j:j+3]
            content_start = j + 3
            end_idx = json_str.find(delim, content_start)
            if end_idx == -1:
                raise ValueError("Unterminated triple-quoted code block in LLM output.")
            raw_code = json_str[content_start:end_idx]
            after_end = end_idx + 3
        elif j < len(json_str) and json_str[j] == '"':
            # normal double-quoted string — find the closing unescaped quote
            content_start = j + 1
            k = content_start
            while k < len(json_str):
                if json_str[k] == '"' and json_str[k-1] != '\\':
                    break
                k += 1
            if k >= len(json_str):
                raise ValueError("Unterminated quoted string for code field.")
            raw_code = json_str[content_start:k]
            after_end = k + 1
        else:
            # Unexpected format (for example code left unquoted). Try to recover:
            # capture until the next closing brace of the object or until next "key":
            # this is a best-effort fallback.
            fallback_match = re.search(r'(?:\n\s*\}\s*|\n\s*"[A-Za-z0-9_]+\s*"\s*:)', json_str[j:], re.DOTALL)
            if fallback_match:
                raw_code = json_str[j:j + fallback_match.start()]
                after_end = j + fallback_match.start()
            else:
                raise ValueError("Unexpected format for code field in LLM output.")

        # --- sanitize/escape the raw code content for JSON ---
        code_content = raw_code
        code_content = code_content.replace('\\', '\\\\')   # escape backslashes first
        code_content = code_content.replace('"', '\\"')     # escape double quotes
        code_content = code_content.replace('\n', '\\n').replace('\r', '')  # escape newlines

        # insert the safe JSON key:value for code
        parts.append(f'"code":"{code_content}"')

        # set last_end to continue after the original code block
        last_end = after_end

    # append the remainder of the JSON string
    parts.append(json_str[last_end:])
    fixed_json_str = ''.join(parts)

    # 3. Parse JSON
    try:
        data = json.loads(fixed_json_str)
    except json.JSONDecodeError as e:
        raise ValueError(f"Failed to parse JSON: {e}\nSanitized JSON:\n{fixed_json_str}")

    return data


# ==============================================================================
# 3. LLM INITIALIZATION
# ==============================================================================
def get_mock_llm_fallback():
    """Returns a mock LLM object as a fallback."""
    class MockLLM:
        def invoke(self, input_vars):
            user_request = input_vars['user_request']
            return f"""
[
    {{
        "question": "{user_request}",
        "code": "// Mock LLM: Please start Ollama and load 'llama3'.\\n#include <iostream>\\nint main() {{ std::cout << 'Mock output!'; return 0; }}"
    }}
]
"""
    return MockLLM()

def initialize_llm_with_ollama(model_name: str = "llama3"):
    print(f"Attempting to initialize Ollama with model: {model_name}...")
    try:
        llm = OllamaLLM(model=model_name, temperature=0.1)
        llm.invoke("Test")  # Test connectivity
        print("Ollama connection successful.\n")
        return llm
    except Exception as e:
        print(f"[ERROR] Ollama connection issue: {e}", file=sys.stderr)
        print("Falling back to Mock LLM.\n", file=sys.stderr)
        return get_mock_llm_fallback()

# Initialize LLM
llm_desc = initialize_llm_with_ollama("llama3")

# ==============================================================================
# 4. MAIN EXECUTION LOGIC
# ==============================================================================
def run_code_generator():
    user_request = input("Enter your Python problem (e.g., Write a Python program to calculate the factorial of a given number using a loop.): ")
    prompt_vars = {'user_request': user_request}

    # Modern LCEL chain: prompt | llm
    desc_prompt = create_description_template()
    chain = desc_prompt | llm_desc

    print("\n--- Invoking LLM Chain ---")
    response_text = chain.invoke(prompt_vars)
    print(response_text)

    # Extraction & parsing
    try:
        data = extract_json_from_llm_output(response_text)  # Python list
        print(data)
        result = data[0]
        print(result)
        extracted_question = result.get('question', 'N/A')
        print(extracted_question)
        generated_code = result.get('code', 'Error: Code not found')
        print(generated_code)

        print("\n--- Extracted Results ---")
        print(f"Original Question: {extracted_question}\n")
        print(f"Generated Python Code:\n{generated_code}\n")

    except Exception as e:
        print(f"\n--- ERROR ---", file=sys.stderr)
        print(f"Could not parse or extract data: {e}", file=sys.stderr)
        print(f"Raw output received:\n{response_text}", file=sys.stderr)

# ==============================================================================
# 5. ENTRY POINT
# ==============================================================================
if __name__ == "__main__":
    run_code_generator()


Attempting to initialize Ollama with model: llama3...
Ollama connection successful.



Enter your Python problem (e.g., Write a Python program to calculate the factorial of a given number using a loop.):  Write a Python program to calculate the factorial of a given number using a loop.



--- Invoking LLM Chain ---
[
    {
        "question": "Write a Python program to calculate the factorial of a given number using a loop.",
        "code": """
def factorial(n):
    # Initialize the result variable
    result = 1
    
    # Loop until n is 0
    for i in range(1, n + 1):
        # Multiply the result by i
        result *= i
    
    return result

# Test the function with a sample input
num = int(input("Enter a number: "))
print(f"The factorial of {num} is {factorial(num)}")
"""
    }
]
[{'question': 'Write a Python program to calculate the factorial of a given number using a loop.', 'code': '\ndef factorial(n):\n    # Initialize the result variable\n    result = 1\n    \n    # Loop until n is 0\n    for i in range(1, n + 1):\n        # Multiply the result by i\n        result *= i\n    \n    return result\n\n# Test the function with a sample input\nnum = int(input("Enter a number: "))\nprint(f"The factorial of {num} is {factorial(num)}")\n'}]
{'question': 'Write a P

# Chatbot Code C++

In [11]:
import re
import json
import sys
from langchain.prompts import PromptTemplate
from langchain_ollama import OllamaLLM

# ==============================================================================
# 1. PROMPT TEMPLATE FUNCTION (C++ version)
# ==============================================================================
def create_description_template():
    """
    Creates a PromptTemplate configured for a C++ code generation LLM.
    """
    template = """
You are an expert C++ programmer. The user will provide you with a question or request.
Your task is to write a perfect, optimized, yet simplified C++ code solution for the user's request.

**Instructions:**
- Include comments in the generated code to explain key parts.
- Describe the thinking process briefly in a single comment block.
- Write short, simple, and optimized C++ code that directly solves the user's request.
- Output only a valid JSON array. Do NOT wrap the code in triple quotes.
- Escape all internal double quotes in the code.
- The JSON array must contain exactly one object with the keys "question" and "code".

**Input question:**
{user_request}

**Output format** (MUST be a JSON array with one object):
[
    {{
        "question": "The user's original input question here...",
        "code": "The full, optimized, and simplified C++ code with comments here..."
    }}
]

Return only the JSON array as output.
Begin.
"""
    return PromptTemplate(input_variables=["user_request"], template=template)

# ==============================================================================
# 2. ROBUST JSON EXTRACTION FUNCTION
# ==============================================================================

def extract_json_from_llm_output(output_str: str) -> list:
    """
    Robustly extracts a JSON array from the LLM output, fixing common
    issues like triple-quoted code fields and unescaped double quotes
    and newlines within the code block.
    Returns a Python list.
    """
    import re
    import json

    output_str = output_str.strip()

    # 1. Look for the JSON array structure [ ... ]
    match = re.search(r'\[\s*\{[\s\S]*?\}\s*\]', output_str, re.DOTALL)
    if not match:
        raise ValueError("No JSON array structure found in LLM output.")

    json_str = match.group(0)

    # 2. Find each "code": occurrence and replace its raw content with a safely escaped JSON string
    last_end = 0
    parts = []
    for m in re.finditer(r'"code"\s*:', json_str):
        # if this found "code": is inside a region we've already processed, skip it
        if m.start() < last_end:
            continue

        # append text before this "code":
        parts.append(json_str[last_end:m.start()])

        # advance pointer to char after the colon
        j = m.end()
        # skip whitespace
        while j < len(json_str) and json_str[j].isspace():
            j += 1

        # detect delimiter: triple quotes or single double-quote
        if json_str.startswith('"""', j) or json_str.startswith("'''", j):
            delim = json_str[j:j+3]
            content_start = j + 3
            end_idx = json_str.find(delim, content_start)
            if end_idx == -1:
                raise ValueError("Unterminated triple-quoted code block in LLM output.")
            raw_code = json_str[content_start:end_idx]
            after_end = end_idx + 3
        elif j < len(json_str) and json_str[j] == '"':
            # normal double-quoted string — find the closing unescaped quote
            content_start = j + 1
            k = content_start
            while k < len(json_str):
                if json_str[k] == '"' and json_str[k-1] != '\\':
                    break
                k += 1
            if k >= len(json_str):
                raise ValueError("Unterminated quoted string for code field.")
            raw_code = json_str[content_start:k]
            after_end = k + 1
        else:
            # Unexpected format (for example code left unquoted). Try to recover:
            # capture until the next closing brace of the object or until next "key":
            # this is a best-effort fallback.
            fallback_match = re.search(r'(?:\n\s*\}\s*|\n\s*"[A-Za-z0-9_]+\s*"\s*:)', json_str[j:], re.DOTALL)
            if fallback_match:
                raw_code = json_str[j:j + fallback_match.start()]
                after_end = j + fallback_match.start()
            else:
                raise ValueError("Unexpected format for code field in LLM output.")

        # --- sanitize/escape the raw code content for JSON ---
        code_content = raw_code
        code_content = code_content.replace('\\', '\\\\')   # escape backslashes first
        code_content = code_content.replace('"', '\\"')     # escape double quotes
        code_content = code_content.replace('\n', '\\n').replace('\r', '')  # escape newlines

        # insert the safe JSON key:value for code
        parts.append(f'"code":"{code_content}"')

        # set last_end to continue after the original code block
        last_end = after_end

    # append the remainder of the JSON string
    parts.append(json_str[last_end:])
    fixed_json_str = ''.join(parts)

    # 3. Parse JSON
    try:
        data = json.loads(fixed_json_str)
    except json.JSONDecodeError as e:
        raise ValueError(f"Failed to parse JSON: {e}\nSanitized JSON:\n{fixed_json_str}")

    return data


# ==============================================================================
# 3. LLM INITIALIZATION
# ==============================================================================
def get_mock_llm_fallback():
    """Returns a mock LLM object as a fallback."""
    class MockLLM:
        def invoke(self, input_vars):
            user_request = input_vars['user_request']
            return f"""
[
    {{
        "question": "{user_request}",
        "code": "// Mock LLM: Please start Ollama and load 'llama3'.\\n#include <iostream>\\nint main() {{ std::cout << 'Mock output!'; return 0; }}"
    }}
]
"""
    return MockLLM()

def initialize_llm_with_ollama(model_name: str = "llama3"):
    print(f"Attempting to initialize Ollama with model: {model_name}...")
    try:
        llm = OllamaLLM(model=model_name, temperature=0.1)
        llm.invoke("Test")  # Test connectivity
        print("Ollama connection successful.\n")
        return llm
    except Exception as e:
        print(f"[ERROR] Ollama connection issue: {e}", file=sys.stderr)
        print("Falling back to Mock LLM.\n", file=sys.stderr)
        return get_mock_llm_fallback()

# Initialize LLM
llm_desc = initialize_llm_with_ollama("llama3")

# ==============================================================================
# 4. MAIN EXECUTION LOGIC
# ==============================================================================
def run_code_generator():
    user_request = input("Enter your C++ problem (e.g., Write a C++ function to find the largest number in an integer array.): ")
    prompt_vars = {'user_request': user_request}

    # Modern LCEL chain: prompt | llm
    desc_prompt = create_description_template()
    chain = desc_prompt | llm_desc

    print("\n--- Invoking LLM Chain ---")
    response_text = chain.invoke(prompt_vars)
    print(response_text)

    # Extraction & parsing
    try:
        data = extract_json_from_llm_output(response_text)  # Python list
        print(data)
        result = data[0]
        print(result)
        extracted_question = result.get('question', 'N/A')
        print(extracted_question)
        generated_code = result.get('code', 'Error: Code not found')
        print(generated_code)

        print("\n--- Extracted Results ---")
        print(f"Original Question: {extracted_question}\n")
        print(f"Generated C++ Code:\n{generated_code}\n")

    except Exception as e:
        print(f"\n--- ERROR ---", file=sys.stderr)
        print(f"Could not parse or extract data: {e}", file=sys.stderr)
        print(f"Raw output received:\n{response_text}", file=sys.stderr)

# ==============================================================================
# 5. ENTRY POINT
# ==============================================================================
if __name__ == "__main__":
    run_code_generator()


Attempting to initialize Ollama with model: llama3...
Ollama connection successful.



Enter your C++ problem (e.g., Write a C++ function to find the largest number in an integer array.):  Write a program to add 2 digits.



--- Invoking LLM Chain ---
[
    {
        "question": "Write a program to add 2 digits.",
        "code": """
#include <iostream>

int main() {
    int num1, num2, sum;

    // Ask for two numbers
    std::cout << "Enter the first number: ";
    std::cin >> num1;
    std::cout << "Enter the second number: ";
    std::cin >> num2;

    // Add the numbers
    sum = num1 + num2;

    // Print the result
    std::cout << "The sum is: " << sum << std::endl;

    return 0;
}
"""
    }
]
[{'question': 'Write a program to add 2 digits.', 'code': '\n#include <iostream>\n\nint main() {\n    int num1, num2, sum;\n\n    // Ask for two numbers\n    std::cout << "Enter the first number: ";\n    std::cin >> num1;\n    std::cout << "Enter the second number: ";\n    std::cin >> num2;\n\n    // Add the numbers\n    sum = num1 + num2;\n\n    // Print the result\n    std::cout << "The sum is: " << sum << std::endl;\n\n    return 0;\n}\n'}]
{'question': 'Write a program to add 2 digits.', 'code': '\n#inc

# Test the code given by the bot in the below cell-> 

In [None]:
# Write a Python function that takes a list of names and returns a dictionary 
#where keys are the first letters and values are lists of names starting with that letter

def group_names_by_first_letter(names):
    # Initialize an empty dictionary to store the result
    result = {}
    
    # Iterate over each name in the input list
    for name in names:
        # Get the first letter of the current name
        first_letter = name[0].upper()
        
        # If the first letter is not already a key in the result dictionary, add it with an empty list as its value
        if first_letter not in result:
            result[first_letter] = []
        
        # Add the current name to the list of names starting with the same first letter
        result[first_letter].append(name)
    
    return result