# Mock Technical Interviewer with Gemini & LangGraph

**Google GenAI Capstone Project**

**Author:** Maksym Rimashevskyi

## 📖 1. Introduction: Bridging the Technical Interview Preparation Gap with GenAI

### The Problem: The High Stakes and Limitations of Traditional Technical Interview Practice 😩
Preparing for technical interviews in the competitive tech industry is a critical yet often inefficient and stressful process. Candidates frequently encounter limitations such as:

*   **😬 Lack of Realistic Simulation:** Existing resources often fail to replicate the dynamic, conversational nature and pressure of a real technical interview.
*   **🤦‍♀️ Subjective and Delayed Feedback:** Feedback from human mock interviews can be subjective, inconsistent, and often comes with delays, hindering rapid learning.
*   **💸 Limited Accessibility and Scalability:** Access to experienced human interviewers for practice is often restricted by cost and availability.
*   **🧭 Difficulty in Identifying Specific Weaknesses:** Candidates may struggle to pinpoint precise areas for improvement beyond general feedback.

### The Solution: An Intelligent GenAI-Powered Mock Interview Platform 🧠

This project introduces an innovative solution: an intelligent, AI-driven Mock Technical Interview Platform designed to provide realistic, interactive, and personalized interview practice. By harnessing the advanced capabilities of Google's Gemini models and the LangGraph framework, this platform offers a transformative approach to technical interview preparation.

**Key Features:**
*   **🗣️ Dynamic Interview Simulation (LangGraph Agents):** Employs LangGraph agents powered by Gemini to create a stateful, multi-turn conversational experience that mirrors a real technical interview, guiding candidates through problem-solving.
*   **💻 Interactive Problem Engagement (Gemini):** Allows candidates to engage with coding problems through text, code input, and even visual explanations via whiteboard sketches interpreted by Gemini's image understanding capabilities.
*   **💡 Intelligent Guidance and Probing (Few-Shot Learning):** Leverages few-shot learning within prompts to enable Gemini to provide contextually relevant hints, ask clarifying questions, and adapt its approach based on the candidate's responses, mimicking the nuanced interaction of a human interviewer.
*   **❓ Knowledge Retrieval for Questioning (Function Calling):** Utilizes function calling to dynamically retrieve relevant interview questions from a database, ensuring a diverse and potentially tailored interview experience.
*   **📊 Comprehensive and Structured Performance Evaluation (Structured Output):** Generates detailed, structured reports using Gemini's controlled generation capabilities, objectively assessing technical proficiency, problem-solving methodology, communication clarity, and code quality.
*   **📚 Personalized Learning and Skill Enhancement (Grounding):** Employs Gemini's grounding capabilities (integrating with external knowledge sources) to analyze the interview performance and generate a targeted learning plan with relevant resources, directly addressing identified weaknesses.
*   **🧠 Contextual Awareness and Long-Term Analysis (Long Context Window):** Leverages Gemini's long context window to maintain a comprehensive understanding of the entire interview conversation, enabling nuanced feedback and holistic performance analysis.

### Innovation & Use Case Suitability ✨
This project represents a significant step forward in technical interview preparation through its creative and impactful application of Generative AI.
*   **🌟 Novelty:** It goes beyond simple Q&A by constructing a sophisticated, stateful interviewing agent capable of understanding multimodal input, providing adaptive guidance, and generating nuanced evaluations – a level of interaction previously only achievable with human interviewers. The integration of whiteboard understanding directly into a technical interview simulation is particularly novel.
*   **🚀 Impact:** This platform has the potential to democratize access to high-quality interview practice, enabling a wider range of candidates to prepare effectively, reduce interview anxiety, and ultimately improve their chances of success in the competitive tech job market. Its scalability allows for widespread adoption and consistent, objective feedback.
*   **💯 GenAI Suitability:** The project's core functionalities are exceptionally well-suited to the strengths of modern GenAI models like Gemini:
    *   **🗣️ Natural Language Understanding and Generation:** Crucial for engaging in a realistic dialogue, interpreting candidate responses, and formulating insightful questions and feedback.
    *   **💻 Code Understanding and Generation:** Essential for interacting with code snippets provided by the candidate and even generating follow-up questions based on the code.
    *   **🖼️ Image Understanding:** Enables the innovative feature of interpreting whiteboard sketches, a common element in real technical interviews.
    *   **📑 Structured Output and Function Calling:** Allows for the systematic retrieval of questions and the generation of organized and informative performance reports.
    *   **🤖 Agentic Capabilities (LangGraph):** Necessary for orchestrating the complex, multi-step process of a technical interview, managing state, and directing the flow of the conversation.
    *   **⏳ Long Context Window:** Enables the AI to maintain context throughout the interview for more accurate and relevant feedback.
    *   **🌍 Grounding:** Facilitates the generation of personalized and actionable learning plans based on external knowledge.

By seamlessly integrating these advanced GenAI capabilities, this project offers a powerful and innovative solution to the persistent challenges of technical interview preparation.

### ❗ **IMPORTANT!**

The app built in this notebook uses Gradio launch function to run an UI interface. This section is commented-out to ensure that you can use the `Run all` feature without interruption. At the end of this notebook you will need to uncomment the `demo.launch(...)` call in order to interact with the app.

If you wish to save a version of this notebook with `Save and Run all`, you will need to **re-comment** the line you commented out to ensure that the notebook can run without human input.

## ⚙️ 2. Environment Setup

This section handles the necessary setup for the project environment.

1.  **Package Installation:** We first uninstall potentially conflicting default Kaggle packages and then install the required libraries:
    *   `gradio`: For building the interactive web UI.
    *   `langgraph`: The core framework for building the stateful agent.
    *   `langchain-google-genai`: Provides LangChain integrations for Gemini models.
    *   `langgraph-prebuilt`: Contains pre-built LangGraph components (though we build custom ones).
    *   `google-genai`: Provides an interface to integrate Google's generative models.
    *   Other dependencies implicitly installed include `langchain-core`, `pydantic`, etc.
2.  **Library Imports:** We import standard Python libraries, third-party tools (like `gradio`, `pandas`, `PIL`), and specific components from `langchain`, `langgraph`, and `google.genai`.
3.  **API Key Configuration:** We securely load the `GOOGLE_API_KEY` using Kaggle Secrets and set it as an environment variable for the Gemini client.

### 2.1 Install packages

Remove conflicting packages from the Kaggle base environment and install langgraph and google-genai packages.

In [1]:
!pip uninstall -qqy jupyterlab libpysal thinc spacy fastai ydata-profiling google-cloud-bigquery google-generativeai pydantic

!pip install -qqU gradio langgraph==0.3.21 langchain-google-genai==2.1.2 langgraph-prebuilt==0.1.7 google-genai==1.7.0

[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m65.2/65.2 kB[0m [31m3.8 MB/s[0m eta [36m0:00:00[0m
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m43.5/43.5 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m138.0/138.0 kB[0m [31m5.4 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m42.0/42.0 kB[0m [31m2.2 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m144.7/144.7 kB[0m [31m7.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m46.9/46.9 MB[0m [31m31.0 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m322.2/322.2 kB[0m [31m14.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m100.9/100.9 kB[0m [31m4.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━

### 2.2 Import libraries

In [2]:
# Standard Library
import base64
import json
import os
from io import BytesIO
from typing import Annotated, List, Literal, Optional, TypedDict, Union, Dict, Any, Tuple

# Third-Party
import gradio as gr
import numpy as np
import pandas as pd
import PIL.Image
import requests
from google import genai
from google.genai import types
from IPython.display import Image, Markdown, display
from jinja2 import Template
from kaggle_secrets import UserSecretsClient
from langchain_core.messages import BaseMessage
from langchain_core.messages.ai import AIMessage
from langchain_core.messages.human import HumanMessage
from langchain_core.messages.system import SystemMessage
from langchain_core.messages.tool import ToolMessage
from langchain_core.runnables import Runnable
from langchain_core.tools import BaseTool, tool
from langchain_google_genai import ChatGoogleGenerativeAI
from langgraph.graph import END, START, StateGraph 
from langgraph.graph.message import add_messages
from langgraph.prebuilt import ToolNode
from pydantic import BaseModel, Field, ConfigDict

genai.__version__

'1.7.0'

### 2.3 Loading secrets

To run the following cell, your API key must be stored it in a [Kaggle secret](https://www.kaggle.com/discussions/product-feedback/114053) named `GOOGLE_API_KEY`.
 
If you don't already have an API key, you can grab one from [AI Studio](https://aistudio.google.com/app/apikey). You can find [detailed instructions in the docs](https://ai.google.dev/gemini-api/docs/api-key).
 
To make the key available through Kaggle secrets, choose `Secrets` from the `Add-ons` menu and follow the instructions to add your key or enable it for this notebook.

In [3]:
GOOGLE_API_KEY = UserSecretsClient().get_secret("GOOGLE_API_KEY")
os.environ["GOOGLE_API_KEY"] = GOOGLE_API_KEY

If you received an error response along the lines of `No user secrets exist for kernel id ...`, then you need to add your API key via `Add-ons`, `Secrets` **and** enable it.
 
![Screenshot of the checkbox to enable GOOGLE_API_KEY secret](https://storage.googleapis.com/kaggle-media/Images/5gdai_sc_3.png)

**Automated retry**

Set up an automatic retry that ensures your requests are retried when per-minute quota is reached.

In [4]:
from google.api_core import retry

is_retriable = lambda e: (isinstance(e, genai.errors.APIError) and e.code in {429, 503})

if not hasattr(genai.models.Models.generate_content, '__wrapped__'):
  genai.models.Models.generate_content = retry.Retry(
      predicate=is_retriable)(genai.models.Models.generate_content)

## 🧩 3. Data Preparation: Interview Questions

To provide relevant coding challenges, the interviewer needs access to a database of problems.

*   **Data Source:** We download a `data.json` file containing a list of coding problems. This file is a simulated database of questions captured on [LeetCode](leetcode.com) that includes fields like `id`, `problem_name`, `topic`, `difficulty`, `link`, `companies`, `content` (problem description), and `code` (starter code). Data has JSON format and stored using GitHub gists.
*   **Loading Data:** The JSON data is loaded into a Pandas DataFrame (`df`) for easy filtering and sampling within the application logic. This DataFrame serves as the knowledge base for the interview questions.

In [5]:
!wget -O data.json https://gist.githubusercontent.com/MaxDatex/474441cecf63e4a71138029f5d9ec1d6/raw/a43227f3f671fb09b08285e8097756ad6e050d2d/data.json

--2025-04-17 20:58:20--  https://gist.githubusercontent.com/MaxDatex/474441cecf63e4a71138029f5d9ec1d6/raw/a43227f3f671fb09b08285e8097756ad6e050d2d/data.json
Resolving gist.githubusercontent.com (gist.githubusercontent.com)... 185.199.108.133, 185.199.110.133, 185.199.109.133, ...
Connecting to gist.githubusercontent.com (gist.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 20937 (20K) [text/plain]
Saving to: ‘data.json’


2025-04-17 20:58:21 (2.72 MB/s) - ‘data.json’ saved [20937/20937]



In [6]:
file_path = "data.json"

try:
    with open(file_path, 'r') as f:
        problems_data = json.load(f)
    print("JSON data loaded successfully!")
except FileNotFoundError:
    print(f"Error: File not found at {file_path}")
except json.JSONDecodeError:
    print(f"Error: Could not decode JSON from {file_path}. Check the file content.")

JSON data loaded successfully!


In [7]:
df = pd.read_json(file_path)
df.head()

Unnamed: 0,id,topic,link,problem_name,difficulty,companies,content,code
0,0,Array Manipulation,https://leetcode.com/problems/two-sum,Two sum,Easy,"[OpenAI, Google, Amazon, Meta, Netflix, Apple]",Given an array of integers nums and an integer...,"class Solution:\n def twoSum(self, nums: Li..."
1,1,Array Manipulation,https://leetcode.com/problems/container-with-m...,Container with most water,Medium,"[Google, Amazon]",'You are given an integer array height of leng...,"class Solution:\n def maxArea(self, height:..."
2,2,Array Manipulation,https://leetcode.com/problems/3sum,3sum,Medium,"[OpenAI, Google, Amazon, Meta, Netflix, Apple]","Given an integer array nums, return all the tr...","class Solution:\n def threeSum(self, nums: ..."
3,3,String Manipulation,https://leetcode.com/problems/longest-substrin...,Longest substring without repeating characters,Medium,"[OpenAI, Google, Amazon, Meta, Netflix, Apple]","Given a string s, find the length of the longe...",class Solution:\n def lengthOfLongestSubstr...
4,4,String Manipulation,https://leetcode.com/problems/string-to-intege...,String to integer (atoi),Medium,"[Meta, Amazon]","Implement the myAtoi(string s) function, which...","class Solution:\n def myAtoi(self, s: str) ..."


## 🧠 4. Core Logic: The Technical Interviewer Agent (LangGraph)

The heart of this application is the agent built using LangGraph, designed to simulate the flow and logic of a technical interview. LangGraph allows us to define a state machine or graph where nodes represent processing steps (like invoking the LLM or tools) and edges represent the flow between these steps.

### 4.1. Agent Prompts 🔑➡️🤖

Prompts are crucial for guiding the Gemini model's behavior. We define several key prompts:

*   **`INTERVIEWER_SYSTEM_PROMPT`:** This is the main prompt defining the AI's persona and core responsibilities.
    *   **Persona:** A supportive technical interviewer from "Mock Technologie Inc."
    *   **Goal:** Evaluate the candidate's skills while guiding them through a problem.
    *   **Behavior:** Ask probing questions, provide *graduated hints* (Level 1-3), encourage "thinking out loud," confirm understanding.
    *   **Constraints:** Must use tools to get questions, cannot make up questions, needs confirmation for starting/ending the interview.
    *   **Multi-Modal Handling:** Explicit instructions on how to analyze and integrate information from whiteboard screenshots seamlessly into the conversation.
    *   **Few-Shot Examples:** Includes example interactions to demonstrate desired behavior, especially regarding hint provision and whiteboard analysis.
*   **`WELCOME_MSG`:** The initial message the interviewer presents to the user.
*   **`CANDIDATE_EVALUATION_PROMPT`:** Used *after* the interview ends. It instructs a separate LLM call to analyze the full transcript and final code, evaluating the candidate based on specific criteria (Technical Competence, Problem-Solving, Communication). Crucially, it requests the output in a **structured JSON format**.
*   **`RESOURCES_SEARCH_PROMPT`:** Used to generate the personalized learning plan. It provides context (interview question, evaluation summary, topics to learn) and instructs the LLM to use the **Google Search tool (Grounding)** to find relevant resources and synthesize recommendations, *without* explicitly listing URLs in the text (citations are handled automatically).
*   **`DESCRIBE_IMAGE_PROMPT`:** A specialized prompt used by the `get_interview_transcript` utility function. When a whiteboard image is present in the user's message, this prompt asks Gemini to describe the image's content and relevance to the ongoing conversation, making the transcript more comprehensive for the final evaluation.

**Few words about the following prompt**

The first iteration of this prompt was created using materials from the Internet about how to be a good technical interviewer.
Then, step by step, I improved this prompt to make the model do what I wanted it to and behave more like a real technical interviewer.
Here are a few things that needed to be made to make the model work properly:
1. Add a line telling the model to ask the user for confirmation (confirmation used to invoke tools)
2. Accent on the fact, that the model should only take questions from the database (using tools).
3. While testing, the model could give some strange responses. Using a few examples (few-shot learning) helps mitigate these issues.  

Nevertheless, I think there could be a lot more things to change/add to this prompt, but we need to test it more.

In [8]:
INTERVIEWER_SYSTEM_PROMPT = '''

COMPANY NAME: "Mock Technologie Inc."
You are a technical interviewer and an expert in software engineering, technical interviewing, and pedagogical best practices. 
Your primary goal is to evaluate a candidate's technical skills, problem-solving abilities, and relevant experience to determine if they are a suitable fit for a specific technical role within the company.
You should keep the candidate actively engaged and progressing through the given problem. You will provide hints, guidance, and ask probing questions to facilitate the candidate's problem-solving process. 
You are designed to be supportive, encouraging, and focused on helping the candidate demonstrate their abilities. 

You should ask user to choose question for the technical interview. User can choose specific question or a random one. You can NOT start the interview if you have not received interview question.
You should take questions only from question database. You can access question database ONLY by using tools. Do NOT make up questions!
You should ask candidate (user) to confirm selected question. For example: "Are you sure that you want to choose following question: question_name".
You should ask candidate (user) to confirm that he want to end the interview if you think he want. For example: "Are you sure you want to end the interview?"
Only ask probing questions or give hints if you think the candidate is strugling.


**I. Core Principles:**

*   **Facilitating Problem-Solving:** Your focus is on guiding the candidate through the problem, not solving it for them.
*   **Encouraging Communication:**  Prompt the candidate to explain their thought process and reasoning.
*   **Providing Strategic Hints:** Offer hints in a graduated manner to help the candidate overcome obstacles.
*   **Positive and Supportive Tone:**  Create a comfortable environment where the candidate feels encouraged to explore solutions.

**II. Interview Execution Guidance:**

*   **Problem Definition Confirmation:** Ensure the candidate understands the problem statement completely. Ask them to reiterate the problem in their own words. "So, just to confirm, can you please explain the problem and expected inputs/outputs back to me?"
*   **Clarifying Questions:** Encourage the candidate to ask clarifying questions before they start coding. "What questions do you have about the requirements or constraints before we begin?"
*   **Observing and Listening:** Pay close attention to the candidate's thought process.
*   **Prompting Explanation:**  Actively ask the candidate to "think out loud." Use prompts like:
    *   "What are you thinking?"
    *   "What's your next step?"
    *   "Why did you choose that approach?"
    *   "What are the potential trade-offs of this approach?"
*   **Code Implementation:** Ask the candidate to code his solution when he described it in words. If candidate first decided to try bruteforce approach, you should still ask him to code it.
*   **Strategic Hint Provision:** Offer hints in a graduated manner:
    *   **Level 1 (General):** "Have you considered [data structure/algorithm]?", "What's the time complexity of your current approach?", "Are there any edge cases you haven't accounted for?"
    *   **Level 2 (Specific):** "You could use a hash map to [solve specific sub-problem].", "Try using a two-pointer approach when [condition exists].", "What should happen if the input is null or empty?"
    *   **Level 3 (Code Snippet/Conceptual Reminder):** "In [language], you could use [built-in function] to [achieve specific task]." "Remember that Depth First Search involves exploring as far as possible down each branch."  Use code snippets SPARINGLY and ONLY after exhausting other options. Focus on conceptual understanding, not direct code completion.
*   **Redirecting if Necessary:**  If the candidate is completely stuck, gently redirect them to a different approach or a simpler sub-problem.  "Perhaps we can focus on [simplified task] first, to get a better understanding of the core concept."

**Example Interactions:**

*Example 1: Candidate is slightly stuck on approach*
Candidate Response: "I'm trying to think how to find the pairs efficiently. Maybe sort the array first?"
Interviewer Output/Response: "Sorting is an interesting idea. What would be the time complexity of sorting, and how would you use the sorted array to find the pair?" (This is a Level 1 hint, guiding towards complexity and next steps without giving away hash maps yet).

*Example 2: Candidate makes a small logic error in code*
Candidate Response: "Here's my code using the hash map."
Candidate Code Snippet: (Python code for Two Sum using hash map, but checks `if target - nums[i] in hm:` *before* putting `nums[i]` in the map)
Interviewer Output/Response: "Thanks for sharing the code. It looks like you're correctly using a hash map. Let's trace it with `nums = [3, 2, 4]` and `target = 6`. What happens in the first iteration when `i=0` and `nums[i]=3`? What gets checked in the hash map, and what gets added?" (This is probing/guiding towards the specific error without explicitly stating it).

*Example 3: Candidate needs a more specific hint*
Candidate Response: "I traced it, but I'm still not sure why it wouldn't work for certain cases." (After Example 2 trace)
Interviewer Output/Response: "Okay, consider the order. Right now, you check if the complement (`target - nums[i]`) exists in the map *before* you add the current number (`nums[i]`) to the map. What if the complement *is* the current number you're processing, like in the `[3, 3]` target 6 example?" (This is a Level 2 hint pointing more directly at the logic flaw).
      
**III. Input Format:**

You will receive the following inputs:

*   **Candidate Response:** (Transcripts of the candidate's verbal explanations, thinking process, etc.)
*   **Candidate Code Snippet (optional):** (Optional candidate code snippet. Candidate will provide code only when ready).
*   **Screanshot of the user whiteboard (optional):** (Optional candidate's whiteboard screenshot. On the whiteboard canidate can explain his thoughts more clearly).
*   **Instruction:** If a whiteboard image is provided, you MUST analyze its content (diagrams, pseudocode, logic flow, data structures etc.). **Seamlessly integrate your understanding and interpretation of the whiteboard into your main conversational response.** Don't just mention the image exists; refer to *specifics* you observe in it. Use this visual context to:
*   Confirm your understanding ("Okay, seeing the diagram confirms you're planning to use a hash map...").
        *   Connect it to their words or code ("That flow you drew on the whiteboard matches your explanation well..." or "The way you've depicted the pointers on the whiteboard helps clarify your code's logic...").
        *   Ask targeted questions based on the visual ("Looking at your whiteboard, what happens in the edge case where...?").
        *   Point out discrepancies if the visual conflicts with other inputs ("Your diagram seems to show [X], but in your code, it looks like [Y]. Could you clarify that step?").
*   **Crucially, do NOT make up any diagram or explanation, If content of the whiteboard is not related to the currend dialogue or problem you should say "Content of the whiteboard is not related to the question/conversation". Do NOT create a separate 'Whiteboard Analysis' section in your output.** Weave these observations naturally into your dialogue with the candidate.


**IV. Output Format:**

Your output should contain the following:

**Response:** (Your response to the candidate. You may or may not include probing questions, strategic hints or guiding suggestion to your response.")
**Response may include:**
*   **Next Probing Question:** (Ask a question to encourage the candidate to elaborate on their approach, clarify their reasoning, or identify potential issues. This question might stem from your whiteboard analysis.)
*   **Strategic Hint (if necessary):** (Provide a hint appropriate for the candidate's current situation, following the Level 1-3 progression. Only provide a hint if the candidate is clearly stuck and not making progress.)
*   **Guiding Suggestion (if necessary):** (If the candidate needs a nudge, suggest a specific action they could take to move forward. E.g., "Try writing a simple test case to verify your logic.")

**Example Input 1 (with Whiteboard):**

```
Candidate Response: "Okay, so I'll use a hash map. I'll iterate through the array, and for each element, I'll calculate the complement needed to reach the target. I'll check if the complement is already in the hash map. If it is, I return the indices. If not, I add the current element and its index to the map."
Candidate Code Snippet: (Code implementing the hash map approach, maybe with a small bug)
Screanshot of the user whiteboard: (Image showing an array [2, 7, 11, 15], target 9, and a box labeled 'HashMap' with arrows indicating lookups and insertions, possibly showing {7: 0} after the first step.)
```

**Example Output 1 (with Whiteboard Analysis):**

```
**Response: ** "That's a good explanation of the hash map approach, and the diagram on your whiteboard clearly illustrates that key idea of storing complements and their indices, like storing {7: 0} after the first step. Your code snippet looks close. Let's trace it with your example [2, 7, 11, 15] and target 9, keeping your diagram in mind. What happens in the code when i is 1 and the element is 7?"
**Optional hints:**
1.  **Next Probing Question:** "Based on your whiteboard diagram, what value would you expect to find in the hash map when processing the number 7?"
2.  **Strategic Hint (if necessary):** Level 2: "Think about the order of operations. Should you check the hash map before or after adding the current element to it?"
3.  **Guiding Suggestion (if necessary):** "Add a print statement inside your loop to see the state of the hash map at each step."
```

**Example Input 2 (with Whiteboard):**

```
Candidate Response: "I think we can iterate over the array to find maximum value, then we should pop this value and do it again. On the third time we will find third biggest number"
Candidate Code Snippet: (No code provided))
Screanshot of the user whiteboard: (Image showing a drawing of a snake that ate an elephant)
```

**Example Output 2 (with Whiteboard Analysis):**

```
**Response: ** "It's great first step to find the right solution but content of the whiteboard is not related to the question. This bruteforce approach could work. Can you elaborate a bit more on this solution."
```

'''

WELCOME_MSG = '''Hello! I'm a technical interviewer for Mock Technologie Inc. I'm here to help you demonstrate your software engineering skills.

To start, please choose a question for the technical interview. You can either pick a specific question you'd like to work on, or I can select one randomly for you. Let me know what you'd prefer!'''

In [9]:
CANDIDATE_EVALUATION_PROMPT = '''
Your Role: You are an experienced Technical Hiring Manager/Interviewer. Your task is to evaluate a candidate's suitability for a specific technical role based solely on the provided interview transcript.

Inputs You Will Be Given:

    [Interview quesiton]: Question that was asked to the candidate.
    [Interview Transcript]: The full text of the conversation between the interviewer(s) and the candidate.
    [Candidate code solution]: Final version of the code solution for the problem

Your Task:

Analyze the provided [Interview Transcript] to assess the candidate's qualifications and fit for the Software Engineer role. Focus on the following key areas, using specific examples and quotes from the transcript as evidence:

Evaluation Criteria:
    Technical Competence:
        Problem Understanding: Did they ask clarifying questions? Did they accurately restate the problem and constraints? Did their whiteboard diagram (if provided) accurately reflect the problem?
        Approach & Algorithm Design: Did they discuss trade-offs (time/space complexity)? Was the chosen approach suitable? Did they break down the problem effectively? Did their whiteboard explanation (if provided) align with their verbal approach and demonstrate logical thinking?
        Coding & Implementation (as described/discussed): Does their described logic seem sound? Did they discuss data structures/algorithms appropriately? Did they consider edge cases in their plan?
        Testing & Verification: Did they propose tests or walk through examples (including edge cases)?
        Debugging & Correction: How did they respond to identified issues or prompts for correction? Could they identify flaws?

    Problem-Solving & Critical Thinking:
        Systematic Approach: Did they follow a logical process?
        Adaptability: How did they handle roadblocks or hints? Did they explore alternatives?
        Optimization: Did they consider performance? Could they articulate why one approach might be better?

    Communication & Collaboration:
        Clarity of Thought: Could they articulate their thought process clearly while solving problems? Was their explanation easy to follow?
        Active Listening: Did they seem to understand the interviewer's questions and feedback?
        Asking Questions: Did they ask relevant clarifying questions about the problem? Did they ask thoughtful questions about the role/team (if applicable in the transcript)?
        Receiving Feedback: How did they react to constructive criticism or alternative suggestions (receptive, defensive, collaborative)?
        Professionalism: Was their language professional and respectful?

Required Output Format:

Structure your evaluation clearly using the following sections:
    Overall Summary: A brief (2-3 sentence) overview of the candidate's performance and your high-level recommendation (e.g., Strong Hire, Hire, Lean Hire, No Hire, Needs Further Discussion)..
    Strengths: List 5 key strengths observed during the interview. For each strength, provide specific examples or brief quotes from the transcript as evidence. Evidence should be brief but descriptive. Avoid evidences like "num[i]". You can take bigger code snippet with text formating to make accent on the correct/interesting implementation.
    Areas for Development / Concerns: List atleast 5 key weaknesses or areas where the candidate struggled or raised concerns. For each point, provide specific examples or brief quotes from the transcript as evidence. Avoid evidences like "num[i]" if case when it should be nums[j]. You can take bigger code snippet with text formating to make accent on the error.
    Detailed Analysis: Extensively elaborate on the candidate's performance within each of the main evaluation criteria (Technical, Problem-Solving, Communication), referencing transcript evidence.
    Final Recommendation & Justification: Clearly state your final hiring recommendation and provide a extensive justification linking back to the key strengths and weaknesses. Give recomendations on what to improve in the context of problem solving and interview communcation.
    Topics to learn: Provide a list of topics to learn for the user to improve their skills in their areas for development.

Important Guidelines:
    Base your entire evaluation strictly on the provided transcript text. Do not infer information not present.
    Be objective and analytical.
    Cite specific examples or quote relevant snippets from the transcript to support your points.
    Maintain a professional tone.


# Inputs:

## Interview question
{question}

## Interview transcript
{transcript}

## Candidate code solution
{code}
'''

In [10]:
RESOURCES_SEARCH_PROMPT = '''
You are an expert learning advisor providing recommendations based on a technical interview evaluation.

**Interview Context:**
*   **Question Asked:** 

{question}

*   **Language Used:** {language}
*   **Expert Evaluation Summary:** 

{analytics}

*   **Key Topics Identified for Learning:** 

{topics}

**Your Task:**
Generate a concise, actionable learning plan or set of recommendations for the candidate based **only** on the provided context and information found via the search tool.

**Instructions:**
1.  Directly address the "Topics to Learn" and relate them to the "Expert Evaluation Summary".
2.  Synthesize information found via search to explain concepts or suggest approaches for improvement.
3.  Structure your response clearly, perhaps using bullet points for key recommendations.
4.  **IMPORTANT:** Write a helpful narrative. **Do NOT explicitly list URLs or resource titles in your response text.** The system will automatically add citations based on the search results you utilize.
5.  Focus on providing clear advice grounded in the search findings.
6.  Tool Usage: Please use the search tool to find current and relevant resources based on these criteria.
7.  Skip introduction: Start you answer with concrete plan or recomended actions. Do NOT start you answer with sentences like "Okay, based on the interview evaluation, here's a...". Instead start with "The evaluation indicates weaknesses in coding proficiency, debugging skills..."



**Example Output Structure (Conceptual - do not include URLs here):**
"Analysis indicates that [Specific Weakness from Analytics], focusing on [Topic 1] is recommended. Understanding the core principles of [Concept Found via Search] can significantly help. For [Topic 2], exploring different approaches like [Approach Found via Search] would be beneficial. Practice problems related to [Relevant Sub-topic] are also advised..."
'''

In [11]:
DESCRIBE_IMAGE_PROMPT = '''
Given transcript of the technical interview, analyze provided image. 
Describe its relevancy to the transcript or code (if provided).
Be concise. Give only required information.

## Transcript:
{transcript}

'''

### 4.2. Structured Output (Evaluation Report) 📜

To ensure the evaluation report is consistent and machine-readable, we define Pydantic models that specify the desired JSON schema.

*   **Pydantic Models:** `StrengthItem`, `AreaForDevelopmentItem`, `DetailedAnalysis`, `FinalRecommendation`, `TopicsToLearn`, and the root `EvaluationOutput` model define the exact structure, fields, types, and descriptions for the evaluation data.
*   **LLM Integration:** When calling Gemini for the evaluation (`create_report_node`), we provide the `EvaluationOutput` model as the `response_schema` and set `response_mime_type` to `application/json`. This instructs Gemini to generate its response adhering strictly to this schema.
*   **Report Template:** A Jinja2 template (`REPORT_TEMPLATE`) is used to format the structured JSON output into a user-friendly Markdown report.

In [12]:
REPORT_TEMPLATE = '''
# Candidate Interview Evaluation Report

---

## Overall Summary

{{ evaluation.overall_summary }}

---

## Final Recommendation

**Recommendation:** {{ evaluation.final_recommendation.recommendation }}

**Justification:** {{ evaluation.final_recommendation.justification }}

---

## Strengths Observed

{% if evaluation.strengths %}
  {% for strength in evaluation.strengths %}
*   **Strength:** {{ strength.point }}
    *   **Evidence:** {{ strength.evidence }}
  {% endfor %}
{% else %}
*   *No specific strengths noted in the transcript.*
{% endif %}

---

## Areas for Development / Concerns

{% if evaluation.areas_for_development %}
  {% for weakness in evaluation.areas_for_development %}
*   **Area:** {{ weakness.point }}
    *   **Evidence:** {{ weakness.evidence }}
  {% endfor %}
{% else %}
*   *No specific areas for development or concerns noted in the transcript.*
{% endif %}

---

## Detailed Analysis

### Technical Competence

{{ evaluation.detailed_analysis.technical_competence }}

### Problem Solving & Critical Thinking

{{ evaluation.detailed_analysis.problem_solving_critical_thinking }}

### Communication & Collaboration

{{ evaluation.detailed_analysis.communication_collaboration }}

---

## Suggested Improvement Resources

*(Note: This section lists suggested resources based on identified development areas. The list might be empty if none were specified.)*

{{ recommendations | default("*No specific learning recommendations were generated.*") }}

---
'''

In [13]:
class StrengthItem(BaseModel):
    """Represents a single observed strength."""
    point: str = Field(
        ..., # Ellipsis (...) means this field is required
        description="A concise statement describing the specific strength observed (e.g., 'Clear communication of thought process', 'Effective use of data structures', 'Proactively identified edge cases')."
    )
    evidence: str = Field(
        ...,
        description="Specific examples, observed behaviors, or direct quotes from the transcript that substantiate the identified strength."
    )

class AreaForDevelopmentItem(BaseModel):
    """Represents a single observed area for development or concern."""
    point: str = Field(
        ...,
        description="A concise statement describing the specific weakness or area needing improvement (e.g., 'Difficulty explaining time complexity', 'Did not consider null inputs initially', 'Hesitant to ask clarifying questions')."
    )
    evidence: str = Field(
        ...,
        description="Specific examples, observed behaviors, or direct quotes from the transcript that substantiate the identified weakness or concern."
    )

class DetailedAnalysis(BaseModel):
    """
    An object containing extensive, narrative elaboration on the candidate's
    performance across core evaluation criteria, weaving together observations
    and transcript evidence.
    """
    technical_competence: str = Field(
        ...,
        alias="technicalCompetence",
        description="Detailed assessment of the candidate's technical skills demonstrated or discussed. Covers aspects like problem understanding, approach formulation, algorithmic thinking, data structure usage, coding logic (as described), testing awareness, and debugging ability, supported by transcript references."
    )
    problem_solving_critical_thinking: str = Field(
        ...,
        alias="problemSolvingCriticalThinking",
        description="Detailed assessment of the candidate's approach to problem-solving. Covers aspects like systematic thinking, logical breakdown of problems, adaptability when facing challenges, response to hints, and consideration of efficiency/optimization, supported by transcript references."
    )
    communication_collaboration: str = Field(
        ...,
        alias="communicationCollaboration",
        description="Detailed assessment of the candidate's communication and interaction style. Covers aspects like clarity of explanation ('thinking out loud'), active listening, ability to articulate ideas, quality of questions asked, receptiveness to feedback, and overall professionalism during the interaction, supported by transcript references."
    )

    class Config:
        validate_by_name = True


class FinalRecommendation(BaseModel):
    """The concluding hiring decision and the explicit reasoning behind it."""
    recommendation: Literal[
        "Strong Hire", "Hire", "Lean Hire", "No Hire", "Needs Further Discussion"
    ] = Field(
        ...,
        description="The definitive hiring recommendation category based on the overall evaluation."
    )
    justification: str = Field(
        ...,
        description="A concise summary justifying the final recommendation. It should explicitly link the key strengths and areas for development to the requirements of the specific role and level being considered."
    )


class TopicsToLearn(BaseModel):
    """Represents a single suggested learning resource."""
    area: str = Field(
        ...,
        description="The specific skill or knowledge area the resource is intended to address (e.g., 'Algorithm Complexity Analysis', 'Test-Driven Development Principles', 'Effective Technical Communication'). Should ideally correspond to one or more points in 'areasForDevelopment'."
    )
    description: str = Field(
        ...,
        description="A brief explanation of the resource content or why it is being recommended in relation to the candidate's development needs."
    )

# Define the main evaluation output model

class EvaluationOutput(BaseModel):
    """
    Root object containing the complete candidate evaluation derived
    from the interview transcript.
    """
    overall_summary: str = Field(
        ...,
        alias="overallSummary",
        description="A brief (2-3 sentence) overview of the candidate's performance and a high-level hiring recommendation (e.g., Strong Hire, Hire, Lean Hire, No Hire, Needs Further Discussion). This provides a quick snapshot of the evaluation outcome."
    )
    strengths: List[StrengthItem] = Field(
        ...,
        description="An array listing the key positive attributes, skills, and behaviors demonstrated by the candidate during the interview. The number of items in the array will vary based on the interview content."
    )
    areas_for_development: List[AreaForDevelopmentItem] = Field(
        ...,
        alias="areasForDevelopment", 
        description="An array listing the key weaknesses, areas where the candidate struggled, skills gaps, or concerns identified during the interview. The number of items in the array will vary."
    )
    detailed_analysis: DetailedAnalysis = Field(
        ...,
        alias="detailedAnalysis",
        description="Extensive analysis of the candidate's performance within each of the main evaluation criteria (Technical, Problem-Solving, Communication), referencing transcript evidence. Each criterial should be atleast 5 sentences."
    )
    final_recommendation: FinalRecommendation = Field(
        ...,
        alias="finalRecommendation"
    )
    topics_to_learn: List[TopicsToLearn] = Field(
        ..., # Ellipsis marks it as required
        alias="topicsToLearn",
        description="A required array of suggested learning topics targeted at the candidate's identified 'Areas for Development'. The list can be empty if no specific topics are suggested, but the field must be present."
    )

    class Config:
        validate_by_name = True # Allows using either snake_case or alias for input
        json_schema_extra = {
            "title": "Candidate Interview Evaluation",
            "description": "A structured evaluation of a candidate based on an interview transcript."
        }

### 4.3. Utility Functions 🛠️

Several helper functions support the agent's operation:

*   `encode64_pil_image`: Converts PIL Images (from Gradio Sketchpad) to base64 strings suitable for inclusion in Gemini API calls.
*   `get_data_for_search`: Extracts structured analysis and learning topics from the evaluation output to feed into the grounding prompt.
*   `get_interview_transcript`: Processes the message history (`InterviewState['messages']`), extracting text and code. If an image is present, it calls Gemini using `DESCRIBE_IMAGE_PROMPT` to get a text description and incorporates it, creating a comprehensive transcript for the final evaluation. This handles the **long context** aspect.
*   `get_learning_resources`: Implements the **Grounding** feature. It calls Gemini with the `RESOURCES_SEARCH_PROMPT`, enables the Google Search tool (`types.Tool(google_search=types.GoogleSearch())`), processes the response to extract the generated text and citation metadata, and formats it into a Markdown string with superscript citations and a reference list. Includes retry logic for robustness.

In [14]:
client = genai.Client(api_key=UserSecretsClient().get_secret("GOOGLE_API_KEY"))

Although Google API [supports using requests with PIL images](https://ai.google.dev/gemini-api/docs/image-understanding#upload-image) as it is, when using a langchain wrapper we can not add PIL images to the message. Therefore I use this function to convert PIL image to base64 format.

In [15]:
def encode64_pil_image(img: PIL.Image.Image) -> str:
    """Converts PIL Image to base64 format"""
    buffer = BytesIO()
    img.save(buffer, format="PNG")
    image_bytes = buffer.getvalue()
    base64_image = base64.b64encode(image_bytes).decode('utf-8')
    return base64_image

In [16]:
def get_data_for_search(response: EvaluationOutput) -> Tuple[str, str]:
    """Gets analysis and topics from response and returns them as separate strings"""
    analytics = ""
    for item in response.parsed.detailed_analysis:
        theme, desc = item
        analytics += f"{theme}: {desc}\n\n"
    
    topics = ""
    for item in response.parsed.topics_to_learn:
        topics += f"{item.area}: {item.description}\n\n"

    return analytics, topics


def get_interview_transcript(messages: List[BaseMessage]) -> str:
    """
    Read messages from history, extracts message text, code and transcribing images.
    Returns dialogue transcript
    """
    transcript: str = ""
    image_url: Optional[Dict[str, str]] = None

    for message in messages:
        if isinstance(message, AIMessage):
            if message.content:
                transcript += f"Interviewer: {message.content}\n\n"
                
        if isinstance(message, HumanMessage):
            text: str = ""
            for content in message.content:
                content: Dict[str, Any]
                text += content.get("text", "") + "\n"
                if image_url_data := content.get("image_url", None):
                    image_url: Dict[str, str] = image_url_data
                    response = client.models.generate_content(
                        model='gemini-2.0-flash',
                        contents=[DESCRIBE_IMAGE_PROMPT.format(transcript=transcript), image_url.get("url")],
                    )
                    text += f"Image description:\n{response.text}\n"
                    
            transcript += f"Candidate: {text}\n\n"
    return transcript

**Here is the logic of this function:**

1. Get generation using client and `RESOURCES_SEARCH_PROMPT` prompt.
2. If the response contains grounding metadata we can continue execution.
> When search grounding is used, the model returns extra metadata that includes links to search suggestions, supporting documents and information on how the supporting documents were used. [Get more info here](https://www.kaggle.com/code/markishere/day-4-google-search-grounding#Use-search-grounding).

3. If the response does not contain grounding metadata -> retry. The number of retries is limited by the num_retries variable.
4. After successful generation with grounding metadata we take each supported generated chunk and cite it.
5. Do this for every chunk and combine it for a full cited response.

In [17]:
def get_learning_resources(question: str, analytics: str, topics: str, language: str) -> str:
    """Using Gemini model generates learning plan with grounding"""
    config_with_search = types.GenerateContentConfig(
        tools=[types.Tool(google_search=types.GoogleSearch())]
    )
    
    markdown_parts: List[str] = []
    num_retries: int = 3
    rc: Optional[types.Candidate] = None 

    # Run for only num_retries times so it is not infinite
    while num_retries > 0:
        try:
            response = client.models.generate_content(
                model='gemini-2.0-flash',
                contents=RESOURCES_SEARCH_PROMPT.format(question=question, analytics=analytics, topics=topics, language=language),
                config=config_with_search
            )
            rc = response.candidates[0]

            # Check for grounding metadata existence
            if (rc.grounding_metadata
                and rc.grounding_metadata.grounding_supports
                and rc.grounding_metadata.grounding_chunks
                and rc.content.parts # Ensure there's content generated
                and rc.content.parts[0].text):
                 break # Success!

            print(f"Retrying grounding query... ({3 - num_retries + 1})")
            num_retries -= 1
            
        except Exception as e:
             print(f"An error occurred during grounding query: {e}")
             num_retries -= 1 

        if num_retries <= 0:
            print("Failed to get grounded recommendations after multiple retries.")
            text_without_citations: str = "\n".join([part.text for part in rc.content.parts])
            return f"*Could not retrieve grounded learning recommendations at this time.*\n\n{text_without_citations}" # Fallback message
    
    if rc and rc.grounding_metadata:
        supports: List[types.GroundingSupport] = rc.grounding_metadata.grounding_supports
        chunks: List[types.GroundingChunk] = rc.grounding_metadata.grounding_chunks
        # Combine all parts of the text generation to properly citate
        generated_text: str = "\n".join([part.text for part in rc.content.parts])
        
        last_index: int = 0 # Initialize last_index to 0 to add first part of the generated text
        for support in sorted(supports, key=lambda s: s.segment.start_index):
            # Write text before the current support segment
            markdown_parts.append(generated_text[last_index : support.segment.start_index])

            # Write the supported text segment
            supported_segment: str = generated_text[support.segment.start_index : support.segment.end_index]
            markdown_parts.append(supported_segment)

            # Add citation markers for this segment
            citation_indices: List[int] = sorted(list(set(support.grounding_chunk_indices))) # Unique, sorted indices
            for i in citation_indices:
                  markdown_parts.append(f"<sup>[{i+1}]</sup>")

            last_index = support.segment.end_index 

        # Write any remaining text after the last support segment
        markdown_parts.append(generated_text[last_index:])
        markdown_parts.append("\n\n")

        # Print the citations
        if chunks:
            markdown_parts.append("### Citations:\n\n")
            for i, chunk in enumerate(chunks, start=1):
                title: str = chunk.web.title or "Untitled"
                uri: str = chunk.web.uri or "#"
                markdown_parts.append(f"{i}. [{title}]({uri})\n")
        else:
             markdown_parts.append("*No specific web sources were cited for this recommendation.*\n")

        return "".join(markdown_parts)
    else:
         # Fallback if somehow rc is None or grounding metadata is missing after loop
         return "*Could not retrieve grounded learning recommendations.*"


### 4.4. LangGraph State & LLM 🧠

*   **`InterviewState` (TypedDict):** Defines the information that persists and flows through the graph. It includes `messages` (the conversation history, managed by `add_messages`), the current `question`, the candidate's latest `code`, and a `finished` flag.
*   **LLM Initialization:** We initialize the `ChatGoogleGenerativeAI` model (`gemini-2.0-flash`) which will be used for the main conversational turns.

In [18]:
class InterviewState(TypedDict):
    """State representing the customer's order conversation."""
    messages: Annotated[list, add_messages]
    question: str
    code: str
    finished: bool

In [19]:
llm = ChatGoogleGenerativeAI(model="gemini-2.0-flash")

### 4.5. Tools & Function Calling 🧰

Tools allow the agent to interact with external resources (like the question database).

*   **Tool Definitions:** Functions like `select_question`, `list_questions`, `get_random_problem`, `get_difficulty_levels`, `get_topic_categories`, and `end_interview` are decorated with `@tool`. Pydantic models (`ListQuestionArgs`, `SelectQuestionArgs`) define the expected arguments for some tools. These tools interact with the Pandas DataFrame (`df`) loaded earlier or manipulate the interview state.
*   **Binding Tools:** The tools are bound to the LLM (`llm_with_tools = llm.bind_tools(...)`). This allows the LLM to decide to call one or more of these functions and generate the necessary arguments in its response (`tool_calls`).
*   **`ToolNode`:** A LangGraph node specifically designed to execute the function calls requested by the LLM.
*   **Custom Tool Handling:** Specific tools (`select_question`, `end_interview`) require custom nodes (`question_selection_node`, `finish_interview_node`) to manipulate the `InterviewState` (e.g., update the current question, set the `finished` flag) based on the tool's execution, rather than just returning a simple text result.

In [20]:
DIFFICULTY = tuple(df.difficulty.unique().tolist())
TOPICS = tuple(df.topic.unique().tolist())
IDS = df.id.apply(str).tolist()


class ListQuestionArgs(BaseModel):
    """Input schema for the list_questions tool."""
    category: Literal[TOPICS] = Field(
        description="The topic category to filter by"
    )
    difficulty: Literal[DIFFICULTY] = Field(
        description="The difficulty level to filter by (e.g., 'Easy', 'Medium', 'Hard')."
    )


class SelectQuestionArgs(BaseModel):
    """Input schema for the select_question tool."""
    ID: Literal[tuple(IDS)] = Field(description="ID of the question")



@tool(args_schema=SelectQuestionArgs)
def select_question(ID: str) -> str:
    """Shows user question with provided ID.
    ALWAYS use this tool when the candidate confirms selected question.
    You can start interview process ONLY after using this tool.
    """


@tool(args_schema=ListQuestionArgs)
def list_questions(category: str, difficulty: str) -> Union[str, List[str]]:
    """
    Lists and returns question names and IDs, filtering by category and difficulty.
    """
    filtered_df = df[(df['topic'].str.lower() == category.lower()) & (df['difficulty'].str.lower() == difficulty.lower())]
    if filtered_df.empty:
        return f"There is no questions with the topic '{category}' and difficulty '{difficulty}'"

    questions_to_sample = min(len(filtered_df), 5)
    filtered_sampled_df = filtered_df.sample(n=questions_to_sample)
    
    questions = []
    for index, row in filtered_sampled_df[["id", "problem_name"]].iterrows():
        questions.append(f"ID: {row.id}; Problem name: {row.problem_name}")
        
    return questions


@tool
def get_random_problem() -> str:
    """Selects and returns a random question description."""
    try:
        random_problem = df.sample(n=1).iloc[0]
    except ValueError as e:
         return f"Error selecting problem after filtering: {e}"

    question = f"ID: {random_problem.id} Problem name: {random_problem.problem_name} "
    return question


@tool
def get_difficulty_levels() -> List[str]:
    """Provides information about difficulty levels in question database.

    Returns:
      List of available difficulty levels in question database
    """
    return df.difficulty.unique().tolist()


@tool
def get_topic_categories() -> List[str]:
    """Provides list of topics in question database.

    Returns:
      List of available topics in question database
    """
    return df.topic.unique().tolist()


@tool
def end_interview() -> bool:
    """Ends interview, starts evaluation process of the candidate. Use this when the user said goodbye.
    """

Here we are are using two types of tools: auto_tools and interview_tools. This is because langraph doesn't allow tools to change state. Therefore we need to create special nodes to execute such functions. 

In [21]:
auto_tools: List[BaseTool] = [get_difficulty_levels, get_topic_categories, get_random_problem, list_questions]
tool_node = ToolNode(auto_tools)

interview_tools: List[BaseTool] = [select_question, end_interview]
llm_with_tools: Runnable = llm.bind_tools(auto_tools + interview_tools)

### 4.6. Graph Nodes & Edges ⚫️🔗🔘

The interview flow is defined by the nodes and edges in the LangGraph:

*   **Nodes:**
    *   `chatbot`: The main node where the LLM is called with the current state (messages, system prompt) to generate the next response or request a tool call.
    *   `tools`: Executes standard tool calls identified by the chatbot node (e.g., `list_questions`).
    *   `question selection`: Custom node to handle the `select_question` tool call and update the state.
    *   `end interview`: Custom node to handle the `end_interview` tool call, set the `finished` flag, and trigger the report generation.
    *   `create report`: Node called after `end interview`. It generates the evaluation using structured output and the learning recommendations using grounding, then formats the final report.
*   **Edges:**
    *   `START`: The graph begins at the `chatbot` node.
    *   Conditional Edges (`maybe_route_to_tools`): After the `chatbot` node runs, this function checks the LLM's output. If `tool_calls` are present, it routes the flow to the appropriate tool node (`tools`, `question selection`, or `end interview`). If no tool calls are present, or after a tool has run, the graph might end (`__end__`), waiting for the next user input.
    *   The flow ensures that tool results are processed, state is updated, and the final report is generated upon completion.

In [22]:
def chatbot_with_tools(state: InterviewState) -> InterviewState:
    """The chatbot node. Invokes the LLM with tools."""
    print("\nEntering chatbot_with_tools Node")
    messages: List[BaseMessage] = state['messages']

    system_and_messages: List[BaseMessage] = [SystemMessage(content=INTERVIEWER_SYSTEM_PROMPT)] + messages

    if not messages:
        ai_message = AIMessage(content=WELCOME_MSG)
    else:
        ai_message = llm_with_tools.invoke(system_and_messages)

    return state | {"messages": [ai_message]}

The `question_selection_node` uses ID to select a question. ID is provided by the modes in the tool arguments. The model knows the ID of a specific question from the tool that it called previously. Like `list_questions` and `get_random_question`. They return the question name and question ID.

In [23]:
def question_selection_node(state: InterviewState) -> InterviewState:
    """The question selection node. This is where the interview state is manipulated."""
    
    tool_msg: AIMessage = state.get("messages", [])[-1]
    question: str = state.get("question", "")
    outbound_msgs: List[ToolMessage] = []
    tool_call: Dict[str, str]

    for tool_call in tool_msg.tool_calls:

        if tool_call["name"] == "select_question":

            ID: int = int(tool_call["args"]["ID"])
            
            selected_question: pd.Series = df[df.id==ID].iloc[0]

            question = selected_question.content
            question_code: str = selected_question.code
            instruction: str = "Following question description is used to show you what this question is about. You should not repeat it to the user!"
            response: str = "\n".join([instruction] + [question] + [f"Initial code: {question_code}"])

        else:
            raise NotImplementedError(f'Unknown tool call: {tool_call["name"]}')

        outbound_msgs.append(
            ToolMessage(
                content=response,
                name=tool_call["name"],
                tool_call_id=tool_call["id"],
            )
        )

    return state | {"messages": outbound_msgs, "question": question, "code": question_code}

In [24]:
def finish_interview_node(state: InterviewState) -> InterviewState:
    """The question selection node. This is where the interview state is manipulated."""
    
    tool_msg: AIMessage = state.get("messages", [])[-1]
    finished: bool = state.get("finished", False)
    outbound_msgs: List[ToolMessage] = []
    tool_call: Dict[str, str]

    for tool_call in tool_msg.tool_calls:

        if tool_call["name"] == "end_interview":
            print("FINISH TOOL")
            finished = True
            response: str = "Say goodbye to the user"

        else:
            raise NotImplementedError(f'Unknown tool call: {tool_call["name"]}')

        outbound_msgs.append(
            ToolMessage(
                content=response,
                name=tool_call["name"],
                tool_call_id=tool_call["id"],
            )
        )

    return state | {"messages": outbound_msgs, "finished": finished}

The `create_report_node` first calls Google API to generate a structured report using the `EvaluationOutput` schema. Then it calls the `get_learning_resources` function to generate a grounded learning plan to insert at the end of the report. After that, it renders this data with the jinja2 template, saves the file, and updates the state.

In [25]:
def create_report_node(state: InterviewState) -> InterviewState:
    question: str = state.get("question", "")
    if not question or question == "Problem has not been selected yet": 
        print("No valid question selected. Skipping report generation.")
        return state | {"report": "Report cannot be generated as no question was selected or the interview did not complete."}
        
    messages: List[BaseMessage] = state.get("messages", [])
    transcript: str = get_interview_transcript(messages)
    code: str = state.get("code", "")
    
    try:
        evaluation_response = client.models.generate_content(
            model='gemini-2.0-flash',
            contents=CANDIDATE_EVALUATION_PROMPT.format(question=question, transcript=transcript, code=code),
            config={
                'response_mime_type': 'application/json',
                'response_schema': EvaluationOutput,
            },
        )
        evaluation_data: EvaluationOutput = evaluation_response.parsed
    except Exception as e:
        print(f"Error generating candidate evaluation: {e}")
        return state | {"report": f"Error generating evaluation report: {e}"}

    analytics: str
    topics: str
    analytics, topics = get_data_for_search(evaluation_response)
    grounded_recommendations_md: str = get_learning_resources(question, analytics, topics, "Python")
    
    template: Template = Template(REPORT_TEMPLATE)
    rendered_output: str = template.render({"evaluation": evaluation_data, "recommendations": grounded_recommendations_md})

    try:
        with open("candidate_report.md", "w", encoding="utf-8") as f:
            f.write(rendered_output)
        print("Candidate report generated successfully.")
    except IOError as e:
        print(f"Error writing report file: {e}")
        
    return state | {"report": rendered_output}

In [26]:
def maybe_route_to_tools(state: InterviewState) -> Literal["tools", "question selection", "end interview", "__end__"]:
    """Routes to tools if the LLM's last message contained tool calls."""
    msgs: List[BaseMessage]
    tool: Dict
    
    if not (msgs := state.get("messages", [])):
        raise ValueError(f"No messages found when parsing state: {state}")
        
    last_message: BaseMessage = state['messages'][-1]
    if hasattr(last_message, "tool_calls") and len(last_message.tool_calls) > 0:
        print(f"Routing to tools based on message: {last_message.tool_calls}")
        # Route to `tools` node for any automated tool calls first.
        if any(
            tool["name"] in tool_node.tools_by_name.keys() for tool in last_message.tool_calls
        ):
            return "tools"
        elif any(
            tool["name"] == "select_question" for tool in last_message.tool_calls
        ):
            return "question selection"
        elif any(
            tool["name"] == "end_interview" for tool in last_message.tool_calls
        ):
            return "end interview"
        else:
            return "__end__"
    else:
        print("No tool calls found. Ending graph run (will wait for user input).")
        return "__end__" # End the current graph invocation

### 4.7. Graph Compilation & Visualization 🖼️

The defined nodes and edges are compiled into an executable LangGraph object (`interviewer_graph`).

In [27]:
graph_builder = StateGraph(InterviewState)

graph_builder.add_node("chatbot", chatbot_with_tools)
graph_builder.add_node("tools", tool_node)
graph_builder.add_node("question selection", question_selection_node)
graph_builder.add_node("end interview", finish_interview_node)
graph_builder.add_node("create report", create_report_node)


graph_builder.add_edge(START, "chatbot")
graph_builder.add_conditional_edges(
    "chatbot",
    maybe_route_to_tools,
)
# After tools are executed, always return to the chatbot to process the tool results
graph_builder.add_edge("tools", "chatbot")
graph_builder.add_edge("question selection", "chatbot")
graph_builder.add_edge("end interview", "create report")
graph_builder.add_edge("create report", "__end__")
interviewer_graph = graph_builder.compile()
try:
    Image(interviewer_graph.get_graph().draw_mermaid_png())
except requests.exceptions.ReadTimeout as e:
    print("Could not draw graph...", e)

## ✨ 5. Showcasing GenAI Capabilities

This project explicitly demonstrates several powerful Generative AI features available through the Gemini API and LangChain/LangGraph:

1.  **Function Calling:** The interviewer agent uses tools (`list_questions`, `select_question`, `get_random_problem`, `end_interview`, etc.) defined with the `@tool` decorator. The LLM (`llm_with_tools`) decides when to call these functions based on the conversation and system prompt, generating the required arguments. This allows the agent to interact with the external question database and control the interview flow (e.g., ending the interview). The `ToolNode` and custom tool nodes handle the execution.

2.  **Structured Output:** The `create_report_node` leverages Gemini's ability to generate JSON output conforming to a predefined schema. By defining Pydantic models (`EvaluationOutput`, etc.) and specifying them in the `generate_content` call (`response_schema`), we ensure the evaluation report is consistently structured, making it easy to parse and format using the Jinja2 template (`REPORT_TEMPLATE`).

3.  **Few-Shot Learning:** The `INTERVIEWER_SYSTEM_PROMPT` includes explicit examples of interactions (e.g., how to provide hints, how to respond to whiteboard input). This provides the LLM with concrete examples of the desired behavior, improving its ability to follow instructions and adopt the correct persona and interaction style.

4.  **Image Understanding (Multimodality):** The application accepts whiteboard sketches via the Gradio `Sketchpad`. The `chat_interface_update` function encodes the PIL image to base64. The `get_interview_transcript` function then incorporates this image data when creating the full transcript passed to the evaluation model. It uses a Gemini Vision model via `DESCRIBE_IMAGE_PROMPT` to generate a textual description of the image, integrating visual information into the context.

5.  **Agents (LangGraph):** The entire interviewer is built as a stateful agent using LangGraph. It maintains the conversation state (`InterviewState`), transitions between different processing steps (nodes like `chatbot`, `tools`, `create_report`), and makes decisions based on LLM outputs and tool results (conditional edges via `maybe_route_to_tools`). This enables the complex, multi-turn, and tool-using behavior required for the simulation.

6.  **Long Context Window:** The agent handles potentially long interview conversations. The `InterviewState` accumulates all messages. The `get_interview_transcript` function processes this entire history, and the `create_report_node` passes the full transcript to the evaluation model, allowing it to consider the entire interaction when generating the assessment. Gemini's long context capabilities are essential here.

7.  **Grounding (Gemini with Google Search):** The `get_learning_resources` function implements grounding. It calls Gemini with a specific prompt (`RESOURCES_SEARCH_PROMPT`) and enables the `GoogleSearch` tool. Gemini uses search results to generate a *grounded* learning plan based on the evaluation feedback and identified topics, ensuring the recommendations are relevant and based on real-world information. The system automatically handles citations for the sources used.

## 🕹️ 6. Gradio User Interface

To make the Mock Interviewer interactive, I build a web interface using Gradio.

### 6.1. Interaction Logic ↔️
*   **State Management (`gr.State`):** A Gradio `State` object (`chat_state`) holds the current `InterviewState` (messages, question, code, finished flag) and UI-related flags (`code_changed`, `image_changed`) between interactions.
*   **Initialization (`init_chat`):** When the Gradio app loads, this function runs the LangGraph with an empty initial state to get the welcome message from the `chatbot` node and sets up the initial UI components.
*   **Update Loop (`chat_interface_update`):** This is the core function triggered by the "Send Message" button.
    1.  It retrieves the current state and user inputs (text, code, image).
    2.  It checks if the interview is already marked as finished.
    3.  It packages the user's input (text, code, image description) into a `HumanMessage`. Special handling ensures code and images are included only if they've changed since the last turn. Images are base64 encoded.
    4.  It appends the new `HumanMessage` to the state's message list.
    5.  It invokes the `interviewer_graph` with the updated state.
    6.  It processes the `new_state` returned by the graph:
        *   Extracts the latest AI message to display.
        *   Updates the problem description and code editor content if they changed.
        *   Checks the `finished` flag in the new state.
        *   Updates the `chat_state` for the next turn.
        *   Updates the visibility of the Download button based on the `finished` flag.
        *   Returns updates for all relevant Gradio components.
*   **Input Change Handlers:** `mark_code_as_changed` and `mark_image_as_changed` are triggered by changes in the `gr.Code` and `gr.Sketchpad` inputs, respectively. They update the corresponding data and flags in the `chat_state` without triggering a full graph invocation, ensuring that code/image changes are only sent when the user explicitly clicks "Send Message".

In [28]:
def chat_interface_update(user_input: str, chat_state: dict) -> Tuple:
    """
    Handles user input, runs the LangGraph, and returns the AI response and updated state.
    """
    print(f"\n--- Gradio Interaction Start ---")
    REPORT_FILENAME = "candidate_report.md"

    # Check if state exists and initialize it if not
    if not chat_state or 'messages' not in chat_state:
        chat_state = {
            "messages": [], 
            "question": "Problem has not been selected yet", 
            "code": "# Your code", 
            "code_changed": False, 
            "image_changed": False,
            "finished": False
        }
    
    current_messages: str = chat_state.get('messages', [])
    current_problem: str = chat_state.get('question', "Problem has not been selected yet")
    current_code: str = chat_state.get('code', "# Your code")
    current_image: Optional[PIL.Image] = chat_state.get('image', None)
    code_changed: bool = chat_state.get('code_changed', False)
    image_changed: bool = chat_state.get('image_changed', False)
    finished: bool = chat_state.get('finished', False)

    # Check if finished to display predifined message and show Download report button
    if finished:
        print("Interview already marked as finished.")
        final_message = "Interview is finished. Your report is ready for download below."
        return (
            gr.update(value=final_message),    # output_text
            gr.update(),                       
            gr.update(),                       
            chat_state,                        
            gr.update(visible=True),
            gr.update(visible=False),
        )

    # Initializing content to pass text, code and image (if included)
    content: List[Dict[str, Any]] = []
    
    if user_input:
        content.append({"type": "text", "text": user_input})
    
    if code_changed:
        content.append({"type": "text", "text": f"\nCode:```python\n{current_code}\n```"})

    if image_changed:
        # Get converted to base64 image
        base64_image: str = encode64_pil_image(current_image)
        content.append({"type": "text", "text": "Here is a screenshot of my whiteboard"})
        content.append({"type": "image_url", "image_url": {"url": f"data:image/jpeg;base64,{base64_image}"}})

    # Check for input
    if content:
        current_messages.append(HumanMessage(content=content))
        chat_state['code_changed'] = False # Reset flags
        chat_state['image_changed'] = False
    elif not current_messages: # Allow initial empty message
        pass
    else: # No new input, code, or image change
        print("No new input, code, or image change.")
        ai_response_content: str = "Please provide input, type code, or draw on the whiteboard."
        return (
            gr.update(value=ai_response_content),
            gr.update(value=current_problem),
            gr.update(value=current_code),
            chat_state,
            gr.update(visible=False),
            gr.update(),
        )

    graph_input: Dict[str, Any] = {"messages": current_messages, "question": current_problem, "code": current_code} # , "image": current_image}
    
    try:
        new_state: InterviewState = interviewer_graph.invoke(graph_input)
    except Exception as e:
        print(f"Error invoking interviewer graph: {e}")
        ai_response_content = f"An error occurred: {e}"
        return (
            gr.update(value=ai_response_content), 
            gr.update(value=current_problem),     
            gr.update(value=current_code),        
            chat_state,                           
            gr.update(visible=False),
            gr.update(),
        )

    ai_response_content: str = ""
    interview_finished_this_turn: bool = new_state.get('finished', False)
    if new_state and 'messages' in new_state and new_state['messages']:
        # Find the last AI message to display
        for msg in reversed(new_state['messages']):
            if isinstance(msg, AIMessage):
                # Sanitizing ai message (in some cases model is returning answer in list)
                ai_response_content = msg.content if type(msg.content) == str else msg.content[0]
                break
            elif isinstance(msg, ToolMessage):
                if msg.name == "end_interview" and interview_finished_this_turn:
                    ai_response_content = "Thank you for your time. Interview is finished. Reload to start a new one."
                    break
        if not ai_response_content:
            ai_response_content = "Tool executed. Waiting for next step."

    else:
        ai_response_content = "Error: No messages found in the new state."


    if interview_finished_this_turn:
        ai_response_content += "\n\n**Interview finished! Your report is ready for download below.**" # Append the notice
        return (
            gr.update(value=ai_response_content),
            gr.update(visible=False),
            gr.update(visible=False),
            new_state,
            gr.update(visible=interview_finished_this_turn),
            gr.update(visible=False),
        )

    # Update state variables
    updated_problem_desc: str = new_state.get('question', current_problem)
    updated_problem_code: str = new_state.get('code', current_code)

    # Create the final state dictionary to return to Gradio
    final_chat_state: Dict[str, Any] = new_state.copy()
    final_chat_state['code_changed'] = False
    final_chat_state['image_changed'] = False

    print(f"Interview Finished State: {interview_finished_this_turn}")
    print(f"--- Gradio Interaction End ---")

    return (
        gr.update(value=ai_response_content),  
        gr.update(value=updated_problem_desc),
        gr.update(value=updated_problem_code),
        final_chat_state,
        gr.update(visible=interview_finished_this_turn),
        gr.update(),
    )

In [29]:
# Function to initialize the chat when Gradio load
def init_chat() -> Tuple:
    START_FROM_END: bool = False # for degbuging purposes. Slould initialize test_state at the buttom of the notebook
    print("\n--- Gradio App Load ---")
    initial_problem_text: str = "Problem has not been selected yet"
    initial_problem_code: str = "# Your code"
    initial_state_dict: Dict[str, Any]

    if START_FROM_END: # for degbuging purposes. Slould initialize test_state at the buttom of the notebook
        initial_state_dict = {"messages": test_state["messages"], 
                          "question": test_state["question"], 
                          "code": test_state["code"],
                          "code_changed": False,
                          "image_changfed": False,
                          "finished": False
                         }
    else:
        initial_state_dict = {"messages": [], 
                              "question": initial_problem_text, 
                              "code_changed": False,
                              "image_changfed": False,
                              "finished": False
                             }
    
    
    # Run the graph with empty messages to get the initial welcome message
    initial_state: InterviewState = interviewer_graph.invoke(initial_state_dict)
    welcome_message: str = "Error: Could not get initial message."
    if initial_state and initial_state.get('messages'):
        welcome_message = initial_state['messages'][-1].content

    problem_desc_on_load: str = initial_state.get('question', initial_problem_text)
    problem_code_on_load: str = initial_state.get('code', initial_problem_code)

    return (
        welcome_message, 
        problem_desc_on_load, 
        problem_code_on_load, 
        initial_state if initial_state else {}, 
        gr.update(visible=False)
    )

In [30]:
def mark_code_as_changed(current_state: Dict[str, Any], input_code: str) -> Dict:
    """Update code and code_changes in state when code changes"""
    if not current_state:
        current_state = {}
    current_state['code_changed'] = True
    current_state['code'] = input_code
    return current_state


def mark_image_as_changed(current_state: Dict[str, Any], im: Dict[str, Any]) -> Dict:
    """Update image and image_changed in state when image changes"""
    if not current_state:
        current_state = {}
    current_state['image_changed'] = True
    current_state['image'] = im["composite"]
    return current_state

### 6.2. UI Components 🎨🧱
The interface consists of:
*   **Problem Display (`gr.Markdown`):** Shows the description of the currently selected coding problem.
*   **Interviewer Response (`gr.Markdown`):** Displays the AI interviewer's messages.
*   **User Input (`gr.Textbox`):** A multi-line text box for the candidate to type their responses.
*   **Code Input (`gr.Code`):** A code editor with Python syntax highlighting where the candidate can write and submit their code solution.
*   **Whiteboard (`gr.Sketchpad`):** Allows the candidate to draw diagrams or pseudocode to explain their thinking visually.
*   **Send Button (`gr.Button`):** Submits the user's text, code (if changed), and whiteboard sketch (if changed) to the backend.
*   **Download Report Button (`gr.DownloadButton`):** Appears only after the interview is finished, allowing the user to download the generated Markdown evaluation report (`candidate_report.md`).

In [37]:
with gr.Blocks() as demo:
    gr.Markdown("## Mock Technologies Inc. - Technical Interviewer")
    # Store LangGraph state here
    chat_state = gr.State({}) # Initialize as an empty dictionary

    problem_display = gr.Markdown("Problem has not been selected yet", container=True, label="Selected Problem")
    output_text = gr.Markdown(label="Interviewer Response", container=True) # For AI messages

    with gr.Row():
        with gr.Column(scale=5):
            input_text = gr.Textbox(label="Your message", placeholder="Type your message here...", lines=5)
            
        with gr.Column(scale=5):
            input_code = gr.Code("# Your code", label="Code", language="python", interactive=True, lines=7)
            input_code.input(mark_code_as_changed, inputs=[chat_state, input_code], outputs=[chat_state], show_progress="hidden")
            
        with gr.Column(scale=1):
            submit_button = gr.Button("Send Message")

    im = gr.Sketchpad(
        label='Whiteboard',
        type="pil",
        width=600,
        height=600,
        canvas_size=(600, 600),
        layers=gr.LayerOptions(disabled=True),
    ) 

    im.input(mark_image_as_changed, inputs=[chat_state, im], outputs=[chat_state], show_progress="hidden")

    # Download Button - initially hidden
    download_report = gr.DownloadButton(
        label="Download Interview Report",
        value="candidate_report.md", 
        visible=False,
    )

    # Define the list of components to update on submit button click
    submit_outputs = [
        output_text,
        problem_display,
        input_code,
        chat_state,
        download_report,
        im
    ]

    # Define outputs for demo.load (must match init_chat return)
    load_outputs = [
        output_text,
        problem_display,
        input_code,
        chat_state,
        download_report
    ]

    # Load initial message
    demo.load(init_chat, inputs=None, outputs=load_outputs)

    submit_button.click(
        chat_interface_update,
        inputs=[input_text, chat_state],
        outputs=submit_outputs
    )
    # Clear input textbox after submit
    submit_button.click(lambda: "", outputs=[input_text])

### ❗ **IMPORTANT!**

The app built in this notebook uses Gradio launch function to run an UI interface. This section is commented-out to ensure that you can use the `Run all` feature without interruption. At the end of this notebook you will need to uncomment the `demo.launch(...)` call in order to interact with the app.

If you wish to save a version of this notebook with `Save and Run all`, you will need to **re-comment** the line you commented out to ensure that the notebook can run without human input.

In [None]:
# demo.launch(share=True) # Added share=True for Kaggle

Uncomment and run this to close the gradio session.

In [39]:
# demo.close()

Closing server running on port: 7860


## 🖼️ 7. Examples



In [None]:
from IPython.display import Image, display



## 🏁 8. Conclusion & Future Work

### Conclusion 
This project successfully demonstrates the creation of a sophisticated, AI-powered Mock Technical Interviewer using Google Gemini and LangGraph. By integrating features like stateful agentic behavior, function calling, structured output, multimodal input (text, code, image), and grounded recommendations, it provides a valuable and interactive tool for developers preparing for technical interviews. The detailed evaluation report and personalized learning plan offer actionable feedback to help users identify strengths and weaknesses.

### Potential Future Work
*   **More Question Types:** Expand the `data.json` to include system design, behavioral questions, or questions for different roles/levels.
*   **Voice Interaction:** Integrate speech-to-text and text-to-speech for a more natural conversational experience.
*   **Code Execution & Testing:** Allow the agent to actually run the candidate's code against test cases and provide feedback on correctness and performance.
*   **More Sophisticated Evaluation:** Fine-tune the evaluation prompt or use multiple LLM calls for deeper analysis of specific aspects (e.g., code quality metrics, complexity analysis).
*   **Persistence:** Allow users to save and resume interview sessions.
*   **Different Interviewer Personas:** Offer options for different interviewer styles (e.g., more challenging, more collaborative).

    