In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python

# 🔍 Introduction
---
**JobBridge-AI** is a research-driven UX and AI Capstone project developed as part of the *Gen AI Intensive Course*.<br>
It tackles real-world challenges faced by foreigners seeking employment in Japan — challenges I personally experienced while navigating the Japanese job market.<br>

Motivated by those experiences, I built JobBridge-AI to explore user pain points and design smarter, more supportive tools.<br>
By leveraging **Generative AI** technologies like Google Gemini, LangChain, and LangGraph, the system delivers personalized résumé feedback, UX advice, and culturally-aware career suggestions based on user questions and uploaded CVs.

# 🗻 Why Japan
---
Japan is a unique job market with strong local hiring customs, a high emphasis on language ability, and limited resources tailored to non-Japanese speakers.<br>
For many foreigners from students and new graduates to mid-career professionals —  
the job search process can feel confusing and isolating.<br>

Many job seekers face barriers including language proficiency (N2/N1), lack of sponsorship, and mismatch with recruiter expectations.<br>
They struggle to navigate Japanese job platforms, understand resume formats (like 履歴書 and 職務経歴書), or prepare for bilingual interviews that blend keigo and technical Q&A.<br>

While Japan offers exciting career opportunities for global talent, the path is not always clear.<br>
Language barriers, unfamiliar application standards, and cultural differences can leave even highly qualified candidates feeling lost or overlooked.<br>

With a growing number of international students and foreign professionals entering Japan each year, there's an urgent need for smarter, more inclusive support systems.<br>  
**JobBridge-AI** was created to help bridge that gap by combining generative AI with UX insights focused on real-world job search struggles.<br>

# 🎯 Project Goals
---
The goal of JobBridge-AI is to empower foreign job seekers in Japan with smarter, more empathetic support during the job-hunting process.

### 🎯 Key Objectives:
- 🔍 **Identify** common frustrations in the Japanese job market through real UX research
- 🧠 **Leverage Generative AI** to offer resume rewrites, career suggestions, and feedback
- 🌐 **Bridge the language gap** with bilingual AI outputs and simplified UX
- 🤖 **Build a working chatbot prototype** using LangGraph, LangChain, and Gemini

# 👕 Target Audience
---
Target Group for This Project
This project focuses on foreign residents in Japan who face unique challenges in the job search process.<br>
Based on our research goals, we identified the following key user segments:

- **International Students & Language School Learners**<br>
Often on student visas, these users are actively job hunting while balancing Japanese language classes. Many aim to transition into full-time roles in tech, hospitality, or creative fields.

- **English Teachers Seeking Career Change**<br>
Many participants reported being "stuck" in ALT or eikaiwa roles, despite having skills or experience in other industries. They often struggle to move out of education due to hiring bias and unclear career pathways.

- **Highly Skilled Visa Holders or Employees**<br>
This group includes engineers, designers, or professionals on work or spouse visas. Despite strong credentials, many are overlooked due to language level, visa type, or cultural mismatch in hiring expectations.

To ensure diverse and inclusive insights, we aim to conduct UX research across these groups,<br>
covering a broad range of Japanese proficiency levels (N5 to N1) and multiple industries, including tech, hospitality, education, and creative sectors.

# 🧠 Gather User Insights
---
To understand the challenges faced by foreign job seekers in Japan, we planned a mixed-method UX study targeting over 100 foreign professionals across multiple industries.<br>
The research included both a bilingual UX survey and in-depth interviews, conducted in English and Japanese.

Despite time constraints, we successfully collected responses from **30+ foreign participants** via survey and conducted **10 individual interviews**.<br>
These insights became foundational to our AI training, grounding its advice in real-world frustrations.


### 📋 1. Bilingual Survey (Google Form)
A bilingual survey (English/Japanese) was distributed to foreign job seekers in Japan.  
The survey explored themes such as language proficiency, visa challenges, recruiter experiences, and job search pain points.

**Top 5 reported pain points:**
1. 🧱 **Language barrier** (especially N2+ requirement)
2. ⚔️ **Competition with local candidates**
3. 🔍 **Difficulty identifying suitable jobs** 
4. 🤝 **Limited networking opportunities**
5. 🌐 **Cultural differences in interviews**

**Status**: ✅ Survey completed (30+ respondents)  
**Insights Summary**:  
There’s a clear demand for tools that decode job descriptions, tailor resumes, and support bilingual candidates more effectively.


### 🗣️ 2. Individual Interviews (Google Meet)
In-depth interviews were conducted in English and Japanese to explore pain points in greater depth.  
Originally, 20+ interviews were planned, but we successfully completed 10 within the project timeline.

These sessions captured rich personal experiences and validated many of the survey's trends.

**Key insights:**
- Conflicting or vague recruiter feedback is common.
- Several reported unethical hiring practices (e.g., unpaid “test” tasks).
- Networking and insider referrals are seen as essential — yet inaccessible for most foreigners.

**Status**: ✅ Interview sampling completed (10 participants)  
**Insights Summary**:  
The job hunt is not only about language or formatting. It’s also about **clarity, trust, and fairness**.<br>
This underscores the need for tools that are not only smart — but also **empathetic** and culturally aware.


Data from both the survey and interviews was compiled into a bilingual JSON dataset (`ux_survey.json`)<br> and integrated into the RAG pipeline via:<br>
ux_insights → job_advice_seeds → job_advice_node

# 🧑‍🎨 User Personas
Persona Media link [job-bridge-media](https://www.kaggle.com/datasets/nattaveelaws/job-bridge-media/)

---

## **Figure:** Persona #1 – Aya Suwanisan<br>
Age: 22<br>
VISA Type: Student Visa<br>
Current Status: Studying Japanese language, Working part-time at an izakaya<br>
Language: Japanese N3 (Intermediate), English Fluent<br>
Target Roles: Full-time hospitality<br><br>
### “It feels like no one will even look at your profile if you’re not fluent  even if the job says 'English OK'.”<br>
### Goals 
- To land her first full-time job in a restaurant that values work ethic and training, not just JLPT scores
- To build confidence interacting with Japanese customers and co-workers
- To transition from part-time back-of-house staff to a customer-facing or managerial role

### Frustrations
- Job posts advertise “English OK,” but interviews are conducted entirely in Japanese
- She’s unsure what to emphasize in a Japanese-style resume or 自己PR
- Feels discouraged when she’s ghosted after interviews or told she’s “not fluent enough”

Aya, 26, is a Thai student in Osaka studying Japanese while working part-time at an izakaya.<br>
She hopes to land a full-time role in Japan’s food service industry, but finds “English OK” job posts misleading and her N3 Japanese seen as insufficient.<br>
She’s eager to grow, but struggles with unclear entry paths into stable hospitality roles.

---

## **Figure:** Persona #2 – Daniel Thompson<br>
### Daniel Thompson
Age: 31<br>
VISA Type: Work Visa<br>
Current Status: Full-time English teacher in a public elementary school<br>
Language: Japanese N2 (Fluent), English Native<br>
Target Roles: Content Creation / Creative Tech<br><br>
### “Once you become a teacher in Japan, they can’t see you as anything else even if your resume says content creator.”<br>

### Goals 
- To pivot from teaching to a full-time role in content or creative tech
- To use his Japanese fluency and media background to create meaningful cross-cultural content
- To build a career path beyond ALT work and grow into leadership in a creative environment
  
### Frustrations
- ALT experience overshadows his previous media work. Recruiters don’t read past the first line
- Job posts claim “English OK” but interviews are entirely in Japanese and high-context
- Lack of clear criteria portfolios are ignored unless hosted on Japanese platforms
### Persona Summarize
Daniel is a New Zealand media professional working as an ALT in Tokyo.<br>
With a background in scriptwriting and fluent Japanese (N2), he came to Japan to pursue a career in creative content.<br>
But visa routes pushed him into teaching. Now three years in,<br>
he’s rebuilding his portfolio in Japanese and looking for companies that value cross-cultural creativity not just teaching experience.

---
## **Figure:** Persona #3 – Amira Elbaz<br>
### Amira Elbaz
Age: 39<br>
VISA Type: Dependent Visa (Spouse of Permanent Resident)<br>
Current Status: Freelancing remotely applying to tech roles<br>
Language: Japanese N4, English Fluent<br>
Target Roles: Backend Developer / QA / Infrastructure<br><br>
### “My experience is solid but no one sees it because I’m on a spouse visa and speak only N4.”<br>
### Goals 
- To re-enter the tech industry in Japan after moving for family
- To use her 10+ years of backend development experience
- To join a diverse team that values real experience over paperwork

### Frustrations
- Gets filtered out for not having N2, even for roles labeled “English OK”
- Recruiters focus on her visa instead of her résumé
- Job boards push her toward teaching or retail, ignoring her skillset
### Persona Summarize
Amira is a seasoned backend engineer from Egypt, now living in Saitama on a spouse visa.<br>
Despite 10+ years of experience, her limited Japanese and visa status often lead recruiters to overlook her.<br>
She’s now refining her GitHub and resume in Japanese, aiming to join a tech team that values skills over status.



# 🛤️ User Journeys
---
1. User uploads a resume PDF.
2. The system parses it into structured fields (name, education, experience...).
3. The user can ask the bot to:
   - Rewrite the resume for a specific company
   - Get job advice based on their situation
   - Ask for alternative roles
4. The assistant responds using:
   - Few-shot prompts (for CV)
   - RAG (for advice)
   - LLM generation (for jobs)

# 🚀 Features Implemented
---
JobBridge-AI integrates multiple generative AI capabilities to support the user journey, from CV preparation to career advice.<br>
Each feature is designed to address specific pain points identified through UX research and interviews.<br><br>

| 🧩 Feature | 🧠 Capability | 🔧 Method |
|-----------|---------------|-----------|
| **CV Parsing** | Resume document understanding | Gemini LLM + PDF parsing pipeline |
| **Rewrite 自己PR & 志望動機** | Few-shot + controlled generation | Prompt-based rewrite using user-uploaded CVs |
| **UX-Driven Advice** | Retrieval-Augmented Generation (RAG) | Embedding search over user pain points (vector DB) |
| **Recommend Alternative Jobs** | Career guidance from similar profiles | Custom retriever using ChromaDB + Gemini |
| **Conversational Agent** | Chatbot interface with smart routing | LangGraph-based flow control and input intent detection |

# 🧰 Capabilities Demonstrated
---
This project demonstrates multiple GenAI capabilities as required by the Kaggle x Google Capstone Challenge.<br>
Each technique was integrated intentionally to solve a user-validated pain point.

| Capability | Description | Example in Project |
|------------|-------------|--------------------|
| **Document Understanding** | Reads and extracts content from uploaded résumés (PDF) | CV parsing via Gemini |
| **Few-shot Prompting** | Controlled rewriting of 自己PR and 志望動機 | Uses examples to generate company-tailored versions |
| **Retrieval-Augmented Generation (RAG)** | Embeds and retrieves pain point data for contextual UX advice | Embedding + ChromaDB |
| **Dynamic Routing via LangGraph** | Multi-node chatbot routing based on user intent | LangGraph nodes: parse, tailor, ux_advice, alt_jobs |
| **Multilingual Input/Output** | Accepts and generates both English and Japanese content | All core features are bilingual |

# 🎨 UX Accessibility Considerations
---
**JobBridge-AI** was designed with empathy for real users navigating the Japanese job market — many of whom face barriers related to language, culture, and unfamiliar application systems.  
Our design was grounded in insights from bilingual UX surveys and user interviews.

---

### 🧩 Accessibility Features
- **Multilingual Support:** Accepts both English and Japanese inputs, with bilingual outputs.
- **Automatic Resume Parsing:** Users can upload a résumé PDF without formatting knowledge — the system parses and extracts content automatically.
- **Low Barrier to Entry:** No need for prior familiarity with Japanese resume formats like 履歴書 or 職務経歴書.

---

### 🫂 Inclusive Design Principles
- Supports users across all JLPT levels (N5 to N1)
- Accepts open-ended queries (e.g., “fix my PR” or “what job suits me?”)
- Transparent interactions: each AI step (e.g., tailoring, advice generation) is printed with clear system labels — no "black box" behavior.

---

### ⚠️ Known Limitations
- ❌ No support for audio input or OCR (camera-captured image CVs) at this time
- ❌ Mobile usability is constrained due to the Kaggle Notebook interface
- ❌ Does not yet handle handwritten resumes (手書き履歴書), which some companies still request in Japan


# 🧪 Usability Testing
---
### Planned Flow (Originally Intended)
1. Build a working prototype using Gemini 2 Flash to analyze user CVs.  
2. Invite users to test and provide feedback on language clarity, output relevance, and overall usefulness.  
3. Collect qualitative feedback via a short follow-up form or interview.  
4. Iterate based on input and re-test improved versions.

---

### 🛠 Tools (Prepared)
- **Kaggle Notebook**: Used as the user-facing interface for rapid prototyping.  
- **Gemini API**: Powers CV understanding, RAG, and resume rewriting logic.  

---

### ❌ Status: Testing Not Completed  
Due to time limitations during the Capstone Challenge window, we were unable to run formal usability testing sessions.  
Our team prioritized core system functionality, integration, and UX-driven logic design.

---

### ⚠️ Known Gaps
- No structured user feedback has been collected yet.  
- Improvement areas remain speculative, based on survey expectations rather than real session logs or usage data.

---

**Next steps (post-capstone):**  
We plan to conduct lightweight usability testing with foreign job seekers currently in Japan to validate and refine prompt flows, language accessibility, and overall interaction clarity.

# Let's start
In this sample, we will use the uploaded resume of Daniel Thompson, one of our personas.  
Media link [job-bridge-media](https://www.kaggle.com/datasets/nattaveelaws/job-bridge-media/)

# [1] Library Installation & Environment Setup
---
Installs the core libraries required for this project, including Gemini, LangGraph, LangChain, and ChromaDB.  
PDF parsing is handled via `PyPDF2`.  
All dependencies are pinned to ensure compatibility within the Kaggle environment.

In [1]:
# clean up any pre‑installed copies
!pip uninstall -qqy google-generativeai google-ai-generativelanguage

# system OCR package
# !apt-get -qq update && apt-get -qq install -y tesseract-ocr

# Python libs  ── note the explicit 0.8.4 / 0.6.15 pair, and upgrade langgraph
!pip install -U \
    google-generativeai==0.8.4 \
    google-ai-generativelanguage==0.6.15 \
    langgraph==0.3.30 \
    langchain langchain-community langchain-google-genai \
    chromadb PyPDF2 pandas

print("✅ Libraries installed (including langgraph>=0.3.30)")
print("✅ Libraries installed (0.8.4 / 0.6.15 pair)")

Collecting google-generativeai==0.8.4
  Downloading google_generativeai-0.8.4-py3-none-any.whl.metadata (4.2 kB)
Collecting google-ai-generativelanguage==0.6.15
  Downloading google_ai_generativelanguage-0.6.15-py3-none-any.whl.metadata (5.7 kB)
Collecting langgraph==0.3.30
  Downloading langgraph-0.3.30-py3-none-any.whl.metadata (7.7 kB)
Collecting langchain
  Downloading langchain-0.3.23-py3-none-any.whl.metadata (7.8 kB)
Collecting langchain-community
  Downloading langchain_community-0.3.21-py3-none-any.whl.metadata (2.4 kB)
Collecting langchain-google-genai
  Downloading langchain_google_genai-2.1.3-py3-none-any.whl.metadata (4.7 kB)
Collecting chromadb
  Downloading chromadb-1.0.5-cp39-abi3-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (6.9 kB)
Collecting PyPDF2
  Downloading pypdf2-3.0.1-py3-none-any.whl.metadata (6.8 kB)
Collecting langgraph-checkpoint<3.0.0,>=2.0.10 (from langgraph==0.3.30)
  Downloading langgraph_checkpoint-2.0.24-py3-none-any.whl.metadata (4.6 kB)


# [2] Core Library Imports & Version Check
---
Imports essential libraries for parsing, RAG, LangGraph flow, and Gemini interaction.  
Also verifies library versions to ensure compatibility across environments.

In [2]:
# Import core libraries
# core libraries
import importlib.metadata as im
import google.generativeai as genai
import langchain, langgraph

# critical modules
from langchain_google_genai import (
    ChatGoogleGenerativeAI,
    GoogleGenerativeAIEmbeddings
)
from langchain.vectorstores import Chroma
from langchain.chains import RetrievalQA
from langgraph.graph import StateGraph, END
from typing import TypedDict
# API
from kaggle_secrets import UserSecretsClient
# utilities
#from pypdf import PdfReader
#from pdf2image import convert_from_path
#import pytesseract
import pandas as pd, os, re, textwrap, json
import logging
import PyPDF2


logging.basicConfig(level=logging.INFO, format="%(message)s")
logger = logging.getLogger(__name__)

print("Versions")
for pkg in (
    "google-generativeai",
    "google-ai-generativelanguage",
    "langchain",
    "langchain-google-genai",
    "langgraph",
    "langchain-core",
):
    print("   •", pkg, ":", im.version(pkg))
print("✅ Versions checked")

Versions
   • google-generativeai : 0.8.4
   • google-ai-generativelanguage : 0.6.15
   • langchain : 0.3.23
   • langchain-google-genai : 2.0.10
   • langgraph : 0.3.30
   • langchain-core : 0.3.54
✅ Versions checked


# [3] API Configuration & Gemini Initialization
---
Sets up the Gemini Flash 2.5 API for text generation and embedding.  
Credentials are securely handled via Kaggle’s environment.  
Also configures model endpoints used in resume rewriting and RAG chains.

In [3]:
# Set up API key
secrets = UserSecretsClient()
API_KEY = secrets.get_secret("JRAA_Gemini_API")

genai.configure(api_key=API_KEY)
os.environ["GOOGLE_API_KEY"] = API_KEY          # optional, but handy for other libs

print("✅ API key configured")

llm_chat  = ChatGoogleGenerativeAI(model="gemini-1.5-flash")   # LangChain wrapper
llm_flash = genai.GenerativeModel("gemini-1.5-flash")          # Direct SDK handle

print("✅ Gemini 1.5 Flash ready")

✅ API key configured
✅ Gemini 1.5 Flash ready


# [4] PDF Text Extraction
---
Extracts resume content from PDF files using `PyPDF2`.  
OCR fallback via Tesseract was planned but not used in the final version due to parsing reliability.

In [4]:
def extract_text_from_pdf(path: str) -> str:
    """
    Extracts all text from each PDF page using PyPDF2.
    Does not perform any OCR fallback.
    """
    reader = PyPDF2.PdfReader(path)
    return "\n".join(page.extract_text() or "" for page in reader.pages)
print("✅ PyPDF2.PdfReader")

✅ PyPDF2.PdfReader


# [5] UX Insight Embeddings & RAG Setup
---
Loads bilingual UX survey data and converts it into embeddings using Gemini.  
The insights are stored in ChromaDB and used to generate personalized job advice via Retrieval-Augmented Generation (RAG).

In [5]:
# Load UX Survey + Seed Advice
df = pd.read_json("/kaggle/input/ux-survey-career-japan/ux_survey.json")
ux_insights = df.dropna(subset=["Please describe specific obstacles"])\
                .apply(lambda r: f"{r['Please describe specific obstacles']} "
                                f"(Visa: {r['Current Visa Status']}, JP: {r['Your Japanese Language Proficiency Level']})",
                       axis=1).tolist()


# Define seeds
job_advice_seeds = [
    "In Japan, it's important to include your JLPT level and photo on your resume.",
    "Recruiters expect short and formal 自己PR statements. Don't write more than 300 words.",
    "N4 is acceptable for entry-level or technical roles, but more companies prefer N3 or above.",
    "If you want to work in tech, learn some basic business Japanese (keigo expressions like お世話になります).",
    "Try mixing Japanese and English job boards to maximize exposure.",
    "Many companies in Japan follow fixed hiring cycles (like April or October). Timing your application can improve response rates.",
    "Japanese interviews often include questions like 'Why Japan?' or 'What do you know about our company?' Be prepared to answer these.",
    "Avoid vague or overly casual expressions in 自己PR. Japanese recruiters value humility and clear structure."
] + ux_insights
# Setup RAG chain
emb = GoogleGenerativeAIEmbeddings(model="models/embedding-001")

job_advice_retriever = Chroma.from_texts(
    job_advice_seeds,
    embedding=emb
).as_retriever(k=3)

rag_chain = RetrievalQA.from_chain_type(
    llm=llm_chat,
    retriever=job_advice_retriever,
    return_source_documents=False
)

print("✅ RAG chain ready with", len(job_advice_seeds), "snippets")

✅ RAG chain ready with 24 snippets


# [6] Tool Functions

## [6.1] CV Parsing & JSON Extraction
---
Uses Gemini to extract structured fields from uploaded Japanese résumés.  
Returns a JSON object containing name, education, skills, certifications, 自己PR, and more.

In [6]:
def parse_cv_node(cv_path: str) -> dict:
    print("[Outside loop] Running parse_cv_node")
    raw_cv = extract_text_from_pdf(cv_path)
    prompt = (
        "Extract the following fields and return ONLY valid JSON:\n"
        "- name, dob, nationality, address, phone, email,\n"
        "- education (list of strings), work_experience, certifications, skills,\n"
        "- self_pr, motivation\n\n"
        + raw_cv
    )
    out = llm_flash.generate_content(prompt).text.strip()
    m = re.search(r"\{.*\}", out, re.S)
    if not m:
        raise ValueError(f"JSON parse failed:\n{out}")
    parsed = json.loads(m.group(0))
    print("✅ Parsed CV JSON:")
    return {"raw_cv": raw_cv, "parsed_cv": parsed}

CV_PATH = "/kaggle/input/sample/Daniel-Thompson-Resume.pdf"
parsed = parse_cv_node(CV_PATH)
raw_cv, parsed_cv = parsed["raw_cv"], parsed["parsed_cv"]

[Outside loop] Running parse_cv_node
✅ Parsed CV JSON:


## [6.2.1] Tailor CV Node — Rewrite 自己PR & 志望動機
---
Generates formal Japanese self-promotion and motivation statements tailored to the target company.  
Uses few-shot prompting with structured resume input and optional company info.

In [7]:
def tailor_cv_node(state: dict) -> dict:
    import re

    # Utility: count Japanese (non-ASCII) characters
    def count_japanese_chars(s: str) -> int:
        return sum(1 for ch in s if ord(ch) > 127)

    company      = state["company"]
    raw          = state["raw_cv"]
    cv           = state["parsed_cv"]

    # Retrieve company info (fallback to LLM if not found)
    company_info = retrieve_company_info(company)
    if not company_info:
        company_info = llm_flash.generate_content(
            f"Provide the mission, values, and recent highlights of {company}."
        ).text.strip()

    # Build the rewrite prompt using structured rules
    prompt = (
        # No emojis, no placeholders
        "Output starting with ENGLISH short reply for example Here's your rewrite resume. or Rewrite version to align with {company}\n"
        "Please output ONLY the resume content—no emojis, no “🤖” markers, no text in brackets. "
        "Use the actual names from the extracted fields. If some name is missing, substitute a generic term.\n\n"
        # Formatting rules
        "- Use Japanese résumé conventions (履歴書), formal tone.\n"
        "- Append English translations in parentheses immediately after each Japanese line.\n"
        "- List jobs in reverse‑chronological order with consistent bullet length.\n"
        "- 自己PRは200～300文字以内で作成してください。\n"
        "- 志望動機は200～300文字以内で作成してください。\n"
        "- Incorporate 貴社 into the 自己PR section to demonstrate respect.\n\n"
        # Rewrite instruction
        f"Rewrite the entire CV to align with {company}:\n\n"
        f"氏名: {cv['name']}\n"
        f"生年月日・国籍: {cv['dob']}／{cv['nationality']}\n"
        f"住所・連絡先: {cv['address']}／{cv['phone']}／{cv['email']}\n\n"
        "学歴:\n" + "\n".join(f"  * {e}" for e in cv["education"]) + "\n\n"
        "職務経歴:\n" + "\n".join(f"  * {e}" for e in cv["work_experience"]) + "\n\n"
        "資格:\n" + "\n".join(f"  * {c}" for c in cv["certifications"]) + "\n\n"
        "スキル:\n" + "\n".join(f"  * {s}" for s in cv["skills"]) + "\n\n"
        f"自己PR: {cv['self_pr']}\n\n"
        f"志望動機: {cv['motivation']}\n\n"
        # Company context
        f"※Company Info for {company}:\n{company_info}\n"
    )

    # Generate initial rewrite
    result = llm_flash.generate_content(prompt).text.strip()

    # Enforce 自己PR length (200–300 Japanese characters)
    match = re.search(r"自己PR:\s*(.+?)(?:\n\n|$)", result, re.S)
    if match:
        jiko_pr = match.group(1).strip()
        length  = count_japanese_chars(jiko_pr)
        if length < 200 or length > 300:
            followup = f"Your 自己PR is {length}文字です。200～300文字になるよう調整してください。"
            result   = llm_flash.generate_content(prompt + "\n\n" + followup).text.strip()

    return {**state, "result": result}

## [6.2.2] Company Info Retrieval
---
Fetches mission, values, and recent highlights of the target company using Gemini.  
This context is embedded into the CV rewriting prompt to improve personalization.

In [8]:
def retrieve_company_info(company: str) -> str:
    #print("✅ retrieve_company_info")
    # Simulate a database or web fetch #this can change into RAG in the future develop
    dummy_info = {
        "Capcom": "Capcom is a global video game developer known for Resident Evil and Monster Hunter.",
        "Rakuten": "Rakuten is a Japanese e-commerce and internet services company with a global presence.",
        "Toyota": "Toyota is a leading automotive company focused on innovation and sustainability."
    }
    #print("✅ Exit retrieve_company_info")
    return dummy_info.get(company, f"{company} is a company in Japan. More details are not available.")

## [6.3] UX Advice Node — RAG-Based Career Support
---
Generates personalized job advice using RAG.  
Combines user certifications (JLPT, TOEIC) with embedded UX survey insights to recommend actionable next steps for working in Japan.

In [9]:
def job_advice_node(state: dict) -> dict:
    """
    Use RAG to answer the user's own query (e.g. “give me job advice”),
    grounded in their profile and the UX survey insights.
    Enforce a consistent, numbered format but let the user dictate the task.
    """
    cv      = state["parsed_cv"]
    query   = state["query"]  # e.g. "give me job advice"
    
    # Extract the user's real certification data
    jlpt    = next((c for c in cv["certifications"] if "日本語能力試験" in c),
                   "日本語能力試験 N? 未取得")
    toeic   = next((c for c in cv["certifications"] if "TOEIC" in c),
                   "TOEIC 未取得")
    skills  = ", ".join(cv["skills"])
    
    # Fold in UX survey bullet points
    ux_ctx  = "\n".join(f"- {tip}" for tip in ux_insights)
    
    # Guard‑rails on formatting, independent of the user task
    formatting = (
        "Output your answer as a numbered list with at least 3 items (unless the user's request "
        "specifies otherwise). For each item:\n"
        "  1. Bold the title of the item.\n"
        "  2. On the next line, prefix “- Advice:” for a brief action.\n"
        "  3. On a following line, prefix “- Resources:” for links or tools (if applicable).\n"
        "All output must be in English; Japanese words are allowed only with translations in parentheses.\n\n"
    )
    
    # Build the dynamic prompt
    prompt = (
        formatting +
        "Candidate profile:\n"
        f"- {jlpt}\n"
        f"- {toeic}\n"
        f"- Skills: {skills}\n"
        f"- QA roles: {len(cv['work_experience'])}\n\n"
        "UX survey insights (reported barriers):\n"
        f"{ux_ctx}\n\n"
        "User request:\n"
        f"{query}\n\n"
        "Please answer the user's request above, following the formatting rules."
    )
    
    # Invoke RAG (use invoke to avoid deprecation)
    raw_output = rag_chain.invoke({"query": prompt})
    
    # Extract the string if we got a dict
    advice_text = raw_output.get("result", raw_output) if isinstance(raw_output, dict) else raw_output
    
    # Ensure real line breaks
    advice = advice_text.replace("\\n", "\n")
    
    return {**state, "result": advice}

## [6.4] Alternative Job Suggestion Node
---
Suggests 5 alternative job paths based on the user’s parsed resume profile.  
Uses Gemini to generate role suggestions, stores them temporarily in a Chroma vector index, and retrieves the most relevant ones.  
Supports structured, RAG-style output with Japanese terms and translations.

In [10]:
def recommend_alt_jobs_node(state: dict) -> dict:
    """
    Suggest 5 alternative job roles in Japan from the Resume below,
    and format each item as:
      1. **<Job Title in English>** (<Japanese title>):
         - Transition Advice: …
         - Recommended Courses/Skills: …
    Permit Japanese words only with translations in ().
    """

    raw_cv = state["raw_cv"]

    # Build a structured prompt
    prompt = (
        # Formatting guard rails
        "Output exactly 5 numbered items. For each:\n"
        "  1. Bold the English job title and put the Japanese title in parentheses.\n"
        "  2. On the next line, prefix “- Transition Advice:” and give a 1‑sentence tip.\n"
        "  3. On the following line, prefix “- Recommended Courses/Skills:” and list any study suggestions.\n"
        "All output must be in English. Japanese words are allowed only with an English translation in parentheses.\n\n"
        # What to base it on
        "Suggest alternative roles based on this résumé:\n\n"
        f"{raw_cv}\n"
    )


    # Call the LLM
    text = llm_flash.generate_content(prompt).text

    # Split into individual non-empty lines
    suggestions = [line for line in text.splitlines() if line.strip()]

    # Join back into a clean multi-line string
    result_text = "\n".join(suggestions)

    return {**state, "result": result_text}

In [11]:
def build_alt_job_store(profile: str):

    """
    Returns a Chroma retriever seeded with alternative‑job snippets,
    ready to query using the candidate's raw CV text.
    """
    # alt_job_snippets is a list of strings you prepared earlier
    return Chroma.from_texts(
        alt_job_snippets,      # your 4–5 prewritten alternative‑job doc strings
        embedding=emb
    ).as_retriever(k=5)

    
    #print("🧭 Enter build_alt_job_store")
    prompt = (
        "Suggest 5 alternative job roles in Japan from the data you get from the Resume below."
        "Return each suggestion as one sentence and also include Advice about transition into a new position."
        "If you have some recommended course or skill set, recommend it."
        "PLEASE NOTE THAT ANSWER MUST BE IN ENGLISH, You can use Japanese for important WORD but you have to add () and write translation inside after that word\n\n" + profile
    )
    suggestions = llm_flash.generate_content(prompt).text.splitlines()

    # ✅ In-memory only — avoids persistence errors
    return Chroma.from_texts(suggestions, embedding=emb).as_retriever(k=2)



# [7] LangGraph Agent Flow & Routing
---
Defines the LangGraph architecture used to control chatbot flow.  
Routes user input dynamically to the correct node based on intent (e.g., rewrite resume, get advice, explore new jobs).  
Each node reads and updates shared state using a dictionary-based agent state model.

In [12]:
# Shared State
class AgentState(TypedDict):
    query: str
    result: str
    company: str
    parsed_cv: str  # Optional but useful in CV use case
    raw_cv: str

# Routing Logic
def route(state: AgentState) -> str:
    #print("📊 Full State Contents:")
    
    if "raw_cv" not in state:
        state["raw_cv"] = raw_cv
        
    for key, value in state.items():
        preview = str(value)[:100].replace("\n", " ")
        #print(f"🔑 {key}: {preview}...")

    query = state.get("query", "").lower()
    if any(keyword in query for keyword in ["rewrite", "志望動機", "resume", "自己pr"]):
        #print("➡️ Routing to: tailor_cv_node")
        return "tailor_cv_node"
        
    elif any(keyword in query for keyword in ["job advice", "how to find job", "how can i get a job", "career advice", "how to apply", "job in japan"]):
        #print("➡️ Routing to: job_advice_node")
        return "job_advice_node"

    elif any(keyword in query for keyword in ["alternative job", "job suit", "recommend job", "what job", "career", "仕事", "職種"]):
        #print("➡️ Routing to: recommend_alt_jobs")
        return "recommend_alt_jobs"

    print("🛑 No match — ending flow")
    return "default_fallback"


# Build the Graph
builder = StateGraph(AgentState)

# Only add actual processing nodes
builder.add_node("tailor_cv_node", tailor_cv_node)
builder.add_node("job_advice_node", job_advice_node)
builder.add_node("recommend_alt_jobs", recommend_alt_jobs_node)

builder.set_entry_point("start_router")  # pick a new internal router node

# Register a dummy node to represent router
def router_entry_node(state: AgentState) -> AgentState:
    print("🚦 Initial router node")
    return state

builder.add_node("start_router", router_entry_node)

# Then update your conditional routing:
builder.add_conditional_edges("start_router", route, {
    "tailor_cv_node": "tailor_cv_node",
    "job_advice_node": "job_advice_node",
    "recommend_alt_jobs": "recommend_alt_jobs",
    "default_fallback": END
})

# Optional: loop back to router if needed
# builder.add_edge("tailor_cv_node", "router")
# builder.add_edge("ux_advice", "router")
# builder.add_edge("recommend_alt_jobs", "router")

# Compile
multi_agent = builder.compile()
print("✅ Builder Compiled")

✅ Builder Compiled


# [8] Chatbot Simulation Loop
---
Runs an interactive chatbot loop using the LangGraph agent.  
Supports natural user queries like “rewrite my resume” or “give me job advice,” routing them to the correct tool.  
Ideal for testing resume rewrites, advice generation, and job recommendations in real time.

In [13]:
def chatbot_simulator():
    print("🤖 JobBridge‑AI: Welcome to JobBridge‑AI!")
    print("💬 Type: “rewrite my resume”, “give me job advice”, or “suggest alternative jobs”")
    print("🛑 Quit with q / quit / exit\n")

    while True:
        user_input = input("🧠 You: ").strip()
        if user_input.lower() in {"q", "quit", "exit"}:
            print("👋 Ending Session")
            break

        state = {
            "raw_cv":    raw_cv,
            "parsed_cv": parsed_cv,
            "query":     user_input,
            "company":   ""
        }
        if "rewrite" in user_input.lower():
            state["company"] = input("🏢 Which company? ").strip()

        #print("🌀 Running agent flow…")
        response = multi_agent.invoke(state)

        # Extract and clean the advice string
        advice = response.get("result", "⚠️ No result returned.")
        advice = advice.replace("\\n", "\n")

        # Print it out—using the same variable name!
        print(
            f"\n🤖 JobBridge‑AI:\n"
            f"{advice}\n"
            f"{'-'*40}\n"
        )

# Execute Program

In [14]:
chatbot_simulator()

🤖 JobBridge‑AI: Welcome to JobBridge‑AI!
💬 Type: “rewrite my resume”, “give me job advice”, or “suggest alternative jobs”
🛑 Quit with q / quit / exit



🧠 You:  rewrite my resume
🏢 Which company?  Rakuten


🚦 Initial router node

🤖 JobBridge‑AI:
Here's your rewritten resume.

氏名: Daniel Thompson (Name: Daniel Thompson)
生年月日・国籍: 1992年5月14日／ニュージーランド国籍 (Date of Birth・Nationality: May 14, 1992 / New Zealand)
住所・連絡先: 〒150-0002 東京都渋谷区渋谷2-15-1 渋谷クロスタワー24階／080-1234-5678／daniel.t@example.com (Address・Contact: Shibuya Crosta Tower 24F, 2-15-1 Shibuya, Shibuya-ku, Tokyo 150-0002 / 080-1234-5678 / daniel.t@example.com)

学歴:
オークランド大学 メディア・コミュニケーション学部 卒業 (Auckland University, Graduate, Faculty of Media and Communication)

職務経歴:
渋谷日本語センター  (Shibuya Japanese Language Center)
期間: 2019年1月～現在 (Period: January 2019 – Present)
・（職務内容を記述）(Description of duties)

ScriptWorks　広告代理店 (ScriptWorks, Advertising Agency)
期間: 2010年4月～2018年12月 (Period: April 2010 – December 2018)
・脚本、コピーライティング業務に従事 (Engaged in scriptwriting and copywriting)
・多文化チームとの協働によるプロジェクト多数経験 (Experienced in numerous projects collaborating with multicultural teams)
・クライアントニーズに沿った効果的な広告制作に貢献 (Contributed to effective advertisement production aligne

🧠 You:  give me job advice


🚦 Initial router node

🤖 JobBridge‑AI:
1. **Target English-Speaking Companies**

- Advice: Focus your job search on multinational companies or foreign companies based in Japan.  These companies are often more open to hiring candidates with strong English skills and may be less stringent about Japanese language proficiency, especially for entry-level positions.

- Resources: Websites like Indeed Japan, LinkedIn, and Glassdoor can be filtered to show English-speaking companies.


2. **Network Strategically**

- Advice: Leverage your existing network and actively build new connections within your field. Attend industry events (even online ones), join relevant online communities, and reach out to people working in your target roles at companies you admire. Networking can open doors to unadvertised positions and provide valuable insights.

- Resources: LinkedIn, Meetup.com, industry-specific online forums.


3. **Highlight Transferable Skills**

- Advice: Emphasize the transferable skills f

🧠 You:  suggest alternative jobs


🚦 Initial router node

🤖 JobBridge‑AI:
1. **Copywriter** (コピーライター)
- Transition Advice: Leverage your experience in script and copywriting at ScriptWorks to highlight your creative abilities and adaptability to new industries.
- Recommended Courses/Skills:  Advanced copywriting techniques, SEO copywriting, content marketing,  Japanese business writing.
2. **Localization Specialist** (ローカリゼーションスペシャリスト)
- Transition Advice:  Showcase your native English fluency, N2 Japanese proficiency, and experience working in multicultural teams to demonstrate your suitability for this role.
- Recommended Courses/Skills: Translation (especially Japanese-English), localization management, cultural sensitivity training, software localization tools (e.g., SDL Trados).
3. **Content Creator** (コンテンツクリエイター)
- Transition Advice: Highlight your experience creating marketing materials at ScriptWorks and your ability to adapt your content to different audiences, emphasizing your bilingual skills.
- Recommended 

🧠 You:  Quit


👋 Ending Session


## -------------------------------------------
