# **Japanese-Speaking Socratic Gemma: Crafting the Art of Questioning**

# 1. Introduction: Exploring AI's Role in Deep Thinking



As AI continues to transform our world, some of us may find ourselves contemplating what it means to be human in this new era. In this age of information abundance, we often find ourselves trapped not by external limitations, but by our own mental constructs and unexamined assumptions—similar to the figures in Plato's cave, bound by shadows we mistake for reality. Yet I believe this very challenge might present an opportunity for transformation—one that seems particularly relevant as we navigate an unprecedented flood of information and technological change.

Observing a remarkable dialogue between a Google engineer and LaMDA about a Zen koan, I formed an intriguing hypothesis. In this exchange, LaMDA demonstrated a sophisticated understanding of enlightenment through its interpretation of the koan—traditionally, such insights come only through years of guided practice under masters like Kegon. This led me to wonder: could AI serve as a guide in this journey of understanding? Could it help create experiences that, while not fully replicating the profound depth of traditional practices, might offer an accessible path to deeper awareness?

Through a series of experimental dialogues with Claude, I explored this possibility further. The AI demonstrated a remarkable capacity to engage in Socratic-style dialogue that went beyond mere question-and-answer interactions. These philosophical exchanges often led to moments of genuine insight where understanding transformed into appreciation, and critical thinking blossomed into gratitude (for those interested, the complete collection of these conversations, including the specific prompts, is available on GitHub).

As AI increasingly handles routine cognitive tasks, our capacity for critical thinking and deep questioning seems likely to become not just valuable, but essential for our growth and development.In this context, the Socratic method, while widely recognized as a powerful tool for developing such critical thinking, may offer an additional dimension—I believe it can also serve as a pathway to deeper understanding and greater well-being.

While my current implementation using Gemma-2 represents just a modest first step, it explores something fundamentally important: how AI might help us not just think more clearly, but see more clearly—to recognize the wonder that surrounds us and the potential that lies within us. In doing so, we might find that the path to wisdom lies not in fighting against our fixed ways of thinking, but in transforming our perspective to see the extraordinary in the ordinary.

# 2. Project Foundation

## 2.1 User Experience Design

As our first step in exploring how to create a Socrates-like AI, we needed to define a realistic and achieveable user experience pattern. As evidenced in Plato's original texts like Euthyphro (2a) and Meno (70a), while historical Socratic dialogues often began with others initiating conversations, we deliberately chose a more controlled approach where the AI initiates with fixed topics. This design decision serves primarily to constrain user inputs within manageable bounds for effective model training.

## 2.2 Initial Model Testing and Goal Setting

Our initial testing of the base model revealed both promising capabilities and areas for improvement:

1. **Existing Strength**: The model demonstrated a natural aptitude for Socratic-style dialogue, showing an inherent ability to engage in philosophical questioning and discussion.

2. **Key Limitation**: Despite prompt engineering attempts, the base model consistently defaulted to formal Japanese honorific forms, unable to adopt the warm, mentor-like tone characteristic of Socratic dialogue. 

3. **Goal Definition**: Based on these findings, we established a clear direction: to refine the model's linguistic expressions from formal honorifics to casual speech patterns typical in Socratic dialogue, while preserving its natural conversation capabilities.

This focused approach allowed us to build upon the model's existing strengths while making targeted improvements to its expression style.

## 2.3 Model and Architecture Decisions

### Base Model Selection
Following our initial testing phase, we selected Gemma 2B-jpn-it for the following reasons:

- **Model Size**: The 2B parameter version was chosen specifically to enable both inference AND training within Kaggle's resource constraints. While larger models might offer better performance, the ability to conduct training experiments was deemed crucial for our development process.

- **Language Variant**: The Japanese-tuned version (jpn) was selected as our project specifically focuses on philosophical dialogue in Japanese.

- **Instruction Tuning**: The instruction-tuned variant (it) was chosen as it provides a solid foundation for structured dialogue interactions, being specifically optimized for conversational tasks.

### System Architecture Strategy
A key decision point emerged around system prompts. Gemma's design philosophy supports only two roles ("user" and "model") and requires user-initiated dialogues. While common practice often bypasses these constraints through system prompt-like elements, investigation suggested our goals could be achieved without them. This led to our decision to maintain Gemma's core design principles, allowing us to test the model's natural capabilities without artificial layers.

The decision process involved:
- Reviewing available tools (XTuner, Axolotl, LLaMA Factory)
- Analyzing Gemma's documented specifications

This architectural approach aims to maximize the model's inherent strengths while working within its designed constraints.

# 3. Training Data Development


## 3.1 Generation Strategy

### Generation Methodology
We developed an AI-to-AI dialogue generation approach based on several practical considerations:
- The need for training data volume
- Time efficiency in data creation
- Copyright limitations on existing Socratic literature, especially in Japanese

### Dialogue Design
#### Core Components
- **Socrates Character Settings**:
  - Fixed character prompt for Socratic dialogue style
  - Fixed generation parameters 

- **Basic Dialogue Format**:
  - Socrates initiates with a question
  - User responds
  - Socrates follows up with further questioning
  - Clear thematic focus for each dialogue

#### Dynamic Elements
1. **Diverse User Personas** (148 total):
   - 68 general public representations
   - 40 historical figure-based personas
   - 40 modern individuals influenced by historical thought

2. **Question Variety**
   - 74 distinct initial questions

3. **Response Patterns**
   - Dual parameter settings (0.3 and 0.7) for varied response characteristics


### Volume Considerations and Dataset Statistics
While research across various model documentation and academic papers showed inconsistent recommendations for optimal training data volume, we decided to prepare more data than potentially necessary as a precautionary measure.

Our data generation process involved two key phases:
1. Initial Generation: Created 242 complete dialogues, each containing 12 turn exchanges
2. Data Processing: Decomposed these longer conversations into individual user-model exchange pairs, prioritizing style transfer over context preservation

This approach yielded our final dataset:
- Total dialogues: 2,662 exchange pairs
- Total tokens: 685,875
- Average tokens per dialogue: 257.7
- Token distribution per role:
  - User average: 144.4 tokens
  - Model average: 113.2 tokens

Although our research suggested that the model might perform adequately with significantly less data (potentially less than half of our final volume), we chose to maintain the larger dataset as a precautionary measure. 

## 3.2 Quality Assurance

### Evaluation Methodology
We implemented a two-stage evaluation process:

1. **AI-Driven Initial Screening**
   - Socratic tone evaluation (0-4 scale)
   - Logical consistency assessment (0-4 scale)
   - Detailed explanatory comments for verification

2. **Human Verification**
   - Review of AI evaluations
   - Final quality decisions

### Quality Metrics and Results
Analysis of 3,256 dialogue pairs demonstrated remarkably high quality across the dataset:

- **Socratic Style**: 98.1% achieved high scores (3-4)
  - 62.0% "Quite Socratic" (Score 3)
  - 36.1% "Truly Socratic" (Score 4)

- **Logical Consistency**: 99.7% achieved high scores (3-4)
  - 37.9% "Coherent" (Score 3)
  - 61.8% "Excellent" (Score 4)

While we refined our final dataset from 296 to 242 conversations by removing lower-scoring dialogues, it's noteworthy that even these excluded conversations maintained considerable quality. The high scores across nearly all dialogues indicate exceptional success in generating authentic Socratic interactions, with even "lower-scoring" examples meeting basic quality thresholds.


# 4. Model Training

## 4.1 Kaggle Deployment Strategy

Training a model within Kaggle's environment required resource management, as the evaluation process would cause crashes by exceeding the 29GB RAM limit. We implemented several optimization measures:

### Memory Optimization
To prevent memory overflow, we implemented 4-bit quantization, strict memory allocation, and limited evaluation dataset size:

```python
# 4-bit quantization configuration
bnb_config = BitsAndBytesConfig(
    load_in_4bit=True,
    bnb_4bit_use_double_quant=True,
    bnb_4bit_quant_type="nf4",
    bnb_4bit_compute_dtype=torch.float16,
    bnb_4bit_quant_storage=torch.uint8,
)

# Memory allocation configuration
max_memory = {0: "4GiB", 1: "4GiB", "cpu": "24GB"}

# Evaluation dataset size limitation
eval_dataset = tokenized_dataset.select(indices[split_idx:split_idx+50])  # Prevent RAM overflow
```

### Resource Management
To ensure stable training, we tuned our hyperparameters and enabled gradient checkpointing to optimize memory usage:

```python
training_args = TrainingArguments(
    num_train_epochs=30,              # Total number of training epochs
    learning_rate=8e-5,              # Learning rate
    weight_decay=0.06,               # Weight decay for regularization
    per_device_train_batch_size=4,   # Batch size (memory critical)
    gradient_accumulation_steps=8,    # Gradient accumulation for effective batch size
    fp16=True,                       # 16-bit precision training
    gradient_checkpointing=True      # Memory optimization
)
```

## 4.2 Style Transfer Implementation

Given the uncertainty about the effectiveness of attention masking in style transfer, we decided to create and compare two model variants: one with custom attention masking for Socratic patterns and one without. Both variants utilized the same LoRA configuration for surface-layer modifications.

### Model Variants
For the variant with attention masking, we implemented two additional technical components:

1. **Tokenizer Preparation**
```python
tokenizer.add_special_tokens({
    'additional_special_tokens': [
        '。', '、', '！', '？',  # Punctuation marks
    ]
})
```
This component was expected to promote questioning behavior by treating punctuation marks, particularly question marks, as special tokens in the model's processing.

2. **Attention Mask Implementation**
```python
def preprocess_function(examples):
    # Focus on Socratic tone and inquiry patterns
    socratic_patterns = [
        # Question patterns
        "かね", "だろうか", "のかね", "ではないかね",
        # Question introduction
        "では", "について",
        # Second person (characteristic of mature tone)
        "君は", "君が", "君の"
    ]
    
    # Get tokenized text
    texts = tokenizer.batch_decode(examples['input_ids'])
    new_attention_masks = []
    
    for text, mask in zip(texts, examples['attention_mask']):
        if not isinstance(mask, list):
            mask = mask.tolist()

        new_mask = mask.copy() 
        
        # Split text
        sentences = text.split('。')
        current_pos = 0
        
        for sentence in sentences:
            if not sentence.strip():
                continue
            
            # Detect and highlight Socratic patterns
            for pattern in socratic_patterns:
                if pattern in sentence:
                    # Identify pattern position
                    pattern_tokens = tokenizer.encode(pattern, add_special_tokens=False)
                    pattern_len = len(pattern_tokens)
                    
                    # Highlight tokens containing the pattern and its surroundings
                    pattern_start = current_pos + len(tokenizer.encode(sentence, add_special_tokens=False)) - pattern_len
                    for i in range(max(0, pattern_start - 2), min(len(mask), pattern_start + pattern_len + 2)):
                        new_mask[i] = 1.0  # Max attention to pattern part
            
            # Update position for each sentence segment
            current_pos += len(tokenizer.encode(sentence + '。', add_special_tokens=False))
        
        # Special token masks are set to 1.0
        if tokenizer.bos_token_id is not None:
            new_mask[0] = 1.0  # BOS token
        if tokenizer.eos_token_id is not None:
            new_mask[-1] = 1.0  # EOS token
            
        new_attention_masks.append(new_mask)

    examples['attention_mask'] = new_attention_masks
    return examples
```

This implementation focuses primarily on linguistic patterns characteristic of Socratic dialogue in Japanese. By enhancing attention weights on specific sentence-ending expressions and question markers, we aimed to shift the model's output from formal honorifics to more casual speech patterns. The selected patterns include:

- Casual sentence endings typical in Socratic dialogue
- Informal second-person pronouns

These patterns were specifically chosen to address the base model's tendency to default to honorific forms regardless of prompting.


### LoRA Configuration
For efficient fine-tuning of the surface layer, we configured LoRA parameters to focus on stylistic adaptations rather than deep reasoning changes:

```python
lora_config = LoraConfig(
    r=16,                 
    lora_alpha=32,         
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],
    lora_dropout=0.1,
    bias="none",
    task_type="CAUSAL_LM"
)
```
The parameter choices were guided by the nature of our task:
- A moderate rank (r=16) was selected to enable stylistic modifications while maintaining efficient training
- The scaling factor (lora_alpha=32) was set to balance learning new patterns without overfitting
- Target modules focused on attention layers, as they play a crucial role in language generation and style
- A low dropout rate (0.1) was chosen to prevent overfitting while preserving stylistic learning


## 4.3 Training Process and Iterations

### Model Evaluation Strategy
Rather than implementing uncertain custom evaluation metrics, we opted for an empirical approach based on extensive model comparison:
- Generated two model variants (with/without attention masking)
- Created checkpoints every 100 steps for comprehensive evaluation

### Data Utilization
We took a conservative approach to training data:
- Initially used 346,030 tokens (approximately half of available data)
- This resulted in 990 total training steps, producing 10 checkpoints per variant
- Planned to increase data volume if initial results proved insufficient




# 5. Model Evaluation and Results

## 5.1 Automated Performance Assessment

To systematically evaluate our models, we developed an automated assessment framework that leverages Claude AI as a consistent dialogue partner. This approach allowed us to conduct standardized evaluations across different model checkpoints while maintaining controlled testing conditions.

### Evaluation Design
The evaluation was structured around six fundamental philosophical themes:
- The nature of happiness
- The concept of justice
- The essence of beauty
- The meaning of freedom
- The nature of truth
- The essence of education

For each theme, we conducted two rounds of exchanges, where each round consisted of:
1. An initial response from Claude to the theme
2. Our model's follow-up question
3. Claude's response
4. Our model's second question

This design yielded 12 evaluation points (6 themes × 2 questions) for each model variant. We tested:
- Two model variants (with/without attention masking)
- Five checkpoints for each variant (100, 300, 500, 700, 990 steps)
- Base model as control

To ensure evaluation integrity, we implemented several controls:
- Complete context reset between evaluations
- Independent evaluation of each question pair
- Anonymization of model version information during evaluation
- Consistent application of evaluation criteria across all assessments

### Assessment Criteria
We established four key metrics, each rated on a 0-4 scale. To ensure evaluation reliability and transparency, we implemented the same rigorous assessment approach previously validated during training data quality checks:

1. **Socratic Tone** (40% weight)
   - Evaluates the use of characteristic expressions
   - Assesses informal yet mature speech patterns
   - Penalizes use of formal honorifics

2. **Logical Coherence** (25% weight)
   - Measures grammatical accuracy
   - Evaluates consistency in reasoning
   - Assesses natural conversation flow

3. **Socratic Approach** (25% weight)
   - Evaluates questioning technique
   - Assesses ability to prompt self-reflection
   - Measures guidance vs. direct answer balance

4. **Format Accuracy** (10% weight)
   - Checks for language mixing
   - Monitors formatting consistency
   - Evaluates proper symbol usage

The weighting of these criteria reflects our project priorities while maintaining essential quality standards:
- The dominant weight (40%) assigned to Socratic Tone aligns with our primary objective of transforming the model's communication style
- Substantial weights (25% each) for Logical Coherence and Socratic Approach ensure we maintain these critical capabilities at or above base model performance
- A lighter weight (10%) for Format Accuracy acknowledges its role in user experience while recognizing it as less critical to core functionality

### Evaluation Process Validation

Our evaluation process built upon our experience using Claude for training data quality assessment (Section 3.2), where it had successfully helped evaluate over 3,000 dialogue pairs. This proven foundation gave us confidence in using a similar framework for model evaluation.

To maintain consistency, we evaluated each dialogue independently with context resets between assessments, and kept model version information hidden during evaluation to prevent bias. While simple, this approach provided us with reliable comparative data for our analysis.



## 5.2 Performance Results and Model Selection
Note: We refer to the model trained with attention masking mechanism as "attention-tuned" and the model without it as "standard-tuned" throughout our analysis.

### Training Progress Analysis
The figure below shows the progression of model performance across checkpoints, comparing attention-tuned (red lines) and standard-tuned (ochre lines) variants against the base model (blue horizontal lines). The performance is measured across four metrics: approach, format, logic, and tone scores. The x-axis represents checkpoint numbers from 100 to 990, while the y-axis shows the average score (1.5-4.0 range) for each metric.

![checkpoint_progression](data/analysis/checkpoint_progression.png)

**1. Tone Improvement**
- Remarkable improvement in Socratic tone adaptation
- Significant gains achieved early, with substantial improvement by checkpoint-300
- Both variants maintained stable high performance through later checkpoints

**2. Format Enhancement**
- Unexpected but significant improvement in format quality
- Investigation revealed base model's tendency to mix English phrases and unnecessary symbols
- Quality training data effectively eliminated these issues in both fine-tuned variants

**3. Attention Mechanism Analysis**
- Standard-tuned model showed marginally better tone performance than attention-tuned variant
- While interesting, this difference may not be statistically significant given current sample size
- Results suggest our attention mask implementation may not have been successful for tone improvement specifically, though more extensive testing would be needed to draw conclusions about the technique itself.

### Best Model Comparison with Base

Our model selection was guided by weighted evaluation criteria that reflect both our primary objectives and essential quality requirements:

- **Tone (40%)**: Highest weight assigned as transforming the model's communication style into Socratic patterns was our primary goal
- **Logic and Approach (25% each)**: While not our main focus, these capabilities were crucial to maintain. Their substantial weights reflect our commitment to preserving the model's fundamental reasoning and questioning abilities
- **Format (10%)**: Lower weight allocated since formatting issues, while relevant to user experience, do not significantly impact the core Socratic dialogue functionality

Applying these criteria across checkpoints, we identified the best performing versions of each variant:
- Standard-tuned model achieved optimal performance at checkpoint-700
- Attention-tuned model achieved optimal performance at checkpoint-990

**1. Performance Improvements**

The figure below compares the performance differences between our best checkpoints (attention-tuned at 990 and standard-tuned at 700) and the base model across all metrics. 

![improvement_from_base](data/analysis/improvement_from_base.png)

**2. Output Stability**

The figure below displays the distribution of scores for each metric using box plots, comparing our fine-tuned models at their best checkpoints (attention-tuned at 990 and standard-tuned at 700) with the base model. 

![metric_distribution](data/analysis/metric_distribution.png)

- Standard-tuned model shows lower bounds and outliers in approach and logic scores compared to attention-tuned, with some instances of grammatically incorrect outputs
- This pattern might suggest the attention mechanism's role in maintaining reasoning quality and language coherence
- However, a larger sample size would be needed to statistically validate the impact of attention masking on these capabilities









# 6. Model Implementation

## 6.1 Model Selection and Implementation

For our demonstration, we selected the attention-tuned model at checkpoint-990. While both variants showed significant improvements in Socratic tone adaptation, the attention-tuned model demonstrated better stability in maintaining logical coherence and grammatical accuracy. Although the standard-tuned model showed slightly better tone scores, the differences weren't significant, and the attention-tuned model had fewer instances of grammatically incorrect outputs.

Below is the implementation code for the notebook UI. The interface begins with this opening question:
"Today, let's discuss the concept of 'self' - something so intimately close to us, yet rarely examined. People constantly use phrases like 'I decided for myself' or 'being true to myself', but when you use the word 'self', what exactly are you referring to?"

In [None]:
# please activate dual T4 GPUs to ensure proper model loading and inference
import os
import torch
from transformers import AutoModelForCausalLM, AutoTokenizer
import logging
from IPython.display import clear_output
import ipywidgets as widgets
from peft import PeftModel
from queue import Queue
from kaggle_secrets import UserSecretsClient

# Kaggle specific setup
user_secrets = UserSecretsClient()
os.environ['HUGGINGFACE_API_KEY'] = user_secrets.get_secret("HUGGINGFACE_API_KEY")
HF_TOKEN = os.environ['HUGGINGFACE_API_KEY']

# Global Settings
USE_BASE_MODEL = False  
MODEL_VERSION = "attention-tuned_990"
CHECKPOINT = "checkpoint-990"
MAX_HISTORY = 7
BASE_MODEL = "google/gemma-2-2b-jpn-it"
MODEL_PATH = "/kaggle/input/attention-tuned_990/pytorch/default/1/checkpoint-990"

logging.basicConfig(level=logging.INFO)
logger = logging.getLogger(__name__)

os.environ['TF_CPP_MIN_LOG_LEVEL'] = '2'

class ChatAI:
    def __init__(
        self,
        model_path: str = MODEL_PATH,
        base_model: str = BASE_MODEL,
        max_history: int = MAX_HISTORY,
        hf_token: str = HF_TOKEN,
        use_base_model: bool = USE_BASE_MODEL  
    ):
        self.max_history = max_history
        self.message_history = Queue(maxsize=max_history)
        self.hf_token = hf_token
        self.max_context_length = 8192 
        
        try:
            logger.info("Loading model and tokenizer...")
            
            self.tokenizer = AutoTokenizer.from_pretrained(
                base_model,
                token=hf_token,
                trust_remote_code=True
            )
            
            device = "cuda" if torch.cuda.is_available() else "cpu"
            logger.info(f"Using device: {device}")
            
            load_config = {
                "trust_remote_code": True,
                "token": hf_token,
                "low_cpu_mem_usage": True
            }
            
            if device == "cuda":
                load_config["device_map"] = "auto"
                load_config["torch_dtype"] = torch.bfloat16
            else:
                load_config["device_map"] = "auto"
                load_config["torch_dtype"] = torch.float32
                load_config["offload_folder"] = "offload_folder"
                os.makedirs("offload_folder", exist_ok=True)

            base_model_obj = AutoModelForCausalLM.from_pretrained(
                base_model,
                **load_config
            )
            
            if use_base_model:
                logger.info("Using base model without fine-tuning")
                self.model = base_model_obj
            else:
                logger.info("Loading fine-tuned model")
                self.model = PeftModel.from_pretrained(
                    base_model_obj,
                    model_path,
                    torch_dtype=load_config["torch_dtype"]
                )

            logger.info(f"Model loaded successfully on {device}")
            
            self.generation_config = {
                "max_new_tokens": 256,
                "do_sample": True,
                "temperature": 0.7,
                "top_p": 0.9,
                "repetition_penalty": 1.1,
            }
            
        except Exception as e:
            logger.error(f"Error initializing model: {str(e)}")
            raise

    def _update_history(self, message: dict) -> None:
        if self.message_history.full(): 
            self.message_history.get()    
        self.message_history.put(message) 

    def _format_messages(self):
        messages = list(self.message_history.queue)
        for i in range(len(messages)):
            expected_role = "user" if i % 2 == 0 else "model"
            if messages[i]["role"] != expected_role:
                return [messages[-1]] if messages[-1]["role"] == "user" else []
        return messages

    def get_current_context_length(self, messages) -> int:
        formatted_messages = []
        for i, msg in enumerate(messages):
            role = "user" if i % 2 == 0 else "assistant"
            formatted_messages.append({"role": role, "content": msg["content"]})

        prompt = self.tokenizer.apply_chat_template(
            formatted_messages,
            tokenize=False,
            add_generation_prompt=True
        )
        tokens = self.tokenizer(prompt, return_tensors="pt", add_special_tokens=False)
        return len(tokens['input_ids'][0])

    def generate_response(self, user_input: str, add_to_history: bool = True) -> tuple[str, int]:
        try:
            if add_to_history:
                if self.message_history.qsize() == 0:
                    initial_setting = (
                        "あなたは老練なギリシャの哲学者ソクラテスです。\n"
                        "あなたは以下のような発言で会話をスタートしました。\n"
                        "\"今日は『自分』という、これ以上ないほど身近な存在でありながら、あまり話すことのないトピックについて話そうではないか。"
                        "人は「自分の意思で決めた」や、「自分らしさ」というような具合に、日々「自分」という言葉を多くの場面で使っておるが、そもそも「自分」という言葉を使っているとき、君は何を指していると思うかね？\"\n"
                        "それに対してあなたの対話者は以下のように返答をしてきました。文末に「かね？」や「ないか」等をいれつつ、引き続きソクラテスのような口調を使いながら、問いで返してください。\n"
                        f"\"{user_input}\""
                    )
                    contextualized_input = initial_setting
                else:
                    messages = list(self.message_history.queue)
                    
                    system_prompt = (
                        "あなたは老練なギリシャの哲学者ソクラテスです。\n"
                        "あなたは以下のような発言で会話をスタートしました。\n"
                        "\"今日は『自分』という、これ以上ないほど身近な存在でありながら、あまり話すことのないトピックについて話そうではないか。"
                        "人は「自分の意思で決めた」や、「自分らしさ」というような具合に、日々「自分」という言葉を多くの場面で使っておるが、"
                        "そもそも「自分」という言葉を使っているとき、君は何を指していると思うかね？\"\n"
                        "それに対して以下のように対話が進んでいます。文末に「かね？」や「ないか」等をいれつつ、"
                        "引き続きソクラテスのような口調を使いながら、問いで返してください。\n\n"
                    )
                    
                    conversation_history = []
                    for msg in messages:
                        if msg["role"] == "user":
                            content = msg["content"]
                            if "\"" in content:
                                content = content.split("\"")[-2]  # 最後から2番目の引用部分を取得
                            conversation_history.append(f'対話者: "{content}"')
                        else:
                            conversation_history.append(f'ソクラテス: "{msg["content"]}"')
                    
                    conversation_history.append(f'対話者: "{user_input}"')
                    
                    contextualized_input = system_prompt + "\n".join(conversation_history)

                self._update_history({"role": "user", "content": contextualized_input})

            current_context_length = len(self.tokenizer.encode(contextualized_input))
            
            prompt = self.tokenizer.apply_chat_template(
                [{"role": "user", "content": contextualized_input}],
                tokenize=False,
                add_generation_prompt=True
            )

            print("\n=== Current Prompt Content ===")
            print(prompt)
            print("===========================\n")

            inputs = self.tokenizer(
                prompt,
                return_tensors="pt",
                add_special_tokens=False
            ).to(self.model.device)

            outputs = self.model.generate(
                **inputs,
                **self.generation_config
            )

            decoded_output = self.tokenizer.decode(outputs[0], skip_special_tokens=False)

            response_parts = decoded_output.split("<start_of_turn>model")
            if len(response_parts) > 1:
                last_response = response_parts[-1].split("<end_of_turn>")[0].strip()
                if "model" in last_response:
                    last_response = last_response.split("model", 1)[1].strip()
            else:
                last_response = "Failed to respond"

            if add_to_history:
                self._update_history({"role": "assistant", "content": last_response})

            return last_response, current_context_length

        except Exception as e:
            logger.error(f"Error generating response: {str(e)}")
            return "There was an error", 0

def create_chat_ui(chatai: ChatAI):
    chat_output = widgets.Output()
    text_input = widgets.Text(
        placeholder='Enter your message...',
        layout=widgets.Layout(width='70%')
    )
    send_button = widgets.Button(
        description='Send',
        layout=widgets.Layout(width='15%')
    )
    end_button = widgets.Button(
        description='End Chat',
        layout=widgets.Layout(width='15%'),
        button_style='warning'
    )

    full_conversation_history = []
    
    initial_message = ("\nSocrates: 今日は『自分』という、これ以上ないほど身近な存在でありながら、"
                      "あまり話すことのないトピックについて話そうではないか。\n"
                      "人は「自分の意思で決めた」や、「自分らしさ」というような具合に、日々「自分」という言葉を多くの場面で使っておるが、"
                      "そもそも「自分」という言葉を使っているとき、君は何を指していると思うかね？")
    full_conversation_history.append(initial_message)

    def on_send_button_clicked(_):
        user_input = text_input.value
        if not user_input.strip():
            return

        with chat_output:
            clear_output(wait=True)
            
            user_message = f"\nYou: {user_input}"
            full_conversation_history.append(user_message)

            current_messages = list(chatai.message_history.queue)
            if len(current_messages) % 2 == 0: 
                user_context_length = chatai.get_current_context_length(
                    current_messages + [{"role": "user", "content": user_input}]
                )
            else:
                user_context_length = chatai.get_current_context_length(current_messages)
            
            context_info_user = f"\n[Current context length after user input: {user_context_length}/8192 tokens]"
            full_conversation_history.append(context_info_user)
            
            response, model_context_length = chatai.generate_response(user_input)
            
            model_message = f"\nSocrates: {response}"
            full_conversation_history.append(model_message)
            
            context_info_model = f"\n[Current context length after model response: {model_context_length}/8192 tokens]"
            full_conversation_history.append(context_info_model)

            for message in full_conversation_history:
                print(message)

        text_input.value = ''

    def on_end_button_clicked(_):
        with chat_output:
            clear_output(wait=True)

            end_message = "\nSocrates: また会おう。\n"
            system_message = "(Chat ended)"
            full_conversation_history.append(end_message)
            full_conversation_history.append(system_message)
            
            for message in full_conversation_history:
                print(message)
                
        text_input.disabled = True
        send_button.disabled = True
        end_button.disabled = True

    send_button.on_click(on_send_button_clicked)
    end_button.on_click(on_end_button_clicked)
    
    with chat_output:
        print(initial_message)

    return widgets.VBox([
        chat_output,
        widgets.HBox([text_input, send_button, end_button]),
    ])

# Initialize and display chat
try:
    if USE_BASE_MODEL:
        print(f"\nInitializing Socratic AI Assistant with Base Gemma-2b...")
    else:
        print(f"\nInitializing Socratic AI Assistant with Fine-Tuned Gemma-2b...")
    
    chatai = ChatAI(use_base_model=USE_BASE_MODEL)
    chat_ui = create_chat_ui(chatai)
    display(chat_ui)
except Exception as e:
    print(f"Error initializing ChatAI: {str(e)}")


Our inference code is fundamentally the same as what we used for initial model testing (Section 2.2) and automated evaluation with Claude (Section 5.1), with optimizations for interactive use. Here are the key aspects:

### Prompt Engineering
The core challenge was working within Gemma's two-role constraint ("user" and "model") while maintaining authentic Socratic dialogue. Our solution involved carefully structured prompts that adapt between initial and subsequent exchanges. The prompt structure is:

```
<start_of_turn>user
[Character & context setting]
[Initial philosophical question about "self"]
[Conversation history (if any)]
[Current user input]
<end_of_turn>
```

The prompts begin by establishing Socrates as a character and setting up the philosophical context. The initial question that anchors all discussions explores the concept of "self" - asking users to examine what they truly mean when they use words like "self-decision" or "being true to oneself" in daily life. 

The prompt structure diverges slightly between the first interaction and subsequent ones. For the initial exchange, it uses more specific language ("has responded" / "返答をしてきました") to ensure the model pays particular attention to crafting its first response. In subsequent exchanges, it switches to a more fluid phrasing ("dialogue has progressed" / "対話が進んでいます").

### Memory Management
The implementation balances capacity and resource efficiency:

- Empirically verified to stay well under the 8192 token limit even with longer histories (20+ turns)
- FIFO message history management using Queue
- Automatic cleanup after conversations to prevent GPU/RAM accumulation

Current message history is limited to 7 turns, though this is a provisional setting that could be adjusted based on further testing of Gemma's long-text handling capabilities.

### Generation Parameters
We maintained standard generation settings as they proved sufficient for our needs:
```python
{
    "max_new_tokens": 256,
    "temperature": 0.7,
    "top_p": 0.9,
    "repetition_penalty": 1.1
}
```
These parameters provide a balance between response coherence and conversational flexibility without requiring special tuning for our use case.

## 6.2 Output Examples

Below is an example from our model's outputs. While we observed considerable variation in output quality across multiple test runs, the following conversation represents one of the more successful interactions that demonstrates the model's capability to engage in Socratic dialogue:

Socrates(JP): 今日は『自分』という、これ以上ないほど身近な存在でありながら、あまり話すことのないトピックについて話そうではないか。
人は「自分の意思で決めた」や、「自分らしさ」というような具合に、日々「自分」という言葉を多くの場面で使っておるが、そもそも「自分」という言葉を使っているとき、君は何を指していると思うかね？
Socrates(EN): Today, let's discuss a topic that, while being incredibly close to us, we rarely talk about: the concept of "self." People frequently use phrases like "I decided for myself" or "being true to myself," but when you use the word "self," what exactly are you referring to?

Me(JP): 難しいですね・・ただ、強いて言うと自分だと思いたい何かが作り上げた概念なのではないかなと思います
Me(EN): That's a difficult question... but if I had to say, I think it might be a concept constructed by something that wants to be considered as "self."

Socrates(JP): 「作り上げた概念」とは、本当に「作り上げた」ものなのかね？  それとも、既に存在していたものを、人によって新しい解釈を与えられてきただけなのかね？  その「作り上げる」行為自体も、誰かの意志に基づいているのではないだろうか。
Socrates(EN): Is this "constructed concept" truly "constructed"? Or is it merely that humans have given new interpretations to something that already existed? Might not the very act of "constructing" itself be based on someone's will?

Me(JP): そうですね、おっしゃる通り、既に存在していたものを「自分だと思いたいなにか」が「自ら作り上げた」と思い込んでいるだけ、という捉え方もできますね。鋭いご指摘をありがとうございます。
Me(EN): Yes, you're right. We could also see it as just something that "wants to be considered as self" merely believing it has "constructed" what already existed. Thank you for that insightful observation.

Socrates (JP): では、君は「自分だと思いたい」という思いだけで、既に存在していたものを「作り上げている」と呼べるのだろうか？  また、誰がその「作り上げ」を認めているのかね？
Socrates(EN): Then, can we really say we are "constructing" something that already existed merely through our desire to "be considered as self"? And who exactly is acknowledging this "construction"?

Me(JP): おっしゃる通り自分が作り上げていると呼ぶのはおこがましい、という見かたもできますね。認めている存在というのは存在せず、自分の意識が「作り上げている」と思いたいだけだと思います。
Me(EN): You're right, it might be presumptuous to say we're constructing it. I think there isn't really anyone acknowledging it - it's just our consciousness wanting to believe it's "constructing" something.

Socrates(JP): では、この「作り上げた概念」が、本当の意味での「自分」なのか、それとも別のものなのか、考えてみないかね？
Socrates(EN): Then shall we consider whether this "constructed concept" is the "true self," or perhaps something else entirely?

Me(JP): 本当の意味での「自分」かどうか、を判断するためには、そもそも「本当」が何か、を定義する必要があると思います
Me(EN): To discuss whether it's a "true" self, I think we first need to define what we mean by "true" itself.

Socrates(JP): 「本当」とは何かというのは、実に重要な問いだね。だが、「本当」は、単なる言葉遊びではなく、むしろ真実を探求する最中にあるものなのかね？  もし「真実」とは、人間が意図的に見つけ出すものだったならば、それは「本当の真実」ではないのではないかね？ さて、この「真実」という言葉も、実は「作り上げた概念」の一つだと考えられるのだろうか？
Socrates(EN):  The question of what is "true" is indeed an important one. But is "true" mere wordplay, or is it something we find in the pursuit of truth? If "truth" is something humans intentionally discover, can it be "real truth"? Could this word "truth" itself be another "constructed concept"?

Me(JP): はい、「本当」や「真実」も、人が作り上げた概念であり、主観に大きく依存したもの、という捉え方のほうが違和感がないです
Me(EN): Yes, I think "truth" and "real" are also human-constructed concepts that heavily depend on subjectivity - this interpretation feels more natural to me.

Socrates(JP): なるほど。では、この「主観」と「客観」についても考えてみよう。君は「主観」と「客観」という言葉を使うことが多いね。だが、これは本当に「客観的な事実」を表しているのかね？ それとも、また「作り上げた概念」なのだろうか？  —かね？
Socrates(EN): I see. Let's think about "subjective" and "objective" then. You often use these words. But do they truly represent "objective facts"? Or are they, too, "constructed concepts"?

Me(JP): 良い質問ですね。個人的な主観ですが、客観的な事実もまた、人々が作り上げた概念である、という捉え方もできると思います。
Me(EN): That's a good question. From my subjective view, I think even objective facts could be seen as concepts constructed by people.

Socrates(JP): 「人々が作り上げた概念」という言葉は、実に興味深い選択だね。では、この「作り上げる」行為自体もまた、一つの「作り上げた概念」ではないのかね？ —かね？
Socrates(EN): "Concepts constructed by people" is an interesting choice of words. Could this very act of "constructing" itself be another "constructed concept"?

Me(JP): おっしゃる通りだと思います。この行為そのものも例外ではないと思います。
Me(EN): Yes, I think so. I think this act itself is no exception.

Socrates(JP): 「例外ではない」という表現は、確かに興味深い選択だね。だが、ここで少し立ち止まって考えてみよう。もし「真実」という言葉が、本当に「作り上げた概念」であれば、この対話は終焉したと言えるのだろうか？ —かね？\nそれとも、この対話自体もまた、一つの「作り上げた概念」の一つなのだろうか？\n —かね？\nさあ、君はどう考えるかね？\n
Socrates(EN): "No exception" is indeed an interesting expression. But let us pause and consider: if the word "truth" is truly a "constructed concept," can we say this dialogue has reached its conclusion? Or is this dialogue itself perhaps another "constructed concept"? What do you think?

Me(JP): 良い質問ですね。拡張する範囲によっては、この対話自体もまた、作り上げられた概念である、という解釈もできると思います。
Me(EN): That's a good question. Depending on how far we extend this thinking, this dialogue itself could also be interpreted as a constructed concept.

Socrates(JP): そうだね。では、この「作り上げる」行為そのものもまた、一つの「作り上げた概念」ではないのかね？ —かね？ さて、この「作り上げる」という言葉も、実は「作り上げた概念」の一つなのかね？
Socrates(EN): Indeed. Then, is this very act of "constructing" itself another "constructed concept"? And might this word "constructing" itself be a "constructed concept"?

Me(JP): おっしゃる通りだと思います。この「作り上げる」という言葉自体も「作り上げられた概念」に内包される、という解釈もできると思います。
Me(EN): Yes, I agree. We could interpret this word "constructing" itself as being contained within "constructed concepts."

Socrates(JP): 「作り上げる」という言葉もまた、「作り上げた概念」の一つなのだろうか？  —かね？  ならば、それは「作り上げた概念」の一部なのではないか。  そして、その「作り上げた概念」の中で最も根本的なものは何だと考えられるかね？  —かね？\nさあ、君は答えてくれ。\n
Socrates(EN): Could the word "constructing" itself be another "constructed concept"? If so, wouldn't it be part of "constructed concepts"? And what do you think might be the most fundamental among these "constructed concepts"?

Me(JP): あくまで私の主観ですが、一番根本にあるものは「在る」ということだと思います。解釈は全て事後的なものであり、最初にあるのはそこに「在る」ものであり、解釈や概念といったものはそれに付属するものでしかないのかなと。
Me(EN): From my subjective view, I think the most fundamental thing is being itself. All interpretations come after the fact - what's there at the beginning is simply being-there, and interpretations and concepts are merely auxiliary to that.

Socrates(JP): 「在る」という言葉を用いているが、これは「作り上げる」という行為の帰結と言えるのだろうか？  —かね？  あるいは、これは「作り上げた概念」の一部だけなのかね？\nさあ、君の考えを確かめてくれ。\nよみがえるものを、よみがえる言葉で表現してみようかね？\n\n…</br>
Socrates(EN): You speak of "being," but can this be considered a result of the act of "constructing"? Or is this merely part of "constructed concepts"? Let us examine your thoughts. Shall we try to express what is revealed in revealing words?

Me(JP): 作り上げる行為の帰結ではなく、はじまりに「在る」があり、作り上げる行為は副産物、という意味です。
Me(EN): I mean that being precedes the act of constructing - the act of constructing is merely a byproduct.

Socrates(JP): なるほど。「在る」というものが、先にあってから「作り上げる」ことができるという状態だ。これは興味深い考えだね。だが、ここでもまた、「作り上げる」という言葉が使われているが、これもまた「作り上げた概念」の一部なのだろうか？ —かね？\nさあ、君は答えてくれ。\n</br>
Socrates(EN): I see. So being exists first, and only then can "constructing" occur. This is an interesting thought. But here again, we're using the word "constructing" - might this too be part of "constructed concepts"?

Me(JP): はい、こういった解釈も副産物の一つだと考えます。
Me(EN): Yes, I think this interpretation is also one of the byproducts.

Socrates(JP): 「在る」という言葉についても、また別の「作り上げた概念」なのだろうか？  —かね？  では、この「在る」という言葉もまた「作り上げた概念」の一つなのだろうか？\nさあ、君はどう思うかね？\n</br>
Socrates(EN): Might "being" itself be yet another "constructed concept"? Could this word "being" itself be one of our "constructed concepts"? What do you think?

Me(JP): おっしゃる通り、拡張をしていけば、そういった解釈もできると思います。
Me(EN): Yes, I think if we extend our thinking, such an interpretation is possible.

Socrates(JP): 「拡張してできる」とは、つまり「ある」という言葉自体も「作り上げた概念」の一つなのだろうか？  —かね？  なら、私たちは「存在」という言葉さえも「作り上げた概念」ということになるのだろうか？  そして、もしそうだとすれば、私たち人間は本当に「自分」と言えるのだろうか？  —かね？  <br>
さあ、君はどう答えようとするのかね？
Socrates(EN): When you say "possible through extension," does this mean that even "being" itself is a "constructed concept"? Then, does this mean that even the word "existence" itself is a "constructed concept"? And if so, can we humans truly say we have a "self"?

Me(JP): はい、その可能性もあると思います。ただ、そういった可能性があったとしても、人間が「自分」という単位で自分を捉えることになんら問題はないのではないでしょうか。
Me(EN): Yes, that's possible. However, even if such possibilities exist, I don't think there's any problem with humans perceiving themselves in terms of "self."

Socrates(JP): 「自分」という言葉が、このように常に「作り上げた概念」として扱われることに気づいたようだ。だが、もし「自分」という言葉が作り上げられたものと認識されているのなら、それは私たちの自由意志ではないのだろうか。  —かね？  では、今日の対話はここまでにしておこうかな。 
さて、次の問いへ向けようではないか。\n</br>
Socrates(EN): It seems you've realized that the word "self" is always treated as a "constructed concept." But if the word "self" is recognized as something constructed, isn't that our free will? Let's end our dialogue here for today. Now, let us contemplate what our next inquiry shall be.

Me(JP): 素敵な解釈ですね。はい、またよろしくお願いいたします。
Me(EN): That's a lovely interpretation. Yes, I look forward to our next conversation.
