## Executive Summary

### Problem Identified ✅
The "javascript" default language in code blocks comes from **@lexical/code's PrismTokenizer**, NOT from your custom transformer. Your `NEUTRAL_CODE_TRANSFORMER` works correctly, but the highlighting system overrides it.

### Impact Assessment
- **Severity**: Medium - Affects user experience but doesn't break functionality
- **Scope**: All code blocks without explicit language specification  
- **Root Cause**: Third-party library default behavior

### Solution Status
- **Fix Required**: Single function modification in `CodeHighlightPlugin`
- **Risk Level**: Low - Well-contained change with fallback behavior
- **Testing Required**: Basic verification of language-less and language-specific code blocks

### Technical Details
- **Source**: `/node_modules/@lexical/code/LexicalCode.dev.js:562` - `DEFAULT_CODE_LANGUAGE = 'javascript'`
- **Trigger**: Line 858 - `node.setLanguage(tokenizer.defaultLanguage)` when language is undefined
- **Solution**: Custom tokenizer with `defaultLanguage: ''` instead of `'javascript'`

This investigation confirms that the custom transformer implementation is working correctly and the issue lies in a downstream highlighting system that needs a small configuration override.

In [None]:
# Here's the exact code that needs to be updated in LexicalEditor.tsx

# BEFORE (lines 150-158):
"""
function CodeHighlightPlugin() {
  const [editor] = useLexicalComposerContext();

  useEffect(() => {
    return registerCodeHighlighting(editor);
  }, [editor]);

  return null;
}
"""

# AFTER (recommended fix):
"""
function CodeHighlightPlugin() {
  const [editor] = useLexicalComposerContext();

  useEffect(() => {
    // Custom tokenizer that preserves undefined language
    const NEUTRAL_TOKENIZER = {
      defaultLanguage: '', // Empty string instead of 'javascript'
      tokenize(code, language) {
        if (!language) return []; // No highlighting for unspecified language
        return Prism.tokenize(code, Prism.languages[language] || []);
      }
    };
    
    return registerCodeHighlighting(editor, NEUTRAL_TOKENIZER);
  }, [editor]);

  return null;
}
"""

## Step 4: Verification and Recommendations

### Verification Summary

✅ **Confirmed**: The issue is NOT in our custom transformer  
✅ **Confirmed**: The issue is NOT in @lexical/markdown transformers  
✅ **Confirmed**: The issue IS in @lexical/code's `registerCodeHighlighting`  

### Root Cause Location
**File**: `/node_modules/@lexical/code/LexicalCode.dev.js`  
**Lines**: 562 (constant definition), 858 (auto-assignment)  

The `PrismTokenizer.defaultLanguage = 'javascript'` forces all code blocks without explicit language to become JavaScript.

### Immediate Fix Recommendation

**Modify the `CodeHighlightPlugin` in `LexicalEditor.tsx`** (lines 150-158):

```javascript
function CodeHighlightPlugin() {
  const [editor] = useLexicalComposerContext();

  useEffect(() => {
    // Custom tokenizer that doesn't force defaults
    const NEUTRAL_TOKENIZER = {
      defaultLanguage: '', // Empty string instead of 'javascript'
      tokenize(code, language) {
        if (!language) return []; // No highlighting for unspecified language
        return Prism.tokenize(code, Prism.languages[language] || []);
      }
    };
    
    return registerCodeHighlighting(editor, NEUTRAL_TOKENIZER);
  }, [editor]);

  return null;
}
```

### Alternative Solutions Ranking

1. **Custom Tokenizer** (BEST) - Clean, minimal change
2. **Node Transform Override** - More complex, potential performance impact  
3. **Custom registerCodeHighlighting** - High maintenance, duplicates library code

### Testing Plan

1. Apply the fix
2. Type ``````` without language
3. Verify code block shows no language attribute
4. Verify syntax highlighting still works for explicit languages (```javascript, ```python, etc.)
5. Check that existing code blocks aren't affected

## Step 3: Solution Analysis

### The Problem Flow

1. User types ```` ```
2. `NEUTRAL_CODE_TRANSFORMER` creates `$createCodeNode(undefined)` ✅
3. `CodeHighlightPlugin` calls `registerCodeHighlighting(editor)` ✅
4. **BUG**: `registerCodeHighlighting` uses `PrismTokenizer` with `defaultLanguage: 'javascript'`
5. When highlighting kicks in, line 858 runs: `node.setLanguage(tokenizer.defaultLanguage)` ❌
6. Node language changes from `undefined` → `"javascript"` ❌

### Possible Solutions

#### Option 1: Custom PrismTokenizer (RECOMMENDED)
Override the default tokenizer to use `null` or `"plain"` instead of `"javascript"`:

```javascript
const CUSTOM_TOKENIZER = {
  defaultLanguage: null, // or 'plain'
  tokenize(code, language) {
    if (!language) return []; // No highlighting for undefined language
    return Prism.tokenize(code, Prism.languages[language] || []);
  }
};

// In CodeHighlightPlugin
return registerCodeHighlighting(editor, CUSTOM_TOKENIZER);
```

#### Option 2: Node Transform Override  
Add a node transform to prevent language changes:

```javascript
editor.registerNodeTransform(CodeNode, (node) => {
  // If node was created without language, keep it that way
  if (wasCreatedWithoutLanguage(node)) {
    node.setLanguage(undefined);
  }
});
```

#### Option 3: Custom registerCodeHighlighting
Create our own version that doesn't auto-set defaults.

## Step 2: Found the Root Cause - DEFAULT_CODE_LANGUAGE

**FOUND IT!** The issue is in the Lexical library itself.

### Key Findings:

1. **@lexical/code sets a default language**: In `/node_modules/@lexical/code/LexicalCode.dev.js:562`
   ```javascript
   const DEFAULT_CODE_LANGUAGE = 'javascript';
   ```

2. **CodeNode constructor analysis**: The constructor doesn't set a default:
   ```javascript
   constructor(language, key) {
     super(key);
     this.__language = language || undefined;  // No default here
   }
   ```

3. **The default is applied during syntax highlighting**: Lines 858 and 871 show where defaults are applied:
   - Line 858: `node.setLanguage(tokenizer.defaultLanguage);` (when language is undefined)
   - Line 871: Uses `tokenizer.defaultLanguage` as fallback in tokenization

4. **PrismTokenizer uses the default**: Line 699 shows:
   ```javascript
   defaultLanguage: DEFAULT_CODE_LANGUAGE,  // = 'javascript'
   ```

### The Real Issue

Our custom `NEUTRAL_CODE_TRANSFORMER` correctly creates nodes without language (`$createCodeNode(undefined)`), but:

1. The `registerCodeHighlighting` plugin automatically applies defaults during highlighting
2. When a CodeNode has `undefined` language, the highlighting system sets it to `"javascript"`
3. This happens **after** our transformer runs, during the highlighting phase

## Step 1: Examining the Main Lexical Editor

Found the main `LexicalEditor.tsx` file. Key findings:

1. **Custom NEUTRAL_CODE_TRANSFORMER**: Lines 88-121 implement a custom transformer that should avoid defaulting to "javascript"
   - Line 110: `$createCodeNode(language)` - calls with `language` or `undefined`
   - Language is extracted from regex match[1] or set to `undefined` if no match

2. **Custom Transformers Setup**: Lines 123-127 filter out default code transformers and use the custom one

3. **$createCodeNode Usage**: The custom transformer calls `$createCodeNode(language)` where language can be `undefined`

**Next**: Need to examine the actual `$createCodeNode` implementation from @lexical/code to see if it has internal defaults.

# JavaScript Default Language Investigation

## Objective
Investigate where the "javascript" default language is being set for code blocks in the Lexical editor implementation.

## Context
- Custom NEUTRAL_CODE_TRANSFORMER implemented to avoid defaulting to "javascript"
- Transformer appears to work (shows correct regex in console)
- Code blocks still default to "javascript" when no language is specified
- Suspect CodeNode itself may have this default

## Investigation Plan
1. Search for hardcoded "javascript" defaults in codebase
2. Analyze $createCodeNode() implementation
3. Check for other transformers/plugins that might override custom transformer
4. Look for CSS/styling adding "javascript" default
5. Trace execution path from ``` input to code block creation