Auto-completion disabled for languages with ID >= 256 due to LanguageID bits overlapping with TokenType bits

Does this issue occur when all extensions are disabled?: Yes/No



- VS Code Version: 1.92.2
- OS Version: 

I have discovered an issue in vscode's LineTokens metadata generation, specifically related to the 8-bit limit for LanguageID.

According to the official blog ([Syntax Highlighting Optimizations](https://code.visualstudio.com/blogs/2017/02/08/syntax-highlighting-optimizations#_tokens-theme-matching)), vscode utilizes the first 8 bits of the metadata integer for LanguageID and bits 9-11 for TokenType.

<img width="767" height="486" alt="Image" src="https://github.com/user-attachments/assets/7b1ef5cf-fa62-4562-9e64-1a9b9d73f900" />

In getDefaultMetadata, the value is constructed as follows:

```TypeScript
// vscode/src/vs/editor/common/tokens/contiguousTokensStore.ts

public getTokens(topLevelLanguageId: string, lineIndex: number, lineText: string): LineTokens {
	let rawLineTokens: Uint32Array | ArrayBuffer | null = null;
	if (lineIndex < this._len) {
		rawLineTokens = this._lineTokens[lineIndex];
	}

	if (rawLineTokens !== null && rawLineTokens !== EMPTY_LINE_TOKENS) {
		return new LineTokens(toUint32Array(rawLineTokens), lineText, this._languageIdCodec);
	}

	const lineTokens = new Uint32Array(2);
	lineTokens[0] = lineText.length;
	lineTokens[1] = getDefaultMetadata(this._languageIdCodec.encodeLanguageId(topLevelLanguageId));
	return new LineTokens(lineTokens, lineText, this._languageIdCodec);
}

...

function getDefaultMetadata(topLevelLanguageId) {
    return ((topLevelLanguageId << MetadataConsts.LANGUAGEID_OFFSET) // <- here
        | (StandardTokenType.Other << MetadataConsts.TOKEN_TYPE_OFFSET)
        | ... ) >>> 0;
}
```

### Reproduction 

When the number of registered languages exceeds 255 (the 8-bit limit), topLevelLanguageId values for subsequent languages will overflow. Because topLevelLanguageId is shifted and bitwise OR-ed with StandardTokenType, this overflow causes a carry-over into the bits reserved for TokenType.

For example, a LanguageID of 325 (binary 1 0100 0101) effectively truncates/overflows into the TokenType bit field, make the metadata becomes to `xxxx 001 0100 0101`, which should be `xxxx 000 (1)0100 0101`. A StandardTokenType.Others (0) bit field is erroneously modified to StandardTokenType.Comments (1) due to this bit carry-over.

### Actual

Since quickSuggestion logic relies on TokenType being Other to trigger by default, this metadata corruption causes auto-completion to be silently disabled for all languages with an ID >= 256.

### Expected

Even though it might not be a long-term goal to support more than 255 languages in a single instance, this behavior is a silent failure that leads to difficult-to-debug issues. I propose implementing a bitwise mask or modulo operation on topLevelLanguageId (e.g., topLevelLanguageId & 0xFF) before shifting to ensure that overflow does not contaminate the TokenType field.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Auto-completion disabled for languages with ID >= 256 due to LanguageID bits overlapping with TokenType bits #319118

Reproduction

Actual

Expected

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Auto-completion disabled for languages with ID >= 256 due to LanguageID bits overlapping with TokenType bits #319118

Description

Reproduction

Actual

Expected

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions