Skip to content

Zed selection line range is incorrect with non-ASCII text when using DB #24808

@Rostov4an1n

Description

@Rostov4an1n

Description

Hi! I found an issue with Zed selection detection when OpenCode uses the Zed SQLite DB.

When the selected file contains non-ASCII text before or inside the selection, OpenCode reports an incorrect line range. It looks like Zed stores selection offsets as UTF-8 byte offsets, but OpenCode treats them as JavaScript string indexes.

Environment

  • Editor: Zed
  • File contains Cyrillic text

Example

Given this Next.js layout file:

import type { Metadata } from 'next'
import { Inter, Manrope } from 'next/font/google'
import { Toaster } from '@/shared/ui'
import { Providers } from './_providers'
import './globals.css'

const manrope = Manrope({
	subsets: ['latin', 'cyrillic'],
	weight: ['400', '500', '600', '700'],
	variable: '--font-manrope',
})

const inter = Inter({
	subsets: ['latin', 'cyrillic'],
	weight: ['400', '500', '600', '700'],
	variable: '--font-inter',
})

export const metadata: Metadata = {
	title: 'KanbanFlow',
	description: 'Эффективное управление задачами и проектами с помощью Kanban-досок',
}

export default function RootLayout({
	children,
}: Readonly<{
	children: React.ReactNode
}>) {
	return (
		<html
			lang="ru"
			className={`${inter.variable} ${manrope.variable}`}
			suppressHydrationWarning
		>
			<body>
				<Providers>
					<Toaster />
					{children}
				</Providers>
			</body>
		</html>
	)
}

Actual behavior

Selections before the Cyrillic line work correctly:

Selecting lines 13–17 is detected as 13–17

But selections that include or come after the Cyrillic line become incorrect:

Selecting lines 19–22 is detected as 19–26
Selecting only line 21 is detected as 21–26
Selecting lines 35–40 is detected as 38–44
Expected behavior

OpenCode should report the same line range that is selected in Zed.

For example:

Selecting 35–40 should be detected as 35–40
Selecting only line 21 should be detected as line 21
Possible cause

In packages/opencode/src/cli/cmd/tui/context/editor-zed.ts, OpenCode reads selection_start and selection_end from Zed’s SQLite database and then uses them directly as JavaScript string offsets:

const startOffset = Math.min(row.selection_start, row.selection_end)
const endOffset = Math.max(row.selection_start, row.selection_end)

text.slice(startOffset, endOffset)
offsetsToSelection(text, startOffset, endOffset)

However, Zed appears to store these offsets as UTF-8 byte offsets, while JavaScript string indexing uses UTF-16 code units. Because of that, any Cyrillic text, emoji, or other non-ASCII characters before the selection shift the calculated line range.

In my example, the Cyrillic description line has more UTF-8 bytes than JavaScript string characters, so all selections after it are shifted forward.

Possible fix

Before calling text.slice() and offsetsToSelection(), the byte offsets from Zed should be converted into JavaScript string indexes.

Something like:

function byteOffsetToStringIndex(text: string, byteOffset: number) {
	let bytes = 0

	for (let index = 0; index < text.length; index++) {
		if (bytes >= byteOffset) return index

		const codePoint = text.codePointAt(index)!
		const char = String.fromCodePoint(codePoint)

		bytes += Buffer.byteLength(char, 'utf8')

		if (codePoint > 0xffff) index++
	}

	return text.length
}

Then:

const startByteOffset = Math.min(row.selection_start, row.selection_end)
const endByteOffset = Math.max(row.selection_start, row.selection_end)

const startOffset = byteOffsetToStringIndex(text, startByteOffset)
const endOffset = byteOffsetToStringIndex(text, endByteOffset)

return {
	type: 'selection',
	selection: {
		text: text.slice(startOffset, endOffset),
		filePath: row.buffer_path,
		source: 'zed',
		selection: offsetsToSelection(text, startOffset, endOffset),
	},
}

I think this should fix incorrect ranges for Cyrillic, emoji, and other non-ASCII text when using the Zed DB.

Thanks!

Plugins

none

OpenCode version

1.14.28

Steps to reproduce

  1. Run Zed
  2. Open file
  3. Run OpenCode
  4. Select File with non-ASCII text

Screenshot and/or share link

Before cyrillic Image
After cyrillic Image

Operating System

Windows 11

Terminal

Windows Terminal

Metadata

Metadata

Assignees

Labels

bugSomething isn't workingcoreAnything pertaining to core functionality of the application (opencode server stuff)opentuiThis relates to changes in v1.0, now that opencode uses opentuiwindows

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions