-
Notifications
You must be signed in to change notification settings - Fork 726
Closed as duplicate
Closed as duplicate
Copy link
Labels
Description
🐛 Bug report
Large _search responses occasionally corrupt arbitrary non-ASCII characters (e.g., Georgian text, emoji, or any extended Unicode). Smaller responses always look correct. The same request may succeed or fail across runs, so this appears to be a transport/streaming issue surfaced through the default @elastic/elasticsearch client.
To reproduce
- Populate an index (e.g.,
persons) with documents that include any non-ASCII strings (Georgian, emoji, accented Latin, etc.). Sample document:{ "id": "person-001", "firstName": "ანა", "lastName": "ტესტაშვილი", "personalId": "00000000001", "lists": [ { "id": "list-entry-001", "content": "😄 🎉 🚀 👍 ❤️ 🙌 Greeting" } ] } - Run:
import { Client } from '@elastic/elasticsearch'; async function repro() { const client = new Client({ headers: { accept: 'application/vnd.elasticsearch+json; compatible-with=8', 'content-type': 'application/vnd.elasticsearch+json; compatible-with=8', }, nodes: ['http://elastic:******@<REDACTED-HOST>:9200'], requestTimeout: 0, }); const res = await client.search({ index: 'persons', size: 200, // ensures >50 hits query: { match_all: {} }, }); res.hits.hits.forEach((hit) => console.log(hit._source?.content ?? hit._source?.firstName), ); } repro().catch(console.error);
- Run the script several times. Some runs log the correct text; others log corrupted output such as
��ამარჯობა, გიორგიor😄 🎉 🚀 👍 ��️ 🙌 Greeting(any non-ASCII characters can be affected).
Expected behavior
Responses should consistently return the original UTF-8 text (Georgian and emoji) regardless of payload size. Example expected output:
გამარჯობა, გიორგი
😄 🎉 🚀 👍 ❤️ 🙌 Greeting
Environment
Node.js version: 20.19.5
@elastic/elasticsearch version: 9.1.1
Operating system: Windows 10/11 x64