Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Value of size 333MB fails to be fetched from the DB #205

Open
janekolszak opened this issue Dec 13, 2022 · 8 comments
Open

Value of size 333MB fails to be fetched from the DB #205

janekolszak opened this issue Dec 13, 2022 · 8 comments

Comments

@janekolszak
Copy link

Hi!
Is it expected that it's not possible to get() values bigger than 333MB?
It is possible to fetch with getBinary().

Thank you!

@kriszyp
Copy link
Owner

kriszyp commented Dec 14, 2022

In my tests, I was able to get values up to 2gb. What error are you seeing?

@janekolszak
Copy link
Author

I'm seeing:

<--- Last few GCs --->                                                                                                                                                                                                                        
                                                                                                                                                                                                                                              
[1104752:0x5672330]     6147 ms: Scavenge 299.4 (333.2) -> 299.4 (333.2) MB, 34.5 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure                                                                                       
[1104752:0x5672330]     7458 ms: Scavenge 491.4 (525.2) -> 491.4 (525.2) MB, 65.3 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure                                                                                       
[1104752:0x5672330]    10035 ms: Scavenge 875.4 (909.2) -> 875.4 (909.2) MB, 128.5 / 0.0 ms  (average mu = 1.000, current mu = 1.000) allocation failure                                                                                      
                                                                                                                                                                                                                                              
                                                                                                                                                                                                                                              
<--- JS stacktrace --->                                                                                                                                                                                                                       
                                                                                                                                                                                                                                              
FATAL ERROR: invalid array length Allocation failed - JavaScript heap out of memory                                                                                                                                                           
 1: 0xa04200 node::Abort() [node]                                                                                                                                                                                                             
 2: 0x94e4e9 node::FatalError(char const*, char const*) [node]                                                                                                                                                                                
 3: 0xb797be v8::Utils::ReportOOMFailure(v8::internal::Isolate*, char const*, bool) [node]                                                                                                                                                    
 4: 0xb79b37 v8::internal::V8::FatalProcessOutOfMemory(v8::internal::Isolate*, char const*, bool) [node]                                                                                                                                      
 5: 0xd343c5  [node]                                                                                                                                                                                                                          
 6: 0xd0cf05  [node]                                                                                                                                                                                                                          
 7: 0xe962ae  [node]
 8: 0xe9b9f4  [node]
 9: 0xe9bcb8  [node]
10: 0xeef18b v8::internal::JSObject::AddDataElement(v8::internal::Handle<v8::internal::JSObject>, unsigned int, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes) [node]
11: 0xf43c92 v8::internal::Object::AddDataProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::PropertyAttributes, v8::Maybe<v8::internal::ShouldThrow>, v8::internal::StoreOrigin) [node]
12: 0xf46f8f v8::internal::Object::SetProperty(v8::internal::LookupIterator*, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin, v8::Maybe<v8::internal::ShouldThrow>) [node]
13: 0x10709c5 v8::internal::Runtime::SetObjectProperty(v8::internal::Isolate*, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::Handle<v8::internal::Object>, v8::internal::StoreOrigin, 
v8::Maybe<v8::internal::ShouldThrow>) [node]
14: 0xdcfb6a v8::internal::Runtime_KeyedStoreIC_Slow(int, unsigned long*, v8::internal::Isolate*) [node]
15: 0x14011f9  [node]

@janekolszak
Copy link
Author

janekolszak commented Dec 15, 2022

This is a small demo (it fails on Ubuntu 20, lmdb-js v2.7.3) with a segfault.

  • Create the big file: head -c 333MB /dev/urandom > bigfile.bin
  • Run this:
/* eslint-disable */
import { open } from 'lmdb';
import * as fs from 'fs';

async function main() {
       const db = open<any, string>({
        path: `./test-cache`,
    });

    const big = fs.readFileSync("./bigfile.bin")
    
    await db.put("id", big.toString())
    console.log("put() success")
    
    let loaded = await db.getBinary("id")
    console.log("getBinary() success")

    loaded = await db.getBinaryFast("id")
    console.log("getBinaryFast() success")
    
    loaded = await db.get("id")
    console.log("get() success")
    await db.close()

    console.log("OK")
}

main().catch((e) => console.error(e));

If you create a smaller file and run it it prints OK:
head -c 1MB /dev/urandom > bigfile.bin

@kriszyp
Copy link
Owner

kriszyp commented Dec 15, 2022

I am kind of wondering if this is a V8 bug. I can actually trigger an error without lmdb at all by doing this with your code:

const big = fs.readFileSync("./bigfile.bin")
let str = big.toString();
let d = new TextEncoder().encode(str);
let s = (new TextDecoder()).decode(d);
console.log(s.length)

I think this error might be occurring in the msgpackr's native decoder and might not be properly handled there, but even if it was, V8 doesn't seem to be capable of decoding this string.

@janekolszak
Copy link
Author

Maybe there should be an option for getRange() to use something like getBinary() ?

@kriszyp
Copy link
Owner

kriszyp commented Dec 22, 2022

Yes, @janekolszak, that seems like a reasonable option to add. I will try to get that in the next release.

@janekolszak
Copy link
Author

janekolszak commented Dec 29, 2022

Maybe lmdb-js decodes values into strings?

There seems to be a limit of 512MB for string size on 32b systems, but I see it on my 64b system with node 19.3.0
(https://developer.mozilla.org/en-US/docs/Web/JavaScript/Reference/Global_Objects/String/length)

Your demo on node 18.12.1 fails with:

TypeError [ERR_ENCODING_INVALID_ENCODED_DATA]: The encoded data was not valid for encoding utf-8
    at new NodeError (node:internal/errors:393:5)
    at TextDecoder.decode (node:internal/encoding:433:15)
    at Object.<anonymous> (/home/jan/work/lmdb/tools/big-file.ts:7:29)
    at Module._compile (node:internal/modules/cjs/loader:1159:14)
    at Module.m._compile (/home/jan/.nvm/versions/node/v18.12.1/lib/node_modules/ts-node/src/index.ts:1618:23)
    at Module._extensions..js (node:internal/modules/cjs/loader:1213:10)
    at Object.require.extensions.<computed> [as .ts] (/home/jan/.nvm/versions/node/v18.12.1/lib/node_modules/ts-node/src/index.ts:1621:12)
    at Module.load (node:internal/modules/cjs/loader:1037:32)
    at Function.Module._load (node:internal/modules/cjs/loader:878:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:81:12) {
  errno: 1,
  code: 'ERR_ENCODING_INVALID_ENCODED_DATA'
}

Your demo on node 19.3.0 fails with:

Error: Cannot create a string longer than 0x1fffffe8 characters
    at TextDecoder.decode (node:internal/encoding:428:16)
    at Object.<anonymous> (/home/jan/work/lmdb/tools/big-file.ts:7:29)
    at Module._compile (node:internal/modules/cjs/loader:1218:14)
    at Module.m._compile (/home/jan/.nvm/versions/node/v19.3.0/lib/node_modules/ts-node/src/index.ts:1618:23)
    at Module._extensions..js (node:internal/modules/cjs/loader:1272:10)
    at Object.require.extensions.<computed> [as .ts] (/home/jan/.nvm/versions/node/v19.3.0/lib/node_modules/ts-node/src/index.ts:1621:12)
    at Module.load (node:internal/modules/cjs/loader:1081:32)
    at Function.Module._load (node:internal/modules/cjs/loader:922:12)
    at Function.executeUserEntryPoint [as runMain] (node:internal/modules/run_main:82:12)
    at phase4 (/home/jan/.nvm/versions/node/v19.3.0/lib/node_modules/ts-node/src/bin.ts:649:14) {
  code: 'ERR_STRING_TOO_LONG'
}

My demo on node 18.12.1 fails with:

put() success
getBinary() success
getBinaryFast() success
TypeError: Cannot read properties of undefined (reading '0')
    at readString (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:568:22)
    at read (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:454:12)
    at checkedRead (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:195:13)
    at Packr.unpack (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:102:12)
    at Packr.decode (/home/jan/work/lmdb/node_modules/msgpackr/unpack.js:174:15)
    at LMDBStore.get (/home/jan/work/lmdb/node_modules/lmdb/read.js:230:70)
    at main (/home/jan/work/lmdb/tools/big-file.ts:22:23)

My demo on node 19.3.0 fails with:

put() success
getBinary() success
getBinaryFast() success
[1]    51080 segmentation fault (core dumped)  ts-node ./tools/big-file.ts

Your demo modified for Deno:

error: Uncaught (in promise) TypeError: Cannot allocate String: buffer exceeds maximum length.
    at async Object.readTextFile (deno:runtime/js/40_read_file.js:56:20)

@kriszyp
Copy link
Owner

kriszyp commented Dec 29, 2022

Maybe lmdb-js decodes values into strings?

lmdb-js (msgpackr) preserves the types of values, so if you encode a string, it will be decoded as a string. And you are explicitly converting your data to a string when it is stored/encoded (so lmdb-js decodes to a string to match what you requested/stored):
await db.put("id", big.toString())

(so if you don't want it decoded as a string, don't store it as a string, store it as a buffer/binary data)

There seems to be a limit of 512MB for string size on 32b systems, but I see it on my 64b system with node 19.3.0

It doesn't seem that surprising that V8 would change this without MDN being updated yet (maybe they felt it was better to be consistent so that there is no behavioral differences that can be observed/detected between architectures).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants