-
-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
End not working, does not work with require, and not performant. #3
Comments
Hello,
Indeed, it won't work with require. It's based on JavaScript modules. To make it work out of the box in Node.js you can enable ESM. I recommend this if possible. Otherwise you'd have to resort to transpilation or manual intervention. I suspect you have performance issues because your An optimal low level way to do this instead could be like this: import {JsonLow} from './JsonLow.js'
import {JsonLowToHigh} from './JsonLowToHigh.js'
// this is copied from https://github.com/xtao-org/utf8x2x
// it's a simple low-level library for handling utf-8 byte streams as code points
// which is compatible with JsonHilo; it's not prepared for use with Node.js yet, so
// I copied it in here, so you don't have to figure this out manually
// you can treat it as a black box
const Utf8bs2c = (next) => {
const {codePoint} = next
let partialCodePoint = 0
let bytesRemain = 0
return {
bytes: (bytes) => {
for (const byte of bytes) switch (bytesRemain) {
case 0:
if (byte < 128) codePoint(byte)
else if ((byte >> 5) === 0b110) {
bytesRemain = 1
partialCodePoint = (byte & 0b00011111) << 6
}
else if ((byte >> 4) === 0b1110) {
bytesRemain = 2
partialCodePoint = (byte & 0b00001111) << 12
}
else if ((byte >> 3) === 0b11110) {
bytesRemain = 3
partialCodePoint = (byte & 0b00000111) << 18
}
else {
throw Error(`Unexpected byte ${byte} (0x${byte.toString(16)} = 0b${byte.toString(2)})!`)
}
break
case 1:
bytesRemain = 0
codePoint(partialCodePoint | (byte & 0b00111111))
break
case 2:
bytesRemain = 1
partialCodePoint |= (byte & 0b00111111) << 6
break
case 3:
bytesRemain = 2
partialCodePoint |= (byte & 0b00111111) << 12
break
}
},
end: () => {
if (bytesRemain > 0) {
throw Error(`Unexpected end! Expected at least ${bytesRemain} more bytes. Incomplete code point at the end: ${partialCodePoint}.`)
}
return next.end?.()
}
}
}
// creating a stream which will accept bytes and pass them on to successive parser streams
const stream = Utf8bs2c(JsonLow(JsonLowToHigh({
openArray: () => {
// do something
},
closeArray: () => {
// do something
},
end() {console.log('ended')}
})))
process.stdin.on('data', (buf) => {
stream.bytes(buf)
})
process.stdin.on('end', () => {
stream.end()
}) This sketch script uses the Node.js standard input to receive the utf-8 bytes. You'd run it like so:
That's all I am able to explain for now. Hope it's better than nothing. Will try to expand later if you're interested. In the long run I suspect you might be interested by a higher-level library based on JsonHilo I've been sketching out: https://github.com/xtao-org/jsonstrum It may be sensible for your use case. If you send me a sample file which is structured the way your input is and tell me how you want the output to look like I can look into it specifically. Last thing: you shouldn't call You only call Thanks again for filing the issue and considering the library. It's a motivation to improve. Cheers |
Thank you very much for your good advice! I was actually considering parsing based off of the buffer, but it turns out the buffer.toString is pretty swift (< 10% of the total time) but still once I work out the kinks here in my approach could become more significant. Also, looks like I need to brush up on my module/script loading knowledge. I can't send anything at the moment but when I get a chance I'll send some scrubbed data and a sample program. I used sax about 20 years ago when I was using Java primarily, it was so much faster than the state of the art DOM loaders at the time. Nice work! |
End does not seem to work, and performance is not what I expected.
I'm sure I'm doing something wrong
I am reading in a string that is an array, i need to separate the string by the array elements and pass it on. I was using JSON.parse but thought I'd give a SAX like event based parser a try.
Also this doesn't seem to work with require
I get
Instead change the require of mod.js in /home/iklotzko/monorepo/packages/common/lib/utils.js to a dynamic import() which is available in all CommonJS modules.
The text was updated successfully, but these errors were encountered: