-
Notifications
You must be signed in to change notification settings - Fork 126
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Non-byte aligned primitives. #228
Non-byte aligned primitives. #228
Conversation
3 similar comments
Sounds good to me. I'm in no rush. Happy to discuss here as the PR progresses! |
So I added some rudimentary functional support code, so that the expression of the two constructors was more clear in its intent. I think that other code in the file could be refactored to take advantage of this code possibly. For bitSeq, I put in the constraints that you cannot request bits that don't total a byte boundary, as well as the Javascript 6-byte Number restriction. That should help protect some people. For bitSeqObj, I tried to maintain relative parity with the interface for seqObj, with the obvious caveat that we are not combining other parsers, but creating one. I also moved the buffer related constructors into a Parsimmon.buffers namespace, because I think it might be confusing to have them side by side with the regular string processing constructors. |
I really like the idea of putting the new buffer utilities into a buffer namespace! I think Also, we should check for buffer support when constructing the API. function noBufferOops() {
throw new Error(
"Buffer global does not exist; please consider using https://github.com/feross/buffer if you are running Parsimmon in a browser"
);
}
if (hasBuffer) {
Parsimmon.Binary = {
bitSeq: bitSeq,
bitSeqObj: bitSeqObj,
byte: byte
};
} else {
Parsimmon.Binary = {
bitSeq: noBufferOops,
bitSeqObj: noBufferOops,
byte: noBufferOops
};
} Thanks again for your work on this so far! |
Works for me, will implement tomorrow.
…On Mar 8, 2018, 5:59 PM -0600, Brian Mock ***@***.***>, wrote:
I really like the idea of putting the new buffer utilities into a buffer namespace!
I think Binary might be a better namespace than buffers though.
Also, we should check for buffer support when constructing the API.
function noBufferOops() {
throw new Error(
"Buffer global does not exist; please consider using https://github.com/feross/buffer if you are running Parsimmon in a browser"
);
}
if (hasBuffer) {
Parsimmon.Binary = {
bitSeq: bitSeq,
bitSeqObj: bitSeqObj,
byte: byte
};
} else {
Parsimmon.Binary = {
bitSeq: noBufferOops,
bitSeqObj: noBufferOops,
byte: noBufferOops
};
}
Thanks again for your work on this so far!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub, or mute the thread.
|
So I did the buffer check at runtime instead of parse/module time so I could actually cover the code with tests. I figured since the constructors are not something you are typically (if ever) rebuilding at runtime that it's not a huge deal. |
Good point about the runtime check. It shouldn't actually affect parser speed, yeah. And it makes it easier to polyfill Buffer (if someone polyfills it after Parsimmon, for example) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR is looking great! Here's a few things I'd like to see changed when you have time.
API.md
Outdated
|
||
Returns a parser that yields a byte that matches the given input. Similar to digit/letter. | ||
|
||
## Parsimmon.bitSeq(...alignments) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you should write [...alignments]
or just alignments
here since the current syntax style makes me immediately think you'd call it like bitSeq(1, 2, 5)
not bitSeq([1, 2, 5])
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's a good point, do you think I should make it based off arguments rather than an array?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Well, I actually really don't like using arguments
and would eventually like to change the other functions to use explicit arrays instead (I just haven't because I don't think I can do this without a breaking API change; maybe detect if there's exactly one param and it's an array? could mess up seqObj
though I think—), so personally I'm cool with arrays here
|
||
## Parsimmon.byte(int) | ||
|
||
Returns a parser that yields a byte that matches the given input. Similar to digit/letter. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Even though this is simple, I would still like examples for every function/parser in Parsimmon.
Parsimmon.byte(0xf0).parse(Buffer.from([0xf0]);
// ...the output
src/parsimmon.js
Outdated
function bitSeq(alignments) { | ||
var totalBits = sum(alignments); | ||
if (totalBits % 8 !== 0) { | ||
throw new Error("Bits do not sum to byte boundary."); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think maybe this error message could be a bit more clear if it printed out the whole alignments array and also reminded you that "sum to a byte boundary" means divisible by 8.
Maybe something like "the bits [" + alignments.join(", ") + "] do not add up to an even number of bytes; the total should be divisible by 8"
or something?
src/parsimmon.js
Outdated
}, alignments); | ||
if (tooBigRange) { | ||
throw new Error( | ||
tooBigRange.toString() + |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
.toString()
shouldn't be necessary here, I think
src/parsimmon.js
Outdated
@@ -612,6 +767,11 @@ function string(str) { | |||
|
|||
function byte(b) { | |||
assertNumber(b); | |||
if (b > 0xff) { | |||
throw new Error( | |||
"Value specified to byte constructor is larger in value than a single byte." |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it would be nice to print back out the value of b
here and format it as a hex number, I think
src/parsimmon.js
Outdated
} | ||
return f.apply(null, arguments); | ||
}; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather this was just a function you call inside each of those Parsimmon.Binary
functions because this is going to mess up the function names inside the console and have them all show up like:
> Parsimmon.Binary.bitSeq
[Function: ensureBuffer]
Which will look weird, I think.
test/core/byte.test.js
Outdated
@@ -1,21 +1,46 @@ | |||
"use strict"; | |||
/*global context, before, after*/ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rather than using /* global
please just add these globals to https://github.com/jneen/parsimmon/blob/master/test/.eslintrc#L5
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For this I actually just added the mocha env to eslint. I just forgot to remove this line.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh that's much better, thanks
[], | ||
arr | ||
); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you don't mind, I think moving arr
to be the first argument in these functions will make the code read better
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wrote this to be more in the style of like ramda, where the point in the function comes last. Putting the point at the beginning of the function makes it harder to work with if you start to use them in a partially applied way.
If you really like the array-first style, what do you think about doing a newtype wrapper style set of helpers?
like
function Func(f) {
this._f = f;
}
Func.prototype.times = function times(n) {
var self = this;
return new Func(function () {
var i = 0;
for (i; i < n; i++) {
self._f(i);
}
});
};
Func.prototype.run = function run() {
return this._f();
};
function Enum(arr) {
this._arr = arr;
}
Enum.prototype.forEach = function forEach(f) {
return new Enum(new Func(f).times(this._arr.length).run());
};
Enum.prototype.reduce = function reduce(f, seed) {
this.forEach(function (elem, i, arr) {
seed = f(seed, elem, i, arr);
});
return seed;
};
Enum.prototype.map = function map(f) {
return new Enum(this.reduce(function (acc, elem, i, a) {
return acc.concat([f(elem, i, a)]);
}, []));
};
Enum.prototype.value = function value() {
return this._arr;
};
new Enum([1, 2, 3]).map(function (x) { return x + 5; }).value();
//[ 5, 6, 7 ]
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually use to be a ramda 🐑λ contributor so I definitely know the style :) anyway it's not really a big deal so if you really like it array-last just keep it 👌
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the newtype wrapper approach lends itself more towards the es6 built in array methods down the road. But I personally use point free style heavily in my own code, but as you know that typically doesn't come up much unless you're doing composition, and I don't even have a compose/pipe helper currently.
Since you're moving the other var hasBuffer = typeof Buffer !== "undefined";
function isBuffer(x) {
/* global Buffer */
return hasBuffer && Buffer.isBuffer(x);
} |
This is looking great. Is there anything else you wanted to add? |
At this point I think this is all I need to actually parse all of the binary stuff I am parsing. I can always open another PR down the road to add new features to it if something comes up. Do you have any other things I need to address before we can move forward with it? |
When I have some time this evening I'm gonna take a deeper look (particularly at testing), but it's looking really good right now! Thanks for your patience working through this together with me. |
Looks good to me overall. I'm gonna merge this in and update the changelog and do a release tomorrow. I usually tweet when I release a new version, and I'm happy to give you a shoutout if you'd like. P.S. If you're interested in discussing Parsimmon 2.x, hop over to #230 |
Sure, wouldn't mind a shout, I'm @emptylambda on Twitter. |
Opening this PR before I begin work so that discussion can reflect over the life of the PR. #224