-
Notifications
You must be signed in to change notification settings - Fork 65
Conversation
rt/rt.mjs
Outdated
@@ -124,15 +126,16 @@ function rt(engine) { | |||
// that's not specified on the Scheme language level. Using JS | |||
// strings for symbols allows us to compare with ===, as is | |||
// required. | |||
'string?': x => typeof(x) === 'string' && x.charAt(0) === 's', | |||
'string?': x => x instanceof String && x.charAt(0) === 's', |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like it's getting into subtleties of JavaScript that I'm not as familiar with. What's the difference here between typeof(x) === 'string'
and x instanceof String
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
From what I understand typeof
returns a string indicating the type of a primitive value or returns'object'
for everything else. In this case,instanceof
looks for String.prototype
in the prototype chain of x
.
To hopefully get the right behavior with eq?
I modified the functions that construct strings to wrap with new String(s)
(%list->string
and %symbol->string
). Now, strings and gensyms are string objects, while symbols are still primitive string values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pretty sure the typeof check is cheaper. Would be nice to preserve speed for string literals.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just ran 9 runs of benchmark-compiler.sh
on the current master and this change, and averaged the results for the stage3 compile.
Master: 0.684 seconds
This PR: 0.706 seconds
So, this version is about 3% slower (with all the caveats around whether compiling the compiler is a good benchmark; if I recall correctly, Schism doesn't a lot of string comparison, but maybe those are just leftover memories from my horrible initial implementation of symbols).
It's not a small regression, and I'm willing to take it for correctness, but it's the kind of thing that could add up over time. Maybe we can do this in a way that has equivalent performance?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Another downside to this pr is that the "fix" complicates the rt
somewhat with mixed benefits. Factoring in the performance hit and the argument against mutable strings, a split from the standard seems reasonable, at least without a better solution to get the desired behavior.
How javascript treats strings and how schism uses ===
for eq?
, makes it not obvious to me how else to get the distinguishable string behavior. Because ===
compares for value equality with 'string'
s but not with 'object'
s, wouldn't we need to change eq?
or the representation for strings? Working with eq?
doesn't really side step the problem at any rate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like what we'd have to do is wrap Schism strings in a JavaScript object
to make ===
do the right thing.
It's kind of bizarre to me that JavaScript doesn't have a reference equality operator. Or, maybe it does for objects but strings aren't considered objects. Maybe the idea is that strings are values like integers are.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gather there are string values but also string objects. Here, I'm wrapping strings by new String()
constructor to get ===
to compare them by reference.
Actually, looking through the TSPL4 book, symbol->string
strings should be treated differently. I just pushed some changes to allow for both behaviors.
The %string=?
predicate first checks if typeof
gives 'string' before doing the isinstanceof
check. Hopefully that recovers some of the performance, and not make it worse!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just ran my benchmark again, although on a different computer this time. The average of 9 runs with this change was 1.117 s versus 1.138 s without. So, perhaps slight improvement, but likely just within the noise. I think we're at performance parity now.
Thanks for sticking with this!
JavaScript primitive strings are immutable and there isn't a way to check them for equality by identity -- you can only compare them by value. Given that you've chosen have immutable strings as well, you may want to consider whether you actually need JavaScript Thus if you're doing JavaScript interop, you'll want to ensure that you unwrap your This can be confusing because if you accidentally pass a Given the potential for confusion, if you want to wrap strings in a JavaScript object so that you can test for equality by identity, you may want to use your own JavaScript object wrapper instead of using JavaScript's |
Thanks for the info, @awwx. It would potentially make sense to fold all of Schism's values into a For now, I think this PR is in a good state, so I'm going to go ahead and merge it. We may want to iterate some more on this, but I think this is a good step for now. |
I think this fixes the reference/value equality issue for strings (#97).
%string=?
to rt.mjs which usesString.valueOf()
when checking for equality.%string=?
is used bystring=?
and has been added toruntime-imports
andeffect-free-callee?
Would it be better to write
string=?
in pure scheme? I was also wondering about how functions likestring-length
andstring-ref
should be added.