New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Hide internal properties from Ecmascript code #979
base: master
Are you sure you want to change the base?
Conversation
5ee08a1
to
e0eed06
Compare
This prototype branch is now rebased but it's probably easiest to first figure out ES6 Symbol approach #980 and merge in basic Symbol support first. |
+1 for Symbol support. :) |
Following up on #980 (comment), it might make sense, if this is not already the case, to add a compiler option which would disable That said, Duktape implements Node.js Buffer, which has a native "buffer to string" primitive ( |
Adding some kind of "sandboxing safe" option would be possible but it's maybe better implemented as a configure.py profile because there may be multiple features that need to be controlled. But even when doing so, every C binding written by the user potentially allows a 1:1 buffer-to-string conversion by accident, so this pull - or some other solution - is still necessary for better sandboxing. One possible alternative solution would be an actual symbol type, and make the internal properties symbols that cannot be created via buffers. That would definitely be a more clean solution conceptually, but I'm not sure if footprint agrees ;) |
I do wonder how much of a footprint issue a new tagged type would actually be, considering all the special casing that now needs to be added to string handling... ;-) |
There isn't much in the way of special casing: handling of internal strings is already in place. Symbols deviate from that only in a few locations. |
Also just a tagged type wouldn't actually be enough: object property table keys are a list of untagged |
My rough guess would be that a separate tagged type would be around 4-5kB larger: it would affect internals here and there for probably around 1-2kB, and it would require a new type tag in the API with all the associated API calls. That's of course just an informed guess :) For the stripped build it would mean roughly a 4% increase of footprint which is not huge but still quite large considering the RegExp engine is less than 10kB total. It really is an honest design trade-off with several viable choices. I tend to favor low footprint choices because footprint caused by code structures is very difficult to rein in. |
Hey, as long as Ecmascript compliance is honored, I'm happy with whatever implementation you decide is best. :) |
One middle-of-the-way implementation approach I've toyed around:
The upside of this that a new tagged or API type wouldn't needed but symbols and strings are still entirely separate and you can't create them even via custom buffer operations. Footprint-wise this would work very well. For C code it'd be a little bit awkward. In general as a C coder I'd prefer strings and symbols to be strings with no internal NULs. This would allow me to pass them around as This is a downside common with any approach introducing an actual symbol type though. On the other hand it can also be considered an upside because then also C code won't accidentaly mix property and symbol lookups. That can be achieved in other ways too, e.g. with this pull, |
On the other hand, C code may want guaranteed unique symbols, and in that case it would have to get a symbol through the API (the equivalent of calling |
Sure it'd be useful to have an API to create a unique symbol. But it could then behave like any other string from the API perspective. Or not, depending on which approach is used. |
Of course, I was just pointing out that that's a weak argument in favor of "symbols == strings" because the C code may want to create unique symbols through the API in either case. There are still other benefits, of course. |
By the way, I'm not always opposing you with my arguments, sometimes I just like to play devil's advocate. :) |
Yeah, but may main concern with the API is not really just creation of the symbols - but for example:
Symbol creation is by far the smallest concern :) And it'd actually be a useful API even now: it would allow hiding the \xFF prefix from user code that didn't want to specifically deal with it. |
Agreed, those were the "other benefits" I mentioned. :) |
Just as a side note, in this example: #define MY_SYMBOL ("\xa0" "fooSymbol") one could: #define MY_SYMBOL DUK_MAKE_SYMBOL("fooSymbol") Similarly, for existing internal properties: #define MY_INT_PROP DUK_MAKE_INTPROP("fooBar") /* -> "\xFF" "fooBar" */ This would hide the concrete prefix from user code, and would also avoid any potential issues with hex escape ambiguity (which makes it necessary to define the string in parts). This wouldn't need to be explained to users. |
Draft of an approach where internal properties are hidden from user Ecmascript code even when the correct (internal) key is used. Internal properties can only be accessed using the C API, which should fulfill sandboxing requirements for protecting the internal properties securely.
e0eed06
to
855c36f
Compare
I'll drop this from 2.0.0 until the symbol typing issues have been resolved. |
Draft of an approach where internal properties are hidden from user Ecmascript code even when the correct (internal) key is used. Internal properties can only be accessed using the C API, which should fulfill sandboxing requirements for protecting the internal properties securely.