Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Function arguments to Functions be Disallowed by Default (or other mitigation?) #253

Closed
hostilefork opened this issue May 21, 2016 · 3 comments

Comments

@hostilefork
Copy link
Member

hostilefork commented May 21, 2016

At its core, Rebol has the idea that any PATH! or WORD! might be a function, and it might wind up dispatching it. There is no special syntax for function calls or delimiting. Reading a line of code like:

foo baz bar/:mumble frotz

...could be pretty much anything.

>> foo: 10
>> baz: 20
>> bar: [asdf ghijk 30 lmno pqrs]
>> mumble: 3
>> frotz: 40
>> reduce [foo baz bar/:mumble frotz]
[10 20 30 40]

>> foo: func [a b c] [a + b + c]
>> reduce [foo baz bar/:mumble frotz]
[90]

>> bar: object [something: does [print "all your base!" 1000]]
>> mumble: 'something
>> reduce [foo baz bar/:mumble frotz]
all your base!
== [1060]

As per the infamous "deep lake", the goal of being so freeform is to make the language match the freedom of dialect designers. In effect, to use more or less punctuation if they want. Tools exist like parentheses, where the parentheses themselves are language elements which may-or-may not mean what you think parentheses mean in that context.

But one question about this might be "how free is too free", for instance:

check-secret: func [guess] [
    secret: "abracadabra"
    print ["guess:" guess "actual:" secret]
    return guess = secret
]

tricky: func [whatever pass] [
    print ["HAHA YOUR PASSWORD IS" pass]
    quit
]

check-secret (get 'tricky)

The output of this is HAHA YOUR PASSWORD IS abracadabra. By default, a function argument with no type limits on it can be anything... function or value. So without a type constraint that prohibits FUNCTION!, the "guess" can steal parameters that weren't intended for it by someone who only had a single value.

I do not want to bemoan the concept of "security" here for several reasons. One of those is that to my perception, Rebol is a kind of assembly...just a really weird sort of thing in that category in terms of malleability. So to criticize this aspect of its suitability as a tool for writing secure software is the tip of a large iceberg of many other things that it's not intended for. A bit like getting upset that C lets you work directly with memory addresses and "wow, imagine how many bad things can happen".

The question I wonder about is more "how confusing can this system get". Let's look at what might be thought of as a "good use" of passing a function in:

value: none
count: 0
cacher: does [
     unless value [
         print "calculating!" ;-- imagine this were slow
         value: reverse "arbadacarba"
     ]
     print "returning!"
     return value
]

check-secret (get 'cacher)

That will output:

calculating!
returning!
guess: abracadabra actual: abracadabra
returning!
== true

Because check-secret uses guess twice, we see two calls. But it only does the calculation the first time. So here we get a trick, that because guess has no special syntax to say it is-or-isn't a function call, it was able to handle being either. It's being cooperative, though--expecting that the caller wanted it to act "value-like".

But as brought up in #252, the picture gets a bit complicated when GET (or a GET-WORD!) is used. Consider this:

 something: func [arg key [word!]] [
     word: either some-condition ['arg] [key]
     return get word
 ]

Here we see a code author who tries to abstract out the place to get their value from. They'll either get it from the arg parameter, or from the word passed in. This won't work if arg is a function and the intent the caller had was for something to invoke the function and used a call of it as if it were a value.

The Problem Restated

So, passing functions as arguments to functions that expected a value has a lot of problems.

  1. The callee may expect the value to be the same every time it is examined, and if it's a function then calling it every time may return different results.
  2. The callee may not expect multiple inspections of the value to have side-effects, and if it's a function it may have side-effects such that each "inspection" changes program state.
  3. The callee may treat its argument as a word!, and then try to use GET in order to perform some level of indirection...which would subvert the idea that the caller
  4. The callee may wish to take advantage of the GET/ANY nature of GET-WORD! in order to indicate they don't mind if the value is not set, thus inadvertently triggering GET semantics when they didn't intend that.
  5. Functions that wind up in values where they are not supposed to be on accident cause very cryptic errors.

Possible Solution 1: Functions Don't Take Function Arguments By Default

In R3-Alpha, the only value type prohibited by default was the UNSET!. A function which was going to take an unset had to explicitly say that it took ANY-TYPE!.

In Ren-C, there is no UNSET! datatype...at all. Only variables in contexts can be in an unset state. So ANY-TYPE! vanished, and the new annotation was added to the generators. (under the hood, this puts a NONE! literal in the type spec... e.g. make function! [[arg [_ integer!]] [...]]...hence is not a "keyword" of the "kernel", just something the FUNC and FUNCTION generators use).

So this offers up the possibility that the new "all types" type, known as ANY-VALUE!, could be required in order to signal a function can also accept functions. The default would be an acceptance of "anything but"...based on the idea that every other type is not executable by default.

When @earl weighed in on the topic long ago, he said he probably made more functions-that-take-functions than most. The casual convenience of not having to mark them as such helped for quick-and-dirty programming. This isn't a defense of the idea that one can easily write a function that didn't expect a function argument will have any chance of handling it intelligently. Just that "well, what else can it handle intelligently anyway? so why not let the people who want type signatures pay the cost, they really should be putting type signatures on anyway if they want the routines to be any good".

But above I've outlined how the reactions to GET and GET-WORD!, and how basic series operations work, that distinguish this. If you know something is not a function, there actually are a fair number of things you can do with it. e.g. the category division is a bit more meaningful than most.

Question: How's a Function's Parameters Different from Any Object or Word?

So basically, let's say I have a function foo: func [bar] [...] and I know bar is not a function, but it might be a block or an object, and I write something like bar/:baz, or maybe :bar/:baz. All the same issues seem to apply, bar might be a block with a function in it or not. Maybe it's a MAP! with a function in it. There's only one level of protection provided by guarding the function arguments from being functions...and it's going to need to be overrideable anyway.

So given that, is it worth the inconsistency? In practice, how many times do people actually try passing functions as arguments when they weren't expected?

Related Idea: Create an effectively "OPT-WORD!" (operator ?)

A motivator of this line of thought was seeing an increase in the use of GET-WORD!s to fetch values conditionally, as it is a historical way to bypass the UNSET!ness.

Another possibility would be to invent something new. Having in the past looked for a better usage of it than a HELP synonym...what if ? was an operator that quoted its word or path argument, and then would evaluate it if it were a function... if it were any other value type be that value...and be a void if the word was unset. e.g.

foo: ()
bar: does [10]
baz: 20
? foo ;-- generate no error, evaluates to no value/void
? bar ;-- 10, ran the function
? baz ;-- 20, fetched the value

It doesn't solve every part of the general problem. But the problem was always around...it's only that it starts to get noticeably worse the more GET-WORD!s and GET-PATH!s you find yourself actually handling unset things as a more common case.

any [:foo | if set? 'bar [bar] | :baz] => any [? foo | ? bar | ? baz]

It at least helps bridge it so that you can separate out "optionality" from "get" semantics.

One disadvantage is that if ? isn't part of the type, e.g. ?foo for OPT-WORD! and ?what/ever as OPT-PATH!, it would have to be processed as a keyword in dialects or escaped, and it would need to be variadic. Hm...actually, it can't actually work for functions in the sense that it would need to evaluate the arguments to know how far to skip to pass the function's arguments, were the word unset. So only arity-0 functions could work with it. :-/ Leaving it here to still think about it or if it inspires other thoughts.

Other Ideas or Thoughts?

This doesn't really answer whether it's worth it for functions to not include FUNCTION! as legal arguments by default. Perhaps MAKE FUNCTION! the low-level primitive would always be explicit about it, and the FUNC and FUNCTION generators would be the ones who chose.

It seems fairly innocuous to have you need to say any-value!, in the scheme of things...so it's at least a little bit of fair warning. I dunno.

Comments welcome.

@nedzadarek
Copy link

The default would be an acceptance of "anything but"...based on the idea that every other type is not executable by default.

That would make creating functions like this:
foo: func [arg1 [any-value! EXCEPT type1! type2! type3! type4! ... ] [...]
That would make a function definition very big.

 something: func [arg key [word!]] [
     word: either some-condition ['arg] [key]
     return get word
 ]

Well, I would write body of that function as word: either some-condition [arg] [get key], but you can use your call-a-function-or-use-get (aka ?) instead of get

 something: func [arg key [word!]] [
     word: either some-condition ['arg] [key]
     return ? word
 ]

In general, why not use a function! datatype like this (psuedo rebol code):

h: func-with-function-type [
    f [function! [a [integer!] b [string!] c] ]
    arg1 [integer!]
    arg2 [type-of a]
    arg3
][
    f arg1 arg2 arg3
]

This way we can make sure that:
a) argument (f in this example) is a function not some word!
b) f & h takes proper arguments (a & arg1 must be integer!s, b must be string!s. arg2 is going to take b's type)

In addition, we can make 2 new function generators:

  1. one that cannot modify words outside it's function's body, let's call it function:
number: 42
foo: function [a] [number: number + a]
foo 4
print number ; 42 not changed
  1. one that can modify words outside it's function's body, let's call it closure:
number: 42
qux: closure [] [number: "CHANGED" ]
print qux ; CHANGED

@hostilefork
Copy link
Member Author

@nezdadarek:

we can make 2 new function generators:

  1. one that cannot modify words outside it's function's body, let's call it function:
  2. one that can modify words outside it's function's body, let's call it closure:

As I mentioned, in R3-Alpha these ideas are covered by FUNCTION for 1 and FUNC for 2. FUNCTION is called FUNCT in Rebol2, so you might not have seen it:

rebol2> funct [] [x: 10]
== func [/local x][x: 10]

The idea of FUNC being called CLOSURE is an interesting idea. CLOSURE was used for a concept in R3-Alpha that is obsolete in Ren-C, and perhaps arguably what FUNC does is a better "abstract verbal mapping" of what most people think of closures as being. But the way Rebol does things is kind of its own universe so abusing terminology may not be a good idea.

I've already suggested that FUNCTION! be called ACTION! and then the names of these generators doesn't need to matter so much.


The default would be an acceptance of "anything but"...based on the idea that every other type is not executable by default.

That would make creating functions like this:

foo: func [arg1 [any-value! EXCEPT type1! type2! type3! type4! ... ] [...]

The suggestion was merely that the default be "any-value! but function!" as a typeset that you would get if you didn't specify otherwise.

Rebol2 and R3-Alpha currently have the default as "any-type! but unset!" for functions, such as that you override it with ANY-TYPE! if you want it to include UNSET!. This feature is currently done with in Ren-C, and the name ANY-VALUE! is used to mean all the other types. So it would just be a similar situation, where if you wanted to accept any value including a function you would just say ANY-VALUE!. If you wanted to accept a more limited set of types it would be the same as today where you listed them explicitly.

So my suggestion doesn't really have a bearing on the problem you suggest about exclusionary typesets. Though that is a problem, if that's what you want to say. Having a better dialect for specifying typesets would be nice, and it's possible to make one. But a lot of those considerations have to weigh in the question of user defined types...

h: func-with-function-type [
f [function! [a [integer!] b [string!] c] ]
arg1 [integer!]
arg2 [type-of a]
arg3
][
f arg1 arg2 arg3
]

I like the "big thinking" of being able to do such things. With a function generator you could do this...basically it would have the type checking code as boilerplate on top of the low-level function. So your func-with-function-type would actually turn around and generate something like:

func [f [function!] arg1 [integer!] arg2 [any-type!]] [
    if 3 != arity-of :f [
        fail "Expected function of arity 3..."
    ]
    unless find (first types-of :f) integer! [
         fail "Expected argument of f to be integer!"
    ]
    ...
    (f arg1 arg2 arg3)
]

It's a balance on deciding how much to build in, and how much to let users build for themselves. Composite and user defined types is a big thing on the list for the system to know about. But I'm still really grappling with just the "basics", things like how to have a language which doesn't build RETURN in as a keyword and things like that (!) Those are the questions that interested me when I saw Rebol and found things it could not do.

@hostilefork
Copy link
Member Author

Moved the central question of this to the forum, to attract discussion, and so this issue doesn't stay on the books forever:

https://forum.rebol.info/t/should-function-arguments-to-functions-be-disallowed-by-default/167

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants