Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should the language specification use a term other than 'function'? (like 'routine'?) #17769

Open
bradcray opened this issue May 18, 2021 · 20 comments

Comments

@bradcray
Copy link
Member

Currently, the language specification uses the term function to refer to the union of procedures and iterators. At times, when giving talks, this has caused cognitive dissonance from audience members who feel that the use of the term 'function' here is a mismatch w.r.t. what 'function' means in mathematics and/or in functional languages. This has made me wonder, for some time now, whether we should change the term to something else.

My favorite alternative is subroutine, which seems to me to better capture the general notion of "here's a chunk of code that you can call into, which takes any number of input and output arguments, can do pretty much anything, and will (probably) eventually return" than 'function' does, particularly for iterators. Conversationally, I might refer to these as routines for short.

@e-kayrakli
Copy link
Contributor

Why not "procedure"?

@bradcray
Copy link
Member Author

This is intended to be a term representing "procedure (proc) or iterator (iter)". I think the main downside of using "procedure" is that it seems to align too closely with proc and doesn't seem to clearly encompass iter. That's what caused us to use a different term (currently 'function') to talk about the combination of the two.

@e-kayrakli
Copy link
Contributor

Oh, right. It's right in your first sentence.

This tells me that my subconscious never thinks about iterators when somebody says "function". I don't know if "subroutine" is an improvement from that sense. But I don't have an alternative at the moment.

@bradcray
Copy link
Member Author

I don't naturally think of 'iterator' when I hear 'function' either, aside from being trained to by years of working with Chapel. To me, 'subroutine' reads more like "arbitrary chunk of code factored into a proc / iter", but admittedly, I'm not coming at it from a Fortran perspective." (Of course, our definition of 'function' doesn't match Fortran's either).

@damianmoz
Copy link

How is an iterator implemented please? Why is it called a function?

The word subroutine in assembler is used to implement both a procedure returning nothing and a procedure returning one or more values, i.e. what some languages call a function. I have no problems with the word subroutine and I come from a Fortran background. Personally I prefer routine because that is a shorter word but that might not be as precise.

@leekillough
Copy link

Iterators are called coroutines in some languages.

@leekillough
Copy link

I think [sub]routine and coroutine are similar enough, yet emphasize the difference between procedures and iterators.

Subroutine has the connotation of not returning a value, while routine does not, at least not as much.

"Chapel routines and coroutines"

@damianmoz
Copy link

As @leekillough suggested, an iterator is a co-routine, while the others are just routines. And then we just have routines and co-routines. I am not a fan of using sub because it implies something lesser. But I am pretty relaxed about it.

If you followed the lead of Rust which uses the abbreviation fn for a function/procedure, we can call an iterator an it! Sorry, I could not help myself.

@damianmoz
Copy link

In defence of sub-routine, it is used in assemblers for any routine, irrespective of whether (or not) its return is designed to feed data back into data from the calling routine, i.e. whether it behaves as a proc returning a value or one which does not. But we are getting into semantics here which is never a good place to be.

@bradcray
Copy link
Member Author

How is an iterator implemented please?

It depends significantly on the iterator and its use. For the simplest cases, the iterator is inlined at the point of the loop that invoked it, and then the loop's body is inlined back into its yield statements. But in more complicated cases, an object that represents a closure of the iterators state is created.

Why is it called a function?

I would say "no good reason" other than that it was the best term we came up with the last time we discussed this.

I am not a fan of using sub because it implies something lesser.

I think of sub in the sense of being "a small part of a program / module".

But if the prevailing opinion is that we should replace "function" with "routine" for the union of "procedures and iterators" I'm open to that. I'm not interested in renaming "procedures" or "iterators" themselves, nor their keywords. Specifically, I'm mostly interested in the terminology change from the perspective of talking about the language, use in error messages, etc. and wouldn't expect it to be a breaking code change (or want it to be one).

@dlongnecke-cray
Copy link
Contributor

Here to throw a vote for routine as the union of procedures and iterators!

@mppf
Copy link
Member

mppf commented Oct 5, 2021

FWIW I like routine better than subroutine. But either way, I'm not personally too put-off by function, but I certainly have no problem changing it. (I think the hardest part for me will be to stop calling it "function resolution" but "type resolution" or "type and call resolution" is arguably better terminology there anyway).

@bradcray bradcray changed the title Should the language specification use a term other than 'function'? Should the language specification use a term other than 'function'? (like 'routine'?) Oct 28, 2021
@mppf
Copy link
Member

mppf commented Feb 17, 2022

If we make this change we would also want to update a bunch of error messages.

Edit: and also the dyno uast type Function.

@bradcray
Copy link
Member Author

If we make this change we would also want to update a bunch of error messages.

Right, I think if we were to make this change, we'd basically try to git grep -i any uses of function out of our source and documentation.

@mppf
Copy link
Member

mppf commented Aug 5, 2022

I keep producing stuff that would need to change if we change this terminology which makes me think, if we are trying to change it, maybe we should decide that, so new things at least can use the new terminology, even if it takes some time to update the old things.

@bradcray
Copy link
Member Author

bradcray commented Aug 9, 2022

I remain in favor of changing this, but have not taken a straw poll of others (who have not commented on this issue) and don't have much of a read on the situation. This afternoon I sent out a mail to the HPE Chapel Devs to get reactions.

@damianmoz
Copy link

In the days of assembler, they were called subroutines but people with a history of Fortran think 'no value returned'. Also 10 letters.

Similarly, functions in Fortran always returned a value and initially had limitations over subroutines. Functions in C always return a value now, even it is is a void value. And 8 letters.

proc means procedure and to some users from other languages it is a foreign word. Also 9 letters.

Whereas routine keeps the subroutine people happy, has no locked-in concepts about value returns or not, and is only 7 letters.

My vote is for the last but I am only a single little (old) bunny and my ideas could be off with the fairies.

@stonea
Copy link
Contributor

stonea commented Aug 16, 2022

I'm in favor of having consistent terminology for the doc if nothing else.

I don't have a problem calling "the union of procedures and iterators" as functions. I don't think anyone is going to confuse us for a functional language and if/when there's a need to refer to some proc as being a function in the more pure or mathematical or functional programming sense I can just call it something like a "pure function" or a "side-effect free function" or a "idempotent function that has a unique return value for every possible input".

Calling something a "subroutine" gives me flashbacks to coding BASIC but I wouldn't be too opposed to it either.

I imagine I might have to do a bit of explanation for the uninitiated every time I show a piece of Chapel code that starts proc foo and I keep calling foo a Subroutine (after all, it says "proc" right there). Although I suppose the same would be true if I kept calling something labeled "proc" as a function.

@bradcray
Copy link
Member Author

bradcray commented Aug 16, 2022

I imagine I might have to do a bit of explanation for the uninitiated every time I show a piece of Chapel code that starts proc foo and I keep calling foo a Subroutine (after all, it says "proc" right there). Although I suppose the same would be true if I kept calling something labeled "proc" as a function.

To be clear, when talking about a piece of code labeled with proc, I think we should definitely call it a 'procedure' rather than a 'function' (today) or '[sub]routine' (in my favorite counterproposal). Similarly, when talking about an iter, we should just refer to it as an 'iterator'. In my proposed world, '[sub]routine[s]' would only be used in a context where we were talking about both procs and iters without being able to distinguish between them. For example, I might write "objects store fields and have methods (which are the routines that use the type as their receiver)" as a shorthand way of saying "objects store fields and have methods (which are all the procedures and iterators that use the type as their receiver)."

Not that there's anything wrong with saying the latter too, but in practice, I think we have found ourselves wanting to talk about the union of the two using a single term rather than listing both options (e.g., isRoutine() or isSubroutine() might be nicer names for reflection routines than isProcOrIter(), where we might also have isProc() and isIter() for the more specific queries).

[edit: So in your example above, hopefully you'd say "pure procedure", "side-effect-free procedure" or "idempotent procedure" rather than "...function" or "...[sub]routine"].

That said, it'd also be interesting to check the spec and code to see how often we use 'function' to mean "both of these things" vs. something else. With a quick glance, I definitely came across cases that should be using 'procedure' instead of 'function' by the guidelines above. When talking about compilation, "function resolution" comes up a lot, but arguably could be "call resolution" if "subroutine resolution" was too long (and "routine resolution" seems a bit ambiguous in meaning).

I don't think anyone is going to confuse us for a functional language

They may not confuse Chapel as being function, but we have had several people approach us over the years to inquire about defining a pure functional subset of Chapel. And I've definitely had audiences who have asked (with something like shock or scorn in their voices) "In what way is that a function?" (coming more from the mathematical world than the functional programming world, from what I could tell).

(worse, to me, iterators don't seem particularly function-like by any definition).

Googling the terms a bit tonight, I'm seeing lots of contrary opinions of what the terms mean, where each language tends to choose its own terms (for example, some would say subroutines can't return anything, which would go against my intuition and proposal here). It did seem notable that the Wikipedia entry for this class of things was called subroutine:

In different programming languages, a subroutine may be called a routine, subprogram, function, method, or procedure. Technically, these terms all have different definitions, and the nomenclature varies from language to language. The generic, umbrella term callable unit is sometimes used.

I'm not completely opposed to keeping the status quo here (and would still propose tightening up cases that lean on 'function' where they could say 'procedure'). But it is one of those things that's weighed on me as a choice I've felt less happy with over time, which is why I filed this.

@mppf
Copy link
Member

mppf commented Aug 16, 2022

"objects store fields and have methods (which are the routines that use the type as their receiver)"

The first couple times I read this, I didn't even notice you were using the new term. So that indicates to me that routine is fairly intuitive, for me at least.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants