DagSverreSeljebotn WhyILikeSimpleGrammars

DagSverreSeljebotn edited this page May 14, 2008 · 10 revisions
Clone this wiki locally

Why I Like Simple Grammars

Sometimes when syntax is discussed, I like to propose things in line with this (this is an example, there probably won't be a type arguments feature as such?):

cdef class A:
    __typearguments__ = ["T", "N"]

And this is shot down as not being as nice, while presumably

cdef class A:
    cdef typeargument T
    cdef typeargument N

has the nice quality. Well, here's why I like the former: It doesn't change the language grammar.

(Note: This doesn't mean that I'm in favor of pure Python syntax only -- I think my ideal grammar would be something like the union of Python and C, with a few necesarry utilities thrown in. It is when creating rather exotic, Cython-only features that I think one should consider hard whether one can embed it naturally into the existing grammar.)

1. Guido does it!

Take metaclasses as an example. It is a very fundamental (every class uses one) feature that changes Python in many ways.

And what is the syntax for setting the metaclass of a class in Python? Is it this?:

class B(A; metaclass=mymeta): ...

Is it this?:

class B(A) {metaclass=mymeta} : ...

No, it is the almost insultingly simple

class B(A):
    __metaclass__ = mymeta
As a result:
  • Syntax highlighters don't have to change
  • Code analysis tools in Python IDEs don't have to change (unless they care very much about metaclasses)
  • Code prettifiers don't have to change

and so on. If Cython becomes as popular as we hope, Cython might get all of the above -- and any effort to keep the grammar simple has the same benefits.

2. Parser

A simpler grammar gives a simpler parser. (Of course, the contents must still be interpreted somehow, but can happen much higher and in a more flexible way; making the parser component smaller and more easily interchangeable).

Having a very simple parser is very Pythonic by the way.

3. Python compatability

It would be nice to have a mode where Cython-compilable code can run in a pure Python interpreter (mostly for interactive testing purposes). "Syntax" like __typearguments__ above might not do anything in such a mode, but at least it is rendered into noops rather than producing syntax errors. (And some syntax might be emulated by run-time code too.)

If this is one direction one might want to go in, why add new syntax that we know might have to be complemented with a Python-compatible syntax later on, if we can simply use the latter on right away?

4. Flexibility for macros

This is based on a common argument against this; which goes like this: If you write

class A:
    __typearguments__ = ["T", "N"]

then you are really "lying", since the right-hand expression cannot be any run-time expression but has to follow a stricter format than the syntax implies. The first answer to that is that I don't care (so it might be taste). Raise a compiler error telling the user that it is illegal to have run-time expressions on the right-hand side.

However, the same aspect does leave the way open for some limited compile time macros. We don't have to do it, but we don't have to change the Cython language if we do (and this comes as an automatic side-effect)! I.e., perhaps the Cython compiler in some time really can detect at compile-time that this:

common_typeargs = ["N"]
def construct_typeargs(custom):
   return custom + common_typeargs

class A:
    __typearguments__ = construct_typeargs(["T"])

class B:
    __typearguments__ = construct_typeargs(["T", "meh"])

...is compile-time evaluatable and just fine. This doesn't seem completely impossible. You need much better (and more explicit, non-Python-compatible) macro support to do that with a more custom syntax.


One is name collision. Simple: Have all our special fields and special methods start with "cython". The odds of collision then are probably zero.

Another is that in some cases, forcing something into "special fields" or "special methods" might seem cumbersome or lack clarity. That is ok, but please then refer to the cumbersomeness or lack of clarity in the discussion, not a dislike of special fields in general.

More? Please add yours :-)