Why "Only lower-case latin letters allowed in names, not ..." #28

boeddeker · 2019-02-14T10:12:06Z

Is there a reason that einops does not support upper latin letters?
I would like to use upper and lower letters.

jotaf98 · 2019-02-14T16:23:15Z

I also wonder about this. How about other characters, e.g. underscores?

One type of name that would be useful to me would be numbers. This is useful e.g. for singleton dimensions. Their values could be set automatically, e.g. rearrange(x, 'a 1 -> a') is much more intuitive than rearrange(x, 'a dummy -> a', dummy=1).

They would still have to be unique (there's no way to disambiguate two singleton dimensions this way) but this is understandable; it would still be useful in many situations regardless.

arogozhnikov · 2019-02-15T05:49:38Z

Thanks for good questions, guys.

Is there a reason that einops does not support upper latin letters?

Yup. Einops enforces better code style (reliable, maintainable and readable code - that what it was created for), this is primarily to prevent 't for token, T for time' and things alike. Be verbose, you'll say thanks to yourself!

That said, this decision is open for reconsidering in future releases if there are important use-cases that current API doesn't meet.

I also wonder about this. How about other characters, e.g. underscores?

Underscores reserved in advance if some additional flags will be required. That way those will be disambiguated with passed dimensions. Other characters can't be passed with kwargs, AFAIK.

rearrange(x, 'a 1 -> a')

Current way is: rearrange(x, 'a () -> a'). Uses existing infra and does not introduce special meaning for 1. Because one would definitely also try rearrange(x, 'a 2 -> a').

boeddeker · 2019-02-15T10:17:13Z

Yup. Einops enforces better code style (reliable, maintainable and readable code - that what it was created for), this is primarily to prevent 't for token, T for time' and things alike. Be verbose, you'll say thanks to yourself!

That said, this decision is open for reconsidering in future releases if there are important use-cases that current API doesn't meet.
I would say it does not enforce better code style, becasue lower case letters aren't better than upper case letters. e.g. rearrange(x, 'a b -> b a') vs rearrange(x, 'A B -> B A'). Furthermore the upper case letters are more common.

When you have squared dimensions I like to use lower and upper case letters to indicate that they have the same size.
An upper case letter would allow to use even more verbose names. e.g. rearrange(x, 'timeDimension -> timeDimension) (Bad example)

Current way is: rearrange(x, 'a () -> a'). Uses existing infra and does not introduce special meaning for 1. Because one would definitely also try rearrange(x, 'a 2 -> a').
I like rearrange(x, 'a 1 -> a') more than rearrange(x, 'a () -> a'), because it is easier to read. Nevertheless rearrange(x, 'a 2 -> a') can be confusing.

jotaf98 · 2019-02-15T15:37:40Z

Thanks for the reply. I don't feel too strongly either way about enforcing style; it just seemed like a simple thing to add.

As for numbers, rearrange(x, 'a 2 -> a') should always throw an error (number of elements on both sides will be different. However, this makes sense: rearrange(x, 'a 2 -> 2 a'). My suggestion (in hindsight it should've been a different issue) was to treat numbers just like named dimensions, with no syntactic difference whatsoever, and with an already known size (in this case 2).

arogozhnikov · 2019-02-15T17:55:37Z

My suggestion (in hindsight it should've been a different issue) was to treat numbers just like named dimensions, with no syntactic difference whatsoever, and with an already known size (in this case 2).

Unfortunately, this produces even more questions down the road:

rearrange(x, 'a 2 -> 2 a')
rearrange(x, 'a 2 2 -> 2 a 2') # which one is which?
rearrange(x, '2 4 -> 4 2') # hey, I thought it is reshape, not transpose!

Axis 1 is the only that does not cause any doubt, but in this case it is an exception in notion (that can't be extended to other numbers).

When you have squared dimensions I like to use lower and upper case letters to indicate that they have the same size.

Thanks for coming up with examples. Wouldn't time1 time2 -> time2 time1 work for this case?

An upper case letter would allow to use even more verbose names. e.g. rearrange(x, 'timeDimension -> timeDimension) (Bad example)

I agree it is not good example :) So far I didn't meet need myself to write long axis names, but camelCased identifiers are easier to read, I agree with that.

boeddeker · 2019-02-15T19:35:50Z

Thanks for coming up with examples. Wouldn't time1 time2 -> time2 time1 work for this case?

Sure, but my "glue" code writes many rearranges and I have common letters for my dimensions.
There I prefare to use single letters and in the moment I use for example 'D T F -> F () T D'.lower() (I like upper case letters more than lower case).

But it is still confusing that rearrange(x, 'a () -> a') is valid and rearrange(x, 'A () -> A') not.

adynathos · 2019-08-28T19:08:31Z

I respectfully disagree.
Lowercase letters are not "better style" than uppercase letters. If anything, uppercase letters are more readable.

If this library is used to implement a formula, it would make sense to follow the symbols and case from the formula.
Thus unicode symbols like greek letters should also work.

Thanks for the trick with .lower(), it will do for now.

shoyer · 2020-08-20T22:54:26Z

+1 for relaxing this. I think a better rule would be supporting any valid Python identifier as a name:
https://docs.python.org/3/reference/lexical_analysis.html#identifiers

Python 3 names are a bit more complex with unicode support, but I would at least support any valid Python 2 name, i.e.,

the uppercase and lowercase letters A through Z, the underscore _ and, except for the first character, the digits 0 through 9.

arogozhnikov · 2020-08-23T04:59:17Z

There is enough consensus to relax requirement. Thanks everyone for input.

From now on

axis name can be any python identifier (camelCase, under_scored, and e.g. greek letters allowed as well)
- please use this freedom wisely
exception: _leading and trailing_ underscores are still booked for extensions
axes that are python reserved keywords trigger warning but not an error (if, for, while, class, etc.)
when there is need to reserve specific names, einops will go thru deprecation warning in one version, then will raise an error

Code in master is already updated to allow new friendly axes.
Last point is in action as well - name axis is reserved for compatibility with planned operations.

shoyer · 2020-08-23T05:57:03Z

Nice! Just to clarify: Is _ a valid axis name, or is that reserved?

arogozhnikov · 2020-08-23T06:26:53Z

Nice catch @shoyer

'_' is reserved and disallowed in rearrange/reduce/repeat
'_' is allowed in parse_shape and has its previous meaning (skip this axis)

arogozhnikov · 2020-09-11T06:03:40Z

This feature became part of einops 0.3 release.
Announcement at reddit/ML

boeddeker changed the title ~~Why "Only lower-case latin letters allowed in names, not '{}'"~~ Why "Only lower-case latin letters allowed in names, not ..." Feb 14, 2019

jotaf98 mentioned this issue Mar 14, 2019

[Feature suggestion] Identifiers not on both sides of the expression #29

Closed

arogozhnikov mentioned this issue Aug 23, 2020

Add support for anonymous axes #62

Merged

arogozhnikov added this to the 0.3 milestone Aug 24, 2020

arogozhnikov closed this as completed Sep 11, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Why "Only lower-case latin letters allowed in names, not ..." #28

Why "Only lower-case latin letters allowed in names, not ..." #28

boeddeker commented Feb 14, 2019

jotaf98 commented Feb 14, 2019

arogozhnikov commented Feb 15, 2019 •

edited

boeddeker commented Feb 15, 2019

jotaf98 commented Feb 15, 2019

arogozhnikov commented Feb 15, 2019

boeddeker commented Feb 15, 2019

adynathos commented Aug 28, 2019 •

edited

shoyer commented Aug 20, 2020

arogozhnikov commented Aug 23, 2020

shoyer commented Aug 23, 2020

arogozhnikov commented Aug 23, 2020

arogozhnikov commented Sep 11, 2020

Why "Only lower-case latin letters allowed in names, not ..." #28

Why "Only lower-case latin letters allowed in names, not ..." #28

Comments

boeddeker commented Feb 14, 2019

jotaf98 commented Feb 14, 2019

arogozhnikov commented Feb 15, 2019 • edited

boeddeker commented Feb 15, 2019

jotaf98 commented Feb 15, 2019

arogozhnikov commented Feb 15, 2019

boeddeker commented Feb 15, 2019

adynathos commented Aug 28, 2019 • edited

shoyer commented Aug 20, 2020

arogozhnikov commented Aug 23, 2020

shoyer commented Aug 23, 2020

arogozhnikov commented Aug 23, 2020

arogozhnikov commented Sep 11, 2020

arogozhnikov commented Feb 15, 2019 •

edited

adynathos commented Aug 28, 2019 •

edited