Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Agreeing on doc, and doc contribution style guide, about hash construction; {...}, %(...), hash(), ... #2117

Closed
Juerd opened this issue Jun 22, 2018 · 27 comments

Comments

@Juerd
Copy link
Contributor

commented Jun 22, 2018

The problem

The style guide suggests using {} for empty hashes and %() for non-empty hashes. This is inconsistent.

Suggestions

Since {} is considered confusing (although technically it's supposed to be straight forward) with blocks, and empty %() is considered a trap (see rakudo/rakudo#1946), why not just use hash instead? According to jnthn in rakudo/rakudo#1946 (comment) these are a "sure bet".

Suggested: avoid both %() (empty construct is a trap) and {} (possible confusion with blocks) for constructing hashes, and always use hash instead:

my $foo = hash;  # empty hash
my $bar = hash a => 'b', c => 'd';  # non-empty hash

Note: these can be used with or without parentheses. Since that goes for most expressions, that could probably use its own recommendation in the style guide.

@Juerd Juerd changed the title Create hashes with `hash` Create hashes with &hash Jun 22, 2018
@JJ

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

@JJ JJ added the meta label Jun 22, 2018
@rafaelschipiura

This comment has been minimized.

Copy link
Contributor

commented Jun 22, 2018

I'm completely against this. You'll simply not see code like that in the wild. People should be exposed to code somewhat like they will encounter in the ecosystem .

And the {} %(...) distinction is fine and, besides a WAT at first, works just fine in practice.

@Juerd

This comment has been minimized.

Copy link
Contributor Author

commented Jun 23, 2018

You'll simply not see code like that in the wild. People should be exposed to code somewhat like they will encounter in the ecosystem.

Well, is the documentation to be descriptive or prescriptive? The existence of a style guide, and the exclusion of { ... } for hashes because it can be confused with blocks, both strongly suggest prescriptivism. But your argument is based on descriptivism. That's a fair point of view, and maybe the example code in the documentation should indeed look like the code in the ecosystem, but in that case the documentation should not shy away from using { ... } for non-empty hashes.

@Juerd

This comment has been minimized.

Copy link
Contributor Author

commented Jun 23, 2018

By the way, if the style guide does want to prescribe the use of %(...) for creating hashes, instead of hash, the following uses of %(...) with an Iterable of zero items seem to reliably return a new empty hash, unlike empty %():

%(())
%([])
%({})  # lol
%(Empty)
%{}

And then there are%(Nil), %(Any), and even %(Int). I don't know why Any.hash(Any:U:) exists, but it enables a lot of new and confusing ways to create an empty hash. How about %(X::Promise::CauseOnlyValidOnBroken)? ;-)

@Juerd

This comment has been minimized.

Copy link
Contributor Author

commented Jun 23, 2018

There are some interesting performance differences, too!

Benchmark:

use Test;

my $foo;
my %subs := {
    'hash'     => sub { $foo = hash },
    '%(Nil)'   => sub { $foo = %(Nil) },
    '{}'       => sub { $foo = {} },
    '%(Empty)' => sub { $foo = %(Empty) },
    '%({})'    => sub { $foo = %({}) },
    'my %'     => sub { $foo = my % },
};

for %subs.kv -> $k, $v {
    say "Testing $k";
    $foo = "bad";
    $v();
    isa-ok($foo, Hash); 
    is($foo.elems, 0);
};

Bench.new.timethese: 1000000, %subs;

Output:

Testing {}
ok 1 - The object is-a 'Hash'
ok 2 - 
Testing %(Nil)
ok 3 - The object is-a 'Hash'
ok 4 - 
Testing hash
ok 5 - The object is-a 'Hash'
ok 6 - 
Testing my %
ok 7 - The object is-a 'Hash'
ok 8 - 
Testing %({})
ok 9 - The object is-a 'Hash'
ok 10 - 
Testing %(Empty)
ok 11 - The object is-a 'Hash'
ok 12 - 
Benchmark: 
Timing 1000000 iterations of %(Empty), %(Nil), %({}), hash, my %, {}...
  %(Empty): 2.790 wallclock secs (2.797 usr 0.000 sys 2.792 cpu) @ 358382.348/s (n=1000000)
    %(Nil): 1.777 wallclock secs (1.779 usr 0.000 sys 1.779 cpu) @ 562757.602/s (n=1000000)
     %({}): 3.194 wallclock secs (3.194 usr 0.000 sys 3.194 cpu) @ 313061.358/s (n=1000000)
      hash: 0.139 wallclock secs (0.150 usr 0.004 sys 0.154 cpu) @ 7211365.111/s (n=1000000)
      my %: 0.112 wallclock secs (0.110 usr 0.004 sys 0.114 cpu) @ 8926817.946/s (n=1000000)
        {}: 3.263 wallclock secs (3.270 usr 0.000 sys 3.270 cpu) @ 306425.967/s (n=1000000)

Going by the benchmark, &hash is the obvious winner, if you ignore the more obscure my %, which is basically how &hash is implemented. Does anyone understand why it is so much faster, though?

@labster

This comment has been minimized.

Copy link
Member

commented Jun 23, 2018

You forgot Hash.new and Hash(Empty) and Hash(()). Hash() returns Hash(Any) which is not what we want. TIMTOWTDI!!!

Benchmark (sorted), because it's interesting:

      %( ): 0.102 wallclock secs (0.099 usr 0.000 sys 0.099 cpu) @ 9844166.839/s (n=1000000)
      my %: 0.104 wallclock secs (0.101 usr 0.000 sys 0.101 cpu) @ 9642364.694/s (n=1000000)
      hash: 0.126 wallclock secs (0.136 usr 0.000 sys 0.136 cpu) @ 7956588.851/s (n=1000000)
    %(Nil): 2.418 wallclock secs (2.408 usr 0.003 sys 2.411 cpu) @ 413502.511/s (n=1000000)
  %(Empty): 2.718 wallclock secs (2.720 usr 0.006 sys 2.726 cpu) @ 367961.179/s (n=1000000)
  Any.hash: 2.720 wallclock secs (2.668 usr 0.000 sys 2.657 cpu) @ 367586.650/s (n=1000000)
  Hash.new: 2.793 wallclock secs (2.797 usr 0.021 sys 2.818 cpu) @ 357985.530/s (n=1000000)
        {}: 4.083 wallclock secs (4.065 usr 0.009 sys 4.074 cpu) @ 244889.523/s (n=1000000)
     %({}): 4.150 wallclock secs (4.157 usr 0.007 sys 4.164 cpu) @ 240949.921/s (n=1000000)
  Hash(()): 4.641 wallclock secs (4.639 usr 0.016 sys 4.655 cpu) @ 215458.872/s (n=1000000)
Hash(Empty): 5.027 wallclock secs (4.991 usr 0.003 sys 4.994 cpu) @ 198930.114/s (n=1000000)
@JJ

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

@JJ

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

@labster

This comment has been minimized.

Copy link
Member

commented Jun 23, 2018

I just did at talk at TPC about PHP, so I've been talking about that language a bit lately. One of the things I discussed with people afterwards is that PHP is a better language now, because (among other things) you can use square brackets [] for arrays now, instead of the old array(). So let's recommend everyone use hash() in Perl 6? Um, no.

Everything suggested here is a move backwards in readability and style, including my own previous suggestions. %() is a misfeature, but %( ) works, and it is in the same speed area as the entirely unreadable my %.
%( ): 0.102 wallclock secs (0.099 usr 0.000 sys 0.099 cpu) @ 9844166.839/s (n=1000000)
The anonymous variables are not really beginner material IMO, and the anon hash is easily confused with modulo.

So I propose {} and %( ) are the best recommendations. Use the space, Luke.

@raiph

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

While hash is a sure bet, I think we need to realize that it was always just poor doc that turned this into a problem. I urge folk to consider a return to respectful acceptance of the simple and natural nature of Larry's original DWIM and a recognition that it has successfully taken root.

Please note my latest StackOverflow attempt at formulating the precise {} hash vs block DWIM rule.

I'm going on a trip for a week, but if everyone is happy with it I will be happy to integrate it into the doc a couple weeks from now.

Imo the right way forward for the docs is to show respect for Larry's DWIM. This is consistent with jnthn's perspective and the majority of existing community usage and, I think, preferences, though I acknowledge some strong opinions against it. (I return to this aspect at the end of this comment.)

Perhaps the WAT side of this DWIM belongs on the traps page but I'm unsure. Please read the latest version of my Reddit discussion of this DWIM/WAT that complements my SO answer.

Imo the right way forward for the docs contributor style guide is to recommend use of the DWIM in all cases except where documenting the %(...) alternative and the hash routines. The current guide text contains multiple factual errors and points in the wrong direction imo:

Prefer the %() form of declaring hashes

Imo this should be "Prefer the {} form of declaring hashes".

my %hash := { this => "is", a => "hash" }; # Correct, but BAD
my %hash := %( this => "is", a => "hash" ); # GOOD

Imo this is inappropriately "disrespectful" to Larry's DWIM. (I get that the "disrespect" isn't emotive but just trying to be clear.)

Using the second form is more idiomatic

Idiom is natural to a native speaker. I'd argue that's much more {} than %().

and avoids confusion with blocks.

As we know, it unfortunately introduces confusion with %() which won't go away for a few years.

See also my discussion of strange consistency with hash subscripts and ugliness in pair literals in my reddit comment linked above.

This does not apply to empty hashes, which should be declared using {}:

Switching guidance to be use of the DWIM will eliminate this exception.

I understand that samcv and alex and others got burned over this dwim. I think that was down to poor doc. Perhaps my poor original SO answer didn't help. (Hopefully it's not so bad now.) I understand that what I'm saying almost certainly won't resonate for them. Getting burned a few times before the right mental model clicks leaves a lasting impression, especially if the right mental model still hasn't clicked.

But I'm hopeful that Larry's view (which is that the DWIM made sense), and the majority newbie view (which is that the DWIM is fine if no one tells them about it but rather just exposes them to it), and the view of some old hands (jnthn finds the DWIM simple and natural, as do many others including me), and the evident widespread use of this DWIM, and the prospect that improved doc and community understanding will be in place, are persuasive that disrespecting Larry's DWIM isn't the way to go but rather embracing it, at least for now, until we perhaps revisit this in another few years once %() has completed a deprecation cycle.

@raiph raiph changed the title Create hashes with &hash Agreeing on doc, and doc contribution style guide, about hash construction; {...}, %(...), hash(), ... Jun 23, 2018
@raiph

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

I posted my prior comment before reading the latest comments about speed. Those look dramatic and perhaps trump everything else. Perhaps {...} can be sped up? How is speed for :key{:a,:b} vs :key(%(:a,:b))?

I still think it makes sense to generally use {} as a hash constructor in docs, when an explicit constructor is needed, because it reads very well. We can document %(...) as being way faster and avoiding the {} hash/block DWIM/WAT though having the %() WAT.

@JJ

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

But really, what is WAT?
I kind of agree with using {} so I guess that's an actionable thing that can be done right now. Do you still want this issue to be assigned to you to complete it?

@AlexDaniel

This comment has been minimized.

Copy link
Member

commented Jun 23, 2018

But really, what is WAT?

http://knowyourmeme.com/memes/wat

@AlexDaniel

This comment has been minimized.

Copy link
Member

commented Jun 23, 2018

Getting burned a few times before the right mental model clicks leaves a lasting impression, especially if the right mental model still hasn't clicked.

I'm not so sure. Looking at your SO answer, it's a huge explanation on when { } creates a block and when it's a hash… I wish we simply didn't need to go into all these details to explain such a basic thing (it should be obvious, not just understandable). With %() that's much easier when it comes to creating hashes, and in the long run the issue with empty %() will be resolved.

That said, most of the cases when I was burned were other way around – a block that I for some reason wanted empty, but got a hash instead ({}). So using %() (which I already do) won't help me.

@rafaelschipiura

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

My point is prescriptive, there's no real problem with using {...} for both Blocks and Hashes.

It's a problem that only exists on the minds of people that had their perception warped by computer science. I mean no offense by this, everyone changes their perception because of their background.

This discussion came up some time back and it was decided the current situation was fine, therefore I'm closing the bug with WONTFIX again.

@AlexDaniel

This comment has been minimized.

Copy link
Member

commented Jun 23, 2018

This discussion came up some time back and it was decided the current situation was fine, therefore I'm closing the bug with WONTFIX again.

Link?

@AlexDaniel AlexDaniel reopened this Jun 23, 2018
@rafaelschipiura

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

@AlexDaniel Sorry, I can't find a link. The only thing I can say is that any seasoned Perl 6 developer is already burned out from this discussion.

So, let me summarize what I think we already have a rough consensus about (if not, just tell):

Hash.new() in the docs will never gonna happen. It's not idiomatic.
The {...} Block/Hash constructor DWIM/WAT is fine, no need to change.
%(), @(), $() is an anti-feature and will be removed in the future, so the docs should completely avoid using it at all, except to warn people about it.

Now, should the docs use {...} or %(...)? I think the former to expose people to it, which takes away a bit of surprise it might have. There's also no need to go into how it works extensively, I think. Just show it being used and tell people how to get the other result in case the compiler decides to do a WAT. We also don't go into how self-clocking works when explaining terms and operators, we don't go into how Grammars tie into the OO system when explaining it, don't go into Grammars when explaining regexes, and we also don't wonder into the MOP when explaining how to write objects, among many other examples.

@rafaelschipiura

This comment has been minimized.

Copy link
Contributor

commented Jun 23, 2018

And if the chosen form is slower, that's a problem for rakudo developers to solve, we should torture the implementation in behalf of the user.

There's no motive for the different forms to take different amounts of time, the compiler is smart enough to know they mean the same thing and to choose the faster implementation.

@JJ

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2018

So, I would say we have several options. I call a vote:

  1. Prescript the {} form for all hashes: use 👍
  2. Eliminate all kind of prescriptions in the style guide: use 👎
  3. Leave it just like it is: use 🎉
  4. Use hash as proposed initially: use ❤️

I'll leave this open for 48 hours, tally votes, and do whatever simple majority decides. Any other thing you want to propose, comment here.

@Altai-man

This comment has been minimized.

Copy link
Member

commented Jun 24, 2018

On regard of my vote, I have spend some time following the conversation(s), so will try to summarize things a little(maybe more for myself than for others though):
1)Magic behavior of %() is set to be removed, yet I have no strong feelings toward using it everywhere: in my humble preferences, {} is just visually neater.
2)I, indeed, had situations before where I was somewhat confused about Block vs Hash situation, but that is not the most horrible thing I've encountered and it took not so much time to fix things.
3)In my opinion, part of us is concerned about user experience in language grammar/semantic, to do it intuitive, yet while it is a great goal to strive for, I'd like to note that such small things exist in every language and it is not the things people usually use to decide on whether to use programming language foo or not. How much is it popular, how many jobs are there, how great ecosystem is, how fast is it, how stable is it, et cetera - you get the idea. Intuitive is a plus for sure, but well, it is IT, people are used to deal with stuff.

Considering the above, my vote is for:
1)Remove prescription for hashes only(I am not sure if it was the indention behind "Eliminate all kind of prescriptions" - this is a bit too extreme imho). As there are quite a lot options to create a hash, I think using most common/plain {} and %() depending on situation/author's style is the best.
2)I don't think we need so much consistency in docs to the extend of "use only One True Form", it is against TimToady.
3)Also, I don't think that performance of some construct should be considered as a rule here. Firstly, it is core team thing to care about various syntax forms for basically same semantics to be equal performance-wise. Secondly, we are describing language, not current implementation. If it happens that tomorrow user Foo will bring out a patch that makes my %foo = WeIrD-way (two spaces) before mah hash: one => 1, two => 2 fastest way to create a hash, lets say 100x faster, will we change all docs to use that form? I doubt that. However, we surely have page named "Performance" - if some folks want more speed, it is up to them to go to that page and read various ways to uglify code to make it faster.

@zoffixznet

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2018

Since {} is considered confusing

By whom exactly? It's a very simple rule and if there's confusion around it, perhaps the rule should be reworded to be clearer.

There are some interesting performance differences, too!

FWIW, there's definitely a bug in there and that performance landscape will change once R#1951 is resolved.

%(), @(), $() is an anti-feature and will be removed in the future

I'm not aware of it being removed. I think you're confusing the coercers with all the magic shortcuts for $/ access; only those will get removed.


The style guide suggests using {} for empty hashes and %() for non-empty hashes.

Why does it need to suggest anything? The docs are written by many programmers, each of whom use different styles, and the language's motto is There's More Than One Way To Do It. It's not overly awful for readers of the docs to be exposed to many features, rather than for the docs to mandate a particular style chosen as the "recommended" one by the 7 people in this discussion.

@rafaelschipiura

This comment has been minimized.

Copy link
Contributor

commented Jun 24, 2018

@zoffixznet I meant the shortcuts for $/ will be removed, I wasn't clear.

And in 6.e I think %() should create an empty Hash.

@abraxxa

This comment has been minimized.

Copy link
Contributor

commented Jun 26, 2018

Why should it make sense to have so many different forms of defining an empty hash?
Perl 5 left the golf era and people realized that readable code is more important than using some super cool, undocumented (mis)feature of the language.
Telling newbies that there are over ten ways to do such a basic thing in Perl 6 won't make it popular or loved.

@rafaelschipiura

This comment has been minimized.

Copy link
Contributor

commented Jun 26, 2018

@abraxxa No need to tell them.

@Juerd

This comment has been minimized.

Copy link
Contributor Author

commented Jun 26, 2018

@abraxxa That discussion is off topic here. We're discussing the documentation, not Perl 6 itself.

@JJ JJ closed this in e232c38 Jun 26, 2018
@abraxxa

This comment has been minimized.

Copy link
Contributor

commented Jun 26, 2018

Not giving a recommendation is not a good solution.

@JJ

This comment has been minimized.

Copy link
Contributor

commented Jun 26, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
9 participants
You can’t perform that action at this time.