Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Introduce a dictionary type #184

Closed
masak opened this issue Aug 26, 2016 · 17 comments
Closed

Introduce a dictionary type #184

masak opened this issue Aug 26, 2016 · 17 comments

Comments

@masak
Copy link
Owner

masak commented Aug 26, 2016

As fell out of the discussion in #144.

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.

Why do I want that? Because experience shows that hashes are central to many other things one might want to build, and allowing anything-hashable makes some code a lot nicer. Sets, bags, graphs come to mind. And I know more than one reason to want to key on tuples/enums, once we get them. So the high road it is.

What should "hashable" mean for the purpose of this issue? Let's go with "has a hash method". Maybe some requirements should be placed on that method, but that can probably wait. Also, once 007 has interfaces, then Hashable should probably be one. But all in due time.

A number of methods that currently sit on Val::Object (fsvo "sit") ought to move to and/or be rethought for Val::Hash:

  • get
  • keys
  • has
  • update/extend

Some of these should maybe be available in some form on Val::Object too? For example, Q::Postfix::Property in runtime.007 is currently implemented as a hash indexing operation:

        Q::Postfix::Property(op) {
            return eval(op.operand)[eval(op.property)];
        },

This is clearly leaning on the "objects-as-hashes" confusion we're about to erase — in other words, there's already a relevant use case in user code that we don't want to erase completely.

I don't feel ready to propose a solution yet. Just noting that sometimes it's useful to treat an object as something slightly hash-like.

@masak
Copy link
Owner Author

masak commented Sep 2, 2016

Before we address this issue, we should do some thinking about how we want the syntax to look. The property shorthands like varname and method() {} are useful for object literals; less so for dictionary literals.

Worse, the object literal syntax builds on object properties having string keys (which they do, and still will), but for dictionaries we will need something that doesn't assume string keys.

This, in the end, is why Python doesn't have autoquoted keys, I suspect.

@masak
Copy link
Owner Author

masak commented Sep 8, 2016

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.)

This is easier than it might first seem. We don't have all that many hashable built-in types:

  • Val::Bool
  • Val::Int
  • Val::Str
  • Val::Type
  • (Edit: now also Val::NoneType)

Hm, Val::Sub, Val::Macro, Val::Regex and Val::Exception are sort of on the fence. We could perhaps make them hashable, but it might not be in their best interest to be hashable, so let's conservatively make them not-hashable for now.

The biggest thing to watch out for is basically that 5 won't hash the same way as "5".

@vendethiel
Copy link
Collaborator

vendethiel commented Sep 8, 2016

Just as a data point...

In Ruby, hash keys can be anything. However you might need to manually rehash if you change properties of the key:

a = [ "a", "b" ]
c = [ "c", "d" ]
h = { a => 100, c => 300 }
h[a]       #=> 100
a[0] = "z"
h[a]       #=> nil
h.rehash   #=> {["z", "b"]=>100, ["c", "d"]=>300}
h[a]       #=> 100

@masak
Copy link
Owner Author

masak commented Sep 8, 2016

In Ruby, hash keys can be anything. However you might need to manually rehash

Interesting. Perl 6 could essentially also have a .rehash method if it wanted, I guess.

Putting my language designer hat on, I think 007 has enough Python in it to want to disallow up front rather than go the route of re-hashing when things mutate. I grant that the question comes down mostly to taste.

@masak
Copy link
Owner Author

masak commented Oct 29, 2016

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.

Vide this tweet and this tweet.

@masak
Copy link
Owner Author

masak commented Nov 1, 2016

Here's a summary of this issue, and an attempt to enumerate the forces that apply to the design. If you feel cognitive dissonance reading the below criteria, it's because they're mutually opposing in many cases.

  • Consistency. It'd be nice if we could use "the same" syntax for object terms and for dictionary terms. For example, we'll use curly braces ({}) and colons (:) in both, because that's what people expect. And we either want to autoquote everywhere, or nowhere.
  • Autoquoting. Makes sense for string-key data structures (like objects), but not for data structures where the keys can be anything hashable (like Python dictionaries).
  • Shorthands. The varname and method () {} shorthands (a) only really make sense on objects, and (b) kind of half-rely on autoquoting because that's when the sugar comes out nice.
  • Convertibility. It's perfectly reasonable, and sometimes even desirable, to render an object as a hash, in order to do more dynamic lookup on it, or process it in some other way. Converting from hash to object would also be possible, provided the hash only had string keys.
  • Hashability. Hashing only on strings is simple but arbitrary. (It could be seen as an aspect of stringly typed code.) Allowing hashing on anything which is immutable enough to function as a hash key makes a lot more sense, and allows for the easy implementation of other useful data structures, such as Set and Bag. (Those can be implemented with just string-key hashes, too, but the complexity of hashing other things would bubble up in the implementation of those data structures. The same problem would need to be solved over and over again.)

Looking at the above, the path forward looks quite straightforward after all:

  • Keep auto-quoting keys in object terms.
  • Disable auto-quoting keys in hashes (losing full Consistency, but that's the only one we lose).
  • Keep the shorthands in object terms.
  • Let objects be convertible to hashes through a method.
  • Hash on anything hashable.

Also, it would mean the proposed syntax is now:

my obj = new Q::Literal::Int {
    value: 42
};    # exactly the same as before, including auto-quoted keys if you want

my dict = {
    "value": 42
};    # the quotes are mandatory here
hash = obj.Dictionary();    # proposed conversion method syntax

I started out calling the above type Hash, but the conversion method .Hash() ends up looking very much like the hashing method .hash(), which isn't good. So Python ends up bequeathing the name "dictionary"/"dict" to 007 as well. Can't say I care much for that name, but the one upside is that it'll be easier to motivate why 007 dictionaries are so similar to Python dictionaries.

@masak masak changed the title Introduce a hash/dictionary type Introduce a dictionary type Nov 2, 2016
@masak
Copy link
Owner Author

masak commented Nov 2, 2016

The .update and .extend methods would move over to dictionaries. Well, .update could keep existing on objects too, but I don't know if there's a strong use case for it. Maybe converting-to-dictionary, then updating, then converting back is in some sense clearer and a better separation of concerns.

@masak
Copy link
Owner Author

masak commented Nov 4, 2016

What should looping over a dictionary do?

In Python, it loops over the keys:

$ python3 -c'
d = { "a":1, "b":2, "c": 3 }
for k in d:
    print(k)'
c
a
b

In Perl 6, it loops over the pairs:

$ perl6 -e'my %h = a => 1, b => 2, c => 3; .say for %h'
a => 1
c => 3
b => 2

In Perl 5, it loops over a flattened list of keys and values:

$ perl -Mstrict -wE'my %h = (a => 1, b => 2, c => 3); say for %h'
b
2
c
3
a
1

Finally in Java, you can't loop over a Map directly, but you instead have to specify whether to loop over keys, values, or entries (pairs).

Since the designs are all over the place, I'm left to my own judgement on this. I kinda like Perl 6's approach, because it gives you both key and value without judging. Together with some kind of destructuring, that's quite enough. But I hadn't planned on giving 007 pairs, at least not at this point. Tuples we may get, but also not quite yet. (Update: Got 'em.)

So I think I'll go with the Java approach, which has the advantage of being the most boring and conservative one. At least then it'll be easy to upgrade to something nicer later.

@masak
Copy link
Owner Author

masak commented Nov 4, 2016

Converting from hash to object would also be possible, provided the hash only had string keys.

I suddenly realized that the main use case for this would be the .create method; see #208. For example, runtime.007 wants to do this when implementing Q::Term::Object.eval, because its task is exactly "I have this bunch of key/value mappings (a dictionary), make me an object from that".

@masak
Copy link
Owner Author

masak commented Nov 6, 2016

I think dictionaries (like Val::Object and like Python dictionaries, but unlike Perl 5/6 hashes) should throw an exception if lookup fails. But I'm open to having one of them get() methods (with the second default being non-optional in 007's case, because we don't have optional parameters by default).

@masak
Copy link
Owner Author

masak commented Jan 3, 2017

I added the "needs-more-design" label to this, because when I picture the {} and the new Q {} syntax co-existing, both with braces but each with its own rules about key quoting and stuff, it feels... hard to use. Program authors would trip up on it. I would trip up on it.

So we probably need two slightly more different syntaxes to distinguish them. Needs more design.

In completely unrelated news, back in November and December I started a branch for this issue, and got more or less stuck — basically the kind of stuck brought on by overwhelming changes without a really clear direction.

@masak
Copy link
Owner Author

masak commented Aug 25, 2017

I'm going to be difficult and say that we should probably aim for making the dictionaries take any hashable key, not just string keys. (Like in Python.) Without that extra requirement, I would slap a "low-hanging fruit" label on this issue, but with it, it's a wee bit bigger.

I think this early requirement might have been a mistake. Even though using any hashable type for keys is nice, language-wise, the cost for it is pretty big.

Notably, the syntax

{ foo: 42 }

in a language that only supports strings as keys is "free" to assume that foo is a string key here, like so:

{ "foo": 42 }

(Python, on the other hand, can't.)

So firstly, it's nice to be able to skip quotes on dictionary keys.

Secondly, the above syntax is strangely consistent with the short-hand:

{ foo }

For

{ foo: foo }

That is

{ "foo": foo }

There is some other syntactic bit here that's also bugging me about not being able to to assume that keys are strings... related to object creation. But I can't quite put it into words now.

@masak
Copy link
Owner Author

masak commented Aug 25, 2017

Why do I want that? Because experience shows that hashes are central to many other things one might want to build, and allowing anything-hashable makes some code a lot nicer. Sets, bags, graphs come to mind. And I know more than one reason to want to key on tuples/enums, once we get them.

I mean, this is all still true, it's just that the language design cost is higher than I expected.

@masak
Copy link
Owner Author

masak commented Sep 6, 2017

Oh, hey — add Object (after #144) as a hashable type, too. Trivially hashable, even. It's counter to how people will probably use Object instances most of the time... but they are hashable and if someone wants to, they could.

@masak
Copy link
Owner Author

masak commented Feb 17, 2018

Oh, hey — add Object (after #144) as a hashable type, too.

This... does not seem like a good idea. Not sure what I was thinking. 😄 If we make Object hashable, then everything is suddenly hashable.

@masak
Copy link
Owner Author

masak commented Aug 8, 2018

So we probably need two slightly more different syntaxes to distinguish them. Needs more design.

This seems to be resolving itself now, since we're moving in the direction of #307 and #308. Sure, with #299 in the mix we'll still have the colon in both syntaxes, but they'll be differentiated enough I think by the different contexts they occur in.

Removing the needs-more-design label.

masak pushed a commit that referenced this issue Aug 8, 2018
It's more like Dict already than like Object.

There are a few things to straighten out here that need to be
fixed before this is mergeable to master -- notably, the whole
`new` syntax needs to work with objects, not with dicts.

But this is an encouraging start.

Closes #184.
masak pushed a commit that referenced this issue Jan 16, 2019
It's more like Dict already than like Object.

There are a few things to straighten out here that need to be
fixed before this is mergeable to master -- notably, the whole
`new` syntax needs to work with objects, not with dicts.

But this is an encouraging start.

Closes #184.
@masak masak mentioned this issue Feb 8, 2019
@masak
Copy link
Owner Author

masak commented Feb 9, 2019

It just occurred to me that we might be able to steal Perl 6's :{} syntax directly for hashes/dictionaries that take any hashable key, not just strings. Just a thought.

It's possible that this syntax need not be on my default. Maybe one can import it via a module.

masak pushed a commit that referenced this issue Feb 11, 2019
It's more like Dict already than like Object.

There are a few things to straighten out here that need to be
fixed before this is mergeable to master -- notably, the whole
`new` syntax needs to work with objects, not with dicts.

But this is an encouraging start.

Closes #184.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants