Skip to content

Commit 6c41a4b

Browse files
committed
Start rewriting of the hashmap language page
Follows #1682 advice of moving the top-heavy part of the Hash type to this page. That part has been rewritten in similarity to Array, by making references to its general behavior using parallelisms between them. This also goes towards fulfillment of the roadmap #114
1 parent 4a16209 commit 6c41a4b

File tree

2 files changed

+297
-300
lines changed

2 files changed

+297
-300
lines changed

doc/Language/hashmap.pod6

Lines changed: 292 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -8,17 +8,303 @@
88
99
TBD
1010
11-
=head1 Mutability of associative data structures
11+
A Hash is a mutable mapping from keys to values (called I<dictionary>,
12+
I<hash table> or I<map> in other programming languages). The values are
13+
all scalar containers, which means you can assign to them.
1214
13-
TBD
15+
Hashes are usually stored in variables with the percent C<%> sigil.
1416
15-
=head1 Working with associative data structures
17+
Hash elements are accessed by key via the C<{ }> postcircumfix operator:
1618
17-
TBD
19+
say %*ENV{'HOME', 'PATH'}.perl;
20+
# OUTPUT: «("/home/camelia", "/usr/bin:/sbin:/bin")␤»
1821
19-
=head1 Defining your own associative structures
22+
The general L<Subscript|/language/subscripts> rules apply providing shortcuts
23+
for lists of literal strings, with and without interpolation.
2024
21-
TBD
25+
my %h = oranges => 'round', bananas => 'bendy';
26+
say %h<oranges bananas>;
27+
# OUTPUT: «(round bendy)␤»
28+
29+
my $fruit = 'bananas';
30+
say %h«oranges "$fruit"»;
31+
# OUTPUT: «(round bendy)␤»
32+
33+
You can add new pairs simply by assigning to an unused key:
34+
35+
my %h;
36+
%h{'new key'} = 'new value';
37+
38+
=head1 Hash assignment
39+
40+
Assigning a list of elements to a hash variable first empties the variable,
41+
and then iterates the elements of the right-hand side. If an element is a
42+
L<Pair>, its key is taken as a new hash key, and its value as the new hash
43+
value for that key. Otherwise the value is coerced to L<Str> and used as a
44+
hash key, while the next element of the list is taken as the corresponding
45+
value.
46+
47+
my %h = 'a', 'b', c => 'd', 'e', 'f';
48+
# same as
49+
my %h = a => 'b', c => 'd', e => 'f';
50+
# or
51+
my %h = <a b c d e f>;
52+
53+
If a L<Pair> is encountered where a value is expected, it is used as a
54+
hash value:
55+
56+
my %h = 'a', 'b' => 'c';
57+
say %h<a>.^name; # OUTPUT: «Pair␤»
58+
say %h<a>.key; # OUTPUT: «b␤»
59+
60+
If the same key appears more than once, the value associated with its last
61+
occurrence is stored in the hash:
62+
63+
my %h = a => 1, a => 2;
64+
say %h<a>; # OUTPUT: «2␤»
65+
66+
To assign a hash to a variable which does not have the C<%> sigil, you may use the C<%()> hash
67+
constructor:
68+
69+
my $h = %( a => 1, b => 2 );
70+
say $h.^name; # OUTPUT: «Hash␤»
71+
say $h<a>; # OUTPUT: «1␤»
72+
73+
B<NOTE:> Hashes can also be constructed with C<{ }>.
74+
It is recommended to use the C<%()> hash constructor, as braces
75+
are reserved for creating L<Block|/type/Block> objects and therefore braces
76+
only create Hash objects in specific circumstances.
77+
78+
If one or more values reference the topic variable, C<$_>, the
79+
right-hand side of the assignment will be interpreted as a L<Block|/type/Block>,
80+
not a Hash:
81+
82+
=begin code :skip-test
83+
my @people = [
84+
%( id => "1A", firstName => "Andy", lastName => "Adams" ),
85+
%( id => "2B", firstName => "Beth", lastName => "Burke" ),
86+
# ...
87+
];
88+
89+
sub lookup-user (Hash $h) { #`(Do something...) $h }
90+
91+
my @names = map {
92+
# While this creates a hash:
93+
my $query = { name => "$person<firstName> $person<lastName>" };
94+
say $query.^name; # OUTPUT: «Hash␤»
95+
96+
# Doing this will create a Block. Oh no!
97+
my $query2 = { name => "$_<firstName> $_<lastName>" };
98+
say $query2.^name; # OUTPUT: «Block␤»
99+
say $query2<name>; # fails
100+
101+
CATCH { default { put .^name, ': ', .Str } };
102+
# OUTPUT: «X::AdHoc: Type Block does not support associative indexing.␤»
103+
lookup-user($query); # Type check failed in binding $h; expected Hash but got Block
104+
}, @people;
105+
=end code
106+
107+
This would have been avoided if you had used the C<%()> hash constructor.
108+
Only use curly braces for creating Blocks.
109+
110+
=head2 Slices
111+
112+
You can assign to multiple keys at the same time with a slice.
113+
114+
my %h; %h<a b c> = 2 xx *; %h.perl.say; # OUTPUT: «{:a(2), :b(2), :c(2)}␤»
115+
my %h; %h<a b c> = ^3; %h.perl.say; # OUTPUT: «{:a(0), :b(1), :c(2)}␤»
116+
117+
=head2 Non-string keys (object hash)
118+
119+
X<|non-string keys>
120+
X<|object hash>
121+
X<|:{}>
122+
123+
By default keys in C<{ }> are forced to strings. To compose a hash with
124+
non-string keys, use a colon prefix:
125+
126+
my $when = :{ (now) => "Instant", (DateTime.now) => "DateTime" };
127+
128+
Note that with objects as keys, you often cannot use the C«<...>» construct
129+
for key lookup, as it creates only strings and
130+
L<allomorphs|/language/glossary#index-entry-Allomorph>. Use the C«{...}»
131+
instead:
132+
133+
:{ 0 => 42 }<0>.say; # Int as key, IntStr in lookup; OUTPUT: «(Any)␤»
134+
:{ 0 => 42 }{0}.say; # Int as key, Int in lookup; OUTPUT: «42␤»
135+
:{ '0' => 42 }<0>.say; # Str as key, IntStr in lookup; OUTPUT: «(Any)␤»
136+
:{ '0' => 42 }{'0'}.say; # Str as key, Str in lookup; OUTPUT: «42␤»
137+
:{ <0> => 42 }<0>.say; # IntStr as key, IntStr in lookup; OUTPUT: «42␤»
138+
139+
Note: Rakudo implementation currently erroneously applies
140+
L<the same rules|/routine/{ }#(Operators)_term_{_}> for C<:{ }> as it does for C<{ }>
141+
and can construct a L<Block> in certain circumstances. To avoid that, you can
142+
instantiate a parameterized Hash directly. Parameterization of C<%>-sigiled variables
143+
is also supported:
144+
145+
my Num %foo1 = "0" => 0e0; # Str keys and Num values
146+
my %foo2{Int} = 0 => "x"; # Int keys and Any values
147+
my Num %foo3{Int} = 0 => 0e0; # Int keys and Num values
148+
Hash[Num,Int].new: 0, 0e0; # Int keys and Num values
149+
150+
Now if you want to define a hash to preserve the objects you are using
151+
as keys I<as the B<exact> objects you are providing to the hash to use as keys>,
152+
then object hashes are what you are looking for.
153+
154+
my %intervals{Instant};
155+
my $first-instant = now;
156+
%intervals{ $first-instant } = "Our first milestone.";
157+
sleep 1;
158+
my $second-instant = now;
159+
%intervals{ $second-instant } = "Logging this Instant for spurious raisins.";
160+
for %intervals.sort -> (:$key, :$value) {
161+
state $last-instant //= $key;
162+
say "We noted '$value' at $key, with an interval of {$key - $last-instant}";
163+
$last-instant = $key;
164+
}
165+
166+
This example uses an object hash that only accepts keys of type L<Instant> to
167+
implement a rudimentary, yet type-safe, logging mechanism. We utilize a named
168+
L<state|/language/variables#The_state_Declarator>
169+
variable for keeping track of the previous C<Instant> so that we can provide an interval.
170+
171+
The whole point of object hashes is to keep keys as objects-in-themselves.
172+
Currently object hashes utilize the L<WHICH|/routine/WHICH> method of an object, which returns a
173+
unique identifier for every mutable object. This is the keystone upon which the object
174+
identity operator (L<===>) rests. Order and containers really matter here as the order of
175+
C<.keys> is undefined and one anonymous list is never L<===> to another.
176+
177+
my %intervals{Instant};
178+
my $first-instant = now;
179+
%intervals{ $first-instant } = "Our first milestone.";
180+
sleep 1;
181+
my $second-instant = now;
182+
%intervals{ $second-instant } = "Logging this Instant for spurious raisins.";
183+
say ($first-instant, $second-instant) ~~ %intervals.keys; # OUTPUT: «False␤»
184+
say ($first-instant, $second-instant) ~~ %intervals.keys.sort; # OUTPUT: «False␤»
185+
say ($first-instant, $second-instant) === %intervals.keys.sort; # OUTPUT: «False␤»
186+
say $first-instant === %intervals.keys.sort[0]; # OUTPUT: «True␤»
187+
188+
Since C<Instant> defines its own comparison methods, in our example a sort according to
189+
L<cmp> will always provide the earliest instant object as the first element in the L<List>
190+
it returns.
191+
192+
If you would like to accept any object whatsoever in your hash, you can use L<Any>!
193+
194+
my %h{Any};
195+
%h{(now)} = "This is an Instant";
196+
%h{(DateTime.now)} = "This is a DateTime, which is not an Instant";
197+
%h{"completely different"} = "Monty Python references are neither DateTimes nor Instants";
198+
199+
There is a more concise syntax which uses binding.
200+
201+
my %h := :{ (now) => "Instant", (DateTime.now) => "DateTime" };
202+
203+
The binding is necessary because an object hash is about very solid, specific objects,
204+
which is something that binding is great at keeping track of but about which assignment doesn't
205+
concern itself much.
206+
207+
=head2 Constraint value types
208+
209+
Place a type object in-between the declarator and the name to constraint the type
210+
of all values of a C<Hash>. Use a L<subset|/language/typesystem#subset> for
211+
constraints with a where-clause.
212+
213+
subset Powerful of Int where * > 9000;
214+
my Powerful %h{Str};
215+
put %h<Goku> = 9001;
216+
try {
217+
%h<Vegeta> = 900;
218+
CATCH { when X::TypeCheck::Binding { .message.put } }
219+
}
220+
221+
# OUTPUT:
222+
# 9001
223+
# Type check failed in binding assignval; expected Powerful but got Int (900)
224+
225+
=head1 Looping over hash keys and values
226+
227+
A common idiom for processing the elements in a hash is to loop over the
228+
keys and values, for instance,
229+
230+
my %vowels = 'a' => 1, 'e' => 2, 'i' => 3, 'o' => 4, 'u' => 5;
231+
for %vowels.kv -> $vowel, $index {
232+
"$vowel: $index".say;
233+
}
234+
235+
gives output similar to this:
236+
237+
=for code :skip-test
238+
a: 1
239+
e: 2
240+
o: 4
241+
u: 5
242+
i: 3
243+
244+
where we have used the C<kv> method to extract the keys and their respective
245+
values from the hash, so that we can pass these values into the loop.
246+
247+
Note that the order of the keys and values printed cannot be relied upon;
248+
the elements of a hash are not always stored the same way in memory for
249+
different runs of the same program. Sometimes one wishes to process the
250+
elements sorted on, e.g. the keys of the hash. If one wishes to print the
251+
list of vowels in alphabetical order then one would write
252+
253+
my %vowels = 'a' => 1, 'e' => 2, 'i' => 3, 'o' => 4, 'u' => 5;
254+
for %vowels.sort(*.key)>>.kv -> ($vowel, $index) {
255+
"$vowel: $index".say;
256+
}
257+
258+
which prints
259+
260+
=for code :skip-test
261+
a: 1
262+
e: 2
263+
i: 3
264+
o: 4
265+
u: 5
266+
267+
and is in alphabetical order as desired. To achieve this result, we sorted
268+
the hash of vowels by key (C<%vowels.sort(*.key)>) which we then ask for its
269+
keys and values by applying the C<.kv> method to each element via the unary
270+
C< >> > hyperoperator resulting in a L<List> of key/value lists. To extract
271+
the key/value the variables thus need to be wrapped in parentheses.
272+
273+
An alternative solution is to flatten the resulting list. Then the key/value
274+
pairs can be accessed in the same way as with plain C<.kv>:
275+
276+
my %vowels = 'a' => 1, 'e' => 2, 'i' => 3, 'o' => 4, 'u' => 5;
277+
for %vowels.sort(*.key)>>.kv.flat -> $vowel, $index {
278+
"$vowel: $index".say;
279+
}
280+
281+
You can also loop over a C<Hash> using
282+
L<destructuring|/type/Signature#Destructuring_Parameters>.
283+
284+
=head2 In place editing of values
285+
286+
There may be times when you would like to modify the values of a hash while iterating over them.
287+
288+
my %answers = illuminatus => 23, hitchhikers => 42;
289+
# OUTPUT: «hitchhikers => 42, illuminatus => 23»
290+
for %answers.values -> $v { $v += 10 }; # Fails
291+
CATCH { default { put .^name, ': ', .Str } };
292+
# OUTPUT: «X::AdHoc: Cannot assign to a readonly variable or a value␤»
293+
294+
This is traditionally accomplished by sending both the key and the value as
295+
follows.
296+
297+
my %answers = illuminatus => 23, hitchhikers => 42;
298+
for %answers.kv -> $k,$v { %answers{$k} = $v + 10 };
299+
300+
However, it is possible to leverage the signature of the block in order to
301+
specify that you would like read-write access to the values.
302+
303+
my %answers = illuminatus => 23, hitchhikers => 42;
304+
for %answers.values -> $v is rw { $v += 10 };
305+
306+
It is not, however, possible to do in-place editing of hash keys, even in the
307+
case of object hashes.
22308
23309
24310
=end pod

0 commit comments

Comments
 (0)