/
TODO
247 lines (158 loc) · 6.58 KB
/
TODO
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
# Version 0.05 alpha $Revision: 1.5 $ $Date: 1999/09/17 14:57:55 $
=head1 TO DO
=over 4
=item *
LIST_CACHE doesn't work with ties to most DBM implementations, because
Memouze tries to save a listref, and DB_File etc. can only store
strings. This should at least be documented. Maybe Memoize could
detect the problem at TIE time and throw a fatal error.
Try out MLDBM here and document it if it works.
=item *
We should extend the benchmarking module to allow
timethis(main, { MEMOIZED => [ suba, subb ] })
What would this do? It would time C<main> three times, once with
C<suba> and C<subb> unmemoized, twice with them memoized.
Why would you want to do this? By the third set of runs, the memo
tables would be fully populated, so all calls by C<main> to C<suba>
and C<subb> would return immediately. You would be able to see how
much of C<main>'s running time was due to time spent computing in
C<suba> and C<subb>. If that was just a little time, you would know
that optimizing or improving C<suba> and C<subb> would not have a
large effect on the performance of C<main>. But if there was a big
difference, you would know that C<suba> or C<subb> was a good
candidate for optimization if you needed to make C<main> go faster.
Done.
=item *
Perhaps C<memoize> should return a reference to the original function
as well as one to the memoized version? But the programmer could
always construct such a reference themselves, so perhaps it's not
necessary. We save such a reference anyway, so a new package method
could return it on demand even if it wasn't provided by C<memoize>.
We could even bless the new function reference so that it could have
accessor methods for getting to the original function, the options,
the memo table, etc.
Naah.
=item *
The TODISK feature is not ready yet. It will have to be rather
complicated, providing options for which disk method to use (GDBM?
DB_File? Flat file? Storable? User-supplied?) and which stringizing
method to use (FreezeThaw? Marshal? User-supplied?)
Done!
=item *
Maybe an option for automatic expiration of cache values? (`After one
day,' `After five uses,' etc.) Also possibly an option to limit the
number of active entries with automatic LRU expiration.
You have a long note to Mike Cariaso that outlines a good approach
that you sent on 9 April 1999.
What's the timeout stuff going to look like?
EXPIRE_TIME => time_in_sec
EXPIRE_USES => num_uses
MAXENTRIES => n
perhaps? Is EXPIRE_USES actually useful?
19990916: Memoize::Expire does EXPIRE_TIME and EXPIRE_USES.
MAXENTRIES can come later as a separate module.
=item *
Put in a better example than C<fibo>. Show an example of a
nonrecursive function that simply takes a long time to run.
C<getpwuid> for example? But this exposes the bug that you can't say
C<memoize('getpwuid')>, so perhaps it's not a very good example.
Well, I did add the ColorToRGB example, but it's still not so good.
These examples need a lot of work. C<factorial> might be a better
example than C<fibo>.
=item *
Add more regression tests for normalizers.
=item *
Maybe resolve normalizer function to code-ref at memoize time instead
of at function call time for efficiency? I think there was some
reason not to do this, but I can't remember what it was.
=item *
Add more array value tests to the test suite.
Does it need more now?
=item *
Fix that `Subroutine u redefined ... line 484' message.
Fixed, I think.
=item *
Get rid of any remaining *{$ref}{CODE} or similar magic hashes.
=item *
There should be an option to dump out the memoized values or to
otherwise traverse them.
What for?
Maybe the tied hash interface taskes care of this anyway?
=item *
Include an example that caches DNS lookups.
=item *
Make tie for Storable (Memoize::Storable)
A prototype of Memoize::Storable is finished. Test it and add to the
test suite.
Done.
=item *
Make tie for DBI (Memoize::DBI)
=item *
I think there's a bug. See `###BUG'.
=item *
Storable probably can't be done, because it doesn't allow updating.
Maybe a different interface that supports readonly caches fronted by a
writable in-memory cache? A generic tied hash maybe?
FETCH {
if (it's in the memory hash) {
return it
} elsif (it's in the readonly disk hash) {
return it
} else {
not-there
}
}
STORE {
put it into the in-memory hash
}
Maybe `save' and `restore' methods?
It isn't working right because the destructor doesn't get called at
the right time.
This is fixed. `use strict vars' would have caught it immediately. Duh.
=item *
Don't forget about generic interface to Storable-like packages
=item *
Maybe add in TODISK after all, with TODISK => 'filename' equivalent to
SCALAR_CACHE => [TIE, Memoize::SDBM_File, $filename, O_RDWR|O_CREAT, 0666],
LIST_CACHE => MERGE
=item *
Maybe the default for LIST_CACHE should be MERGE anyway.
=item *
There's some terrible bug probably related to use under threaded perl,
possibly connected with line 56:
my $wrapper = eval "sub { unshift \@_, qq{$cref}; goto &_memoizer; }";
I think becayse C<@_> is lexically scoped in threadperl, the effect of
C<unshift> never makes it into C<_memoizer>. That's probably a bug in
Perl, but maybe I should work around it. Can anyone provide more
information here, or lend me a machine with threaded Perl where I can
test this theory? Line 59, currently commented out, may fix the
problem.
=item *
Maybe if the original function has a prototype, the module can use
that to select the most appropriate default normalizer. For example,
if the prototype was C<($)>, there's no reason to use `join'. If it's
C<(\@)> then it can use C<join $;,@$_[0];> instead of C<join $;,@_;>.
=item *
Ariel Scolnikov suggests using the change counting problem as an
example. (How many ways to make change of a dollar?)
=item *
I found a use for `unmemoize'. If you're using the Storable glue, and
your program gets SIGINT, you find that the cache data is not in the
cache, because Perl normally writes it all out at once from a
DESTROY method, and signals skip DESTROY processing. So you could add
$sig{INT} = sub { unmemoize ... };
(Jonathan Roy pointed this out)
=item *
This means it would be useful to have a method to return references to
all the currently-memoized functions so that you could say
$sig{INT} = sub { for $f (Memoize->all_memoized) {
unmemoize $f;
}
}
=item *
19990917 There should be a call you can make to get back the cache
itself. If there were, then you could delete stuff from it to
manually expire data items.
=item *
There was probably some other stuff that I forgot.
=back