This repository has been archived by the owner on Feb 16, 2023. It is now read-only.
/
BasicClasses.tex
758 lines (562 loc) · 39.6 KB
/
BasicClasses.tex
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
% $Author$
% $Date$
% $Revision$
%=================================================================
\ifx\wholebook\relax\else
% --------------------------------------------
% Lulu:
\documentclass[a4paper,10pt,twoside]{book}
\usepackage[
papersize={6.13in,9.21in},
hmargin={.815in,.815in},
vmargin={.98in,.98in},
ignoreheadfoot
]{geometry}
\input{../common.tex}
\pagestyle{headings}
\setboolean{lulu}{true}
% --------------------------------------------
% A4:
% \documentclass[a4paper,11pt,twoside]{book}
% \input{../common.tex}
% \usepackage{a4wide}
% --------------------------------------------
\graphicspath{{figures/} {../figures/}}
\begin{document}
\renewcommand{\nnbb}[2]{} % Disable editorial comments
\sloppy
\fi
%=================================================================
\chapter{Basic Classes}
\label{cha:basic}
Most of the magic of Smalltalk is not in the language but in the class libraries. To program effectively with Smalltalk, you need to learn how the class libraries support the language and environment. The class libraries are entirely written in Smalltalk and can easily be extended since a package may add new functionality to a class even if it does not define this class.
Our goal here is not to present in tedious detail the whole of the Squeak class library, but rather to point out the key classes and methods that you will need to use or override to program effectively. In this chapter we cover the basic classes that you will need for nearly every application: \ct{Object}, \ct{Number} and its subclasses, \ct{Character}, \ct{String}, \ct{Symbol} and \ct{Boolean}.
\md{Here are some comments:\\
- copying: Good question... the copying in Squeak is much too complicated... there is for one the "old" smalltalk way of
overrifing postCopy, and then the "automatic" deepCopy... which is quite complex and (I think) was no good idea...
(see class comment in DeepCopier)\\
- Debugging: Yes, needs its own chapter. We should talk about haltIf, haltOnce...\\
- assert: Object>>>assert: can take both a block and a boolean, because boleen implements \#value.
(I will fix SUnit to allow both, too).\\
- Characters and Strings: we should talk about Unicode stuff... but I don't know too much myself.}
%=================================================================
\section{Object}
For all intents and purposes, \clsindmain{Object} is the root of the inheritance hierarchy. Actually, in Squeak the true root of the hierarchy is \clsind{ProtoObject}, which is used to define minimal entities that masquerade as objects, but we can ignore this point for the time being.
% (more on this later in the chapter on reflection).
\ct{Object} can be found in the \scatind{Kernel-Objects} category. Astonishingly, there are some 400 methods to be found here (including extensions). In other words, every class that you define will automatically provide these 400 methods, whether you know what they do or not. Note that some of the methods should be removed and new versions of Squeak may remove some of the superfluous methods.
\sd{I do not like to quote something that can change and that people can find simply in the image but let us keep it for now.}
The class comment for the \ct{Object} states:
\needlines{4}
\begin{quote}
\textit{\ct{Object} is the root class for almost all of the other classes in the class hierarchy. The exceptions are \ct{ProtoObject} (the superclass of \ct{Object}) and its subclasses.
Class \ct{Object} provides default behaviour common to all normal objects, such as access, copying, comparison, error handling, message sending, and \ind{reflection}. Also utility messages that all objects should respond to are defined here.
\ct{Object} has no instance variables, nor should any be added. This is due to several classes of objects that inherit from \ct{Object} that have special implementations (\ct{SmallInteger} and \ct{UndefinedObject} for example) or the VM knows about and depends on the structure and layout of certain standard classes.}
\end{quote}
If we begin to browse the method categories on the instance side of \ct{Object} we start to see some of the key behaviour it provides.
%-----------------------------------------------------------------
\subsection{Printing}
Every object in Smalltalk can return a printed form of itself. You can select any expression in a workspace and select the \menu{print it} menu: this executes the expression and asks the returned object to print itself. In fact this sends the message \ct{printString} to the returned object. The method \mthind{Object}{printString}, which is a \ind{template method}, at its core sends the message \mthind{Object}{printOn:} to its receiver. The message \ct{printOn:} is a hook that can be specialized.
\ct{Object>>>printOn:} is very likely one of the methods that you will most frequently override. This method takes as its argument a \clsind{Stream} on which a \clsind{String} representation of the object will be written. The default implementation simply writes the class name preceded by ``\ct{a}'' or ``\ct{an}''. \ct{Object>>>printString} returns the \ct{String} that is written.
For example, the class \clsind{Browser} does not redefine the method \ct{printOn:} and sending the message printString to an instance executes the methods defined in \ct{Object}.
\begin{code}{@TEST}
Browser new printString --> 'a Browser'
\end{code}
The class \ct{TTCFont} shows an example of \mthind{TTCFont}{printOn:}
specialization. It prints the name of the class followed by the family
name, the size and the subfamily name of the font as shown by the code
below which prints an instance of the class.
% \needlines{7}
\begin{method}[zork]{printOn: redefinition.}
TTCFont>>>printOn: aStream
aStream nextPutAll: 'TTCFont(';
nextPutAll: self familyName; space;
print: self pointSize; space;
nextPutAll: self subfamilyName;
nextPut: $)
\end{method}\ignoredollar$
\begin{code}{@TEST}
TTCFont allInstances anyOne printString --> 'TTCFont(BitstreamVeraSans 6 Bold)'
\end{code}
Note that the message \ct{printOn:} is not the same as \mthind{Object}{storeOn:}. The message \ct{storeOn:} puts on its argument stream an expression that can be used to recreate the receiver. This expression is evaluated when the stream is read using the message \ct{readFrom:}. \ct{printOn:} just returns a textual version of the receiver. Of course, it may happen that this textual representation may represent the receiver as a self-evaluating expression.
\paragraph{A word about representation and self-evaluating representation.}
In functional programming, expressions return values when executed. In Smalltalk, messages (expressions) return objects (values). Some objects have the nice properties that their value is themselves. For example, the value of the object \ct{true} is itself \ie the object \ct{true}. We call such objects \emphind{self-evaluating objects}. You can see a \emph{printed} version of an object value when you print the object in a workspace. Here are some examples of such self-evaluating expressions.
\begin{code}{@TEST}
true --> true
3@4 --> 3@4
$a --> $a
#(1 2 3) --> #(1 2 3)
\end{code}
Note that some objects such as arrays are self-evaluating or not depending on the objects they contain. For example, an array of booleans is self-evaluating whereas an array of persons is not. In Squeak 3.9, a mechanism was introduced (via the message \mthind{Object}{isSelfEvaluating}) to print collections in their self-evaluating forms as much as possible and this is especially true for brace arrays. The following example shows that a \subind{Array}{dynamic} array is self-evaluating only if its elements are:
\begin{code}{@TEST}
{10@10 . 100@100} --> {10@10 . 100@100}
{Browser new . 100@100} --> an Array(a Browser 100@100)
\end{code}
Remember that \subind{Array}{literal} arrays can only contain literals. Hence the following array does not contain two points but rather six literal elements.
\begin{code}{@TEST}
#(10@10 100@100) --> #(10 #@ 10 100 #@ 100)
\end{code}
Lots of \ct{printOn:} method specializations implement self-evaluating behavior. The implementations of \cmind{Point}{printOn:} and \cmind{Interval}{printOn:} are self-evaluating.
\begin{method}[Self-evaluating points]{Self-evaluation of \ct{Point}}
Point>>>printOn: aStream
"The receiver prints on aStream in terms of infix notation."
x printOn: aStream.
aStream nextPut: $@.
y printOn: aStream
\end{method}\ignoredollar$
\begin{method}[Self-evaluating intervals]{Self-evaluation of \ct{Interval}}
Interval>>>printOn: aStream
aStream nextPut: $(;
print: start;
nextPutAll: ' to: ';
print: stop.
step ~= 1 ifTrue: [aStream nextPutAll: ' by: '; print: step].
aStream nextPut: $)
\end{method}
\begin{code}{@TEST}
1 to: 10 --> (1 to: 10) "intervals are self-evaluating"
\end{code}
%-----------------------------------------------------------------
\subsection{Identity and equality}
In Smalltalk, the message \ct{=} tests object \emphsubindmain{Object}{equality} (\ie whether two objects represent the same value) whereas \ct{==} tests object \emphsubindmain{Object}{identity} (\ie whether two expressions represent the same object).
\seeindex{\ct{=}}{Object, equality}
\seeindex{\ct{==}}{Object, identity}
\seeindex{equality}{Object, equality}
\seeindex{identity}{Object, identity}
The default implementation of object equality is to test for object identity:
\begin{method}{Object equality}
Object>>>= anObject
"Answer whether the receiver and the argument represent the same object.
If = is redefined in any subclass, consider also redefining the message hash."
^ self == anObject
\end{method}
\cmindex{Object}{=}
This is a method that you will frequently want to override. Consider the case of \ct{Complex} numbers:
\begin{code}{@TEST}
(1 + 2 i) = (1 + 2 i) --> true "same value"
(1 + 2 i) == (1 + 2 i) --> false "but different objects"
\end{code}
This works because \ct{Complex} overrides \ct{=} as follows:
\cmindex{Complex}{=}
\needlines{5}
\begin{method}{Equality for complex numbers}
Complex>>>= anObject
anObject isComplex
ifTrue: [^ (real = anObject real) & (imaginary = anObject imaginary)]
ifFalse: [^ anObject adaptToComplex: self andSend: #=]
\end{method}
The default implementation of \ct{Object>>>~=} simply negates \ct{Object>>>=}, and should not normally need to be changed.
%\cmindex{Object}{\~=}
\index{Object!~=@\ct{~=}} % needs special treatment due to ~
\begin{code}{@TEST}
(1 + 2 i) ~= (1 + 4 i) --> true
\end{code}
If you override \ct{=}, you should consider overriding \mthind{Object}{hash}. If instances of your class are ever used as keys in a \clsind{Dictionary}, then you should make sure that instances that are considered to be equal have the same hash value:
\cmindex{Complex}{hash}
\begin{method}{Hash must be reimplemented for complex numbers}
Complex>>>hash
"Hash is reimplemented because = is implemented."
^ real hash bitXor: imaginary hash.
\end{method}
Although you should override \ct{=} and \ct{hash} together, you should \emph{never} override \ct{==}. (The semantics of object identity is the same for all classes.) \ct{==} is a primitive method of \clsind{ProtoObject}.
Note that Squeak has some strange behaviour compared to other Smalltalks: for example a symbol and a string can be equal. (We consider this be a bug, not a feature.)
\begin{code}{@TEST}
#'lulu' = 'lulu' --> true
'lulu' = #'lulu' --> true
\end{code}
%-----------------------------------------------------------------
\subsection{Class membership}
Several methods allow you to query the class of an object.
\paragraph{\mthind{Object}{class}.} You can ask any object about its class using the message \ct{class}.
\begin{code}{@TEST}
1 class --> SmallInteger
\end{code}
Conversely, you can ask if an object is an instance of a specific class:
\cmindex{Object}{isMemberOf:}
\begin{code}{@TEST}
1 isMemberOf: SmallInteger --> true "must be precisely this class"
1 isMemberOf: Integer --> false
1 isMemberOf: Number --> false
1 isMemberOf: Object --> false
\end{code}
Since Smalltalk is written in itself, you can really navigate through its structure using the right combination of superclass and class messages (see \charef{metaclasses}).
\paragraph{\ct{isKindOf:}}
\cmind{Object}{isKindOf:} answers whether the receiver's class is either the same as, or a subclass of the argument class.
\begin{code}{@TEST}
1 isKindOf: SmallInteger --> true
1 isKindOf: Integer --> true
1 isKindOf: Number --> true
1 isKindOf: Object --> true
1 isKindOf: String --> false
1/3 isKindOf: Number --> true
1/3 isKindOf: Integer --> false
\end{code}
\ct{1/3} which is a \clsind{Fraction} is a kind of \clsind{Number}, since the class \ct{Number} is a superclass of the class \ct{Fraction}, but \ct{1/3} is not a \ct{Integer}.
\paragraph{\ct{respondsTo:}}
\cmind{Object}{respondsTo:} answers whether the receiver understands the message selector given as an argument.
\begin{code}{@TEST}
1 respondsTo: #, --> false
\end{code}
Normally it is a bad idea to query an object for its class, or to ask it which messages it understands.
Instead of making decisions based on the class of object, you should simply send a message to the object and let it decide (\ie on the basis of its class) how it should behave.
%-----------------------------------------------------------------
\subsection{Copying}
Copying objects introduces some subtle issues. Since instance variables are accessed by reference, a \emphsubind{Object}{shallow copy} of an object would share its references to instance variables with the original object:
\seeindex{copy}{Object, \ct{copy}}
\seeindex{shallow copy}{Object, \ct{shallowCopy}}
\seeindex{deep copy}{Object, \ct{deepCopy}}
\begin{code}{@TEST | a1 a2 |}
a1 := { { 'harry' } }.
a1 --> #(#('harry'))
a2 := a1 shallowCopy.
a2 --> #(#('harry'))
(a1 at: 1) at: 1 put: 'sally'.
a1 --> #(#('sally'))
a2 --> #(#('sally')) "the subarray is shared!"
\end{code}
\cmind{Object}{shallowCopy} is a primitive method that creates a shallow copy of an object. Since \ct{a2} is only a shallow copy of \ct{a1}, the two arrays share a reference to the nested \ct{Array} that they contain.
\ct{Object>>>shallowCopy} is the ``public interface'' to \cmind{Object}{copy} and should be overridden if instances are unique. This is the case, for example, with the classes \clsind{Boolean}, \clsind{Character}, \clsind{SmallInteger}, \clsind{Symbol} and \clsind{UndefinedObject}.
\cmind{Object}{copyTwoLevel} does the obvious thing when a simple shallow copy does not suffice:
\begin{code}{@TEST | a1 a2 |}
a1 := { { 'harry' } } .
a2 := a1 copyTwoLevel.
(a1 at: 1) at: 1 put: 'sally'.
a1 --> #(#('sally'))
a2 --> #(#('harry')) "fully independent state"
\end{code}
\cmind{Object}{deepCopy} makes an arbitrarily deep copy of an object.
\begin{code}{@TEST | a1 a2 |}
a1 := { { { 'harry' } } } .
a2 := a1 deepCopy.
(a1 at: 1) at: 1 put: 'sally'.
a1 --> #(#('sally'))
a2 --> #(#(#('harry')))
\end{code}
The problem with \ct{deepCopy} is that it will not terminate when applied to a mutually recursive structure:
\begin{code}{NB: CANNOT TEST}
a1 := { 'harry' }.
a2 := { a1 }.
a1 at: 1 put: a2.
a1 deepCopy --> !\emph{... does not terminate!}!
\end{code}
% NB: Not a test!
Although it is possible to override \ct{deepCopy} to do the right thing, \cmind{Object}{copy} offers a better solution:
\begin{method}{Copying objects as a template method}
Object>>>copy
"Answer another instance just like the receiver. Subclasses typically override postCopy;
they typically do not override shallowCopy."
^self shallowCopy postCopy
\end{method}
You should override \mthind{Object}{postCopy} to copy any instance variables that should not be shared. \ct{postCopy} should always do a \ct{super postCopy}.
\on{I looked, but did not finda good example in the system.}
%-----------------------------------------------------------------
\subsection{Debugging}
The most important method here is \mthind{Object}{halt}. In order to set a breakpoint in a method, simply insert the message send \ct{self halt} at some point in the body of the method. When this message is sent, execution will be interrupted and a \ind{debugger} will open to this point in your program.
(See \charef{env} for more details about the debugger.)
\sd{in another chapter haltIf:, haltOnce, inspectOnce, flagging: isThisEverCalled, }
The next most important message is \mthind{Object}{assert:}, which takes a \ind{block} as its argument. If the block returns \ct{true}, execution continues. Otherwise an \ct{AssertionFailure} exception will be raised. If this exception is not otherwise caught, the debugger will open to this point in the execution. \ct{assert:} is especially useful to support \emphind{design by contract}. The most typical usage is to check non-trivial pre-conditions to public methods of objects. \cmind{Stack}{pop} could easily have been implemented as follows:
\begin{method}{Checking a pre-condition}
Stack>>>pop
"Return the first element and remove it from the stack."
self assert: [ self isEmpty not ].
^self linkedList removeFirst element
\end{method}
Do not confuse \ct{Object>>>assert:} with \cmind{TestCase}{assert:}, which occurs in the SUnit testing framework (see \charef{SUnit}). While the former expects a block as its argument\footnote{Actually, it will take any argument that understands \ct{value}, including a \ct{Boolean}.}, the latter expects a \clsind{Boolean}. Although both are useful for debugging, they each serve a very different intent.
%-----------------------------------------------------------------
\subsection{Error handling}
This protocol contains several methods useful for signaling run-time errors.
Sending \lct{self deprecated: \emph{anExplanationString}} signals that the current method should no longer be used, if deprecation has been turned on in the \protind{debug} protocol of the \ind{preference browser}.
The \ct{String} argument should offer an alternative.
\cmindex{Object}{deprecated:}
\index{deprecation}
\begin{code}{NB: CANNOT TEST}
1 doIfNotNil: [ :arg | arg printString, ' is not nil' ]
--> !\emph{SmallInteger(Object)>>doIfNotNil: has been deprecated. use ifNotNilDo:}!
\end{code}
\ct{doesNotUnderstand:} is sent whenever message lookup fails. The default implementation, \ie \cmind{Object}{doesNotUnderstand:} will trigger the debugger at this point. It may be useful to override \lct{does\-Not\-Un\-der\-stand:} to provide some other behaviour.
\on{Add a chapter ref when we write the chapter on exceptions.}
\cmind{Object}{error} and \cmind{Object}{error:} are generic methods that can be used to raise exceptions.
(Generally it is better to raise your own custom exceptions, so you can distinguish errors arising from your code from those coming from kernel classes.)
\lr{Maybe mention that it is preferred to create your own custom exception class. (p. 208)}
Abstract methods in Smalltalk are implemented by convention with the body \lct{self sub\-class\-Res\-pon\-si\-bi\-li\-ty}. Should an abstract class be instantiated by accident, then calls to abstract methods will result in \cmind{Object}{subclassResponsibility} being evaluated.
\begin{method}{Signaling that a method is abstract}
Object>>>subclassResponsibility
"This message sets up a framework for the behavior of the class' subclasses.
Announce that the subclass should have implemented this message."
self error: 'My subclass should have overridden ', thisContext sender selector printString
\end{method}
\clsind{Magnitude}, \clsind{Number} and \clsind{Boolean} are classical examples of \subind{class}{abstract} classes that we shall see shortly in this chapter.
\begin{code}{NB: CANNOT TEST}
Number new + 1 --> !\emph{Error: My subclass should have overridden \#+}!
\end{code}
\ct{self shouldNotImplement} is sent by convention to signal that an inherited method is not appropriate for this subclass. This is generally a sign that something is not quite right with the design of the class hierarchy. Due to the limitations of single inheritance, however, sometimes it is very hard to avoid such workarounds.
\cmindex{Object}{shouldNotImplement}
\index{inheritance!canceling}
A typical example is \cmind{Collection}{remove:} which is inherited by \clsind{Dictionary} but flagged as not implemented. (A \ct{Dictionary} provides \mthind{Dictionary}{removeKey:} instead.)
%-----------------------------------------------------------------
\sd{ subsection{Deprecation} }
\sd{to be done}
\on{There already is some text above! See second paragraph on Error handling.}
%-----------------------------------------------------------------
\subsection{Testing}
The \protind{testing} methods have nothing to do with SUnit testing! A testing method is one that lets you ask a question about the state of the receiver and returns a \clsind{Boolean}.
Numerous testing methods are provided by \ct{Object}. We have already seen \mthind{Object}{isComplex}. Others include \mthind{Object}{isArray}, \mthind{Object}{isBoolean}, \mthind{Object}{isBlock}, \mthind{Object}{isCollection} and so on. Generally such methods are to be avoided since querying an object for its class is a form of violation of encapsulation. Instead of testing an object for its class, one should simply send a request and let the object decide how to handle it.
Nevertheless some of these testing methods are undeniably useful. The most useful are probably \cmind{ProtoObject}{isNil} and \cmind{Object}{notNil} (though the \patind{Null Object}\cite{Wool98a} design pattern can obviate the need for even these methods).
% \footnote{However the \emph{Null Object} design pattern can obviate the need for even these methods. See, Bobby Woolf, ``Null Object,'' Pattern Languages of Program Design 3, Robert Martin, Dirk Riehle and Frank Buschmann (Eds.), pp. 5-18, Addison Wesley, 1998.}.
%-----------------------------------------------------------------
\subsection{Initialize release}
A final key method that occurs not in \ct{Object} but in \ct{ProtoObject} is \mthind{ProtoObject}{initialize}.
\begin{method}{\lct{initialize} as an empty hook method}
ProtoObject>>>initialize
"Subclasses should redefine this method to perform initializations on instance creation"
\end{method}
The reason this is important is that in Squeak as of version 3.9, the default \mthind{Behavior}{new} method defined for every class in the system will send \ct{initialize} to newly created instances.
\begin{method}{\lct{new} as a class-side template method}
Behavior>>>new
"Answer a new initialized instance of the receiver (which is a class) with no indexable
variables. Fail if the class is indexable."
^ self basicNew initialize
\end{method}
\cmindex{Behavior}{new}
This means that simply by overriding the \ct{initialize} \ind{hook method}, new instances of your class will automatically be initialized. The \ct{initialize} method should normally perform a \ct{super initialize} to establish the class \subind{class}{invariant} for any inherited instance variables.
(Note that this is \emph{not} the standard behaviour of other Smalltalks.)
%=================================================================
\section{Numbers}
\label{sec:Number}
Remarkably, numbers in Smalltalk are not primitive data values but true objects. Of course numbers are implemented efficiently in the virtual machine, but the \clsindmain{Number} hierarchy is as perfectly accessible and extensible as any other portion of the Smalltalk class hierarchy.
\begin{figure}[ht]
\centerline {\includegraphics[width=8cm]{NumberHierarchy}}
\caption{The Number Hierarchy \label{fig:numbers}}
\end{figure}
Numbers are found in the \scatind{Kernel-Numbers} category. The abstract root of this hierarchy is \clsind{Magnitude}, which represents all kinds of classes supporting comparision operators. \ct{Number} adds various arithmetic and other operators as mostly abstract methods. \clsind{Float} and \clsind{Fraction} represent, respectively, floating point numbers and fractional values. \clsind{Integer} is also abstract, thus distinguishing between subclasses \clsind{SmallInteger}, \clsind{LargePositiveInteger} and \clsind{LargeNegativeInteger}. For the most part users do not need to be aware of the difference between the three \ct{Integer} classes, as values are automatically converted as needed.
%-----------------------------------------------------------------
\subsection{Magnitude}
\clsindmain{Magnitude} is the parent not only of the \clsind{Number} classes, but also of other classes supporting comparison operations, such as \clsind{Character}, \clsind{Duration} and \clsind{Timespan}. (\clsind{Complex} numbers are not comparable, and so do not inherit from \clsind{Number}.)
Methods \mthind{Magnitude}{<} and \mthind{Magnitude}{=} are abstract. The remaining operators are generically defined. For example:
\begin{method}{Abstract comparison methods}
Magnitude>>> < aMagnitude
"Answer whether the receiver is less than the argument."
^self subclassResponsibility
Magnitude>>> > aMagnitude
"Answer whether the receiver is greater than the argument."
^aMagnitude < self
\end{method}
\cmindex{Magnitude}{>}
%-----------------------------------------------------------------
\subsection{Number}
Similarly, \clsindmain{Number} defines \mthind{Number}{+}, \mthind{Number}{-}, \mthind{Number}{*} and \mthind{Number}{/} to be abstract, but all other arithmetic operators are generically defined.
All \ct{Number} objects support various \emph{converting} operators, such as \mthind{Number}{asFloat} and \mthind{Number}{asInteger}. There are also numerous \emphind{shortcut constructor methods}, such as \mthind{Number}{i}, which converts a \ct{Number} to an instance of \clsind{Complex} with a zero real component, and others which generate \clsindplural{Duration}, such as \mthind{Number}{hour}, \mthind{Number}{day} and \mthind{Number}{week}.
\ct{Numbers} directly support common \emph{math functions} such as \mthind{Number}{sin}, \mthind{Number}{log}, \mthind{Number}{raiseTo:}, \mthind{Number}{squared}, \mthind{Number}{sqrt} and so on.
\cmind{Number}{printOn:} is implemented in terms of the abstract method \ct{Number>>>printOn:base:}. (The default base is 10.)
Testing methods include \mthind{Number}{even}, \mthind{Number}{odd}, \mthind{Number}{positive} and \mthind{Number}{negative}. Unsurprisingly \ct{Number} overrides \lct{is\-Num\-ber}. More interesting, \mthind{Number}{isInfinite} is defined to return \ct{false}.
\emph{Truncation} methods include \mthind{Number}{floor}, \mthind{Number}{ceiling}, \mthind{Number}{integerPart}, \mthind{Number}{fractionPart} and so on.
\begin{code}{@TEST}
1 + 2.5 --> 3.5 "Addition of two numbers"
3.4 * 5 --> 17.0 "Multiplication of two numbers"
8 / 2 --> 4 "Division of two numbers"
10 - 8.3 --> 1.7 "Subtraction of two numbers"
12 = 11 --> false "Equality between two numbers"
12 ~= 11 --> true "Test if two numbers are different"
12 > 9 --> true "Greater than"
12 >= 10 --> true "Greater or equal than"
12 < 10 --> false "Smaller than"
100@10 --> 100@10 "Point creation"
\end{code}
\on{Should check how tabbing works in the listings package ...}
The following example works surprisingly well in \st:
\begin{code}{@TEST}
1000 factorial / 999 factorial --> 1000
\end{code}
Note that \ct{1000 factorial} is really calculated which in many other languages can be quite difficult to compute. This is an excellent example of automatic coercion and exact handling of a number.
\cmindex{Integer}{factorial}
\dothis{Try to display the result of \ct{1000 factorial}. It takes more time to display it than to calculate it!}
%-----------------------------------------------------------------
\subsection{Float}
\clsindmain{Float} implements the abstract \ct{Number} methods for floating point numbers.
More interestingly, \ct{Float class} (\ie the class-side of \ct{Float}) provides methods to return the following \emph{constants}: \mthind{Float class}{e}, \mthind{Float class}{infinity}, \mthind{Float class}{nan} and \mthind{Float class}{pi}.
\begin{code}{@TEST}
Float pi --> 3.141592653589793
Float infinity --> Infinity
Float infinity isInfinite --> true
\end{code}
%-----------------------------------------------------------------
\subsection{Fraction}
\clsind{Fractions} are represented by instance variables for the numerator and denominator, which should be \ct{Integer}s. \ct{Fractions} are normally created by \ct{Integer} division (rather than using the constructor method \cmind{Fraction}{numerator:denominator:}):
\begin{code}{@TEST}
6/8 --> (3/4)
(6/8) class --> Fraction
\end{code}
Multiplying a \ct{Fraction} by an \ct{Integer} or another \ct{Fraction} may yield an \ct{Integer}:
\begin{code}{@TEST}
6/8 * 4 --> 3
\end{code}
\lr{Maybe mention to avoid fractions in results that one of the operands has to be a float, e.g. 6.0 / 8 or 6 asFloat / 8. (p. 213)}
%-----------------------------------------------------------------
\subsection{Integer}
\clsindmain{Integer} is the abstract parent of three concrete integer implementations. In addition to providing concrete implementations of many abstract \ct{Number} methods, it also adds a few methods specific to integers, such as \mthind{Integer}{factorial}, \mthind{Integer}{atRandom}, \mthind{Integer}{isPrime}, \mthind{Integer}{gcd:} and many others.
\clsindmain{SmallInteger} is special in that its instances are represented compactly --- instead of being stored as a reference, a \ct{SmallInteger} is represented directly using the bits that would otherwise be used to hold a reference. The first bit of an object reference indicates whether the object is a \ct{SmallInteger} or not.
The class methods \mthind{SmallInteger}{minVal} and \mthind{SmallInteger}{maxVal} tell us the range of a \ct{SmallInteger}:
\begin{code}{@TEST}
SmallInteger maxVal = ((2 raisedTo: 30) - 1) --> true
SmallInteger minVal = (2 raisedTo: 30) negated --> true
\end{code}
When a \ct{SmallInteger} goes out of this range, it is automatically converted to a \clsind{LargePositiveInteger} or a \clsind{LargeNegativeInteger}, as needed:
\begin{code}{@TEST}
(SmallInteger maxVal + 1) class --> LargePositiveInteger
(SmallInteger minVal - 1) class --> LargeNegativeInteger
\end{code}
Large integers are similarly converted back to small integers when appropriate.
As in most programming languages, integers can be useful for specifying iterative behaviour. There is a dedicated method \mthind{Integer}{timesRepeat:} for evaluating a block repeatedly.
We have already seen a similar example in \charef{syntax}:
\begin{code}{@TEST | n |}
n := 2.
3 timesRepeat: [ n := n*n ].
n --> 256
\end{code}
%=================================================================
\section{Characters}
\clsindmain{Character} is defined in the \scatind{Collections-Strings} category as a subclass of \clsind{Magnitude}. Printable characters are represented in Squeak as \lct{\$$\langle$\emph{char}$\rangle$}. For example:
\begin{code}{@TEST}
$a < $b --> true
\end{code}
Non-printing characters can be generated by various class methods. \mbox{\cmind{Character class}{value:}} takes the Unicode (or ASCII) integer value as argument and returns the corresponding character. The protocol \protind{accessing untypeable characters} contains a number of convenience constructor methods such as \mthind{Character class}{backspace}, \mthind{Character class}{cr}, \mthind{Character class}{escape}, \mthind{Character class}{euro}, \mthind{Character class}{space}, \mthind{Character class}{tab}, and so on.
\begin{code}{@TEST}
Character space = (Character value: Character space asciiValue) --> true
\end{code}
The \mthind{Character}{printOn:} method is clever enough to know which of the three ways to generate characters offers the most appropriate representation:
\begin{code}{@TEST}
Character value: 1 --> Character value: 1
Character value: 32 --> Character space
Character value: 97 --> $a
\end{code}\ignoredollar$
Various convenient \emph{testing} methods are built in: \mthind{Character}{isAlphaNumeric}, \mthind{Character}{isCharacter}, \mthind{Character}{isDigit}, \mthind{Character}{isLowercase}, \mthind{Character}{isVowel}, and so on.
To convert a \ct{Character} to the string containing just that character, send \mthind{Character}{asString}. In this case \ct{asString} and \mthind{Character}{printString} yield different results:
\begin{code}{@TEST}
$a asString --> 'a'
$a --> $a
$a printString --> '$a'
\end{code}
Every ascii \ct{Character} is a unique instance, stored in the class variable \cvind{CharacterTable}:
\begin{code}{@TEST}
(Character value: 97) == $a --> true
\end{code}\ignoredollar$
\ct{Characters} outside the range 0 to 255 are not unique, however:
\begin{code}{@TEST}
Character characterTable size --> 256
(Character value: 500) == (Character value: 500) --> false
\end{code}
%=================================================================
\section{Strings}
The \clsindmain{String} class is also defined in the category \scatind{Collections-Strings}. A \ct{String} is an indexed \ct{Collection} that holds only \ct{Characters}.
\begin{figure}[ht]
\ifluluelse
{\centerline {\includegraphics[width=0.5\textwidth]{StringHierarchy}}}
{\centerline {\includegraphics[width=6cm]{StringHierarchy}}}
\caption{The String Hierarchy \label{fig:strings}}
\end{figure}
In fact, \ct{String} is abstract and Squeak \ct{Strings} are actually instances of the concrete class \clsindmain{ByteString}.
\begin{code}{@TEST}
'hello world' class --> ByteString
\end{code}
The other important subclass of \ct{String} is \clsindmain{Symbol}. The key difference is that there is only ever a single instance of \ct{Symbol} with a given value. (This is sometimes called ``the unique instance property''). In contrast, two separately constructed \ct{String}s that happen to contain the same sequence of characters will often be different objects.
\begin{code}{@TEST}
'hel','lo' == 'hello' --> false
\end{code}
\begin{code}{@TEST}
('hel','lo') asSymbol == #hello --> true
\end{code}
\noindent
Another important difference is that a \ct{String} is mutable, whereas a \ct{Symbol} is immutable.
\begin{code}{@TEST}
'hello' at: 2 put: $u; yourself --> 'hullo'
\end{code}\ignoredollar$
\begin{code}{NB: CANNOT TEST}
#hello at: 2 put: $u --> error!
\end{code}\ignoredollar$
It is easy to forget that since strings are collections, they understand the same messages that other collections do:
\begin{code}{@TEST}
#hello indexOf: $o --> 5
\end{code}\ignoredollar$
Although \ct{String} does not inherit from \clsind{Magnitude}, it does support the usual \protind{comparing} methods, \ct{<}, \ct{=} and so on. In addition, \cmind{String}{match:} is useful for some basic glob-style pattern-matching:
\begin{code}{@TEST}
'*or*' match: 'zorro' --> true
\end{code}
Should you need more advanced support for regular expressions, there are a number of third party implementations available, such as Vassili Bykov's Regex package.
\index{Bykov, Vassili}
\index{regular expression package}
Strings support rather a large number of conversion methods. Many of these are \ind{shortcut constructor methods} for other classes, such as \mthind{String}{asDate}, \mthind{String}{asFileName} and so on. There are also a number of useful methods for converting a string to another string, such as \mthind{String}{capitalized} and \mthind{String}{translateToLowercase}.
For more on strings and collections, see \charef{collections}.
\on{There is more material we could use here:
\url{http://www.dmu.com/crb/crb7.html}.}
%=================================================================
\section{Booleans}
The class \clsindmain{Boolean} offers a fascinating insight into how much of the Smalltalk language has been pushed into the class library. \ct{Boolean} is the \subind{class}{abstract} superclass of the \patind{Singleton} classes \clsindmain{True} and \clsindmain{False}.
\begin{figure}[ht]
\ifluluelse
{\centerline {\includegraphics[width=0.6\textwidth]{BooleanHierarchy}}}
{\centerline {\includegraphics[width=6cm]{BooleanHierarchy}}}
\caption{The Boolean Hierarchy \label{fig:booleans}}
\end{figure}
Most of the behaviour of \ct{Boolean}s can be understood by considering the method \mthind{Boolean}{ifTrue:ifFalse:}, which takes two \ct{Blocks} as arguments.
\begin{code}{@TEST}
(4 factorial > 20) ifTrue: [ 'bigger' ] ifFalse: [ 'smaller' ] --> 'bigger'
\end{code}
The method is abstract in \ct{Boolean}.
The implementations in its concrete subclasses are both trivial:
\begin{method}{Implementations of \lct{ifTrue:ifFalse:}}
True>>>ifTrue: trueAlternativeBlock ifFalse: falseAlternativeBlock
^trueAlternativeBlock value
False>>>ifTrue: trueAlternativeBlock ifFalse: falseAlternativeBlock
^falseAlternativeBlock value
\end{method}
\cmindex{True}{ifTrue:}
\cmindex{False}{ifTrue:}
In fact, this is the essence of OOP: when a message is sent to an object, the object itself determines which method will be used to respond. In this case an instance of \ct{True} simply evaluates the \emph{true} alternative, while an instance of \ct{False} evaluates the \emph{false} alternative. All the abstract \ct{Boolean} methods are implemented in this way for \ct{True} and \ct{False}. For example:
\begin{method}{Implementing negation}
True>>>not
"Negation--answer false since the receiver is true."
^false
\end{method}
\cmindex{True}{not}
\ct{Booleans} offer several useful convenience methods, such as \mthind{Boolean}{ifTrue:}, \mthind{Boolean}{ifFalse:}, \mthind{Boolean}{ifFalse:ifTrue}. You also have the choice between eager and lazy conjunctions and disjunctions.
\begin{code}{@TEST}
(1>2) & (3<4) --> false "must evaluate both sides"
(1>2) and: [ 3<4 ] --> false "only evaluate receiver"
(1>2) and: [ (1/0) > 0 ] --> false "argument block is never evaluated, so no exception"
\end{code}
In the first example, both \ct{Boolean} subexpressions are evaluated, since \mthind{Boolean}{&} takes a \ct{Boolean} argument.
In the second and third examples, only the first is evaluated, since \mthind{Boolean}{and:} expects a \ct{Block} as its argument. The \ct{Block} is evaluated only if the first argument is \pvind{true}.
\dothis{Try to imagine how \ct{and:} and \ct{or:} are implemented.
Check the implementations in \ct{Boolean}, \ct{True} and \ct{False}.}
%=================================================================
\section{Chapter summary}
\begin{itemize}
% \item Send \ct{yourself} to get back the receiver at the end of a cascade.
\item If you override \ct{=} then you should override \ct{hash} as well.
\item Override \ct{postCopy} to correctly implement copying for your objects.
\item Send \ct{self halt} to set a breakpoint.
\item Return \ct{self subclassResponsibility} to make a method abstract.
\item To give an object a \ct{String} representation you should override \ct{printOn:}.
\item Override the hook method \ct{initialize} to properly initialize instances.
\item \ct{Number} methods automatically convert between \ct{Floats}, \ct{Fractions} and \ct{Integers}.
\item \ct{Fractions} truly represent rational numbers rather than floats.
\item \ct{Characters} are unique instances.
\item \ct{Strings} are mutable; \ct{Symbols} are not.
Take care not to mutate string literals, however!
\item \ct{Symbols} are unique; \ct{Strings} are not.
\item \ct{Strings} and \ct{Symbols} are \ct{Collections} and therefore support the usual \ct{Collection} methods.
\end{itemize}
%=============================================================
\ifx\wholebook\relax\else
\bibliographystyle{jurabib}
\nobibliography{scg}
\end{document}
\fi
%=============================================================
%-----------------------------------------------------------------
%%% Local Variables:
%%% coding: utf-8
%%% mode: latex
%%% TeX-master: t
%%% TeX-PDF-mode: t
%%% ispell-local-dictionary: "english"
%%% End: