Permalink
Fetching contributors…
Cannot retrieve contributors at this time
3941 lines (3386 sloc) 139 KB
@table @b
@item What this manual presents
This document provides a tutorial introduction to the Smalltalk language
in general, and the @gst{} implementation in particular.
It does not provide exhaustive coverage of every feature of the language
and its libraries; instead, it attempts to introduce a critical mass of
ideas and techniques to get the Smalltalk novice moving in the
right direction.
@item Who this manual is written for
This manual assumes that the reader is acquainted with
the basics of computer science, and has reasonable proficiency
with a procedural language such as C. It also assumes that the reader
is already familiar with the usual janitorial tasks associated with
programming: editing, moving files, and so forth.
@end table
@menu
* Getting started:: Starting to explore @gst{}
* Some classes:: Using some of the Smalltalk classes
* The hierarchy:: The Smalltalk class hierarchy
* Creating classes:: Creating a new class of objects
* Creating subclasses:: Adding subclasses to another class
* Code blocks (I):: Control structures in Smalltalk
* Code blocks (II):: Guess what? More control structures
* Debugging:: Things go bad in Smalltalk too!
* More subclassing:: Coexisting in the class hierarchy
* Streams:: A powerful abstraction useful in scripts
* Exception handling:: More sophisticated error handling
* Behind the scenes:: Some nice stuff from the Smalltalk innards
* And now:: Some final words
* The syntax:: For the most die-hard computer scientists
@end menu
@node Getting started
@section Getting started
@menu
* Starting Smalltalk:: Starting up Smalltalk
* Saying hello:: Saying hello
* What happened:: But how does it say hello?
* Doing math:: Smalltalk too can do it!
* Math in Smalltalk:: But in a peculiar way of course...
@end menu
@node Starting Smalltalk
@subsection Starting up Smalltalk
Assuming that @gst{} has been installed on your
system, starting it is as simple as:
@example
@b{$} gst
@end example
the system loads in Smalltalk, and displays a startup banner
like:
@display
GNU Smalltalk ready
st>
@end display
You are now ready to try your hand at Smalltalk! By the
way, when you're ready to quit, you exit Smalltalk by typing
@kbd{control-D} on an empty line.
@node Saying hello
@subsection Saying hello
An initial exercise is to make Smalltalk say ``hello'' to
you. Type in the following line (@code{printNl} is a upper case
N and a lower case L):
@example
'Hello, world' printNl
@end example
The system then prints back 'Hello, world' to you. It prints it
twice, the first time because you asked to print and the second
time because the snipped evaluated to the 'Hello, world' string.@footnote{
You can also have the system print out a lot of statistics which
provide information on the performance of the underlying Smalltalk
engine. You can enable them by starting Smalltalk as:
@example
@b{$} gst -V
@end example
}
@node What happened
@subsection What actually happened
The front-line Smalltalk interpreter gathers all text
until a '!' character and executes it. So the actual
Smalltalk code executed was:
@example
'Hello, world' printNl
@end example
This code does two things. First, it creates an object of
type @code{String} which contains the characters ``Hello, world''.
Second, it sends the message named @code{printNl} to the object.
When the object is done processing the message, the code is
done and we get our prompt back.
You'll notice that we didn't say anything about printing
ing the string, even though that's in fact what happened.
This was very much on purpose: the code we typed in doesn't
know anything about printing strings. It knew how to get a
string object, and it knew how to send a message to that
object. That's the end of the story for the code we wrote.
But for fun, let's take a look at what happened when the
string object received the @code{printNl} message. The string object
then went to a table @footnote{Which table? This is determined by the type
of the object. An object has a type, known as the
class to which it belongs. Each class has a table
of methods. For the object we created, it is
known as a member of the @code{String} class. So we go
to the table associated with the String class.}
which lists the messages which strings can receive, and what code to
execute. It found that there is indeed an entry for
@code{printNl} in that table and ran this code. This code then walked through
its characters, printing each of them out to the terminal. @footnote{
Actually, the message @code{printNl} was inherited
from Object. It sent a @code{print} message, also
inherited by Object, which then sent @code{printOn:} to
the object, specifying that it print to the @code{Transcript}
object. The String class then prints its characters to the
standard output.}
The central point is that an object is entirely self-contained;
only the object knew how to print itself out. When we want an
object to print out, we ask the object itself to do the printing.
@node Doing math
@subsection Doing math
A similar piece of code prints numbers:
@example
1234 printNl
@end example
Notice how we used the same message, but have sent it to a
new type of object---an integer (from class @code{Integer}). The
way in which an integer is printed is much different from
the way a string is printed on the inside, but because we
are just sending a message, we do not have to be aware of
this. We tell it to @code{printNl}, and it prints itself out.
As a user of an object, we can thus usually send a particular
message and expect basically the same kind of behavior,
regardless of object's internal structure (for
instance, we have seen that sending @code{printNl} to an object
makes the object print itself). In later chapters we will
see a wide range of types of objects. Yet all of them can
be printed out the same way---with @code{printNl}.
White space is ignored, except as it separates words.
This example could also have looked like:
@example
1234 printNl
@end example
However, @gst{} tries to execute each line by itself if
possible. If you wanted to write the code on two lines, you
might have written something like:
@example
(1234
printNl)
@end example
From now on, we'll omit @code{printNl} since @gst{}
does the service of printing the answer for us.
An integer can be sent a number of messages in addition
to just printing itself. An important set of messages for
integers are the ones which do math:
@example
9 + 7
@end example
Answers (correctly!) the value 16. The way that it does
this, however, is a significant departure from a procedural
language.
@node Math in Smalltalk
@subsection Math in Smalltalk
In this case, what happened was that the object @code{9} (an
Integer), received a @code{+} message with an argument of @code{7}
(also an Integer). The @code{+} message for integers then caused
Smalltalk to create a new object @code{16} and return it as the
resultant object. This @code{16} object was then given the
@code{printNl} message, and printed @code{16} on the terminal.
Thus, math is not a special case in Smalltalk; it is
done, exactly like everything else, by creating objects, and
sending them messages. This may seem odd to the Smalltalk
novice, but this regularity turns out to be quite a boon:
once you've mastered just a few paradigms, all of the language
``falls into place''. Before you go on to the next
chapter, make sure you try math involving @code{*} (multiplication),
@code{-} (subtraction), and @code{/} (division) also. These
examples should get you started:
@example
8 * (4 / 2)
8 - (4 + 1)
5 + 4
2/3 + 7
2 + 3 * 4
2 + (3 * 4)
@end example
@node Some classes
@section Using some of the Smalltalk classes
This chapter has examples which need a place to hold
the objects they create. Such place is created automatically
as necessary; when you want to discard all the objects you
stored, write an exclamation mark at the end of the statement.
Now let's create some new objects.
@menu
* Arrays:: An array in Smalltalk
* Sets:: A set in Smalltalk
* Dictionaries:: Getting more sophisticated, eh?
* Closing thoughts:: There always ought to be some closing thoughts
@end menu
@node Arrays
@subsection An array in Smalltalk
An array in Smalltalk is similar to an array in any
other language, although the syntax may seem peculiar at
first. To create an array with room for 20 elements, do@footnote{
@gst{} supports completion in the same way as Bash or @sc{gdb}.
To enter the following line, you can for example type
@samp{x := Arr@kbd{<TAB>} new: 20}. This can come in handy
when you have to type long names such as @code{IdentityDictionary},
which becomes @samp{Ide@kbd{<TAB>}D@kbd{<TAB>}}. Everything
starting with a capital letter or ending with a colon can
be completed.}:
@example
x := Array new: 20
@end example
The @code{Array new: 20} creates the array; the @code{x :=} part
connects the name @code{x} with the object. Until you assign
something else to @code{x}, you can refer to this array by the name
@code{x}. Changing elements of the array is not done using the
@code{:=} operator; this operator is used only to bind names to
objects. In fact, you never modify data structures;
instead, you send a message to the object, and it will modify itself.
For instance:
@example
x at: 1
@end example
@noindent
which prints:
@example
nil
@end example
The slots of an array are initially set to ``nothing'' (which
Smalltalk calls @code{nil}). Let's set the first slot to the
number 99:
@example
x at: 1 put: 99
@end example
@noindent
and now make sure the 99 is actually there:
@example
x at: 1
@end example
@noindent
which then prints out:
@example
99
@end example
These examples show how to manipulate an array. They also
show the standard way in which messages are passed arguments
ments. In most cases, if a message takes an argument, its
name will end with `:'.@footnote{Alert readers will remember that the math
examples of the previous chapter deviated from this.}
So when we said @code{x at: 1} we were sending a message to whatever
object was currently bound to @code{x} with an argument of 1. For an
array, this results in the first slot of the array being returned.
The second operation, @code{x at: 1 put: 99} is a message
with two arguments. It tells the array to place the second
argument (99) in the slot specified by the first (1). Thus,
when we re-examine the first slot, it does indeed now
contain 99.
There is a shorthand for describing the messages you
send to objects. You just run the message names together.
So we would say that our array accepts both the @code{at:} and
@code{at:put:} messages.
There is quite a bit of sanity checking built into an
array. The request
@example
6 at: 1
@end example
@noindent
fails with an error; 6 is an integer, and can't be indexed. Further,
@example
x at: 21
@end example
@noindent
fails with an error, because the array we created only has
room for 20 objects.
Finally, note that the object stored
in an array is just like any other object, so we can do
things like:
@example
(x at: 1) + 1
@end example
@noindent
which (assuming you've been typing in the examples) will
print 100.
@node Sets
@subsection A set in Smalltalk
We're done with the array we've been using, so we'll
assign something new to our @code{x} variable. Note that we
don't need to do anything special about the old array: the
fact that nobody is using it any more will be automatically
detected, and the memory reclaimed. This is known as @i{garbage collection}
and it is generally done when Smalltalk finds that it is
running low on memory. So, to get our new object, simply do:
@example
x := Set new
@end example
@noindent
which creates an empty set. To view its contents, do:
@example
x
@end example
The kind of object is printed out (i.e., @code{Set}), and then the
members are listed within parenthesis. Since it's empty, we
see:
@example
Set ()
@end example
Now let's toss some stuff into it. We'll add the numbers 5
and 7, plus the string 'foo'. This is also the first example
where we're using more than one statement, and thus a good place to present
the statement separator---the @code{.} period:
@example
x add: 5. x add: 7. x add: 'foo'
@end example
Like Pascal, and unlike C, statements are separated rather than
terminated. Thus you need only use a @code{.} when you have finished
one statement and are starting another. This is why our last statement,
@code{^r}, does not have a @code{.} following. Once again like Pascal,
however, Smalltalk won't complain if your enter a spurious
statement separator after @i{the last} statement.
However, we can save a little typing by using a Smalltalk shorthand:
@example
x add: 5; add: 7; add: 'foo'
@end example
This line does exactly what the previous one did.
The trick is that the semicolon operator causes
the message to be sent to the same object as the last message
sent. So saying @code{; add: 7} is the same as saying
@code{x add: 7}, because @code{x} was the last thing a message was sent
to.
This may not seem like such a big savings, but compare
the ease when your variable is named @code{aVeryLongVariableName}
instead of just @code{x}! We'll revisit some other occasions
where @code{;} saves you trouble, but for now let's continue with
our set. Type either version of the example, and make sure
that we've added 5, 7, and ``foo'':
@example
x
@end example
@noindent
we'll see that it now contains our data:
@example
Set ('foo' 5 7)
@end example
What if we add something twice? No problem---it just stays in
the set. So a set is like a big checklist---either it's in
there, or it isn't. To wit:
@example
x add:5; add: 5; add: 5; add: 5; yourself
@end example
We've added @i{5} several times, but when we printed our set
back out, we just see:
@example
Set ('foo' 5 7)
@end example
@code{yourself} is commonly sent at the end of the cascade,
if what you are interested in is the object itself---in this
case, we were not interested in the return value of @code{add: 5},
which happens to be @code{5} simply. There's nothing magic in
@code{yourself}; it is a unary message like @code{printNl},
which does nothing but returning the object itself. So you
can do this too:
@example
x yourself
@end example
What you put into a set with @code{add:}, you can take out
with @code{remove:}. Try:
@example
x remove: 5
x printNl
@end example
The set now prints as:
@example
Set ('foo' 7)
@end example
The ``5'' is indeed gone from the set.
We'll finish up with one more of the many things you
can do with a set---checking for membership. Try:
@example
x includes: 7
x includes: 5
@end example
From which we see that x does indeed contain 7, but not 5.
Notice that the answer is printed as @code{true} or @code{false}.
Once again, the thing returned is an object---in this case, an
object known as a boolean. We'll look at the use of
booleans later, but for now we'll just say that booleans are
nothing more than objects which can only either be true or
false---nothing else. So they're very useful for answers to
yes or no questions, like the ones we just posed. Let's
take a look at just one more kind of data structure:
@node Dictionaries
@subsection Dictionaries
A dictionary is a special kind of collection. With a
regular array, you must index it with integers. With
dictionaries, you can index it with any object at all.
Dictionaries thus provide a very powerful way of correlating
one piece of information to another. Their only downside is
that they are somewhat less efficient than simple arrays.
Try the following:
@example
y := Dictionary new
y at: 'One' put: 1
y at: 'Two' put: 2
y at: 1 put: 'One'
y at: 2 put: 'Two'
@end example
This fills our dictionary in with some data. The data is
actually stored in pairs of key and value (the key is what
you give to @code{at:}---it specifies a slot; the value is what is
actually stored at that slot). Notice how we were able to
specify not only integers but also strings as both the key
and the value. In fact, we can use any kind of object we
want as either---the dictionary doesn't care.
Now we can map each key to a value:
@example
y at: 1
y at: 'Two'
@end example
which prints respectively:
@example
'One'
2
@end example
We can also ask a dictionary to print itself:
@example
y
@end example
@noindent
which prints:
@example
Dictionary (1->'One' 2->'Two' 'One'->1 'Two'->2 )
@end example
@noindent
where the first member of each pair is the key, and the second
the value. It is now time to take a final look at the objects
we have created, and send them to oblivion:
@example
y
x!
@end example
The exclamation mark deleted @gst{}'s knowledge of both
variables. Asking for them again will return just @code{nil}.
@node Closing thoughts
@subsection Closing thoughts
You've seen how Smalltalk provides you with some very
powerful data structures. You've also seen how Smalltalk
itself uses these same facilities to implement the language.
But this is only the tip of the iceberg---Smalltalk is much
more than a collection of ``neat'' facilities to use.
The objects and methods which are automatically available
are only the beginning of the foundation on which you
build your programs---Smalltalk allows you to add your own
objects and methods into the system, and then use them along
with everything else. The art of programming in Smalltalk
is the art of looking at your problems in terms of objects,
using the existing object types to good effect, and enhancing
Smalltalk with new types of objects. Now that you've
been exposed to the basics of Smalltalk manipulation, we can
begin to look at this object-oriented technique of programming.
@node The hierarchy
@section The Smalltalk class hierarchy
When programming in Smalltalk, you sometimes need to
create new kinds of objects, and define what various
messages will do to these objects. In the next chapter we will
create some new classes, but first we need to understand how
Smalltalk organizes the types and objects it contains.
Because this is a pure ``concept'' chapter, without any actual
Smalltalk code to run, we will keep it short and to the
point.
@menu
* Class Object:: The grandfather of every class
* Animals:: A classic in learning OOP!
* But why:: The bottom line of the class hierarchy
@end menu
@node Class Object
@subsection Class @code{Object}
Smalltalk organizes all of its classes as a tree hierarchy.
At the very top of this hierarchy is class @i{Object}.
Following somewhere below it are more specific classes, such
as the ones we've worked with---strings, integers, arrays, and
so forth. They are grouped together based on their similarities;
for instance, types of objects which may be compared
as greater or less than each other fall under a class known
as @i{Magnitude}.
One of the first tasks when creating a new object is to
figure out where within this hierarchy your object falls.
Coming up with an answer to this problem is at least as much
art as science, and there are no hard-and-fast rules to nail
it down. We'll take a look at three kinds of objects to
give you a feel for how this organization matters.
@node Animals
@subsection Animals
Imagine that we have three kinds of objects, representing
@i{Animals}, @i{Parrots}, and @i{Pigs}. Our messages will be
@i{eat}, @i{sing}, and @i{snort}. Our first pass at
inserting these objects into the Smalltalk hierarchy would
organize them like:
@example
@r{Object}
@r{Animals}
@r{Parrots}
@r{Pigs}
@end example
This means that Animals, Parrots, and Pigs are all direct
descendants of @i{Object}, and are not descendants of each
other.
Now we must define how each animal responds to each
kind of message.
@example
@r{Animals}
@r{eat --> Say ``I have now eaten''}
@r{sing --> Error}
@r{snort --> Error}
@r{Parrots}
@r{eat --> Say ``I have now eaten''}
@r{sing --> Say ``Tweet''}
@r{snort --> Error}
@r{Pigs}
@r{eat --> Say ``I have now eaten"''}
@r{sing --> Error}
@r{snort --> Say ``Oink''}
@end example
Notice how we kept having to indicate an action for @i{eat}.
An experienced object designer would immediately recognize
this as a clue that we haven't set up our hierarchy correctly.
Let's try a different organization:
@example
@r{Object}
@r{Animals}
@r{Parrots}
@r{Pigs}
@end example
That is, Parrots inherit from Animals, and Pigs from Parrots.
Now Parrots inherit all of the actions from Animals,
and Pigs from both Parrots and Animals. Because of this
inheritance, we may now define a new set of actions which
spares us the redundancy of the previous set:
@example
@r{Animals}
@r{eat --> Say ``I have now eaten''}
@r{sing --> Error}
@r{snort --> Error}
@r{Parrots}
@r{sing --> Say ``Tweet''}
@r{Pigs}
@r{snort --> Say ``Oink''}
@end example
Because Parrots and Pigs both inherit from Animals, we have
only had to define the @i{eat} action once. However, we have
made one mistake in our class setup---what happens when we
tell a Pig to @i{sing}? It says ``Tweet'', because we have put
Pigs as an inheritor of Parrots. Let's try one final
organization:
@example
@r{Object}
@r{Animals}
@r{Parrots}
@r{Pigs}
@end example
Now Parrots and Pigs inherit from Animals, but not from each
other. Let's also define one final pithy set of actions:
@example
@r{Animals}
@r{eat --> Say ``I have eaten''}
@r{Parrots}
@r{sing --> Say ``Tweet''}
@r{Pigs}
@r{snort --> Say ``Oink''}
@end example
The change is just to leave out messages which are inappropriate.
If Smalltalk detects that a message is not known by
an object or any of its ancestors, it will automatically
give an error---so you don't have to do this sort of thing
yourself. Notice that now sending @i{sing} to a Pig does
indeed not say ``Tweet''---it will cause a Smalltalk error
instead.
@node But why
@subsection The bottom line of the class hierarchy
The goal of the class hierarchy is to allow you to
organize objects into a relationship which allows a particular
object to inherit the code of its ancestors. Once you
have identified an effective organization of types, you
should find that a particular technique need only be implemented
once, then inherited by the children below. This
keeps your code smaller, and allows you to fix a bug in a
particular algorithm in only once place---then have all users
of it just inherit the fix.
You will find your decisions for adding objects change
as you gain experience. As you become more familiar with
the existing set of objects and messages, your selections
will increasingly ``fit in'' with the existing ones. But even
a Smalltalk @i{pro} stops and thinks carefully at this stage,
so don't be daunted if your first choices seem difficult and
error-prone.
@node Creating classes
@section Creating a new class of objects
With the basic techniques presented in the preceding
chapters, we're ready do our first real Smalltalk program.
In this chapter we will construct three new types of objects
(known as @i{classes}), using the Smalltalk technique of
inheritance to tie the classes together, create new objects
belonging to these classes (known as creating instances of
the class), and send messages to these objects.
We'll exercise all this by implementing a toy home-finance
accounting system. We will keep track of our overall
cash, and will have special handling for our checking
and savings accounts. From this point on, we will be defining
classes which will be used in future chapters. Since
you will probably not be running this whole tutorial in one
Smalltalk session, it would be nice to save off the state of
Smalltalk and resume it without having to retype all the
previous examples. To save the current state of @gst{},
type:
@example
ObjectMemory snapshot: 'myimage.im'
@end example
@noindent
and from your shell, to later restart Smalltalk from this
``snapshot'':
@example
@b{$} gst -I myimage.im
@end example
Such a snapshot currently takes a little more than a megabyte,
and contains all variables, classes, and definitions you
have added.
@menu
* A new class:: Creating a new class
* Documenting the class:: So anybody will know what it's about
* Defining methods:: So it will be useful
* Instance methods:: One of two kind of methods (the others,
class methods, are above)
* A look at our object:: which will sorely show that something
is still missing.
* Moving money around:: Let's make it more fun!
* Next coming:: Yeah, what's next?!?
@end menu
@node A new class
@subsection Creating a new class
Guess how you create a new class? This should be getting
monotonous by now---by sending a message to an object.
The way we create our first ``custom'' class is by sending the
following message:
@example
Object subclass: #Account.
Account instanceVariableNames: 'balance'.
@end example
Quite a mouthful, isn't it? @gst{} provides a
simpler way to write this, but for now let's stick with this.
Conceptually, it isn't really that bad. The Smalltalk variable
@i{Object} is bound to the grand-daddy of all classes on the
system. What we're doing here is telling the @i{Object} class
that we want to add to it a subclass known as @i{Account}.
Then, @code{instanceVariableNames: 'balance'} tells the new
class that each of its objects (@dfn{instances}) will have a
hidden variable named @code{balance}.
@node Documenting the class
@subsection Documenting the class
The next step is to associate a description with the
class. You do this by sending a message to the new class:
@example
Account comment:
'I represent a place to deposit and withdraw money'
@end example
A description is associated with every Smalltalk class, and
it's considered good form to add a description to each new
class you define. To get the description for a given class:
@example
Account comment
@end example
And your string is printed back to you. Try this with class
Integer, too:
@example
Integer comment
@end example
However, there is another way to define classes. This still
translates to sending objects, but looks more like a traditional
programming language or scripting language:
@example
Object subclass: Account [
| balance |
<comment:
'I represent a place to deposit and withdraw money'>
]
@end example
This has created a class. If we want to access it again, for
example to modify the comment, we can do so like this:
@example
Account extend [
<comment:
'I represent a place to withdraw money that has been deposited'>
]
@end example
This instructs Smalltalk to pick an existing class, rather than
trying to create a subclass.
@node Defining methods
@subsection Defining a method for the class
We have created a class, but it isn't ready to do any
work for us---we have to define some messages which the class
can process first. We'll start at the beginning by defining
methods for instance creation:
@example
Account class extend [
new [
| r |
<category: 'instance creation'>
r := super new.
r init.
^r
]
]
@end example
The important points about this are:
@itemize @bullet
@item
@code{Account class} means that we are defining messages which are
to be sent to the Account class itself.
@item
@code{<category: 'instance creation'>}
is more documentation support; it says that the methods
we are defining supports creating objects of type
Account.
@item
The text starting with @code{new [} and ending with @code{]}
defined what action to take for the message @code{new}.
When you enter this definition, @gst{} will simply
give you another prompt, but your method has been compiled in
and is ready for use. @gst{} is pretty quiet on successful
method definitions---but you'll get plenty of error
messages if there's a problem!
If you're familiar with other Smalltalks, note that the body
of the method is always in brackets.
@end itemize
The best way to describe how this method works is to
step through it. Imagine we sent a message to the new class
Account with the command line:
@example
Account new
@end example
@code{Account} receives the message @code{new} and looks up
how to process this message. It finds our new definition, and
starts running it. The first line, @code{| r |}, creates a local
variable named @code{r} which can be used as a placeholder for
the objects we create. @code{r} will go away as soon as the message
is done being processed; note the parallel with @code{balance}, which
goes away as soon as the object is not used anymore. And note that
here you have to declare local variables explicitly, unlike what
you did in previous examples.
The first real step is to actually create the object.
The line @code{r := super new} does this using a fancy trick.
The word @code{super} stands for the same object that the message
@code{new} was originally sent to (remember? it's @code{Account}),
except that when Smalltalk goes to search for the methods,
it starts one level higher up in the hierarchy than the current
level. So for a method in the Account class, this is
the Object class (because the class Account inherits from is
Object---go back and look at how we created the Account
class), and the Object class' methods then execute some code
in response to the @code{#new} message. As it turns out, Object
will do the actual creation of the object when sent a @code{#new}
message.
One more time in slow motion: the Account method @code{#new}
wants to do some fiddling about when new objects are created,
but he also wants to let his parent do some work with
a method of the same name. By saying @code{r := super new} he
is letting his parent create the object, and then he is attaching
it to the variable @code{r}. So after this line of code executes,
we have a brand new object of type Account, and @code{r}
is bound to it. You will understand this better as time
goes on, but for now scratch your head once, accept it as a
recipe, and keep going.
We have the new object, but we haven't set it up correctly.
Remember the hidden variable @code{balance} which we saw
in the beginning of this chapter? @code{super new} gives us the
object with the @code{balance} field containing nothing, but we want
our balance field to start at 0. @footnote{And unlike C, Smalltalk
draws a distinction between @code{0} and @code{nil}. @code{nil}
is the @i{nothing} object, and you will receive an error if you
try to do, say, math on it. It really does matter that we
initialize our instance variable to the number 0 if we wish
to do math on it in the future.}
So what we need to do is ask the object to set itself up.
By saying @code{r init}, we are sending the @code{init}
message to our new Account. We'll define
this method in the next section---for now just assume that
sending the @code{init} message will get our Account set up.
Finally, we say @code{^r}. In English, this is @i{return what
r is attached to}. This means that whoever sent to Account
the @code{new} message will get back this brand new account. At
the same time, our temporary variable @code{r} ceases to exist.
@node Instance methods
@subsection Defining an instance method
We need to define the @code{init} method for our Account
objects, so that our @code{new} method defined above will work.
Here's the Smalltalk code:
@example
Account extend [
init [
<category: 'initialization'>
balance := 0
]
]
@end example
It looks quite a bit like the previous method definition,
except that the first one said
@code{Account class extend}, and ours says
@code{Account extend}.
The difference is that the first one defined a method for
messages sent directly to @code{Account}, but the second one is
for messages which are sent to Account objects once they are
created.
The method named @code{init} has only one line, @code{balance := 0}.
This initializes the hidden variable @code{balance} (actually
called an instance variable) to zero, which makes
sense for an account balance. Notice that the method
doesn't end with @code{^r} or anything like it: this method
doesn't return a value to the message sender. When you do
not specify a return value, Smalltalk defaults the return
value to the object currently executing. For clarity of
programming, you might consider explicitly returning @code{self}
in cases where you intend the return value to be used.@footnote{
And why didn't the designers default the
return value to nil? Perhaps they didn't appreciate
the value of void functions. After all, at
the time Smalltalk was being designed, C didn't
even have a void data type.}
Before going on, ere is how you could have written this code in a
single declaration (i.e.@: without using @code{extend}):
@example
Object subclass: Account [
| balance |
<comment:
'I represent a place to deposit and withdraw money'>
Account class >> new [
<category: 'instance creation'>
| r |
r := super new.
r init.
^r
]
init [
<category: 'initialization'>
balance := 0
]
]
@end example
@node A look at our object
@subsection Looking at our Account
Let's create an instance of class Account:
@example
a := Account new
@end example
Can you guess what this does? The @code{Smalltalk at: #a put: <something>}
creates a Smalltalk variable. And the @code{Account new} creates a new
Account, and returns it. So this line creates a Smalltalk
variable named @code{a}, and attaches it to a new Account---all in
one line. It also prints the Account object we just created:
@example
an Account
@end example
Hmmm... not very informative. The problem is that we didn't
tell our Account how to print itself, so we're just getting
the default system @code{printNl} method---which tells what the
object is, but not what it contains. So clearly we must add
such a method:
@example
Account extend [
printOn: stream [
<category: 'printing'>
super printOn: stream.
stream nextPutAll: ' with balance: '.
balance printOn: stream
]
]
@end example
Now give it a try again:
@example
a
@end example
@noindent
which prints:
@example
an Account with balance: 0
@end example
This may seem a little strange. We added a new method,
printOn:, and our printNl message starts behaving differently.
It turns out that the printOn: message is the central
printing function---once you've defined it, all of the
other printing methods end up calling it. Its argument is a
place to print to---quite often it is the variable @code{Transcript}.
This variable is usually hooked to your terminal, and thus
you get the printout to your screen.
The @code{super printOn: stream} lets our parent do what it
did before---print out what our type is. The @code{an Account}
part of the printout came from this.
@code{stream nextPutAll: ' with balance: '} creates the
string @code{ with balance: }, and prints it out to the stream,
too; note that we don't use @code{printOn:} here because that would
enclose our string within quotes. Finally, @code{balance printOn: stream}
asks whatever object is hooked to the @code{balance} variable to print
itself to the stream. We set @code{balance} to 0, so the 0 gets printed out.
@node Moving money around
@subsection Moving money around
We can now create accounts, and look at them. As it
stands, though, our balance will always be 0---what a tragedy!
Our final methods will let us deposit and spend money.
They're very simple:
@example
Account extend [
spend: amount [
<category: 'moving money'>
balance := balance - amount
]
deposit: amount [
<category: 'moving money'>
balance := balance + amount
]
]
@end example
With these methods you can now deposit and spend amounts of
money. Try these operations:
@example
a deposit: 125
a deposit: 20
a spend: 10
@end example
@node Next coming
@subsection What's next?
We now have a generic concept, an ``Account''. We can create them,
check their balance, and move money in and out of
them. They provide a good foundation, but leave out important
information that particular types of accounts might
want. In the next chapter, we'll take a look at fixing this
problem using subclasses.
@node Creating subclasses
@section Two Subclasses for the Account Class
This chapter continues from the previous chapter in
demonstrating how one creates classes and subclasses in
Smalltalk. In this chapter we will create two special subclasses
of Account, known as Checking and Savings. We will
continue to inherit the capabilities of Account, but will
tailor the two kinds of objects to better manage particular
kinds of accounts.
@menu
* The Savings class:: One of the two subclasses we'll put together
* The Checking class:: And here is the other
* Writing checks:: Only in Smalltalk, of course
@end menu
@node The Savings class
@subsection The Savings class
We create the Savings class as a subclass of Account.
It holds money, just like an Account, but has an additional
property that we will model: it is paid interest based on
its balance. We create the class Savings as a subclass of
Account.
@example
Account subclass: Savings [
| interest |
@end example
This is already telling something:
the instance variable @code{interest} will accumulate interest
paid. Thus, in addition to the @code{spend:} and
@code{deposit:} messages which we inherit from our parent,
Account, we will need to define a method to add in interest
deposits, and a way to clear the interest variable (which
we would do yearly, after we have paid taxes). We first define
a method for allocating a new account---we need to make sure that the
interest field starts at 0.
We can do so within the @code{Account subclass: Savings} scope,
which we have not closed above.
@example
init [
<category: 'initialization'>
interest := 0.
^super init
]
@end example
Recall that the parent took care of the @code{new} message, and
created a new object of the appropriate size. After creation,
the parent also sent an @code{init} message to the new
object. As a subclass of Account, the new object will
receive the @code{init} message first; it sets up its own
instance variable, and then passes the @code{init} message up the
chain to let its parent take care of its part of the
initialization.
With our new @code{Savings} account created, we can define
two methods for dealing specially with such an account:
@example
interest: amount [
interest := interest + amount.
self deposit: amount
]
clearInterest [
| oldinterest |
oldinterest := interest.
interest := 0.
^oldinterest
]
@end example
We are now finished, and close the class scope:
@example
]
@end example
The first method says that we add the @code{amount} to our
running total of interest. The line @code{self deposit: amount}
tells Smalltalk to send ourselves a message, in this case
@code{deposit: amount}. This then causes Smalltalk to look up
the method for @code{deposit:}, which it finds in our parent,
Account. Executing this method then updates our overall
balance.@footnote{@code{self} is much like @code{super}, except that
@code{self} will start looking for a method at the bottom
of the type hierarchy for the object, while
@code{super} starts looking one level up from the current
level. Thus, using @code{super} forces inheritance,
but @code{self} will find the first definition
of the message which it can.}
One may wonder why we don't just replace this with the
simpler @code{balance := balance + amount}. The answer lies
in one of the philosophies of object-oriented languages in general,
and Smalltalk in particular. Our goal is to encode a
technique for doing something once only, and then re-using
that technique when needed. If we had directly encoded
@code{balance := balance + amount} here, there would have been
two places that knew how to update the balance from a
deposit. This may seem like a useless difference. But consider
if later we decided to start counting the number of
deposits made. If we had encoded
@code{balance := balance + amount} in each place that needed to
update the balance, we would have to hunt each of them down in
order to update the count of deposits. By sending @code{self}
the message @code{deposit:}, we need only update this method
once; each sender of this message would then automatically get the correct
up-to-date technique for updating the balance.
The second method, @code{clearInterest}, is simpler. We
create a temporary variable @code{oldinterest} to hold the current
amount of interest. We then zero out our interest to
start the year afresh. Finally, we return the old interest
as our result, so that our year-end accountant can see how
much we made.@footnote{Of course, in a real accounting system we
would never discard such information---we'd probably
throw it into a Dictionary object, indexed by the
year that we're finishing. The ambitious might
want to try their hand at implementing such an
enhancement.}
@node The Checking class
@subsection The Checking class
Our second subclass of Account represents a checking
account. We will keep track of two facets:
@itemize @bullet
@item
What check number we are on
@item
How many checks we have left in our checkbook
@end itemize
We will define this as another subclass of Account:
@example
Account subclass: Checking [
| checknum checksleft |
@end example
We have two instance variables, but we really only need to
initialize one of them---if there are no checks left, the current
check number can't matter. Remember, our parent class
Account will send us the @code{init} message. We don't need our
own class-specific @code{new} function, since our parent's will
provide everything we need.
@example
init [
<category: 'initialization'>
checksleft := 0.
^super init
]
@end example
As in Savings, we inherit most of abilities from our superclass,
Account. For initialization, we leave @code{checknum}
alone, but set the number of checks in our checkbook to
zero. We finish by letting our parent class do its own
initialization.
@node Writing checks
@subsection Writing checks
We will finish this chapter by adding a method for
spending money through our checkbook. The mechanics of taking
a message and updating variables should be familiar:
@example
newChecks: number count: checkcount [
<category: 'spending'>
checknum := number.
checksleft := checkcount
]
writeCheck: amount [
<category: 'spending'>
| num |
num := checknum.
checknum := checknum + 1.
checksleft := checksleft - 1.
self spend: amount.
^ num
]
]
@end example
@code{newChecks:} fills our checkbook with checks. We record
what check number we're starting with, and update the count
of the number of checks in the checkbook.
@code{writeCheck:} merely notes the next check number, then
bumps up the check number, and down the check count. The
message @code{self spend: amount} resends the message
@code{spend:} to our own object. This causes its method to be looked
up by Smalltalk. The method is then found in our parent class,
Account, and our balance is then updated to reflect our
spending.
You can try the following examples:
@example
c := Checking new
c deposit: 250
c newChecks: 100 count: 50
c writeCheck: 32
c
@end example
For amusement, you might want to add a printOn: message to
the checking class so you can see the checking-specific
information.
In this chapter, you have seen how to create subclasses
of your own classes. You have added new methods, and inherited
methods from the parent classes. These techniques provide
the majority of the structure for building solutions to
problems. In the following chapters we will be filling in
details on further language mechanisms and types, and providing
details on how to debug software written in Smalltalk.
@node Code blocks (I)
@section Code blocks
The Account/Saving/Checking example from the last chapter
has several deficiencies. It has no record of the
checks and their values. Worse, it allows you to write a
check when there are no more checks---the Integer value for
the number of checks will just calmly go negative! To fix
these problems we will need to introduce more sophisticated
control structures.
@menu
* Conditions:: Making some decisions
* Iteration:: Making some loops
@end menu
@node Conditions
@subsection Conditions and decision making
Let's first add some code to keep you from writing too
many checks. We will simply update our current method for
the Checking class; if you have entered the methods from the
previous chapters, the old definition will be overridden by
this new one.
@example
Checking extend [
writeCheck: amount [
| num |
(checksleft < 1)
ifTrue: [ ^self error: 'Out of checks' ].
num := checknum.
checknum := checknum + 1.
checksleft := checksleft - 1.
self spend: amount
^ num
]
]
@end example
The two new lines are:
@example
(checksleft < 1)
ifTrue: [ ^self error: 'Out of checks' ].
@end example
At first glance, this appears to be a completely new structure.
But, look again! The only new construct is the square
brackets, which appear within a method and not only surround it.
The first line is a simple boolean expression. @code{checksleft}
is our integer, as initialized by our Checking class.
It is sent the message @code{<}, and the argument 1. The current
number bound to @code{checksleft} compares itself against 1, and
returns a boolean object telling whether it is less than 1.
Now this boolean, which is either true or false, is sent the
message @code{ifTrue:}, with an argument which is called a code
block. A code block is an object, just like any other. But
instead of holding a number, or a Set, it holds executable
statements. So what does a boolean do with a code block which
is an argument to a @code{ifTrue:} message? It depends on which boolean!
If the object is the @code{true} object, it executes the code
block it has been handed. If it is the @code{false} object, it
returns without executing the code block. So the traditional
@i{conditional construct} has been replaced in
Smalltalk with boolean objects which execute the indicated
code block or not, depending on their truth-value.
@footnote{It is interesting to note that because of the
way conditionals are done, conditional constructs
are not part of the Smalltalk language, instead they are
merely a defined behavior for the Boolean class of
objects.}
In the case of our example, the actual code within the
block sends an error message to the current object. @code{error:}
is handled by the parent class Object, and will pop up an
appropriate complaint when the user tries to write too many
checks. In general, the way you handle a fatal error in
Smalltalk is to send an error message to yourself (through
the @code{self} pseudo-variable), and let the error handling
mechanisms inherited from the Object class take over.
As you might guess, there is also an @code{ifFalse:} message
which booleans accept. It works exactly like @code{ifTrue:},
except that the logic has been reversed; a boolean @code{false}
will execute the code block, and a boolean @code{true} will not.
You should take a little time to play with this method
of representing conditionals. You can run your checkbook,
but can also invoke the conditional functions directly:
@example
true ifTrue: [ 'Hello, world!' printNl ]
false ifTrue: [ 'Hello, world!' printNl ]
true ifFalse: [ 'Hello, world!' printNl ]
false ifFalse: [ 'Hello, world!' printNl ]
@end example
@node Iteration
@subsection Iteration and collections
Now that we have some sanity checking in place, it
remains for us to keep a log of the checks we write. We
will do so by adding a Dictionary object to our Checking
class, logging checks into it, and providing some messages
for querying our check-writing history. But this enhancement
brings up a very interesting question---when we change
the ``shape'' of an object (in this case, by adding our dictionary
as a new instance variable to the Checking class),
what happens to the existing class, and its objects?
The answer is that the old objects are mutated to keep their
new shape, and all methods are recompiled so that they work
with the new shape. New objects will have exactly the same shape
as old ones, but old objects might happen to be initialized
incorrectly (since the newly added variables will be simply
put to nil). As this can lead to very puzzling behavior, it is
usually best to eradicate all of the old objects, and then
implement your changes.
If this were more than a toy object
accounting system, this would probably entail saving the
objects off, converting to the new class, and reading the
objects back into the new format. For now, we'll just
ignore what's currently there, and define our latest Checking
class.
@example
Checking extend [
| history |
@end example
This is the same syntax as the last time we defined a checking account,
except that we start with @code{extend} (since the class is already
there). Then, the two instance variables we had defined remain, and we
add a new @code{history} variable; the old methods will be recompiled
without errors. We must now feed in our definitions for each of the
messages our object can handle, since we are basically defining a new
class under an old name.
With our new Checking instance variable, we are all set to start recording
our checking history. Our first change will be in the handling of the
@code{init} message:
@example
init [
<category: 'initialization'>
checksleft := 0.
history := Dictionary new.
^ super init
]
@end example
This provides us with a Dictionary, and hooks it to our new
@code{history} variable.
Our next method records each check as it's written.
The method is a little more involved, as we've added some
more sanity checks to the writing of checks.
@example
writeCheck: amount [
<category: 'spending'>
| num |
"Sanity check that we have checks left in our checkbook"
(checksleft < 1)
ifTrue: [ ^self error: 'Out of checks' ].
"Make sure we've never used this check number before"
num := checknum.
(history includesKey: num)
ifTrue: [ ^self error: 'Duplicate check number' ].
"Record the check number and amount"
history at: num put: amount.
"Update our next checknumber, checks left, and balance"
checknum := checknum + 1.
checksleft := checksleft - 1.
self spend: amount.
^ num
]
@end example
We have added three things to our latest version of
@code{writeCheck:}. First, since our routine has become somewhat
involved, we have added comments. In Smalltalk, single
quotes are used for strings; double quotes enclose comments.
We have added comments before each section of code.
Second, we have added a sanity check on the check number
we propose to use. Dictionary objects respond to the
@code{includesKey:} message with a boolean, depending on whether
something is currently stored under the given key in the
dictionary. If the check number is already used, the @code{error:}
message is sent to our object, aborting the operation.
Finally, we add a new entry to the dictionary. We have
already seen the @code{at:put:} message (often found written
as @code{#at:put:}, with a sharp in front of it) at the start of
this tutorial. Our use here simply associates a check number with
an amount of money spent.@footnote{You might start to wonder what
one would do if you wished to associate two pieces of
information under one key. Say, the value and who the
check was written to. There are several ways; the
best would probably be to create a new, custom
object which contained this information, and then
store this object under the check number key in
the dictionary. It would also be valid (though
probably overkill) to store a dictionary as the
value---and then store as many pieces of information
as you'd like under each slot!} With this, we now have a working Checking
class, with reasonable sanity checks and per-check information.
Let us finish the chapter by enhancing our ability to
get access to all this information. We will start with some
simple print-out functions.
@example
printOn: stream [
super printOn: stream.
', checks left: ' printOn: stream.
checksleft printOn: stream.
', checks written: ' printOn: stream.
(history size) printOn: stream.
]
check: num [
| c |
c := history
at: num
ifAbsent: [ ^self error: 'No such check #' ].
^c
]
@end example
There should be very few surprises here. We format and
print our information, while letting our parent classes handle
their own share of the work. When looking up a check
number, we once again take advantage of the fact that blocks
of executable statements are an object; in this case, we are
using the @code{at:ifAbsent:} message supported by the
Dictionary class. As you can probably anticipate, if the
requested key value is not found in the
dictionary, the code block is executed. This allows us to
customize our error handling, as the generic error would only
tell the user ``key not found''.
While we can look up a check if we know its number, we
have not yet written a way to ``riffle through'' our collection
of checks. The following function loops over the
checks, printing them out one per line. Because there is
currently only a single numeric value under each key, this
might seem wasteful. But we have already considered storing
multiple values under each check number, so it is best to
leave some room for each item. And, of course, because we
are simply sending a printing message to an object, we will
not have to come back and re-write this code so long as the
object in the dictionary honors our @code{printNl}/@code{printOn:} messages
sages.
@example
printChecks [
history keysAndValuesDo: [ :key :value |
key print.
' - ' print.
value printNl.
]
]
]
@end example
We still see a code block object being passed to the
dictionary, but @code{:key :value |} is something new. A code
block can optionally receive arguments. In this case, the
two arguments represent a key/value pair.
If you only wanted the value portion, you could call
history with a @code{do:} message instead; if you only wanted the
key portion, you could call history with a @code{keysDo:} message instead.
We then invoke our printing interface upon them. We don't want a
newline until the end, so the @code{print} message is used instead.
It is pretty much the same as @code{printNl}, since both implicitly use
@code{Transcript}, except it doesn't add a newline.
It is important that you be clear that in principle there is
no relationship between the code block and the dictionary you
passed it to. The dictionary just invokes the passed code block
with a key/value pair when processing a keysAndValuesDo: message. But
the same two-parameter code block can be passed to any message that
wishes to evaluate it (and passes the exact number of parameters to
it). In the next chapter
we'll see more on how code blocks are used; we'll also look at how
you can invoke code blocks in your own code.
@node Code blocks (II)
@section Code blocks, part two
In the last chapter, we looked at how code blocks could
be used to build conditional expressions, and how you could
iterate across all entries in a collection.@footnote{The
@code{do:} message is understood by most types
of Smalltalk collections. It works for the
Dictionary class, as well as sets, arrays, strings,
intervals, linked lists, bags, and streams. The
@code{keysDo:} message, for example, works only with dictionaries.}
We built our own code blocks, and handed them off for use by system
objects. But there is nothing magic about invoking code
blocks; your own code will often need to do so. This chapter
will shows some examples of loop construction in
Smalltalk, and then demonstrate how you invoke code blocks
for yourself.
@menu
* Integer loops:: Well, Smalltalk too has them
* Intervals:: And of course here's a peculiar way to use them
* Invoking code blocks:: You can do it, too
@end menu
@node Integer loops
@subsection Integer loops
Integer loops are constructed by telling a number to
drive the loop. Try this example to count from 1 to 20:
@example
1 to: 20 do: [:x | x printNl ]
@end example
There's also a way to count up by more than one:
@example
1 to: 20 by: 2 do: [:x | x printNl ]
@end example
Finally, counting down is done with a negative step:
@example
20 to: 1 by: -1 do: [:x | x printNl ]
@end example
Note that the @code{x} variable is local to the block.
@example
x
@end example
@noindent
just prints @code{nil}.
@node Intervals
@subsection Intervals
It is also possible to represent a range of numbers as
a standalone object. This allows you to represent a range
of numbers as a single object, which can be passed around
the system.
@example
i := Interval from: 5 to: 10
i do: [:x | x printNl]
@end example
As with the integer loops, the Interval class can also
represent steps greater than 1. It is done much like it was
for our numeric loop above:
@example
i := (Interval from: 5 to: 10 by: 2)
i do: [:x| x printNl]
@end example
@node Invoking code blocks
@subsection Invoking code blocks
Let us revisit the checking example and add a method
for scanning only checks over a certain amount. This would
allow our user to find ``big'' checks, by passing in a value
below which we will not invoke their function. We will
invoke their code block with the check number as an argument
ment; they can use our existing check: message to get the
amount.
@example
Checking extend [
checksOver: amount do: aBlock
history keysAndValuesDo: [:key :value |
(value > amount)
ifTrue: [aBlock value: key]
]
]
@end example
The structure of this loop is much like our printChecks message
sage from chapter 6. However, in this case we consider each
entry, and only invoke the supplied block if the check's
value is greater than the specified amount. The line:
@example
ifTrue: [aBlock value: key]
@end example
@noindent
invokes the user-supplied block, passing as an argument the
key, which is the check number. The @code{value:}
message, when received by a code block, causes the code
block to execute. Code blocks take @code{value}, @code{value:},
@code{value:value:}, and @code{value:value:value:} messages, so you
can pass from 0 to 3 arguments to a code block.@footnote{
There is also a @code{valueWithArguments:} message
which accepts an array holding as many arguments
as you would like.}
You might find it puzzling that an association takes a
@code{value} message, and so does a code block. Remember, each
object can do its own thing with a message. A code block gets
run when it receives a @code{value} message. An association merely
returns the value part of its key/value pair. The fact that
both take the same message is, in this case, coincidence.
Let's quickly set up a new checking account with $250
(wouldn't this be nice in real life?) and write a couple
checks. Then we'll see if our new method does the job
correctly:
@example
mycheck := Checking new.
mycheck deposit: 250
mycheck newChecks: 100 count: 40
mycheck writeCheck: 10
mycheck writeCheck: 52
mycheck writeCheck: 15
mycheck checksOver: 1 do: [:x | x printNl]
mycheck checksOver: 17 do: [:x | x printNl]
mycheck checksOver: 200 do: [:x | x printNl]
@end example
We will finish this chapter with an alternative way of
writing our @code{checksOver:} code. In this example, we will use
the message @code{select:} to pick the checks which exceed our
value, instead of doing the comparison ourselves. We can
then invoke the new resulting collection against the user's
code block.
@example
Checking extend [
checksOver: amount do: aBlock [
| chosen |
chosen := history select: [:amt| amt > amount].
chosen keysDo: aBlock
]
]
@end example
Note that @code{extend} will also overwrite methods. Try
the same tests as above, they should yield the same result!
@node Debugging
@section When Things Go Bad
So far we've been working with examples which work the
first time. If you didn't type them in correctly, you probably
received a flood of unintelligible complaints. You
probably ignored the complaints, and typed the example
again.
When developing your own Smalltalk code, however, these
messages are the way you find out what went wrong. Because
your objects, their methods, the error printout, and your
interactive environment are all contained within the same
Smalltalk session, you can use these error messages to debug
your code using very powerful techniques.
@menu
* Simple errors:: Those that only happen in examples
* Nested calls:: Those that actually happen in real life
* Looking at objects:: Trying to figure it out
@end menu
@node Simple errors
@subsection A Simple Error
First, let's take a look at a typical error. Type:
@example
7 plus: 1
@end example
This will print out:
@example
7 did not understand selector 'plus:'
<blah blah>
UndefinedObject>>#executeStatements
@end example
The first line is pretty simple; we sent a message to the
@code{7} object which was not understood; not surprising since
the @code{plus:} operation should have been @code{+}. Then there are
a few lines of gobbledegook: just ignore them, they reflect
the fact that the error passed throgh @gst{}'s exception
handling system. The remaining line reflect the way the
@gst{} invokes code which we type to our command prompt; it
generates a block of code which is invoked via an internal
method @code{executeStatements} defined in class Object and evaluated
like @code{nil executeStatements} (nil is an instance of @i{UndefinedObject}).
Thus, this output tells you that you directly typed a line which sent an
invalid message to the @code{7} object.
All the error output but the first line is actually a
stack backtrace. The most recent call is the one nearer the
top of the screen. In the next example, we will cause an
error which happens deeper within an object.
@node Nested calls
@subsection Nested Calls
Type the following lines:
@example
x := Dictionary new
x at: 1
@end example
The error you receive will look like:
@example
Dictionary new: 31 "<0x33788>" error: key not found
@i{@r{@dots{}blah blah@dots{}}}
Dictionary>>#error:
[] in Dictionary>>#at:
[] in Dictionary>>#at:ifAbsent:
Dictionary(HashedCollection)>>#findIndex:ifAbsent:
Dictionary>>#at:ifAbsent:
Dictionary>>#at:
UndefinedObject(Object)>>#executeStatements
@end example
The error itself is pretty clear; we asked for something
within the Dictionary which wasn't there. The object
which had the error is identified as @code{Dictionary new: 31}.
A Dictionary's default size is 31; thus, this is the object
we created with @code{Dictionary new}.
The stack backtrace shows us the inner structure of how
a Dictionary responds to the @code{#at:} message. Our hand-entered
command causes the usual entry for @code{UndefinedObject(Object)}.
Then we see a Dictionary object responding to an @code{#at:} message
(the ``Dictionary>>#at:'' line). This code called the object
with an @code{#at:ifAbsent:} message. All of a sudden,
Dictionary calls that strange method @code{#findIndex:ifAbsent:},
which evaluates two blocks, and then the error happens.
To understand this better, it is necessary to know that
a very common way to handle errors in Smalltalk is to
hand down a block of code which will be called when an error
occurs. For the Dictionary code, the @code{at:} message passes
in a block of code to the at:ifAbsent: code to be called
when @code{at:ifAbsent:} can't find the given key, and
@code{at:ifAbsent:} does the same with @code{findIndex:ifAbsent:}.
Thus, without even looking at the code for Dictionary itself, we can
guess something of the code for Dictionary's implementation:
@example
findIndex: key ifAbsent: errCodeBlock [
@i{@r{@dots{}look for key@dots{}}}
(keyNotFound) ifTrue: [ ^(errCodeBlock value) ]
@i{@r{@dots{}}}
]
at: key [
^self at: key ifAbsent: [^self error: 'key not found']
]
@end example
Actually, @code{findIndex:ifAbsent:} lies in class @code{HashedCollection},
as that @code{Dictionary(HashedCollection)} in the backtrace says.
It would be nice if each entry on the stack backtrace included
source line numbers. Unfortunately, at this point @gst{} doesn't
provide this feature. Of course, you have the source code
available...
@node Looking at objects
@subsection Looking at Objects
When you are chasing an error, it is often helpful to
examine the instance variables of your objects. While
strategic calls to @code{printNl} will no doubt help, you can look at an
object without having to write all the code yourself. The
@code{inspect} message works on any object, and dumps out the
values of each instance variable within the object.@footnote{When using
the Blox GUI, it actually pops up a so-called @dfn{Inspector window}.}
Thus:
@example
x := Interval from: 1 to: 5.
x inspect
@end example
displays:
@example
An instance of Interval
start: 1
stop: 5
step: 1
contents: [
[1]: 1
[2]: 2
[3]: 3
[4]: 4
[5]: 5
]
@end example
We'll finish this chapter by emphasizing a technique
which has already been covered: the use of the @code{error:}
message in your own objects. As you saw in the case of Dictionary,
an object can send itself an @code{error:} message with a
descriptive string to abort execution and dump a stack backtrace.
You should plan on using this technique in your own
objects. It can be used both for explicit user-caused
errors, as well as in internal sanity checks.
@node More subclassing
@section Coexisting in the Class Hierarchy
The early chapters of this tutorial discussed classes in
one of two ways. The ``toy'' classes we developed were rooted
at Object; the system-provided classes were treated as
immutable entities. While one shouldn't modify the behavior
of the standard classes lightly, ``plugging in'' your own
classes in the right place among their system-provided
brethren can provide you powerful new classes with very little
effort.
This chapter will create two complete classes which
enhance the existing Smalltalk hierarchy. The discussion
will start with the issue of where to connect our new
classes, and then continue onto implementation. Like most
programming efforts, the result will leave many possibilities
for improvements. The framework, however, should begin
to give you an intuition of how to develop your own
Smalltalk classes.
@menu
* The existing hierarchy:: We've been talking about it for a while,
so here it is at last
* Playing with Arrays:: Again.
* New kinds of Numbers:: Sounds interesting, doesn't it?
* Inheritance and Polymorphism:: Sounds daunting, doesn't it?
@end menu
@node The existing hierarchy
@subsection The Existing Class Hierarchy
To discuss where a new class might go, it is helpful to
have a map of the current classes. The following is the
basic class hierarchy of @gst{}. Indentation means
that the line inherits from the earlier line with one less
level of indentation.@footnote{This listing is courtesy of the
printHierarchy method supplied by @gst{} author Steve
Byrne. It's in the @file{kernel/Browser.st} file.}.
@display
@t{ }Object
@t{ }Behavior
@t{ }ClassDescription
@t{ }Class
@t{ }Metaclass
@t{ }BlockClosure
@t{ }Boolean
@t{ }False
@t{ }True
@t{ }Browser
@t{ }CFunctionDescriptor
@t{ }CObject
@t{ }CAggregate
@t{ }CArray
@t{ }CPtr
@t{ }CCompound
@t{ }CStruct
@t{ }CUnion
@t{ }CScalar
@t{ }CChar
@t{ }CDouble
@t{ }CFloat
@t{ }CInt
@t{ }CLong
@t{ }CShort
@t{ }CSmalltalk
@t{ }CString
@t{ }CUChar
@t{ }CByte
@t{ }CBoolean
@t{ }CUInt
@t{ }CULong
@t{ }CUShort
@t{ }Collection
@t{ }Bag
@t{ }MappedCollection
@t{ }SequenceableCollection
@t{ }ArrayedCollection
@t{ }Array
@t{ }ByteArray
@t{ }WordArray
@t{ }LargeArrayedCollection
@t{ }LargeArray
@t{ }LargeByteArray
@t{ }LargeWordArray
@t{ }CompiledCode
@t{ }CompiledMethod
@t{ }CompiledBlock
@t{ }Interval
@t{ }CharacterArray
@t{ }String
@t{ }Symbol
@t{ }LinkedList
@t{ }Semaphore
@t{ }OrderedCollection
@t{ }RunArray
@t{ }SortedCollection
@t{ }HashedCollection
@t{ }Dictionary
@t{ }IdentityDictionary
@t{ }MethodDictionary
@t{ }RootNamespace
@t{ }Namespace
@t{ }SystemDictionary
@t{ }Set
@t{ }IdentitySet
@t{ }ContextPart
@t{ }BlockContext
@t{ }MethodContext
@t{ }CType
@t{ }CArrayCType
@t{ }CPtrCType
@t{ }CScalarCType
@t{ }Delay
@t{ }DLD
@t{ }DumperProxy
@t{ }AlternativeObjectProxy
@t{ }NullProxy
@t{ }VersionableObjectProxy
@t{ }PluggableProxy
@t{ }File
@t{ }Directory
@t{ }FileSegment
@t{ }Link
@t{ }Process
@t{ }SymLink
@t{ }Magnitude
@t{ }Association
@t{ }Character
@t{ }Date
@t{ }LargeArraySubpart
@t{ }Number
@t{ }Float
@t{ }Fraction
@t{ }Integer
@t{ }LargeInteger
@t{ }LargeNegativeInteger
@t{ }LargePositiveInteger
@t{ }LargeZeroInteger
@t{ }SmallInteger
@t{ }Time
@t{ }Memory
@t{ }Message
@t{ }DirectedMessage
@t{ }MethodInfo
@t{ }NullProxy
@t{ }PackageLoader
@t{ }Point
@t{ }ProcessorScheduler
@t{ }Rectangle
@t{ }SharedQueue
@t{ }Signal
@t{ }Exception
@t{ }Error
@t{ }Halt
@t{ }ArithmeticError
@t{ }ZeroDivide
@t{ }MessageNotUnderstood
@t{ }UserBreak
@t{ }Notification
@t{ }Warning
@t{ }Stream
@t{ }ObjectDumper
@t{ }PositionableStream
@t{ }ReadStream
@t{ }WriteStream
@t{ }ReadWriteStream
@t{ }ByteStream
@t{ }FileStream
@t{ }Random
@t{ }TextCollector
@t{ }TokenStream
@t{ }TrappableEvent
@t{ }CoreException
@t{ }ExceptionCollection
@t{ }UndefinedObject
@t{ }ValueAdaptor
@t{ }NullValueHolder
@t{ }PluggableAdaptor
@t{ }DelayedAdaptor
@t{ }ValueHolder
@end display
While initially a daunting list, you should take the
time to hunt down the classes we've examined in this tutorial
so far. Notice, for instance, how an Array is a subclass
below the @i{SequenceableCollection} class. This makes sense;
you can walk an Array from one end to the other. By contrast,
notice how this is not true for Sets: it doesn't make sense
to walk a Set from one end to the other.
A little puzzling is the relationship of a Bag to a Set, since
a Bag is actually a Set supporting multiple occurrences of its
elements. The answer lies in the purpose of both a Set and a
Bag. Both hold an unordered collection of objects; but a Bag
needs to be optimized for the case when an object has possibly
thousands of occurrences, while a Set is optimized for checking
object uniqueness. That's why Set being a subclass or Bag, or
the other way round, would be a source of problems in the actual
implementation of the class. Currently a Bag holds a Dictionary
associating each object to each count; it would be feasible however
to have Bag as a subclass of HashedCollection and a sibling of Set.
Look at the treatment of numbers---starting with the class
@i{Magnitude}. While numbers can indeed be ordered by @emph{less than},
@emph{greater than}, and so forth, so can a number of other
objects. Each subclass of Magnitude is such an
object. So we can compare characters with other characters,
dates with other dates, and times with other times, as well
as numbers with numbers.
Finally, you will have probably noted some pretty strange classes,
representing language entities that you might have never thought
of as objects themselves: @i{Namespace}, @i{Class} and even
@i{CompiledMethod}. They are the base of Smalltalk's ``reflection''
mechanism which will be discussed later, in @ref{Why is #new
there?!?, , The truth on metaclasses}.
@node Playing with Arrays
@subsection Playing with Arrays
Imagine that you need an array, but alas you need that if an index
is out of bounds, it returns nil. You could modify the Smalltalk
implementation, but that might break some code in the image, so it
is not practical. Why not add a subclass?
@example
"We could subclass from Array, but that class is specifically
optimized by the VM (which assumes, among other things, that
it does not have any instance variables). So we use its
abstract superclass instead. The discussion below holds
equally well."
ArrayedCollection subclass: NiledArray [
<shape: #pointer>
boundsCheck: index [
^(index < 1) | (index > (self basicSize))
]
at: index [
^(self boundsCheck: index)
ifTrue: [ nil ]
ifFalse: [ super at: index ]
]
at: index put: val [
^(self boundsCheck: index)
ifTrue: [ val ]
ifFalse: [ super at: index put: val ]
]
]
@end example
Much of the machinery of adding a class should be
familiar. We see another declaration like @code{comment:},
that is @code{shape:} message. This sets up @code{NiledArray}
to have the same underlying
structure of an @code{Array} object; we'll delay discussing this
until the chapter on the nuts and bolts of arrays. In any
case, we inherit all of the actual knowledge of how to create
arrays, reference them, and so forth. All that we do is
intercept @code{at:} and @code{at:put:} messages, call our common
function to validate the array index, and do something special
if the index is not valid. The way that we coded
the bounds check bears a little examination.
Making a first cut at coding the bounds check, you
might have coded the bounds check in NiledArray's methods
twice (once for @code{at:}, and again for @code{at:put:}. As
always, it's preferable to code things once, and then re-use them.
So we instead add a method for bounds checking @code{boundsCheck:}, and
use it for both cases. If we ever wanted to enhance the
bounds checking (perhaps emit an error if the index is < 1 and
answer nil only for indices greater than the array size?), we only
have to change it in one place.
The actual math for calculating whether the bounds have
been violated is a little interesting. The first part of
the expression returned by the method:
@example
(index < 1) | (index > (self basicSize))
@end example
@noindent
is true if the index is less than 1, otherwise it's false.
This part of the expression thus becomes the boolean object
true or false. The boolean object then receives the message
@code{|}, and the argument @code{(index > (self basicSize))}.
@code{|} means ``or''---we want to OR together the two possible
out-of-range checks. What is the second part of the expression?
@footnote{Smalltalk also offers an @code{or:} message, which
is different in a subtle way from @code{|}. or: takes
a code block, and only invokes the code block if
it's necessary to determine the value of the
expression. This is analogous to the guaranteed C
semantic that @code{||} evaluates left-to-right only as
far as needed. We could have written the expressions
as @code{((index < 1) or: [index > (self basicSize)])}.
Since we expect both sides of or: to be
false most of the time, there isn't much reason to
delay evaluation of either side in this case.}
@code{index} is our argument, an integer; it receives the
message @code{>}, and thus will compare itself to the value
@code{self basicSize} returns. While we haven't covered the
underlying structures Smalltalk uses to build arrays, we can
briefly say that the @code{#basicSize} message returns the number
of elements the Array object can contain. So the index is checked
to see if it's less than 1 (the lowest legal Array index) or
greater than the highest allocated slot in the Array. If it
is either (the @code{|} operator!), the expression is true,
otherwise false.
From there it's downhill; our boolean object, returned by
@code{boundsCheck:}, receives the @code{ifTrue:ifFalse:} message,
and a code block which will do the appropriate thing. Why do we
have @code{at:put:} return val? Well, because that's what it's
supposed to do: look at every implementor of @code{at:put} or @code{at:}
and you'll find that it returns its second parameter. In general, the
result is discarded; but one could write a program which uses it, so
we'll write it this way anyway.
@node New kinds of Numbers
@subsection Adding a New Kind of Number
If we were programming an application which did a large
amount of complex math, we could probably manage it with a
number of two-element arrays. But we'd forever be writing
in-line code for the math and comparisons; it would be much
easier to just implement an object class to support the complex
numeric type. Where in the class hierarchy would it be
placed?
You've probably already guessed---but let's step down the
hierarchy anyway. Everything inherits from Object, so
that's a safe starting point. Complex numbers can not be
compared with @code{<} and @code{>}, and yet we strongly suspect that,
since they are numbers, we should place them under the Number
class. But Number inherits from Magnitude---how do we
resolve this conflict? A subclass can place itself under a
superclass which allows some operations the subclass doesn't
wish to allow. All that you must do is make sure you intercept
these messages and return an error. So we will place
our new Complex class under Number, and make sure to disallow
comparisons.
One can reasonably ask whether the real and imaginary
parts of our complex number will be integer or floating
point. In the grand Smalltalk tradition, we'll just leave
them as objects, and hope that they respond to numeric messages
reasonably. If they don't, the user will doubtless
receive errors and be able to track back their mistake with
little fuss.
We'll define the four basic math operators, as well as
the (illegal) relationals. We'll add @code{printOn:} so that the
printing methods work, and that should give us our Complex
class. The class as presented suffers some limitations,
which we'll cover later in the chapter.
@example
Number subclass: Complex [
| realpart imagpart |
"This is a quick way to define class-side methods."
Complex class >> new [
<category: 'instance creation'>
^self error: 'use real:imaginary:'
]
Complex class >> new: ignore [
<category: 'instance creation'>
^self new
]
Complex class >> real: r imaginary: i [
<category: 'instance creation'>
^(super new) setReal: r setImag: i
]
setReal: r setImag: i [
<category: 'basic'>
realpart := r.
imagpart := i.
^self
]
real [
<category: 'basic'>
^realpart
]
imaginary [
<category: 'basic'>
^imagpart
]
+ val [
<category: 'math'>
^Complex real: (realpart + val real)
imaginary: (imagpart + val imaginary)
]
- val [
<category: 'math'>
^Complex real: (realpart - val real)
imaginary: (imagpart - val imaginary)
]
* val [
<category: 'math'>
^Complex real: (realpart * val real) - (imagpart * val imaginary)
imaginary: (imagpart * val real) + (realpart * val imaginary)
]
/ val [
<category: 'math'>
| d r i |
d := (val real * val real) + (val imaginary * val imaginary).
r := ((realpart * val real) + (imagpart * val imaginary)).
i := ((imagpart * val real) - (realpart * val imaginary)).
^Complex real: r / d imaginary: i / d
]
= val [
<category: 'comparison'>
^(realpart = val real) & (imagpart = val imaginary)
]
"All other comparison methods are based on <"
< val [
<category: 'comparison'>
^self shouldNotImplement
]
printOn: aStream [
<category: 'printing'>
realpart printOn: aStream.
aStream nextPut: $+.
imagpart printOn: aStream.
aStream nextPut: $i
]
]
@end example
There should be surprisingly little which is actually
new in this example. The printing method uses both @code{printOn:}
as well as @code{nextPut:} to do its printing. While we haven't
covered it, it's pretty clear that @code{$+} generates the ASCII
character @code{+} as an object@footnote{A @gst{} extension
allows you to type characters by ASCII code too, as in
@code{$<43>}.}, and @code{nextPut:} puts its argument
as the next thing on the stream.
The math operations all generate a new object, calculating
the real and imaginary parts, and invoking the Complex
class to create the new object. Our creation code is a
little more compact than earlier examples; instead of using
a local variable to name the newly-created object, we just
use the return value and send a message directly to the new
object. Our initialization code explicitly returns self;
what would happen if we left this off?
@node Inheritance and Polymorphism
@subsection Inheritance and Polymorphism
This is a good time to look at what we've done with the
two previous examples at a higher level. With the
NiledArray class, we inherited almost all of the functionality
ality of arrays, with only a little bit of code added to
address our specific needs. While you may have not thought
to try it, all the existing methods for an Array continue to
work without further effort-you might find it interesting to
ponder why the following still works:
@example
a := NiledArray new: 10
a at: 5 put: 1234
a do: [:i| i printNl ]
@end example
The strength of inheritance is that you focus on the incremental
changes you make; the things you don't change will generally
continue to work.
In the Complex class, the value of polymorphism was
exercised. A Complex number responds to exactly the same
set of messages as any other number. If you had handed this
code to someone, they would know how to do math with Complex
numbers without further instruction. Compare this with C,
where a complex number package would require the user to
first find out if the complex-add function was
complex_plus(), or perhaps complex_add(), or add_complex(),
or@dots{}
However, one glaring deficiency is present in the Complex class:
what happens if you mix normal numbers with Complex numbers?
Currently, the Complex class assumes that it will only
interact with other Complex numbers. But this is unrealistic:
mathematically, a ``normal'' number is simply one with an
imaginary part of 0. Smalltalk was designed to allow numbers
to coerce themselves into a form which will work with
other numbers.
The system is clever and requires very little additional
code. Unfortunately, it would have tripled the
amount of explanation required. If you're interested in how
coercion works in @gst{}, you should find the
Smalltalk library source, and trace back the execution of
the @code{retry:coercing:} messages. You want to consider the
value which the @code{generality} message returns for each type
of number. Finally, you need to examine the @code{coerce:} handling
in each numeric class.
@node Streams
@section Smalltalk Streams
Our examples have used a mechanism extensively, even
though we haven't discussed it yet. The Stream class provides
a framework for a number of data structures, including
input and output functionality, queues, and endless sources
of dynamically-generated data. A Smalltalk stream is quite
similar to the UNIX streams you've used from C. A stream
provides a sequential view to an underlying resource; as you
read or write elements, the stream position advances until
you finally reach the end of the underlying medium. Most
streams also allow you to set the current position, providing
random access to the medium.
@menu
* The output stream:: Which, even though you maybe didn't know
it, we've used all the time
* Your own stream:: Which, instead, is something new
* Files:: Which are streams too
* Dynamic Strings:: A useful application of Streams
@end menu
@node The output stream
@subsection The Output Stream
The examples in this book all work because they write
their output to the @code{Transcript} stream. Each class implements
the @code{printOn:} method, and writes its output to the supplied
stream. The @code{printNl} method all objects use is simply to
send the current object a @code{printOn:} message whose argument is
@code{Transcript} (by default attached to the standard output stream
found in the @code{stdout} global). You can invoke the standard output stream
directly:
@example
'Hello, world' printOn: stdout
stdout inspect
@end example
@noindent
or you can do the same for the Transcript, which is yet another stream:
@example
'Hello, world' printOn: stdout
Transcript inspect
@end example
@noindent
the last @code{inspect} statement will show you how the @code{Transcript} is
linked to @code{stdout}@footnote{Try executing it under Blox, where the
Transcript is linked to the omonymous window!}.
@node Your own stream
@subsection Your Own Stream
Unlike a pipe you might create in C, the underlying
storage of a Stream is under your control. Thus, a Stream
can provide an anonymous buffer of data, but it can also
provide a stream-like interpretation to an existing array of
data. Consider this example:
@example
a := Array new: 10
a at: 4 put: 1234
a at: 9 put: 5678
s := ReadWriteStream on: a.
s inspect
s position: 1
s inspect
s nextPut: 11; nextPut: 22
(a at: 1) printNl
a do: [:x| x printNl]
s position: 2
s do: [:x| x printNl]
s position: 5
s do: [:x| x printNl]
s inspect
@end example
The key is the @code{on:} message; it tells a stream class to
create itself in terms of the existing storage. Because of
polymorphism, the object specified by on: does not have to
be an Array; any object which responds to numeric at: messages
can be used. If you happen to have the NiledArray
class still loaded from the previous chapter, you might try
streaming over that kind of array instead.
You're wondering if you're stuck with having to know
how much data will be queued in a Stream at the time you
create the stream. If you use the right class of stream,
the answer is no. A ReadStream provides read-only access to
an existing collection. You will receive an error if you
try to write to it. If you try to read off the end of the
stream, you will also get an error.
By contrast, WriteStream and ReadWriteStream (used in
our example) will tell the underlying collection to grow
when you write off the end of the existing collection. Thus,
if you want to write several strings, and don't want to add up their
lengths yourself:
@example
s := ReadWriteStream on: String new
s inspect
s nextPutAll: 'Hello, '
s inspect
s nextPutAll: 'world'
s inspect
s position: 1
s inspect
s do: [:c | stdout nextPut: c ]
s contents
@end example
In this case, we have used a String as the collection
for the Stream. The @code{printOn:} messages add bytes to the initially
empty string. Once we've added the data, you can
continue to treat the data as a stream. Alternatively, you
can ask the stream to return to you the underlying object.
After that, you can use the object (a String, in this example)
using its own access methods.
There are many amenities available on a stream object.
You can ask if there's more to read with @code{atEnd}. You can
query the position with @code{position}, and set it with @code{position:}.
You can see what will be read next with @code{peek}, and
you can read the next element with @code{next}.
In the writing direction, you can write an element with
@code{nextPut:}. You don't need to worry about objects doing a
@code{printOn:} with your stream as a destination; this operation
ends up as a sequence of @code{nextPut:} operations to your stream.
If you have a collection of things to write, you can use
@code{nextPutAll:} with the collection as an argument; each member
of the collection will be written onto the stream. If you
want to write an object to the stream several times, you
can use @code{next:put:}, like this:
@example
s := ReadWriteStream on: (Array new: 0)
s next: 4 put: 'Hi!'
s position: 1
s do: [:x | x printNl]
@end example
@node Files
@subsection Files
Streams can also operate on files. If you wanted to
dump the file @file{/etc/passwd} to your terminal, you could
create a stream on the file, and then stream over its contents:
@example
f := FileStream open: '/etc/passwd' mode: FileStream read
f linesDo: [ :c | Transcript nextPutAll: c; nl ]
f position: 30
25 timesRepeat: [ Transcript nextPut: (f next) ]
f close
@end example
and, of course, you can load Smalltalk source code into your
image:
@example
FileStream fileIn: '/Users/myself/src/source.st'
@end example
@node Dynamic Strings
@subsection Dynamic Strings
Streams provide a powerful abstraction for a number of
data structures. Concepts like current position, writing
the next position, and changing the way you view a data
structure when convenient combine to let you write compact,
powerful code. The last example is taken from the actual
Smalltalk source code---it shows a general method for making
an object print itself onto a string.
@example
printString [
| stream |
stream := WriteStream on: (String new).
self printOn: stream.
^stream contents
]
@end example
This method, residing in Object, is inherited by every
class in Smalltalk. The first line creates a WriteStream
which stores on a String whose length is currently 0
(@code{String new} simply creates an empty string. It
then invokes the current object with @code{printOn:}. As the
object prints itself to ``stream'', the String grows to accommodate
new characters. When the object is done printing,
the method simply returns the underlying string.
As we've written code, the assumption has been that
printOn: would go to the terminal. But replacing a stream
to a file like @file{/dev/tty} with a stream to a data
structure (@code{String new}) works just as well. The last line
tells the Stream to return its underlying collection, which will
be the string which has had all the printing added to it. The
result is that the @code{printString} message returns an object of
the String class whose contents are the printed representation
of the very object receiving the message.
@node Exception handling
@section Exception handling in Smalltalk
Up to this point of the tutorial, you used the original Smalltalk-80
error signalling mechanism:
@example
check: num [
| c |
c := history
at: num
ifAbsent: [ ^self error: 'No such check #' ].
^c
]
@end example
In the above code, if a matching check number is found, the method will
answer the object associated to it. If no prefix is found, Smalltalk
will unwind the stack and print an error message including the message
you gave and stack information.
@example
CheckingAccount new: 31 "<0x33788>" error: No such check #
@i{@r{@dots{}blah blah@dots{}}}
CheckingAccount>>#error:
[] in Dictionary>>#at:ifAbsent:
Dictionary(HashedCollection)>>#findIndex:ifAbsent:
Dictionary>>#at:ifAbsent:
[] in CheckingAccount>>#check:
CheckingAccount>>#check:
UndefinedObject(Object)>>#executeStatements
@end example
Above we see the object that received the #error: message, the message
text itself, and the frames (innermost-first) running when the error was
captured by the system. In addition, the rest of the code in methods
like @code{CheckingAccount>>#check:} was not executed.
So simple error reporting gives us most of the features we want:
@itemize @bullet
@item
Execution stops immediately, preventing programs from continuing as if
nothing is wrong.
@item
The failing code provides a more-or-less useful error message.
@item
Basic system state information is provided for diagnosis.
@item
A debugger can drill further into the state, providing information like
details of the receivers and arguments on the stack.
@end itemize
However, there is a more powerful and complex error handling mechanism,
that is @dfn{exception}. They are like "exceptions" in other programming
languages, but are more powerful and do not always indicate error
conditions. Even though we use the term "signal" often with regard
to them, do not confuse them with the signals like @code{SIGTERM} and
@code{SIGINT} provided by some operating systems; they are a different
concept altogether.
Deciding to use exceptions instead of @code{#error:} is a matter of
aesthetics, but you can use a simple rule: use exceptions only if you want
to provide callers with a way to recover sensibly from certain errors,
and then only for signalling those particular errors.
For example, if you are writing a word processor, you might provide the
user with a way to make regions of text read-only. Then, if the user
tries to edit the text, the objects that model the read-only text can
signal a @code{ReadOnlyText} or other kind of exception, whereupon the
user interface code can stop the exception from unwinding and report
the error to the user.
When in doubt about whether exceptions would be useful, err on the side
of simplicity; use @code{#error:} instead. It is much easier to convert an
#error: to an explicit exception than to do the opposite.
@menu
* Creating exceptions:: Starting to use the mechanism
* Raising exceptions:: What to do when exceptional events happen
* Handling exceptions:: The other side
* When an exception isn't handled:: Default actions
* Creating new exception classes:: Your own exceptions
* Hooking into the stack unwinding:: An alternative exception handling system
* Handler stack unwinding caveat:: Differences with other languages
@end menu
@node Creating exceptions
@subsection Creating exceptions
@gst{} provides a few exceptions, all of which are subclasses of
@code{Exception}. Most of the ones you might want to create yourself are
in the @code{SystemExceptions} namespace. You can browse the builtin
exceptions in the base library reference, and look at their names with
@code{Exception printHierarchy}.
Some useful examples from the system exceptions are
@code{SystemExceptions.InvalidValue}, whose meaning should be obvious, and
@code{SystemExceptions.WrongMessageSent}, which we will demonstrate below.
Let's say that you change one of your classes to no longer support #new
for creating new instances. However, because you use the first-class
classes feature of Smalltalk, it is not so easy to find and change
all sends. Now, you can do something like this:
@example
Object subclass: Toaster [
Toaster class >> new [
^SystemExceptions.WrongMessageSent
signalOn: #new useInstead: #toast:
]
Toaster class >> toast: reason [
^super new reason: reason; yourself
]
...
]
@end example
Admittedly, this doesn't quite fit the conditions for using exceptions.
However, since the exception type is already provided, it is probably
easier to use it than #error: when you are doing defensive programming
of this sort.
@node Raising exceptions
@subsection Raising exceptions
Raising an exception is really a two-step process. First, you create
the exception object; then, you send it @code{#signal}.
If you look through the hierarchy, you'll see many class methods
that combine these steps for convenience. For example, the class
@code{Exception} provides @code{#new} and @code{#signal}, where the
latter is just @code{^self new signal}.
You may be tempted to provide only a signalling variant of your own
exception creation methods. However, this creates the problem that
your subclasses will not be able to trivially provide new instance
creation methods.
@example
Error subclass: ReadOnlyText [
ReadOnlyText class >> signalOn: aText range: anInterval [
^self new initText: aText range: anInterval; signal
]
initText: aText range: anInterval [
<category: 'private'>
...
]
]
@end example
Here, if you ever want to subclass @code{ReadOnlyText} and add new
information to the instance before signalling it, you'll have to use
the private method @code{#initText:range:}.
We recommend leaving out the signalling instance-creation variant in new
code, as it saves very little work and makes signalling code less clear.
Use your own judgement and evaluation of the situation to determine when
to include a signalling variant.
@node Handling exceptions
@subsection Handling exceptions
To handle an exception when it occurs in a particular block of code,
use @code{#on:do:} like this:
@example
^[someText add: inputChar beforeIndex: i]
on: ReadOnlyText
do: [:sig | sig return: nil]
@end example
This code will put a handler for @code{ReadOnlyText} signals on the
handler stack while the first block is executing. If such an exception
occurs, and it is not handled by any handlers closer to the point of
signalling on the stack (known as "inner handlers"), the exception object
will pass itself to the handler block given as the @code{do:} argument.
You will almost always want to use this object to handle the exception
somehow. There are six basic handler actions, all sent as messages to
the exception object:
@table @code
@item return:
Exit the block that received this @code{#on:do:}, returning the given value.
You can also leave out the argument by sending @code{#return}, in which case
it will be nil. If you want this handler to also handle exceptions in
whatever value you might provide, you should use @code{#retryUsing:} with a
block instead.
@item retry
Acts sort of like a "goto" by restarting the first block. Obviously,
this can lead to an infinite loop if you don't fix the situation that
caused the exception.
@code{#retry} is a good way to implement reinvocation upon recovery,
because it does not increase the stack height. For example, this:
@example
frobnicate: n [
^[do some stuff with n]
on: SomeError
do: [:sig | sig return: (self frobnicate: n + 1)]
]
@end example
@noindent
should be replaced with retry:
@example
frobnicate: aNumber [
| n |
n := aNumber.
^[do some stuff with n]
on: SomeError
do: [:sig | n := 1 + n. sig retry]
]
@end example
@item retryUsing:
Like @code{#retry}, except that it effectively replaces the original
block with the one given as an argument.
@item pass
If you want to tell the exception to let an outer handler handle it,
use @code{#pass} instead of @code{#signal}. This is just like rethrowing
a caught exception in other languages.
@item resume:
This is the really interesting one. Instead of unwinding the stack,
this will effectively answer the argument from the @code{#signal} send.
Code that sends @code{#signal} to resumable exceptions can use this
value, or ignore it, and continue executing. You can also leave out
the argument, in which case the @code{#signal} send will answer nil.
Exceptions that want to be resumable must register this capability by
answering @code{true} from the @code{#isResumable} method, which is
checked on every @code{#resume:} send.
@item outer
This is like @code{#pass}, but if an outer handler uses @code{#resume:},
this handler block will be resumed (and @code{#outer} will answer the
argument given to @code{#resume:}) rather than the piece of code that
sent @code{#signal} in the first place.
@end table
None of these methods return to the invoking handler block except for
@code{#outer}, and that only in certain cases described for it above.
Exceptions provide several more features; see the methods on the classes
@code{Signal} and @code{Exception} for the various things you can do
with them. Fortunately, the above methods can do what you want in almost
all cases.
If you don't use one of these methods or another exception feature to exit
your handler, Smalltalk will assume that you meant to @code{sig return:}
whatever you answer from your handler block. We don't recommend relying
on this; you should use an explicit @code{sig return:} instead.
A quick shortcut to handling multiple exception types is the
@code{ExceptionSet}, which allows you to have a single handler for the
exceptions of a union of classes:
@example
^[do some stuff with n]
on: SomeError, ReadOnlyError
do: [:sig | ...]
@end example
In this code, any @code{SomeError} or @code{ReadOnlyError} signals will
be handled by the given handler block.
@node When an exception isn't handled
@subsection When an exception isn't handled
Every exception chooses one of the above handler actions by default when
no handler is found, or they all use @code{#pass}. This is invoked by
sending @code{#defaultAction} to the class.
One example of a default action is presented above as part of the example
of @code{#error:} usage; that default action prints a message, backtrace,
and unwinds the stack all the way.
The easiest way to choose a default action for your own exception classes
is to subclass from an exception class that already chose the right one,
as explained in the next section. For example, some exceptions, such
as warnings, resume by default, and thus should be treated as if they
will almost always resume.
Selecting by superclass is by no means a requirement. Specializing your
@code{Error} subclass to be resumable, or even to resume by default,
is perfectly acceptable when it makes sense for your design.
@node Creating new exception classes
@subsection Creating new exception classes
If you want code to be able to handle your signalled exceptions, you will
probably want to provide a way to pick those kinds out automatically.
The easiest way to do this is to subclass @code{Exception}.
First, you should choose an exception class to specialize. @code{Error}
is the best choice for non-resumable exceptions, and @code{Notification}
or its subclass @code{Warning} is best for exceptions that should resume
with @code{nil} by default.
Exceptions are just normal objects; include whatever information you think
would be useful to handlers. Note that there are two textual description
fields, a @dfn{description} and a @dfn{message text}. The description,
if provided, should be a more-or-less constant string answered from a
override method on @code{#description}, meant to describe all instances
of your exception class. The message text is meant to be provided at
the point of signalling, and should be used for any extra information
that code might want to provide. Your signalling code can provide the
@code{messageText} by using @code{#signal:} instead of @code{#signal}.
This is yet another reason why signalling variants of instance creation
messages can be more trouble than they're worth.
@node Hooking into the stack unwinding
@subsection Hooking into the stack unwinding
More often useful than even @code{#on:do:} is @code{#ensure:}, which
guarantees that some code is executed when the stack unwinds, whether
because of normal execution or because of a signalled exception.
Here is an example of use of @code{#ensure:} and a situation where the
stack can unwind even without a signal:
@example
Object subclass: ExecuteWithBreak [
| breakBlock |
break: anObject [
breakBlock value: anObject
]
valueWithBreak: aBlock [
"Sets up breakBlock before entering the block,
and passes self to the block."
| oldBreakBlock |
oldBreakBlock := breakBlock.
^[breakBlock := [:arg | ^arg].
aBlock value]
ensure: [breakBlock := oldBreakBlock]
]
]
@end example
This class provides a way to stop the execution of a block without
exiting the whole method as using @code{^} inside a block would do.
The use of @code{#ensure:} guarantees (hence the name "ensure") that even
if @code{breakBlock} is invoked or an error is handled by unwinding,
the old ``break block'' will be restored.
The definition of @code{breakBlock} is extremely simply; it is an
example of the general unwinding feature of blocks, that you have
probably already used:
@example
(history includesKey: num)
ifTrue: [ ^self error: 'Duplicate check number' ].
@end example
You have probably been using @code{#ensure:} without knowing. For example,
@code{File>>#withReadStreamDo:} uses it to ensure that the file is
closed when leaving the block.
@node Handler stack unwinding caveat
@subsection Handler stack unwinding caveat
One important difference between Smalltalk and other languages is
that when a handler is invoked, the stack is not unwound.
The Smalltalk exception system is designed this way because it's rare
to write code that could break because of this difference, and the
@code{#resume:} feature doesn't make sense if the stack is unwound.
It is easy enough to unwind a stack later, and is not so easy to wind
it again if done too early.
For almost all applications, this will not matter, but it technically
changes the semantics significantly so should be kept in mind. One
important case in which it might matter is when using @code{#ensure:}
blocks @emph{and} exception handlers. For comparison, this Smalltalk
code:
@example
| n |
n := 42.
[[self error: 'error'] ensure: [n := 24]]
on: Error
do: [:sig | n printNl. sig return].
n printNl.
@end example
@noindent
will put "42" followed by "24" on the transcript, because the @code{n :=
24} will not be executed until @code{sig return} is invoked, unwinding
the stack. Similar Java code acts differently:
@example
int n = 42;
try
@{
try @{throw new Exception ("42");@}
finally @{n = 24;@}
@}
catch (Exception e)
@{
System.out.println (n);
@}
System.out.println (n);
@end example
@noindent
printing "24" twice, because the stack unwinds before executing the
catch block.
@node Behind the scenes
@section Some nice stuff from the Smalltalk innards
Just like with everything else, you'd probably end up asking yourself:
how's it done? So here's this chapter, just to wheten your appetite...
@menu
* Inside Arrays:: Delving into something old
* Two flavors of equality:: Delving into something new
* Why is #new there?!?:: Or, the truth on metaclasses
* Performance:: Hmm... they told me Smalltalk is slow...
@end menu
@node Inside Arrays
@subsection How Arrays Work
Smalltalk provides a very adequate selection of predefined
classes from which to choose. Eventually, however,
you will find the need to code a new basic data structure.
Because Smalltalk's most fundamental storage allocation
facilities are arrays, it is important that you understand
how to use them to gain efficient access to this kind of
storage.
@b{The Array Class.} Our examples have already shown the Array class, and
its use is fairly obvious. For many applications, it will
fill all your needs---when you need an array in a new class,
you keep an instance variable, allocate a new Array and
assign it to the variable, and then send array accesses via
the instance variable.
This technique even works for string-like objects,
although it is wasteful of storage. An Array object uses a
Smalltalk pointer for each slot in the array; its exact size
is transparent to the programmer, but you can generally
guess that it'll be roughly the word size of your machine.
@footnote{For @gst{}, the size of a C @code{long}, which
is usually 32 bits.} For storing an array of characters, therefore,
an Array works but is inefficient.
@b{Arrays at a Lower Level.} So let's step down to a lower level of data
structure. A ByteArray is much like an Array, but each slot holds only
an integer from 0 to 255-and each slot uses only a byte of
storage. If you only needed to store small quantities in
each array slot, this would therefore be a much more efficient
choice than an Array. As you might guess, this is the
type of array which a String uses.
Aha! But when you go back to chapter 9 and look at the
Smalltalk hierarchy, you notice that String does not inherit
from ByteArray. To see why, we must delve down yet another
level, and arrive at the basic methods for setting up the
structure of the instances of a class.
When we implemented our NiledArray example, we used
@code{<shape: #inherit>}. The shape is exactly that:
the fundamental structure of Smalltalk objects created within a given
class. Let's consider the differences in the next sub-sections.
@table @asis
@item Nothing
The default shape specifies the simplest
Smalltalk object. The object consists only of the storage
needed to hold the instance variables. In C, this would be
a simple structure with zero or more scalar fields.@footnote{C
requires one or more; zero is allowed in Smalltalk}.
@item @code{#pointer}
Storage is still allocated for any instance
variables, but the objects of the class must be created with a
@code{new:} message.
The number passed as an argument to @code{new:} causes the new
object, in addition to the space for instance variables, to
also have that many slots of unnamed (indexed) storage allocated.
The analog in C would be to have a dynamically allocated structure
with some scalar fields, followed at its end by a array of pointers.
@item @code{#byte}
The storage allocated as specified by new: is an array of bytes.
The analog in C would be a dynamically allocated structure with
scalar fields@footnote{This is not always true for other Smalltalk
implementations, who don't allow instance variables in variableByteSubclasses
and variableWordSubclasses.}, followed by a array of @code{char}.
@item @code{#word}
The storage allocated as specified by new: is an array of C unsigned longs,
which are represented in Smalltalk by Integer objects. The analog in
C would be a dynamically allocated structure with scalar fields, followed
by an array of @code{long}. This kind of subclass is only used in a few
places in Smalltalk.
@item @code{#character}
The storage allocated as specified by new: is an array of characters.
The analog in C would be a dynamically allocated structure with
scalar fields, followed by a array of @code{char}.
@end table
There are many more shapes for more specialized usage. All of them
specify the same kind of ``array-like'' behavior, with different
data types.
How to access this new arrays? You already know how to access instance
variables---by name. But there doesn't seem to be a name for this new
storage. The way an object accesses it is to send itself
array-type messages like @code{at:}, @code{at:put:}, and so forth.
The problem is when an object wants to add a new level
of interpretation to these messages. Consider
a Dictionary---it is a pointer-holding object
but its @code{at:} message is in terms of a key, not an integer
index of its storage. Since it has redefined the @code{at:} message, how
does it access its fundamental storage?
The answer is that Smalltalk has defined @code{basicAt:} and
@code{basicAt:put:}, which will access the basic storage even when
the @code{at:} and @code{at:put:} messages have been defined to provide
a different abstraction.
This can get pretty confusing in the abstract, so let's
do an example to show how it's pretty simple in practice.
Smalltalk arrays tend to start at 1; let's define an array
type whose permissible range is arbitrary.
@example
ArrayedCollection subclass: RangedArray [
| offset |
<comment: 'I am an Array whose base is arbitrary'>
RangedArray class >> new: size [
<category: 'instance creation'>
^self new: size base: 1
]
RangedArray class >> new: size base: b [
<category: 'instance creation'>
^(super new: size) init: b
]
init: b [
<category: 'init'>
offset := (b - 1). "- 1 because basicAt: works with a 1 base"
^self
]
rangeCheck: i [
<category: 'basic'>
(i <= offset) | (i > (offset + self basicSize)) ifTrue: [
'Bad index value: ' printOn: stderr.
i printOn: stderr.
Character nl printOn: stderr.
^self error: 'illegal index'
]
]
at: [
self rangeCheck: i.
^self basicAt: i - offset
]
at: i put: v [
self rangeCheck: i.
^self basicAt: i - offset put: v
]
]
@end example
The code has two parts; an initialization, which simply
records what index you wish the array to start with, and the
at: messages, which adjust the requested index so that the
underlying storage receives its 1-based index instead.
We've included a range check; its
utility will demonstrate itself in a moment:
@example
a := RangedArray new: 10 base: 5.
a at: 5 put: 0
a at: 4 put: 1
@end example
Since 4 is below our base of 5, a range check error occurs.
But this check can catch more than just our own misbehavior!
@example
a do: [:x| x printNl]
@end example
Our do: message handling is broken! The stack backtrace
pretty much tells the story:
@example
RangedArray>>#rangeCheck:
RangedArray>>#at:
RangedArray>>#do:
@end example
Our code received a do: message. We didn't define one, so
we inherited the existing do: handling. We see that an
Integer loop was constructed, that a code block was invoked,
and that our own at: code was invoked. When we range
checked, we trapped an illegal index. Just by coincidence,
this version of our range checking code also dumps the
index. We see that do: has assumed that all arrays start at
1.
The immediate fix is obvious; we implement our own do:
@example
RangedArray extend [
do: aBlock [
<category: 'basic'>
1 to: (self basicSize) do: [:x|
aBlock value: (self basicAt: x)
]
]
]
@end example
But the issues start to run deep. If our parent class
believed that it knew enough to assume a starting index of
1@footnote{Actually, in @gst{} @code{do:} is not the only
message assuming that.}, why didn't it also assume that it could
call basicAt:? The answer is that of the two choices, the designer
of the parent class chose the one which was less likely to cause
trouble; in fact all standard Smalltalk collections do have indices
starting at 1, yet not all of them are implemented so
that calling basicAt: would work.@footnote{Some of these classes
actually redefine @code{do:} for performance reasons, but they
would work even if the parent class' implementation of @code{do:}
was kept.}
Object-oriented methodology says that one object should be
entirely opaque to another. But what sort of privacy should
there be between a higher class and its subclasses? How
many assumption can a subclass make about its superclass,
and how many can the superclass make before it begins
infringing on the sovereignty of its subclasses?
Alas, there are rarely easy answers, and this is just an example.
For this particular problem, there is an easy solution. When the
storage need not be accessed with peak efficiency, you can use the
existing array classes. When every access counts, having the
storage be an integral part of your own object allows for
the quickest access---but remember that when you move into this
area, inheritance and polymorphism become trickier, as
each level must coordinate its use of the underlying array
with other levels.
@node Two flavors of equality
@subsection Two flavors of equality
As first seen in chapter two, Smalltalk keys its dictionary
with things like @i{#word}, whereas we generally use
@i{'word'}. The former, as it turns out, is from class Symbol.
The latter is from class String. What's the real difference
between a Symbol and a String? To answer the question, we'll
use an analogy from C.
In C, if you have a function for comparing strings, you
might try to write it:
@example
streq(char *p, char *q)
@{
return (p == q);
@}
@end example
But clearly this is wrong! The reason is that you can have
two copies of a string, each with the same contents but each
at its own address. A correct string compare must walk its
way through the strings and compare each element.
In Smalltalk, exactly the same issue exists, although
the details of manipulating storage addresses are hidden.
If we have two Smalltalk strings, both with the same contents,
we don't necessarily know if they're at the same
storage address. In Smalltalk terms, we don't know if
they're the same object.
The Smalltalk dictionary is searched frequently. To
speed the search, it would be nice to not have to compare
the characters of each element, but only compare the address
itself. To do this, you need to have a guarantee that all
strings with the same contents are the same object. The
String class, created like:
@example
y := 'Hello'
@end example
@noindent
does not satisfy this. Each time you execute this line, you
may well get a new object. But a very similar class, Symbol,
will always return the same object:
@example
y := #Hello
@end example
In general, you can use strings for almost all your tasks.
If you ever get into a performance-critical function which
looks up strings, you can switch to Symbol. It takes longer
to create a Symbol, and the memory for a Symbol is never
freed (since the class has to keep tabs on it indefinitely
to guarantee it continues to return the same object). You
can use it, but use it with care.
This tutorial has generally used the strcmp()-ish kind of
checks for equality. If you ever need to ask the question
``is this the same object?'', you use the @code{==} operator
instead of @code{=}:
@example
x := y := 'Hello'
(x = y) printNl
(x == y) printNl
y := 'Hel', 'lo'
(x = y) printNl
(x == y) printNl
x := #Hello
y := #Hello
(x = y) printNl
(x == y) printNl
@end example
Using C terms, @code{=} compares contents like @code{strcmp()}.
@code{==} compares storage addresses, like a pointer comparison.
@node Why is #new there?!?
@subsection The truth about metaclasses
Everybody, sooner or later, looks for the implementation of the
@code{#new} method in Object class. To their surprise, they
don't find it; if they're really smart, they search for implementors
of #new in the image and they find out it is implemented by
@code{Behavior}... which turns out to be a subclass of Object! The
truth starts showing to their eyes about that sentence that everybody
says but few people understand: ``classes are objects''.
Huh? Classes are objects?!? Let me explain.
@ifinfo
Open up an image; then type the text following the
@code{st>} prompt.
@end ifinfo
@ifhtml
Open up an image; then type the text following the
@code{st>} prompt.
@end ifhtml
@iftex
Open up an image; then type the text printed in
@t{mono-spaced} font.
@end iftex
@display
st> @t{Set superclass!}
HashedCollection
st> @t{HashedCollection superclass!}
Collection
st> @t{Collection superclass!}
Object
st> @t{Object superclass!}
nil
@end display
Nothing new for now. Let's try something else:
@display
st> @t{#(1 2 3) class!}
Array
st> @t{'123' class!}
String
st> @t{Set class!}
Set class
st> @t{Set class class!}
Metaclass
@end display
You get it, that strange @code{Set class} thing is something
called ``a meta-class''... let's go on:
@display
st> @t{^Set class superclass!}
Collection class
st> @t{^Collection class superclass!}
Object class
@end display
You see, there is a sort of `parallel' hierarchy between classes
and metaclasses. When you create a class, Smalltalk creates a
metaclass; and just like a class describes how methods for its
instances work, a metaclass describes how class methods for that
same class work.
@code{Set} is an instance of the metaclass, so when you invoke
the @code{#new} class method, you can also say you are invoking
an instance method implemented by @code{Set class}. Simply put,
class methods are a lie: they're simply instance methods that
are understood by instances of metaclasses.
Now you would expect that @code{Object class superclass} answers
@code{nil class}, that is @code{UndefinedObject}. Yet you saw that
@code{#new} is not implemented there... let's try it:
@display
st> @t{^Object class superclass!}
Class
@end display
Uh?!? Try to read it aloud: the @code{Object class} class inherits
from the @code{Class} class. @code{Class} is the abstract superclass
of all metaclasses, and provides the logic that allows you to create
classes in the image. But it is not the termination point:
@display
st> @t{^Class superclass!}
ClassDescription
st> @t{^ClassDescription superclass!}
Behavior
st> @t{^Behavior superclass!}
Object
@end display
Class is a subclass of other classes. @code{ClassDescription} is
abstract; @code{Behavior} is concrete but lacks the methods
and state that allow classes to have named instance variables,
class comments and more. Its instances are called
@emph{light-weight} classes because they don't have separate
metaclasses, instead they all share @code{Behavior} itself as
their metaclass.
Evaluating @code{Behavior superclass} we have worked our way up to
class Object again: Object is the superclass of all instances as well
as all metaclasses. This complicated system is extremely powerful,
and allows you to do very interesting things that you probably did
without thinking about it---for example, using methods such as
@code{#error:} or @code{#shouldNotImplement} in class methods.
Now, one final question and one final step: what are metaclasses
instances of? The question makes sense: if everything has a class,
should not metaclasses have one?
Evaluate the following:
@display
st> @t{meta := Set class}
st> @t{0 to: 4 do: [ :i |}
st> @t{ i timesRepeat: [ Transcript space ]}
st> @t{ meta printNl}
st> @t{ meta := meta class}
st> @t{]}
Set class
Metaclass
Metaclass class
Metaclass
Metaclass class
0
@end display
If you send @code{#class} repeatedly, it seems that you end up
in a loop made of class @code{Metaclass}@footnote{Which turns
out to be another subclass of @code{ClassDescription}.} and its
own metaclass, @code{Metaclass class}. It looks like class
Metaclass is @i{an instance of an instance of itself}.
To understand the role of @code{Metaclass}, it can be useful
to know that the class creation is implemented there.
Think about it.
@itemize @bullet
@item
@code{Random class} implements creation and
initialization of its instances' random number seed;
analogously, @code{Metaclass class} implements creation and
initialization of its instances, which are metaclasses.
@item
And @code{Metaclass} implements creation and initialization of
its instances, which are classes (subclasses of @code{Class}).
@end itemize
The circle is closed. In the end, this mechanism implements a
clean, elegant and (with some contemplation) understandable
facility for self-definition of classes. In other words, it
is what allows classes to talk about themselves, posing the
foundation for the creation of browsers.
@node Performance
@subsection The truth of Smalltalk performance
Everybody says Smalltalk is slow, yet this is not completely true for
at least three reasons. First, most of the time in graphical applications
is spent waiting for the user to ``do something'', and most of the time
in scripting applications (which @gst{} is particularly well
versed in) is spent in disk I/O; implementing a travelling salesman
problem in Smalltalk would indeed be slow, but for most real applications
you can indeed exchange performance for Smalltalk's power and development
speed.
Second, Smalltalk's automatic memory management is faster than C's manual
one. Most C programs are sped up if you relink them with one of the
garbage collecting systems available for C or C++.
Third, even though very few Smalltalk virtual machines are as optimized as,
say, the Self environment (which reaches half the speed of optimized C!),
they do perform some optimizations on Smalltalk code which make them run
many times faster than a naive bytecode interpreter. Peter Deutsch, who
among other things invented the idea of a just-in-time compiler like those
you are used to seeing for Java@footnote{And like the one that @gst{}
includes as an experimental feature.}, once observed that implementing a
language like Smalltalk efficiently requires the implementor to cheat...
but that's okay as long as you don't get caught. That is, as long as you
don't break the language semantics. Let's look at some of these optimizations.
For certain frequently used 'special selectors', the compiler emits a
send-special-selector bytecode instead of a send-message bytecode.
Special selectors have one of three behaviors:
@itemize @bullet
@item
A few selectors are assigned to special bytecode solely in order to
save space. This is the case for @code{#do:} for example.
@item
Three selectors (@code{#at:}, @code{#at:put:}, @code{#size}) are
assigned to special bytecodes because they are subject to a special
caching optimization. These selectors often result in calling a
virtual machine primitive, so @gst{} remembers which primitve
was last called as the result of sending them. If we send @code{#at:}
100 times for the same class, the last 99 sends are directly mapped
to the primitive, skipping the method lookup phase.
@item
For some pairs of receiver classes and special selectors, the
interpreter never looks up the method in the class; instead it swiftly
executes the same code which is tied to a particular primitive. Of
course a special selector whose receiver or argument is not of the
right class to make a no-lookup pair is looked up normally.
@end itemize
No-lookup methods do contain a primitive number specification,
@code{<primitive: xx>}, but it is used only when the method is
reached through a @code{#perform:@dots{}} message send. Since
the method is not normally looked up, deleting the primitive name
specification cannot in general prevent this primitive from running.
No-lookup pairs are listed below:
@multitable @columnfractions .35 .1 .55
@item @code{Integer}/@code{Integer} @*
@code{Float}/@code{Integer} @*
@code{Float}/@code{Float}
@tab @ @* for
@tab @ @* @code{+ - * = ~= > < >= <=}
@item @code{Integer}/@code{Integer}
@tab for
@tab @code{// \\ bitOr: bitShift: bitAnd:}
@item Any pair of objects
@tab for
@tab @code{== isNil notNil class}
@item BlockClosure
@tab for
@tab @code{value value: blockCopy:}@footnote{You
won't ever send this message in Smalltalk programs. The compiler uses it when
compiling blocks.}
@end multitable
Other messages are open coded by the compiler. That is, there are
no message sends for these messages---if the compiler sees blocks
without temporaries and with the correct number of arguments at the
right places, the compiler unwinds them using jump bytecodes,
producing very efficient code. These are:
@example
to:by:do: if the second argument is an integer literal
to:do:
timesRepeat:
and:, or:
ifTrue:ifFalse:, ifFalse:ifTrue:, ifTrue:, ifFalse:
whileTrue:, whileFalse:
@end example
Other minor optimizations are done. Some are done by a peephole optimizer
which is ran on the compiled bytecodes. Or, for example, when @gst{} pushes a
boolean value on the stack, it automatically checks whether the following
bytecode is a jump (which is a common pattern resulting from most of the
open-coded messages above) and combines the execution of the two bytecodes.
All these snippets can be optimized this way:
@example
1 to: 5 do: [ :i | @dots{} ]
a < b and: [ @dots{} ]
myObject isNil ifTrue: [ @dots{} ]
@end example
That's all. If you want to know more, look at the virtual machine's source
code in @file{libgst/interp-bc.inl} and at the compiler in
@file{libgst/comp.c}.
@node And now
@section Some final words
The question is always how far to go in one document.
At this point, you know how to create classes. You know how
to use inheritance, polymorphism, and the basic storage management
mechanisms of Smalltalk. You've also seen a sampling
of Smalltalk's powerful classes. The rest of this
chapter simply points out areas for further study; perhaps a
newer version of this document might cover these in further
chapters.
@table @b
@item Viewing the Smalltalk Source Code
Lots of experience can be gained by looking at the source code
for system methods; all of them are visible: data structure classes,
the innards of the magic that makes classes be themselves objects and
have a class, a compiler written in Smalltalk itself, the classes
that implement the Smalltalk GUI and those that wrap sockets.
@item Other Ways to Collect Objects
We've seen Array, ByteArray, Dictionary, Set, and the
various streams. You'll want to look at the Bag,
OrderedCollection, and SortedCollection classes. For special purposes,
you'll want to examine the CObject and CType hierarchies.
@item Flow of Control
@gst{} has support for non-preemptive multiple threads of
execution. The state is embodied in a Process class object;
you'll also want to look at the Semaphore and ProcessorScheduler
class.
@item Smalltalk Virtual Machine
@gst{} is implemented as a virtual instruction
set. By invoking @gst{} with the @code{-D} option, you can
view the byte opcodes which are generated as files on the
command line are loaded. Similarly, running @gst{}
with @code{-E} will trace the execution of instructions in your
methods.
You can look at the @gst{} source to gain more information
on the instruction set. With a few modifications, it is based
on the set described in the canonical book from two of the
original designers of Smalltalk: @i{Smalltalk-80: The Language
and its Implementation}, by Adele Goldberg and David Robson.
@item Where to get Help
The Usenet @t{comp.lang.smalltalk} newsgroup is read by many people
with a great deal of Smalltalk experience. There are several
commercial Smalltalk implementations; you can buy support for
these, though it isn't cheap. For the @gst{} system in
particular, you can try the mailing list at:
@example
@mailto{help-smalltalk@@gnu.org}
@end example
No guarantees, but the subscribers will surely do their best!
@end table
@node The syntax
@section A Simple Overview of Smalltalk Syntax
Smalltalk's power comes from its treatment of objects.
In this document, we've mostly avoided the issue of syntax
by using strictly parenthesized expressions as needed. When
this leads to code which is hard to read due to the density
of parentheses, a knowledge of Smalltalk's syntax can let
you simplify expressions. In general, if it was hard for
you to tell how an expression would parse, it will be hard
for the next person, too.
The following presentation presents the grammar a couple
of related elements at a time. We use an EBNF style of
grammar. The form:
@example
[ @dots{} ]
@end example
@noindent
means that ``@dots{}'' can occur zero or one times.
@example
[ @dots{} ]*
@end example
@noindent
means zero or more;
@example
[ @dots{} ]+
@end example
@noindent
means one or more.
@example
@dots{} | @dots{} [ | @dots{} ]*
@end example
@noindent
means that one of the variants must be chosen. Characters
in double quotes refer to the literal characters. Most elements
may be separated by white space; where this is not legal, the
elements are presented without white space
between them.
@table @b
@item @t{methods: ``!'' id [``class''] ``methodsFor:'' string ``!'' [method ``!'']+ ``!''}
Methods are introduced by first naming a class (the id element),
specifying ``class'' if you're adding class methods
instead of instance methods, and sending a string argument
to the @code{methodsFor:} message. Each method is terminated with
an ``!''; two bangs in a row (with a space in the middle)
signify the end of the new methods.
@item @t{method: message [pragma] [temps] exprs}
@itemx @t{message: id | binsel id | [keysel id]+}
@itemx @t{pragma: ``<'' keymsg ``>''}
@itemx @t{temps: ``|'' [id]* ``|''}
A method definition starts out with a kind of template. The
message to be handled is specified with the message names
spelled out and identifiers in the place of arguments. A
special kind of definition is the pragma; it has not been
covered in this tutorial and it provides a way to mark a
method specially as well as the interface to the underlying
Smalltalk virtual machine. temps is the declaration
of local variables. Finally, exprs (covered soon) is
the actual code for implementing the method.
@item @t{unit: id | literal | block | arrayconstructor | ``('' expr ``)''}
@itemx @t{unaryexpr: unit [ id ]+}
@itemx @t{primary: unit | unaryexpr}
These are the ``building blocks'' of Smalltalk expressions. A
unit represents a single Smalltalk value, with the highest
syntactic precedence. A unaryexpr is simply a unit which
receives a number of unary messages. A unaryexpr has the
next highest precedence. A primary is simply a convenient
left-hand-side name for one of the above.
@item @t{exprs: [expr ``.'']* [[``^''] expr]}
@itemx @t{expr: [id ``:='']* expr2} @*
@itemx @t{expr2: primary | msgexpr [ ``;'' cascade ]*}
A sequence of expressions is separated by dots and can end
with a returned value (@code{^}). There can be leading assignments;
unlike C, assignments apply only to simple variable names. An
expression is either a primary (with highest precedence) or
a more complex message. cascade does not apply to primary
constructions, as they are too simple to require the construct.
Since all primary construct are unary, you can just add more unary messages:
@example
1234 printNl printNl printNl
@end example
@item @t{msgexpr: unaryexpr | binexpr | keyexpr}
A complex message is either a unary message (which we have
already covered), a binary message (@code{+}, @code{-}, and so forth),
or a keyword message (@code{at:}, @code{new:}, @dots{}) Unary has the
highest precedence, followed by binary, and keyword messages
have the lowest precedence. Examine the two versions of the
following messages. The second have had parentheses added
to show the default precedence.
@example
myvar at: 2 + 3 put: 4
mybool ifTrue: [ ^ 2 / 4 roundup ]
(myvar at: (2 + 3) put: (4))
(mybool ifTrue: ([ ^ (2 / (4 roundup)) ]))
@end example
@item @t{cascade: id | binmsg | keymsg}
A cascade is used to direct further messages to the same
object which was last used. The three types of messages (
id is how you send a unary message) can thus be sent.
@item @t{binexpr: primary binmsg [ binmsg ]*}
@itemx @t{binmsg: binsel primary}
@itemx @t{binsel: binchar[binchar]}
A binary message is sent to an object, which primary has
identified. Each binary message is a binary selector, constructed
from one or two characters, and an argument which
is also provided by a primary.
@example
1 + 2 - 3 / 4
@end example
@noindent
which parses as:
@example
(((1 + 2) - 3) / 4)
@end example
@item @t{keyexpr: keyexpr2 keymsg}
@itemx @t{keyexpr2: binexpr | primary}
@itemx @t{keymsg: [keysel keyw2]+}
@itemx @t{keysel: id``:''}
Keyword expressions are much like binary expressions, except
that the selectors are made up of identifiers with a colon
appended. Where the arguments to a binary function can only
be from primary, the arguments to a keyword can be binary
expressions or primary ones. This is because keywords have
the lowest precedence.
@item @t{block: ``['' [[``:'' id]* ``|'' ] [temps] exprs ``]''}
A code block is square brackets around a collection of
Smalltalk expressions. The leading ``: id'' part is for block
arguments. Note that it is possible for a block to have
temporary variables of its own.
@item @t{arrayconstructor: ``@{'' exprs ``@}''}
Not covered in this tutorial, this syntax allows to create
arrays whose values are not literals, but are instead evaluated
at run-time. Compare @code{#(a b)}, which results in an Array
of two symbols @code{#a} and @code{#b}, to @code{@{a. b+c@}} which
results in an Array whose two elements are the contents of variable
@code{a} and the result of summing @code{c} to @code{b}.
@item @t{literal: number | string | charconst | symconst | arrayconst | binding | eval}
@itemx @t{arrayconst: ``#'' array | ``#'' bytearray}
@itemx @t{bytearray: ``['' [number]* ``]''}
@itemx @t{array: ``('' [literal | array | bytearray | arraysym | ]* ``)''}
@itemx @t{number: [[dig]+ ``r''] [``-''] [alphanum]+ [``.'' [alphanum]+] [exp [``-''][dig]+].}
@itemx @t{string: "'"[char]*"'"}
@itemx @t{charconst: ``$''char}
@itemx @t{symconst: ``#''symbol | ``#''string }
@itemx @t{arraysym: [id | ``:'']*}
@itemx @t{exp: ``d'' | ``e'' | ``q'' | ``s''}
We have already shown the use of many of these constants.
Although not covered in this tutorial, numbers can have a base
specified at their front, and a trailing scientific notation.
We have seen examples of character, string, and symbol constants.
Array constants are simple enough; they would look like:
@example
a := #(1 2 'Hi' $x #Hello 4 16r3F)
@end example
There are also ByteArray constants, whose elements are constrained
to be integers between 0 and 255; they would look like:
@example
a := #[1 2 34 16r8F 26r3H 253]
@end example
Finally, there are three types of floating-point constants with
varying precision (the one with the @code{e} being the less
precise, followed by @code{d} and @code{q}), and scaled-decimal
constants for a special class which does exact computations but
truncates comparisons to a given number of decimals. For example,
@code{1.23s4} means ``the value @code{1.23}, with four significant
decimal digits''.
@item @t{binding: ``#@{'' [id ``.'']* id ``@}''}
This syntax has not been used in the tutorial, and results in an
Association literal (known as a @dfn{variable binding}) tied to
the class that is named between braces. For example,
@code{#@{Class@} value} is the same as @code{Class}. The
dot syntax is required for supporting namespaces:
@code{#@{Smalltalk.Class@}} is the same as
@code{Smalltalk associationAt: #Class}, but is resolved
at compile-time rather than at run-time.
@item @t{symbol: id | binsel | keysel[keysel]*}
Symbols are mostly used to represent the names of methods.
Thus, they can hold simple identifiers, binary selectors,
and keyword selectors:
@example
#hello
#+
#at:put:
@end example
@itemx @t{eval: ``##('' [temps] exprs ``)''}
This syntax also has not been used in the tutorial, and results
in evaluating an arbitrarily complex expression at compile-time,
and substituting the result: for example @code{##(Object allInstances
size)} is the number of instances of @code{Object} held in the
image @emph{at the time the method is compiled}.
@item @t{id: letter[alphanum]*}
@itemx @t{binchar: ``+'' | ``-'' | ``*'' | ``/'' | ``~'' | ``|'' | ``,'' |}
@itemx @t{``<'' | ``>'' | ``='' | ``&'' | ``@@'' | ``?'' | ``\'' | ``%''}
@itemx @t{alphanum: dig | letter}
@itemx @t{letter: ``A''..``Z''}
@itemx @t{dig: ``0''..``9''}
These are the categories of characters and how they are combined
at the most basic level. binchar simply lists the
characters which can be combined to name a binary message.
@end table