Skip to content

Latest commit

 

History

History
941 lines (713 loc) · 37 KB

chapter7.md

File metadata and controls

941 lines (713 loc) · 37 KB

Chapter 7 {docsify-ignore}

STUDENT: Solving Algebra Word Problems

[This] is an example par excellence of the power of using meaning to solve linguistic problems.

-Marvin Minsky (1968) MIT computer scientist

S S TUDENT was another early language understanding program, written by Daniel Bobrow as his Ph.D. research project in 1964. It was designed to read and solve the kind of word problems found in high school algebra books. An example is:

If the number of customers Tom gets is twice the square of 20% of the number of advertisements he runs, and the number of advertisements is 45, then what is the number of customers Tom gets?

STUDENT could correctly reply that the number of customers is 162. To do this, STUDENT must be far more sophisticated than ELIZA; it must process and "understand" a great deal of the input, rather than just concentrate on a few key words. And it must compute a response, rather than just fill in blanks. However, we shall see that the STUDENT program uses little more than the pattern-matching techniques of ELIZA to translate the input into a set of algebraic equations. From there, it must know enough algebra to solve the equations, but that is not very difficult.

The version of STUDENT we develop here is nearly a full implementation of the original. However, remember that while the original was state-of-the-art as of 1964, AI has made some progress in a quarter century, as subsequent chapters will attempt to show.

7.1 Translating English into Equations The description of STUDENT is:

  1. Break the input into phrases that will represent equations.
  2. Break each phrase into a pair of phrases on either side of the = sign.
  3. Break these phrases down further into sums and products, and so on, until finally we bottom out with numbers and variables. (By "variable" here, I mean "mathematical variable," which is distinct from the idea of a "pattern-matching variable" as used in pat-match in chapter 6).
  4. Translate each English phrase into a mathematical expression. We use the idea of a rule-based translator as developed for ELIZA.
  5. Solve the resulting mathematical equations, coming up with a value for each unknown variable.
  6. Print the values of all the variables. For example, we might have a pattern of the form (I f ?x then ?y), with an associated response that says that ?x and ?y will each be equations or lists of equations. Applying the pattern to the input above, ?y would have the value (what is the number of customers Tom gets). Another pattern of the form (?x i s ?y) could have a response corresponding to an equation where ?x and ?y are the two sides of the equation. We could then make up a mathematical variable for (what) and another for (the number of customers Tom gets). We would recognize this later phrase as a variable because there are no patterns to break it down further. In contrast, the phrase (twice the square of 20 per cent of the number of advertisements he r uns) could match a pattern of the form (twi ce ?x) and transform to (* 2 (the square of 20 per cent of the number of advertisements he runs)), and by furtherapplyingpatternsof the form (the square of ?x) and (?x per cent of ?y) we could arrive at a final response of (* 2 (expt (* (/ 20 100) n) 2)), where . is the variable generated by (the number of advertisements he runs).

Thus, we need to represent variables, expressions, equations, and sets of equations. The easiest thing to do is to use something we know: represent them just as Lisp itself does. Variables will be symbols, expressions and equations will be nested

lists with prefix operators, and sets of equations will be lists of equations. With that in mind, we can define a list of pattern-response rules corresponding to the type of statements found in algebra word problems. The structure definition for a rule is repeated here, and the structure exp, an expression, is added. 1 hs and rhs stand for left- and right-hand side, respectively. Note that the constructor mkexp is defined as a constructor that builds expressions without taking keyword arguments. In general, the notation (: constructor fn args) creates a constructor function with the given name and argument Ust.^

(defstruct (rule (:type list)) pattern response)

(defstruct (exp (:type list) (:constructor mkexp (Ihs op rhs))) op Ihs rhs)

(defun exp-p (x) (consp x)) (defun exp-args (x) (rest x))

We ignored commas and periods in ELIZA, but they are crucial for STUDENT, SO we must make allowances for them. The problem is that a "," in Lisp normally can be used only within a backquote construction, and a "." normally can be used only as a decimal point or in a dotted pair. The special meaning of these characters to the Lisp reader can be escaped either by preceding the character with a backslash (,) or by surrounding the character by vertical bars (I J) .

(pat-match-abbrev '?x* '(?* ?x)) (pat-match-abbrev '?y* (? ?y))

(defparameter student-rules (mapcar #'expand-pat-match-abbrev

'(((?x* I.I) ?x) ((?x* I.I ?y*) (?x ?y)) ((if ?x* IJ then ?y*) (?x ?y)) ((if ?x* then ?y*) (?x ?y)) ((if ?x* IJ ?y*) (?x ?y)) ((?x* . and ?y*) (?x ?y)) ((find ?x* and ?y*) ((= to-find-1 ?x) (= to-find-2 ?y))) ((find ?x*) (= to-find ?x)) ((?x* equals ?y*) (= ?x ?y)) ((?x* same as ?y*) (= ?x ?y)) ((?x* = ?y*) (= ?x ?y)) ((?x* is equal to ?y*) (= ?x ?y)) ((?x* is ?y*) (= ?x ?y)) ((?x* - ?y*) (- ?x ?y)) ((?x* minus ?y*) (- ?x ?y))

^Page 316 of Common Lisp the Language says, "Because a constructor of this type operates By Order of Arguments, it is sometimes known as a BOA constructor."

((difference between ?x* and ?y*) (- ?y ?x)) ((difference ?x* and ?y*) (- ?y ?x)) ((?x* + ?y*) (+ ?x ?y)) ((?x* plus ?y*) (+ ?x ?y)) ((sum ?x* and ?y*) (+ ?x ?y)) ((product ?x* and ?y*) (* ?x ?y)) ((?x* * ?y*) (* ?x ?y)) ((?x* times ?y*) (* ?x ?y)) ((?x* / ?y*) (/ ?x ?y)) ((?x* per ?y*) (/ ?x ?y)) ((?x* divided by ?y*) (/ ?x ?y)) ((half ?x*) (/ ?x 2)) ((one half ?x*) (/ ?x 2)) ((twice ?x*) (* 2 ?x)) ((square ?x*) (* ?x ?x)) ((?x* % less than ?y*) (* ?y (/ (- 100 ?x) 100))) ((?x* % more than ?y*) (* ?y (/ (+ 100 ?x) 100))) ((?x* % ?y*) (* (/ ?x 100) ?y)))))

The main section of STUDENT will search through the list of rules for a response, just

as ELIZA did. The first point of deviation is that before we substitute the values of the

pat-match variables into the response, we must first recursively translate the value

of each variable, using the same list of pattern-response rules. The other difference

is that once we're done, we don't just print the response; instead we have to solve the

set of equations and print the answers. The program is summarized in figure 7.1. Before looking carefully at the program, let's try a sample problem: "If . is 3, what is twice z?" Applying the rules to the input gives the following trace:

Input: (If . is 3. what is twice z) Rule: ((if ?x IJ ?y) (?x ?y)) Binding: ((?x . (z is 3)) (?y . (what is twice z)))

Input: (z is 3) Rule: ((?x is ?y) (= ?x ?y)) Result: (= . 3)

Input: (what is twice . ?) Rule: ((?x is ?y) (= ?x ?y)) Binding: ((?x . what) (?y . (twice z)))

Input: (twice z) Rule: ((twice ?x) (* 2 ?x)) Result: (* 2 z)

Result: (= what (* 2 z)) Result: ((= . 3) (= what (* 2 z)))

There are two minor complications. First, we agreed to implement sets of equations as lists of equations. For this example, everything worked out, and the response

Top-Level Function

Student Solve certain algebra word problems.

Special Variables

student-rules A list of pattern/response pairs.

Data Types

exp An operator and its arguments. rule A pattern and response.

Major Functions

translate-to-expression Translate an English phrase into an equation or expression. translate-pair Translate the value part of the pair into an equation or expression. create-list-of-equations Separate out equations embedded in nested parens. solve-equations Print the equations and their solution. solve Solve a system of equations by constraint propagation.

Auxiliary Fimctions

isolate Isolate the lone variable on the left-hand side of an expression.

noise-word-p Is this a low-content word that can be safely ignored?

make-variable Create a variable name based on the given list of words.

print-equations Print a list of equations.

inverse-op I.e., the inverse of + is -. unknown-p Is the argument an unknown (variable)?

in-exp True if X appears anywhere in exp. no-unknown Returns true if there are no unknowns in exp. one-unknown Returns the single unknown in exp, if there is exactly one.

commutative-p Is the operator commutative?

solve-arithmetic Perform arithmetic on rhs of an equation.

binary-exp-p Is this a binary expression?

prefix->infix Translate prefix to infix expressions.

mkexp Make an expression.

Previously Defined Functions

pat-match Match pattern against an input, (p. 180) rule-based-translator Apply a set of rules, (p. 189)

Figure 7.1: Glossary for the STUDENT Program

was a list of two equations. But if nested patterns are used, the response could be something like (( = a 5) ((= b (+ a 1)) (= c (->-a b)))), which is not a list of equations. The function create -1 i st-of-equati ons transforms a response like this into a proper list of equations. The other complication is choosing variable names. Givenalistof words like (the number of customers Tom gets), we want to choose a symbol to represent it. We will see below that the symbol customers is chosen, but that there are other possibilities.

Here is the main function for STUDENT. It first removes words that have no content, then translates the input to one big expression with translate-to-expression, and breaks that into separate equations with create-list-of-equations. Finally, the function solve-equations does the mathematics and prints the solution.

(defun Student (words) "Solve certain Algebra Word Problems." (solve-equations

(create-list-of-equations (translate-to-expression (remove-if #*noise-word-p words)))))

The function translate-to-expression is a rule-based translator. It either finds some rule in student-rules to transform the input, or it assumes that the entire input represents a single variable. The function translate-pair takes a variable/value binding pair and translates the value by a recursive call to translate-to - express ion.

(defun translate-to-expression (words) "Translate an English phrase into an equation or expression." (or (rule-based-translator

words student-rules :rule-if #'rule-pattern :rule-then #*rule-response :action #'(lambda (bindings response)

(sublis (mapcar #'translate-pair bindings) response))) (make-variable words)))

(defun translate-pair (pair) "Translate the value part of the pair into an equation or expression." (cons (binding-var pair)

(translate-to-expression (binding-val pair))))

The function create-1 ist-of-equations takes a single expression containing embedded equations and separates them into a list of equations:

(defun create-1ist-of-equations (exp) "Separate out equations embedded in nested parens." (cond ((null exp) nil)

((atom (first exp)) (list exp)) (t (append (create-1ist-of-equations (first exp)) (create-1ist-of-equations (rest exp))))))

Finally, the function make-vari abl e creates a variable to represent a Ust of v^ords. We do that by first removing all "noise words" from the input, and then taking the first symbol that remains. So, for example, "the distance John traveled" and "the distance traveled by John" will both be represented by the same variable, di stance, which is certainly the right thing to do. However, "the distance Mary traveled" will also be represented by the same variable, which is certainly a mistake. For (the number of customers Tom gets), the variable will be customers, since the, of and number are all noise words. This will match (the customers mentioned above) and

(the number of customers), but not (Tom's customers). For now, we will accept the first-non-noise-word solution, but note that exercise 7.3 asks for a correction.

(defun make-variable (words) "Create a variable name based on the given list of words" The list of words will already have noise words removed (first words))

(defun noise-word-p (word) "Is this a low-content word that can be safely ignored?" (member word '(a an the this number of $)))

7.2 Solving Algebraic Equations The next step is to write the equation-solving section of STUDENT. This is more an exercise in elementary algebra than in AI, but it is a good example of a symbol- manipulation task, and thus an interesting programming problem.

The STUDENT program mentioned the function sol ve-equati ons, passing it one argument, a Hst of equations to be solved, sol ve-equati ons prints the Hst of equations, attempts to solve them using sol ve, and prints the result.

(defun solve-equations (equations) "Print the equations and their solution " (print-equations "The equations to be solved are:" equations) (print-equations "The solution is:" (solve equations nil)))

The real work is done by sol ve, which has the following specification: (1) Find an equation with exactly one occurrence of an unknown in it. (2) Transform that equation so that the unknown is isolated on the left-hand side. This can be done if we limit the operators to +, -, *, and /. (3) Evaluate the arithmetic on the right-hand side, yielding a numeric value for the unknown. (4) Substitute the numeric value for the unknown in all the other equations, and remember the known value. Then try to solve the resulting set of equations. (5) If step (1) fails—if there is no equation with exactly one unknown—then just return the known values and don't try to solve anything else.

The function sol ve is passed a system of equations, along with a list of known variable/value pairs. Initially no variables are known, so this list will be empty, sol ve goes through the list of equations searching for an equation with exactly one unknown. If it can find such an equation, it caHs isolate to solve the equation in terms of that one unknown, solve then substitutes the value for the variable throughout the list of equations and calls itself recursively on the resulting list. Each

time solve calls itself, it removes one equation from the Ust of equations to be solved, and adds one to the Ust of known variable/value pairs. Since the list of equations is always growing shorter, sol ve must eventually terminate.

(defun solve (equations known)

"Solve a system of equations by constraint propagation." Try to solve for one equation, and substitute its value into the others. If that doesn't work, return what is known,

(or (some #*(lambda (equation) (let ((x (one-unknown equation))) (when X (let ((answer (solve-arithmetic (isolate equation x)))) (solve (subst (exp-rhs answer) (exp-lhs answer) (remove equation equations)) (cons answer known)))))) equations) known))

isolate is passed an equation guaranteed to have one unknown. It returns an equivalent equation with the unknown isolated on the left-hand side. There are five cases to consider: when the unknown is alone on the left, we're done. The second case is when the unknown is anywhere on the right-hand side. Because is commutative, we can reduce the problem to solving the equivalent equation with left- and right-hand sides reversed.

Next we have to deal with the case where the unknown is in a complex expression on the left-hand side. Because we are allowing four operators and the unknown can be either on the right or the left, there are eight possibilities. Letting X stand for an expression containing the unknown and A and . stand for expressions with no unknowns, the possibilities and their solutions are as follows:

(1)XA=B X=B/A (5)AX=B X=B/A (2)X+A=B ^ X=B-A (6)A+X=B ^ X=B-A (3)X/A=B => X=B*A (7)A/X=B X=A/B (4)X-A=B X=B+A (8)A-X=B => X=A-B

Possibilities (1) through (4) are handled by case III, (5) and (6) by case IV, and (7) and (8) by case V. In each case, the transformation does not give us the final answer, since X need not be the unknown; it might be a complex expression involving the unknown. So we have to caU i sol ate again on the resulting equation. The reader should try to verify that transformations (1) to (8) are valid, and that cases III to V implement them properly.

(defun isolate (e x)

"Isolate the lone . in e on the left-hand side of e." This assumes there is exactly one . in e, and that e is an equation,

(cond ((eq (exp-lhs e) x) Case I: X = A -> X = . e)

((in-exp X (exp-rhs e)) ;; Case II: A = f(X) -> f(X) = A (isolate (mkexp (exp-rhs e) '= (exp-lhs e)) x))

((in-exp X (exp-lhs (exp-lhs e))) Case III: f(X)*A = . -> f(X) = B/A (isolate (mkexp (exp-lhs (exp-lhs e)) *=

(mkexp (exp-rhs e) (inverse-op (exp-op (exp-lhs e))) (exp-rhs (exp-lhs e)))) x))

((commutative-p (exp-op (exp-lhs e))) :; Case IV: A*f(X) = . -> f(X) = B/A (isolate (mkexp (exp-rhs (exp-lhs e)) ' =

(mkexp (exp-rhs e) (inverse-op (exp-op (exp-lhs e))) (exp-lhs (exp-lhs e)))) x))

(t :; Case V: A/f(X) = . -> f(X) = A/B (isolate (mkexp (exp-rhs (exp-lhs e)) ' =

(mkexp (exp-lhs (exp-lhs e)) (exp-op (exp-lhs e)) (exp-rhs e))) x))))

Recall that to prove a function is correct, we have to prove both that it gives the correct answer when it terminates and that it will eventually terminate. For a recursive function with several alternative cases, we must show that each alternative is valid, and also that each alternative gets closer to the end in some way (that any recursive calls involve 'simpler' arguments). For i sol ate, elementary algebra will show that each step is valid—or at least nearly valid. Dividing both sides of an equation by 0 does not yield an equivalent equation, and we never checked for that. It's also possible that similar errors could sneak in during the call to eval. However, if we assume the equation does have a single valid solution, then i sol ate performs only legal transformations.

The hard part is to prove that 1 sol ate terminates. Case I clearly terminates, and the others all contribute towards isolating the unknown on the left-hand side. For any equation, the sequence will be first a possible use of case II, followed by a number of recursive calls using cases III to V. The number of calls is bounded by the number of subexpressions in the equation, since each successive call effectively removes an expression from the left and places it on the right. Therefore, assuming the input is

of finite size, we must eventually reach a recursive call to i sol ate that will use case I and terminate.

When i sol ate returns, the right-hand side must consist only of numbers and operators. We could easily write a function to evaluate such an expression. However, we don't have to go to that effort, since the function already exists. The data structure exp was carefully selected to be the same structure (lists with prefix functions) used by Lisp itself for its own expressions. So Lisp will find the right-hand side to be an acceptable expression, one that could be evaluated if typed in to the top level. Lisp evaluates expressions by calling the function eval, so we can call eval directly and have it return a number. The function sol ve -a ri t hmet i c returns an equation of the form (=var number).

Auxiliary functions for sol ve are shown below. Most are straightforward, but I will remark on a few of them. The function pref ix->inf ix takes an expression in prefix notation and converts it to a fully parenthesized infix expression. Unlike i sol ate, it assumes the expressions will be implemented as lists, pref i x->i nf i . is used by pri nt-equati ons to produce more readable output.

(defun print-equations (header equations) "Print a list of equations." (format t "%a"'{% { a}}%" header

(mapcar #*prefix->infix equations)))

(defconstant operators-and-inverses '((+ -) (- +) (* /) (/ *) (= =)))

(defun inverse-op (op) (second (assoc op operators-and-inverses)))

(defun unknown-p (exp) (symbolp exp))

(defun in-exp (x exp) "True if X appears anywhere in exp" (or (eq . exp)

(and (exp-p exp) (or (in-exp . (exp-lhs exp)) (in-exp . (exp-rhs exp))))))

(defun no-unknown (exp) "Returns true if there are no unknowns in exp." (cond ((unknown-p exp) nil)

((atom exp) t) ((no-unknown (exp-lhs exp)) (no-unknown (exp-rhs exp))) (t nil)))

(defun one-unknown (exp) "Returns the single unknown in exp, if there is exactly one." (cond ((unknown-p exp) exp)

((atom exp) nil) ((no-unknown (exp-lhs exp)) (one-unknown (exp-rhs exp))) ((no-unknown (exp-rhs exp)) (one-unknown (exp-lhs exp))) (t nil)))

(defun commutative-p (op) "Is operator commutative?" (member op '(+*=)))

(defun solve-arithmetic (equation) "Do the arithmetic for the right-hand side." This assumes that the right-hand side is in the right form, (mkexp (exp-lhs equation) *= (eval (exp-rhs equation))))

(defun binary-exp-p (x) (and (exp-p x) (= (length (exp-args x)) 2)))

(defun prefix->infix (exp) "Translate prefix to infix expressions." (if (atom exp) exp

(mapcar #'prefix->infix

(if (binary-exp-p exp) (list (exp-lhs exp) (exp-op exp) (exp-rhs exp)) exp))))

Here's an example of sol ve-equati ons in action, with a system of two equations.

The reader should go through the trace, discovering which case was used at each call

to i sol ate, and verifying that each step is accurate.

(trace isolate solve) (isolate solve)

(solve-equations '((= (+ 3 4) (* (- 5 (+ 2 x)) 7)) (= (+ (* 3 X) y) 12)))

The equations to be solved are: (3 + 4) = ((5 - (2 + X)) * 7) ((3 * X) + Y) = 12

(1 ENTER SOLVE: ((= (+ 3 4) (* (- 5 (+ 2 X)) 7)) (= (+ (* 3 X) Y) 12)) NIL) (1 ENTER ISOLATE: (= (+ 3 4) (* (- 5 (+ 2 X)) 7)) X) (2 ENTER ISOLATE: (= (* (- 5 (+ 2 X)) 7) (+ 3 4)) X) (3 ENTER ISOLATE: (= (- 5 (+ 2 X)) (/ (+ 3 4) 7)) X)

(4 ENTER ISOLATE: (= (+ 2 X) (- 5 (/ (+ 3 4) 7))) X) (5 ENTER ISOLATE: (= X (- (- 5 (/ (+ 3 4) 7)) 2)) X) (5 EXIT ISOLATE: (= X (- (- 5 (/ (+ 3 4) 7)) 2)))

(4 EXIT ISOLATE: (= X (- (- 5 (/ (+ 3 4) 7)) 2)))

(3 EXIT ISOLATE: (= . (- (- 5 (/ (+ 3 4) 7)) 2)))

(2 EXIT ISOLATE: (= X (- (- 5 (/ (+ 3 4) 7)) 2)))

(1 EXIT ISOLATE: (= X (- (- 5 (/ (+ 3 4) 7)) 2)))

(2 ENTER SOLVE: ((= (+ (* 3 2) Y) 12)) ((= X 2)))

(1 ENTER ISOLATE: (= (+ (* 3 2) Y) 12) Y)

(2 ENTER ISOLATE: (= Y (- 12 (* 3 2))) Y)

(2 EXIT ISOLATE: (= Y (- 12 (* 3 2))))

(1 EXIT ISOLATE: (= Y (- 12 (* 3 2))))

(3 ENTER SOLVE: NIL ((= Y 6) (= X 2)))

(3 EXIT SOLVE: ((= Y 6) (= X 2)))

(2 EXIT SOLVE: ((= Y 6) (= X 2)))

(1 EXIT SOLVE: ((= Y 6) (= X 2)))

The solution is:

Y = 6

X = 2

NIL

Now let's tackle the format string "'^%a{% "{ a''}}%" in print-equations. This may look like random gibberish, but there is actually sense behind it. format processes the string by printing each character, except that""" indicates some special formatting action, depending on the following character. The combination "~%" prints a newline, and "a" prints the next argument to format that has not been used yet. Thus the first four characters of the format string, " ", print a newline followed by the argument header. The combination "{" treats the corresponding argument as a list, and processes each element according to the specification between the " ~{" and the next" ~}". In this case, equati ons is a list of equations, so each one gets printed with a newline ("") followed by two spaces, followed by the processing of the equation itself as a list, where each element is printed in the "~a" format and preceded by a blank. The t given as the first argument to format means to print to the standard output; another output stream may be specified there.

One of the annoying minor holes in Lisp is that there is no standard convention on where to print newlines! In C, for example, the very first line of code in the reference manual is

printfC'hello, worldXn");

This makes it clear that newlines are printed after each line. This convention is so ingrained in the UNIX world that some UNIX programs will go into an infinite loop if the last line in a file is not terminated by a newline. In Lisp, however, the function pri nt puts in a newline before the object to be printed, and a space after. Some Lisp programs carry the newline-before policy over to format, and others use the newline- af ter policy. This only becomes a problem when you want to combine two programs written under different policies. How did the two competing policies arise? In UNIX there was only one reasonable policy, because all input to the UNIX interpreter (the

shell) is terminated by newlines, so there is no need for a newline-before. In some Lisp interpreters, however, input can be terminated by a matching right parenthesis. In that case, a newline-before is needed, lest the output appear on the same line as the input.

▣ Exercise 7.1 [m] Implement pri nt-equati ons using only primitive printing functions such as terpri and pri nc, along with explicit loops.

7.3 Examples Now we move on to examples, taken from Bobrow's thesis. In the first example, it is necessary to insert a "then" before the word "what" to get the right answer:

(student '(If the number of customers Tom gets is twice the square of 20 % of the number of advertisements he runs I.I and the number of advertisements is 45 I J then what is the number of customers Tom gets ?))

The equations to be solved are: CUSTOMERS = (2 * (((20 / 100) * ADVERTISEMENTS) *

((20 / 100) * ADVERTISEMENTS))) ADVERTISEMENTS = 45 WHAT = CUSTOMERS

The solution is: WHAT = 162 CUSTOMERS = 162 ADVERTISEMENTS = 45

NIL

Notice that our program prints the values for all variables it can solve for, while Bobrow's program only printed the values that were explicitly asked for in the text. This is an example of "more is less"—it may look impressive to print all the answers, but it is actually easier to do so than to decide just what answers should be printed. The following example is not solved correctly:

(student '(The daily cost of living for a group is the overhead cost plus the running cost for each person times the number of people in the group I.I This cost for one group equals $ 100 I,I and the number of people in the group is 40 I.I If the overhead cost is 10 times the running cost I,I find the overhead and running cost for each person I.I))

The equations to be solved are: DAILY = (OVERHEAD + (RUNNING * PEOPLE)) COST = 100 PEOPLE = 40 OVERHEAD = (10 * RUNNING) TO-FIND-1 = OVERHEAD TO-FIND-2 = RUNNING

The solution is: PEOPLE = 40 COST = 100

NIL

This example points out two important limitations of our version of student as compared to Bobrow's. The first problem is in naming of variables. The phrases "the daily cost of living for a group" and "this cost" are meant to refer to the same quantity, but our program gives them the names daily and cost respectively. Bobrow's program handled naming by first considering phrases to be the same only if they matched perfectly. If the resulting set of equations could not be solved, he would try again, this time considering phrases with words in common to be identical. (See the following exercises.)

The other problem is in our sol ve function. Assuming we got the variables equated properly, sol ve would be able to boil the set of equations down to two:

100 = (OVERHEAD + (RUNNING * 40)) OVERHEAD = (10 * RUNNING)

This is a set of two linear equations in two unknowns and has a unique solution at RUNNING = 2, OVERHEAD = 20. But our version of sol ve couldn't find this solution, since it looks for equations with one unknown. Here is another example that student handles well:

(student '(Fran's age divided by Robin's height is one half Kelly's 10 I.I Kelly's IQ minus 80 is Robin's height I.I If Robin is 4 feet tall 1,1 how old is Fran ?))

The equations to be solved are: (FRAN / ROBIN) = (KELLY / 2) (KELLY - 80) = ROBIN ROBIN = 4

HOW = FRAN

The solution is: HOW = 168 FRAN = 168 KELLY = 84 ROBIN = 4

NIL

But a slight variation leads to a problem:

(student '(Fran's age divided by Robin's height is one half Kelly's IQ I.I Kelly's IQ minus 80 is Robin's height I.I If Robin is 0 feet tall . how old is Fran ?))

The equations to be solved are: (FRAN / ROBIN) = (KELLY / 2) (KELLY - 80) = ROBIN ROBIN = 0 HOW = FRAN

The solution is: HOW = 0 FRAN = 0 KELLY = 80 ROBIN = 0

NIL

There is no valid solution to this problem, because it involves dividing by zero (Robin's height). But student is willing to transform the first equation into:

FRAN = ROBIN * (KELLY / 2)

and then substitutes to get 0 for FRAN. Worse, dividing by zero could also come up inside eval:

(student '(Fran's age times Robin's height is one half Kelly's IQ I.I Kelly's IQ minus 80 is Robin's height I.I If Robin is 0 feet tall IJ how old is Fran ?))

The equations to be solved are: (FRAN * ROBIN) = (KELLY / 2) (KELLY - 80) = ROBIN ROBIN = 0 HOW = FRAN

Error: There was an attempt to divide a number by zero

However, one could claim that nasty examples with division by zero don't show up in algebra texts.

In summary, STUDENT behaves reasonably well, doing far more than the toy program ELIZA. STUDENT is also quite efficient; on my machine it takes less than one second for each of the prior examples. However, it could still be extended to have more powerful equation-solving capabilities. Its linguistic coverage is another matter. While one could add new patterns, such patterns are really just tricks, and don't capture the underlying structure of English sentences. That is why the STUDENT approach was abandoned as a research topic.

7.4 History and References Bobrow's Ph.D. thesis contains a complete description of STUDENT. It is reprinted in Minsky 1968. Since then, there have been several systems that address the same task, with increased sophistication in both their mathematical and linguistic ability. Wong (1981) describes a system that uses its understanding of the problem to get a better linguistic analysis. Sterling et al. (1982) present a much more powerful equation solver, but it does not accept natural language input. Certainly Bobrow's language analysis techniques were not very sophisticated by today's measures. But that was largely the point: if you know that the language is describing an algebraic problem of a certain type, then you don't need to know very much linguistics to get the right answer most of the time.

7.5 Exercises ▣ Exercise 7.2 [h] We said earlier that our program was unable to solve pairs of linear equations, such as:

100 = (OVERHEAD + (RUNNING * 40))

OVERHEAD = (10 * RUNNING)

The original STUDENT could solve these equations. Write a routine to do so. You may assume there will be only two equations in two unknowns if you wish, or if you are more ambitious, you could solve a system of . linear equations with . unknowns.

▣ Exercise 7.3 [h] Implement a version of Bobrow's variable-naming algorithm. Instead of taking the first word of each equation, create a unique symbol, and associate

with it the entire list of words. In the first pass, each nonequal list of words will be considered a distinct variable. If no solution is reached, word lists that share words in common are considered to be the same variable, and the solution is attempted again. For example, an input that contains the phrases "the rectangle's width" and "the width of the rectangle" might assign these two phrases the variables . 1 and .2. If an attempt to solve the problem yields no solutions, the program should realize that vl and . 2 have the words "rectangle" and "width" in common, and add the equation (= vl v2) and try again. Since the variables are arbitrary symbols, the printing routine should probably print the phrases associated with each variable rather than the variable itself.

▣ Exercise 7.4 [h] The original STUDENT also had a set of "common knowledge" equations that it could use when necessary. These were mostly facts about conversion factors, such as (1 inch = 2.54 cm). Also included wereequations like (distance equals rate times time), which could be used to solve problems like "If the distance from Anabru to Champaign is 10 miles and the time it takes Sandy to travel this distance is 2 hours, what is Sandy's rate of speed?" Make changes to incorporate this facility. It probably only helps in conjunction with a solution to the previous exercise.

▣ Exercise 7.5 [h] Change student so that it prints values only for those variables that are being asked for in the problem. That is, given the problem "X is 3. Y is 4. How much is X + Y?" it should not print values for X and Y.

▣ Exercise 7.6 [m] Try STUDENT on the following examples. Make sure you handle special characters properly:

(a) The price of a radio is 69.70 dollars. If this price is 15% less than the marked price, find the marked price. (b) The number of soldiers the Russians have is one half of the number of guns they have. The number of guns they have is 7000. What is the number of soldiers they have? (c) If the number of customers Tom gets is twice the square of 20 % of the number of advertisements he runs, and the number of advertisements is 45, and the profit Tom receives is 10 times the number of customers he gets, then what is the profit? (d) The average score is 73, The maximum score is 97. What is the square of the difference between the average and the maximum? (e) Tom is twice Mary's age, and Jane's age is half the difference between Mary and Tom. If Mary is 18 years old, how old is Jane? (f)Whatis4 + 5*14/7?

{g)xxb = c--d.bxc = x.x = b--b.b = 5.

▣ Exercise 7.7 [h] Student's infix-to-prefix rules account for the priority of operators properly, but they don't handle associativity in the standard fashion. For example, (12 6 3) translates to (- 12 (- 6 3)) or 9, when the usual convention is to interpret this as ((- 12 6) 3) or 3. Fix student to handle this convention.

▣ Exercise 7.8 [d] Find a mathematically oriented domain that is sufficiently limited so that STUDENT can solve problems in it. The chemistry of solutions (calculating pH concentrations) might be an example. Write the necessary student-rules , and test the resulting program.

▣ Exercise 7.9 [m] efficient version. Analyze the complexity of one-unknown and implement a more

▣ Exercise 7.10 [h] Bobrow's paper on STUDENT (1968) includes an appendix that abstractly characterizes all the problems that his system can solve. Generate a similar characterization for this version of the program.

7.6 Answers Answer 7.1

(defun print-equations (header(terpri) (princ header) (dolist (equation equations)

(terpri) (princ " ")

equations)

(dolist (x (prefix->infix equation)) (princ " ") (princ x))))

Answer 7.9 one-unknown is very inefficient because it searches each subcomponent of an expression twice. For example, consider the equation:

(= (+ (+ . 2) (+ 3 4)) (+ (+ 5 6) (+ 7 8)))

To decide if this has one unknown, one - unknown will call no - unknown on the left-hand side, and since it fails, call it again on the right-hand side. Although there are only eight atoms to consider, it ends up calling no-unknown 17 times and one-unknown 4 times. In general, for a tree of depth n, approximately 2^ calls to no-unknown are made. This is clearly wasteful; there should be no need to look at each component more than once.

The following version uses an auxiliary function, f i nd - one - un known, that has an accumulator parameter, unknown. This parameter can take on three possible values: nil, indicating that no unknown has been found; or the single unknown that has been found so far; or the number 2 indicating that two unknowns have been found and therefore the final result should be nil. The function f i nd - one - unknown has four cases: (1) If we have already found two unknowns, then return 2 to indicate this. (2) If the input expression is a nonatomic expression, then first look at its left-hand side for unknowns, and pass the result found in that side as the accumulator to a search of the right-hand side. (3) If the expression is an unknown, and if it is the second one found, return 2; otherwise return the unknown itself. (4) If the expression is an atom that is not an unknown, then just return the accumulated result.

(defun one-unknown (exp) "Returns the single unknown in exp, if there is exactly one." (let ((answer (find-one-unknown exp nil)))

;; If there were two unknowns, return nil; otherwise return the unknown (if there was one)

(if (eql answer 2) nil answer)))

(defun find-one-unknown (exp unknown) "Assuming UNKNOWN is the unknown(s) found so far, decide if there is exactly one unknown in the entire expression." (cond ((eql unknown 2) 2)

((exp-p exp)

(find-one-unknown (exp-rhs exp) (find-one-unknown (exp-lhs exp) unknown)))

((unknown-p exp)

(if unknown 2 exp))

(t unknown)))