-
Notifications
You must be signed in to change notification settings - Fork 22
/
part_i.html
5289 lines (5129 loc) · 244 KB
/
part_i.html
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
706
707
708
709
710
711
712
713
714
715
716
717
718
719
720
721
722
723
724
725
726
727
728
729
730
731
732
733
734
735
736
737
738
739
740
741
742
743
744
745
746
747
748
749
750
751
752
753
754
755
756
757
758
759
760
761
762
763
764
765
766
767
768
769
770
771
772
773
774
775
776
777
778
779
780
781
782
783
784
785
786
787
788
789
790
791
792
793
794
795
796
797
798
799
800
801
802
803
804
805
806
807
808
809
810
811
812
813
814
815
816
817
818
819
820
821
822
823
824
825
826
827
828
829
830
831
832
833
834
835
836
837
838
839
840
841
842
843
844
845
846
847
848
849
850
851
852
853
854
855
856
857
858
859
860
861
862
863
864
865
866
867
868
869
870
871
872
873
874
875
876
877
878
879
880
881
882
883
884
885
886
887
888
889
890
891
892
893
894
895
896
897
898
899
900
901
902
903
904
905
906
907
908
909
910
911
912
913
914
915
916
917
918
919
920
921
922
923
924
925
926
927
928
929
930
931
932
933
934
935
936
937
938
939
940
941
942
943
944
945
946
947
948
949
950
951
952
953
954
955
956
957
958
959
960
961
962
963
964
965
966
967
968
969
970
971
972
973
974
975
976
977
978
979
980
981
982
983
984
985
986
987
988
989
990
991
992
993
994
995
996
997
998
999
1000
---
layout: paper
part: Part I
---
<div class="WordSection1">
<p class="romeinscijfer" id="logic_and_logic_programming">
I
</p>
<div class="AutoStyle00">
<h1 class="AutoStyle01" id="h_logic_and_logic_programming">
Logic and<br>
Logic Programming
</h1>
</div>
<p class="sektie1">
<i>Logic Programming</i> is the name of a programming paradigm which was developed in the 70s. Rather than viewing a computer program as a step-by-step description of an algorithm, the program is conceived as a logical theory, and a procedure call is viewed as a theorem of which the truth needs to be established. Thus, executing a program means searching for a proof. In traditional (imperative) programming languages, the program is a <i>procedural</i> specification of <b>how</b> a problem needs to be solved. In contrast, a logic program concentrates on a <i>declarative</i> specification of <b>what</b> the problem is. Readers familiar with imperative programming will find that Logic Programming requires quite a different way of thinking. Indeed, their knowledge of the imperative paradigm will be partly incompatible with the logic paradigm.
</p>
<p class="sektie">
This is certainly true with regard to the concept of a program <i>variable</i>. In imperative languages, a variable is a name for a memory location which can store data of certain types. While the contents of the location may vary over time, the variable always points to the same location. In fact, the term ‘variable’ is a bit of a misnomer here, since it refers to a value that is well-defined at every moment. In contrast, a variable in a logic program is a variable in the mathematical sense, i.e. a placeholder that can take on any value. In this respect, Logic Programming is therefore much closer to mathematical intuition than imperative programming.
</p>
<p class="sektie">
Imperative programming and Logic Programming also differ with respect to the <i>machine model</i> they assume. A machine model is an abstraction of the computer on which programs are executed. The imperative paradigm assumes a dynamic, state-based machine model, where the state of the computer is given by the contents of its memory. The effect of a program statement is a transition from one state to another. Logic Programming does not assume such a dynamic machine model. Computer plus program represent a certain amount of knowledge about the world, which is used to answer queries.
</p>
<p class="sektie">
The first three chapters of the book are devoted to an introduction to Logic Programming. Chapter 1, <i>A brief introduction to clausal logic</i>, is an introductory chapter, introducing many concepts in Logic Programming by means of examples. These concepts get a more formal treatment in Chapter 2, <i>Clausal logic and resolution: theoretical backgrounds</i>. In Chapter 3, <i>Logic Programming and Prolog</i>, we take a closer look at Prolog as a logic programming language, explaining its main features and describing some common programming techniques.
</p>
</div>
<b>
<span class="AutoStyle02">
<br clear="all"/>
</span>
</b>
<div class="WordSection2">
<p class="cijfer" id="a_brief_introduction_to_clausal_logic">
1
</p>
<h2 id="h_a_brief_introduction_to_clausal_logic">
A brief introduction to clausal logic
</h2>
<p class="sektie1">
In this chapter, we will introduce clausal logic as a formalism for representing and reasoning with knowledge. The aim of this chapter is to acquaint the reader with the most important concepts, without going into too much detail. The theoretical aspects of clausal logic, and the practical aspects of Logic Programming, will be discussed in Chapters 2 and 3.
</p>
<p class="sektie">
Our Universe of Discourse in this chapter will be the London Underground, of which a small part is shown in fig. 1.1. Note that this picture contains a wealth of information, about lines, stations, transit between lines, relative distance, etc. We will try to capture this information in logical statements. Basically, fig. 1.1 specifies which stations are directly connected by which lines. If we follow the lines from left to right (Northern downwards), we come up with the following 11 formulas:
</p>
<div class="extract swish" id="1.0.1">
<pre class="source swish temp AutoStyle03" data-variant-id="group-1" id="swish.1.0.1" query-text="?-connected(bond_street,Y,L). ?-connected(X,piccadilly_circus,L). ?-connected(X,Y,piccadilly). ?-connected(X,Y,L),connected(Y,Z,L).">
connected(bond_street,oxford_circus,central).
connected(oxford_circus,tottenham_court_road,central).
connected(bond_street,green_park,jubilee).
connected(green_park,charing_cross,jubilee).
connected(green_park,piccadilly_circus,piccadilly).
connected(piccadilly_circus,leicester_square,piccadilly).
connected(green_park,oxford_circus,victoria).
connected(oxford_circus,piccadilly_circus,bakerloo).
connected(piccadilly_circus,charing_cross,bakerloo).
connected(tottenham_court_road,leicester_square,northern).
connected(leicester_square,charing_cross,northern).
</pre>
</div>
<p class="tekst">
Let’s define two stations to be <i>nearby</i> if they are on the same line, with at most one station in between. This relation can also be represented by a set of logical formulas:
</p>
<div class="extract swish" id="1.0.2">
<pre class="source swish AutoStyle03" data-variant-id="group-1" id="swish.1.0.2" query-text="?-nearby(bond_street,Y). ?-nearby(X,piccadilly_circus). ?-nearby(X,Y). ?-nearby(X,Y),nearby(Y,Z).">
nearby(bond_street,oxford_circus).
nearby(oxford_circus,tottenham_court_road).
nearby(bond_street,tottenham_court_road).
nearby(bond_street,green_park).
nearby(green_park,charing_cross).
nearby(bond_street,charing_cross).
nearby(green_park,piccadilly_circus).
nearby(piccadilly_circus,leicester_square).
nearby(green_park,leicester_square).
nearby(green_park,oxford_circus).
nearby(oxford_circus,piccadilly_circus).
nearby(piccadilly_circus,charing_cross).
nearby(oxford_circus,charing_cross).
nearby(tottenham_court_road,leicester_square).
nearby(leicester_square,charing_cross).
nearby(tottenham_court_road,charing_cross).
</pre>
</div>
<div class="extract figure" id="1.1">
<table align="center" cellpadding="0" cellspacing="0" hspace="0" vspace="0">
<tbody>
<tr>
<td align="left" class="AutoStyle04" valign="top">
<div class="AutoStyle05">
<p class="figure">
<img src="img/part_i/image002.svg" v:shapes="_x0000_i1025" width="100%"/>
</p>
</div>
<p class="caption">
<b>Figure 1.1.</b> Part of the London Underground. Reproduced by permission of London Regional Transport (LRT Registered User No. 94/1954).
</p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="tekst">
These 16 formulas have been derived from the previous 11 formulas in a systematic way. If <i>X</i> and <i>Y</i> are directly connected via some line <i>L</i>, then <i>X</i> and <i>Y</i> are nearby. Alternatively, if there is some <i>Z</i> in between, such that <i>X</i> and <i>Z</i> are directly connected via <i>L</i>, and <i>Z</i> and <i>Y</i> are also directly connected via <i>L</i>, then <i>X</i> and <i>Y</i> are also nearby. We can formulate this in logic as follows:
</p>
<div class="extract swish" id="1.0.3">
<pre class="source swish inherit temp AutoStyle03" data-variant-id="group-1" id="swish.1.0.3" inherit-id="swish.1.0.1" query-text="?-nearby(tottenham_court_road,leicester_square). ?-nearby(tottenham_court_road,W). ?-nearby(X,leicester_square).">
nearby(X,Y):-connected(X,Y,L).
nearby(X,Y):-connected(X,Z,L),connected(Z,Y,L).
</pre>
</div>
<p class="tekst">
In these formulas, the symbol ‘ <tt>:-</tt> ’ should be read as ‘if’, and the comma between <tt>connected(X,Z,L)</tt> and <tt>connected(Z,Y,L)</tt> should be read as ‘and’. The uppercase letters stand for universally quantified variables, such that, for instance, the second formula means:
</p>
<p class="citaat">
<b>For any values</b> of <i>X</i>, <i>Y</i>, <i>Z</i> and <i>L</i>, <i>X</i> is nearby <i>Y</i> <b>if</b> <i>X</i> is directly connected to Z via L, <b>and</b> Z is directly connected to Y via L.
</p>
<p class="sektie">
We now have two definitions of the nearby-relation, one which simply lists all pairs of stations that are nearby each other, and one in terms of direct connections. Logical formulas of the first type, such as
</p>
<p class="p-el">
nearby(bond_street,oxford_circus)
</p>
<p class="tekst">
will be called <i>facts</i>, and formulas of the second type, such as
</p>
<p class="p-el">
nearby(X,Y):-connected(X,Z,L),connected(Z,Y,L)
</p>
<p class="tekst">
will be called <i>rules</i>. Facts express unconditional truths, while rules denote conditional truths, i.e. conclusions which can only be drawn when the premises are known to be true. Obviously, we want these two definitions to be <i>equivalent</i>: for each possible query, both definitions should give exactly the same answer. We will make this more precise in the next section.
</p>
<div class="extract exercise" id="1.1">
<div class="AutoStyle06">
<p class="exercise AutoStyle07">
<i>Exercise 1.1.</i> Two stations are ‘not too far’ if they are on the same or a different line, with at most one station in between. Define rules for the predicate <tt>not_too_far</tt>.
</p>
<div class="extract swish" id="1.0.4">
<pre class="source swish inherit AutoStyle08" data-variant-id="group-1" id="swish.1.0.4" inherit-id="swish.1.0.1" query-text="?-not_too_far(X,Y).">
not_too_far(X,Y):-true. % replace 'true' with your definition
not_too_far(X,Y):-true. % add more clauses as needed
</pre>
</div>
</div>
</div>
<h3 id="answering_queries">
1.1 Answering queries
</h3>
<p class="sektie1">
A <i>query</i> like ‘which station is nearby Tottenham Court Road?’ will be written as
</p>
<p class="p-el">
?-nearby(tottenham_court_road,W).
</p>
</div>
<p class="tekst">
where the prefix ‘ <tt>?-</tt> ’ indicates that this is a query rather than a fact. An <i>answer</i> to this query, e.g. ‘Leicester Square’, will be written { <tt>W</tt>
<span class="AutoStyle09">
→
</span>
<tt>leicester_square</tt> }, indicating a <i>substitution</i> of values for variables, such that the statement in the query, i.e.
</p>
<p class="p-el">
?-nearby(tottenham_court_road,leicester_square).
</p>
<p class="tekst">
is true. Now, if the nearby-relation is defined by means of a list of facts, answers to queries are easily found: just look for a fact that <i>matches</i> the query, by which is meant that the fact and the query can be made identical by substituting values for variables in the query. Once we have found such a fact, we also have the substitution which constitutes the answer to the query.
</p>
<p class="sektie">
If rules are involved, query-answering can take several of these steps. For answering the query <tt>?-nearby(tottenham_court_road,W)</tt>, we match it with the conclusion of the rule
</p>
<p class="p-el">
nearby(X,Y):-connected(X,Y,L)
</p>
<p class="tekst">
yielding the substitution { <tt>X</tt>
<span class="AutoStyle09">
→
</span>
<tt>tottenham_court_road</tt>, <tt>Y</tt>
<span class="AutoStyle09">
→
</span>
<tt>W</tt> }. We then try to find an answer for the premises of the rule under this substitution, i.e. we try to answer the query
</p>
<p class="p-el">
?-connected(tottenham_court_road,W,L).
</p>
<p class="tekst">
That is, we can find a station nearby Tottenham Court Road, if we can find a station directly connected to it. This second query is answered by looking at the facts for direct connections, giving the answer { <tt>W</tt>
<span class="AutoStyle09">
→
</span>
<tt>leicester_square</tt>, <tt>L</tt>
<span class="AutoStyle09">
→
</span>
<tt>northern</tt> }. Finally, since the variable <tt>L</tt> does not occur in the initial query, we just ignore it in the final answer, which becomes { <tt>W</tt>
<span class="AutoStyle09">
→
</span>
<tt>leicester_square</tt> } as above. In fig. 1.2, we give a graphical representation of this process. Since we are essentially <i>proving</i> that a statement follows logically from some other statements, this graphical representation is called a <i>proof tree</i>.
</p>
<div class="extract figure" id="1.2">
<table align="center" cellpadding="0" cellspacing="0" hspace="0" vspace="0">
<tbody>
<tr>
<td align="left" class="AutoStyle04" valign="top">
<div class="AutoStyle05">
<p class="figure">
<img src="img/part_i/image004.svg" v:shapes="_x0000_i1026" width="100%"/>
</p>
</div>
<p class="caption">
<b>Figure 1.2.</b> A proof tree for the query <tt>?-nearby(tottenham_court_road,W)</tt>.
</p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="sektie">
The steps in fig. 1.2 follow a very general reasoning pattern:
</p>
<p class="citaat">
to answer a query <tt>?-</tt> <i>Q</i>
<span class="AutoStyle10">
1
</span>
<tt>,</tt> <i>Q</i>
<span class="AutoStyle10">
2
</span>
<tt>,</tt> … <tt>,</tt> <i>Q<span class="AutoStyle10">
n
</span></i>, find a rule <i>A</i> <tt>:-</tt> <i>B</i>
<span class="AutoStyle10">
1
</span>
<tt>,</tt> … <tt>,</tt> <i>B<span class="AutoStyle10">
m
</span></i> such that <i>A</i> matches with <i>Q</i>
<span class="AutoStyle10">
1
</span>
, and answer the query <tt>?-</tt> <i>B</i>
<span class="AutoStyle10">
1
</span>
<tt>,</tt> … <tt>,</tt> <i>B<span class="AutoStyle10">
m
</span></i> <tt>,</tt> <i>Q</i>
<span class="AutoStyle10">
2
</span>
<tt>,</tt> … <tt>,</tt> <i>Q<span class="AutoStyle10">
n
</span></i>.
</p>
<p class="tekst">
This reasoning pattern is called <i>resolution</i>, and we will study it extensively in Chapters 2 and 3. Resolution adds a <b>procedural interpretation</b> to logical formulas, besides their declarative interpretation (they can be either true or false). Due to this procedural interpretation, logic can be used as a programming language. In an ideal logic programming system, the procedural interpretation would exactly match the declarative interpretation: everything that is calculated procedurally is declaratively true, and <i>vice versa</i>. In such an ideal system, the programmer would just bother about the declarative interpretation of the formulas she writes down, and leave the procedural interpretation to the computer. Unfortunately, in current logic programming systems the procedural interpretation does <b>not</b> exactly match the declarative interpretation: for example, some things that are declaratively true are not calculated at all, because the system enters an infinite loop. Therefore, the programmer should also be aware of the procedural interpretation given by the computer to her logical formulas.
</p>
<p class="sektie">
The resolution proof process makes use of a technique that is known as <i>reduction to the absurd</i>: suppose that the formula to be proved is false, and show that this leads to a contradiction, thereby demonstrating that the formula to be proved is in fact true. Such a proof is also called a <i>proof by refutation</i>. For instance, if we want to know which stations are nearby Tottenham Court Road, we negate this statement, resulting in ‘there are no stations nearby Tottenham Court Road’. In logic, this is achieved by writing the statement as a rule with an empty conclusion, i.e. a rule for which the truth of its premises would lead to falsity:
</p>
<p class="p-el">
:-nearby(tottenham_court_road,W)
</p>
<p class="tekst">
Thus, the symbols ‘ <tt>?-</tt> ’ and ‘ <tt>:-</tt> ’ are in fact equivalent. A contradiction is found if resolution leads to the empty rule, of which the premises are always true (since there are none), but the conclusion is always false. Conventionally, the empty rule is written as ‘
<span class="AutoStyle11">
□
</span>
’.
</p>
<p class="sektie">
At the beginning of this section, we posed the question: can we show that our two definitions of the nearby-relation are equivalent? As indicated before, the idea is that to be equivalent means to provide exactly the same answers to the same queries. To formalise this, we need some additional definitions. A <i>ground</i> fact is a fact without variables. Obviously, if <tt>G</tt> is a ground fact, the query <tt>?-G</tt> never returns a substitution as answer: either it <i>succeeds</i> (<tt>G</tt> does follow from the initial assumptions), or it <i>fails</i> (<tt>G</tt> does not). The set of ground facts <tt>G</tt> for which the query <tt>?-G</tt> succeeds is called the <i>success set</i>. Thus, the success set for our first definition of the nearby-relation consists simply of those 16 formulas, since they are ground facts already, and nothing else is derivable from them. The success set for the second definition of the nearby-relation is constructed by applying the two rules to the ground facts for connectedness. Thus we can say: two definitions of a relation are (procedurally) <i>equivalent</i> if they have the same success set (restricted to that relation).
</p>
<div class="extract exercise" id="1.2">
<div class="AutoStyle06">
<p class="exercise AutoStyle07">
<i>Exercise 1.2.</i> Construct the proof trees for the query<br>
<tt>?-nearby(W,charing_cross)</tt>.
</p>
</div>
</div>
<h3 id="recursion">
1.2 Recursion
</h3>
<p class="sektie1">
Until now, we have encountered two types of logical formulas: facts and rules. There is a special kind of rule which deserves special attention: the rule which defines a relation in terms of itself. This idea of ‘self-reference’, which is called <i>recursion</i>, is also present in most procedural programming languages. Recursion is a bit difficult to grasp, but once you’ve mastered it, you can use it to write very elegant programs, e.g.
</p>
<p class="p-el">
<span class="AutoStyle12">
IF N=0<br>
THEN FAC:=1<br>
ELSE FAC:=N*FAC(N-1).
</span>
</p>
<p class="tekst">
is a recursive procedure for calculating the factorial of a given number, written in a Pascal-like procedural language. However, in such languages <i>iteration</i> (looping a pre-specified number of times) is usually preferred over recursion, because it uses memory more efficiently.
</p>
<p class="sektie">
In Prolog, however, recursion is the <b>only</b> looping structure
<span class="CustomFootnote">
<a href="#_ftn1" name="_ftnref1" title="">
<span class="MsoFootnoteReference">
<span class="AutoStyle13">
<span class="AutoStyle14">
[1]
</span>
</span>
</span>
</a>
</span>
. (This does not necessarily mean that Prolog is always less efficient than a procedural language, because there are ways to write recursive loops that are just as efficient as iterative loops, as we will see in section 3.6.) Perhaps the easiest way to think about recursion is the following: an arbitrarily large chain is described by describing how one link in the chain is connected to the next. For instance, let us define the relation of <i>reachability</i> in our underground example, where a station is reachable from another station if they are connected by one or more lines. We could define it by the following 20 ground facts:
</p>
<div class="extract swish" id="1.1.1">
<pre class="source swish temp AutoStyle03" data-variant-id="group-1" id="swish.1.1.1" query-text="?-reachable(bond_street,Y). ?-reachable(X,green_park). ?-reachable(X,Y).">
reachable(bond_street,charing_cross).
reachable(bond_street,green_park).
reachable(bond_street,leicester_square).
reachable(bond_street,oxford_circus).
reachable(bond_street,piccadilly_circus).
reachable(bond_street,tottenham_court_road).
reachable(green_park,charing_cross).
reachable(green_park,leicester_square).
reachable(green_park,oxford_circus).
reachable(green_park,piccadilly_circus).
reachable(green_park,tottenham_court_road).
reachable(leicester_square,charing_cross).
reachable(oxford_circus,charing_cross).
reachable(oxford_circus,leicester_square).
reachable(oxford_circus,piccadilly_circus).
reachable(oxford_circus,tottenham_court_road).
reachable(piccadilly_circus,charing_cross).
reachable(piccadilly_circus,leicester_square).
reachable(tottenham_court_road,charing_cross).
reachable(tottenham_court_road,leicester_square).
</pre>
</div>
<p class="tekst">
Since any station is reachable from any other station by a route with at most two intermediate stations, we could instead use the following (non-recursive) definition:
</p>
<p class="p-eerst AutoStyle15">
reachable(X,Y):-connected(X,Y,L).
</p>
<p class="programma AutoStyle15">
reachable(X,Y):-connected(X,Z,L1),connected(Z,Y,L2).
</p>
<p class="p-laatst AutoStyle15">
reachable(X,Y):-connected(X,Z1,L1),connected(Z1,Z2,L2),<br>
connected(Z2,Y,L3).
</p>
<p class="tekst">
Of course, if we were to define the reachability relation for the entire London underground, we would need a lot more, longer and longer rules. Recursion is a much more convenient and natural way to define such chains of arbitrary length:
</p>
<div class="extract swish" id="1.1.2">
<pre class="source swish inherit AutoStyle03" data-variant-id="group-1" id="swish.1.1.2" inherit-id="swish.1.0.1" query-text="?-reachable(bond_street,Y). ?-reachable(X,green_park). ?-reachable(X,Y).">
reachable(X,Y):-connected(X,Y,L).
reachable(X,Y):-connected(X,Z,L),reachable(Z,Y).
</pre>
</div>
<p class="tekst">
The reading of the second rule is as follows: ‘ <i>Y</i> is reachable from <i>X</i> if <i>Z</i> is directly connected to <i>X</i> via line <i>L</i>, and <i>Y</i> is reachable from <i>Z</i> ’.
</p>
<div class="extract figure" id="1.3">
<table align="center" cellpadding="0" cellspacing="0" hspace="0" vspace="0">
<tbody>
<tr>
<td align="left" class="AutoStyle04" valign="top">
<div class="AutoStyle05">
<p class="figure">
<img src="img/part_i/image006.svg" v:shapes="_x0000_i1027" width="100%"/>
</p>
</div>
<p class="caption">
<b>Figure 1.3.</b> A proof tree for the query <tt>?-reachable(bond_street,W)</tt>.
</p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="sektie">
We can now use this recursive definition to prove that Leicester Square is reachable from Bond Street (fig. 1.3). However, just as there are several routes from Bond Street to Leicester Square, there are several alternative proofs of the fact that Leicester Square is reachable from Bond Street. An alternative proof is given in fig. 1.4. The difference between these two proofs is that in the first proof we use the fact
</p>
<p class="p-el">
connected(oxford_circus,tottenham_court_road,central)
</p>
<p class="tekst">
while in the second proof we use
</p>
<p class="p-el">
connected(oxford_circus,piccadilly_circus,bakerloo)
</p>
<p class="tekst">
There is no reason to prefer one over the other, but since Prolog searches the given formulas top-down, it will find the first proof before the second. Thus, the order of the clauses determines the order in which answers are found. As we will see in Chapter 3, it sometimes even determines whether any answers are found at all.
</p>
<div class="extract exercise" id="1.3">
<div class="AutoStyle06">
<p class="exercise AutoStyle07">
<i>Exercise 1.3.</i> Give a third proof tree for the answer { <tt>W</tt>
<span class="AutoStyle09">
→
</span>
<tt>leicester_square</tt> }, and change the order of the facts for connectedness, such that this proof tree is constructed first.
</p>
</div>
</div>
<div class="extract figure" id="1.4">
<table align="center" cellpadding="0" cellspacing="0" hspace="0" vspace="0">
<tbody>
<tr>
<td align="left" class="AutoStyle04" valign="top">
<div class="AutoStyle05">
<p class="figure">
<img src="img/part_i/image008.svg" v:shapes="_x0000_i1028" width="100%"/>
</p>
</div>
<p class="caption">
<b>Figure 1.4.</b> Alternative proof tree for the query <tt>?-reachable(bond_street,W)</tt>.
</p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="sektie">
In other words, Prolog’s query-answering process is a <i>search process</i>, in which the answer depends on all the choices made earlier. A important point is that some of these choices may lead to a dead-end later. For example, if the recursive formula for the reachability relation had been tried before the non-recursive one, the bottom part of fig. 1.3 would have been as in fig. 1.5. This proof tree cannot be completed, because there are no answers to the query <tt>?-reachable(charing_cross,W)</tt>, as can easily be checked. Prolog has to recover from this failure by climbing up the tree, reconsidering previous choices. This search process, which is called <i>backtracking</i>, will be detailed in Chapter 5.
</p>
<h3 id="structured_terms">
1.3 Structured terms
</h3>
<p class="sektie1">
Finally, we illustrate the way Prolog can handle more complex datastructures, such as a list of stations representing a route. Suppose we want to redefine the reachability relation, such that it also specifies the intermediate stations. We could adapt the non-recursive definition of <tt>reachable</tt> as follows:
</p>
<p class="p-eerst AutoStyle16">
reachable0(X,Y):-connected(X,Y,L).
</p>
<p class="programma AutoStyle17">
reachable1(X,Y,Z):-connected(X,Z,L1),<br>
connected(Z,Y,L2).
</p>
<p class="p-laatst AutoStyle18">
reachable2(X,Y,Z1,Z2):-connected(X,Z1,L1),<br>
connected(Z1,Z2,L2),<br>
connected(Z2,Y,L3).
</p>
<p class="tekst">
The suffix of reachable indicates the number of intermediate stations; it is added to stress that relations with different number of arguments are really different relations, even if their names are the same. The problem now is that we have to know the number of intermediate stations in advance, before we can ask the right query. This is, of course, unacceptable.
</p>
<div class="extract figure" id="1.5">
<table align="center" cellpadding="0" cellspacing="0" hspace="0" vspace="0">
<tbody>
<tr>
<td align="left" class="AutoStyle04" valign="top">
<div class="AutoStyle05">
<p class="figure">
<img src="img/part_i/image010.svg" v:shapes="_x0000_i1029" width="100%"/>
</p>
</div>
<p class="caption">
<b>Figure 1.5.</b> A failing proof tree.
</p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="sektie">
We can solve this problem by means of <i>functors</i>. A functor looks just like a mathematical function, but the important difference is that <i>functor expressions are never evaluated to determine a value</i>. Instead, they provide a way to name a complex object composed of simpler objects. For instance, a route with Oxford Circus and Tottenham Court Road as intermediate stations could be represented by
</p>
<p class="p-el">
route(oxford_circus,tottenham_court_road)
</p>
<p class="tekst">
Note that this is not a ground fact, but rather an argument for a logical formula. The reachability relation can now be defined as follows:
</p>
<div class="extract swish" id="1.2.1">
<pre class="source swish inherit AutoStyle03" data-variant-id="group-1" id="swish.1.2.1" inherit-id="swish.1.0.1" query-text="?-reachable(oxford_circus,charing_cross,R).">
reachable(X,Y,noroute):-connected(X,Y,L).
reachable(X,Y,route(Z)):-connected(X,Z,L1),
connected(Z,Y,L2).
reachable(X,Y,route(Z1,Z2)):-connected(X,Z1,L1),
connected(Z1,Z2,L2),
connected(Z2,Y,L3).
</pre>
</div>
<div class="extract swish" id="1.2.2">
<pre class="source swish inherit AutoStyle03" data-variant-id="group-1" id="swish.1.2.2" inherit-id="swish.1.0.1" query-text="?-reachable(oxford_circus,charing_cross,R).">
reachable(X,Y,noroute):-connected(X,Y,L).
reachable(X,Y,route(Z,R)):-connected(X,Z,L),
connected(Z,Y,R).
</pre>
</div>
<p class="tekst">
The query <tt>?-reachable(oxford_circus,charing_cross,R)</tt> now has three possible answers:
</p>
<p class="tekst AutoStyle19">
{ <tt>R</tt>
<span class="AutoStyle09">
→
</span>
<tt>route(piccadilly_circus)</tt> }<br>
{ <tt>R</tt>
<span class="AutoStyle09">
→
</span>
<tt>route(tottenham_court_road,leicester_square)</tt> }<br>
{ <tt>R</tt>
<span class="AutoStyle09">
→
</span>
<tt>route(piccadilly_circus,leicester_square)</tt> }
</p>
<div class="extract figure" id="1.6">
<table align="center" cellpadding="0" cellspacing="0" hspace="0" vspace="0">
<tbody>
<tr>
<td align="left" class="AutoStyle04" valign="top">
<div class="AutoStyle20">
<p class="med-figure AutoStyle07">
<img src="img/part_i/image012.svg" v:shapes="_x0000_i1030" width="100%"/>
</p>
</div>
<p class="med-caption">
<b>Figure 1.6.</b> A complex object as a tree.
</p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="sektie">
As argued in the previous section, we prefer the recursive definition of the reachability relation, in which case we use functors in a somewhat different way.
</p>
<div class="extract swish" id="1.2.2_2">
<pre class="source swish inherit AutoStyle03" data-variant-id="group-1" id="swish.1.2.2" inherit-id="swish.1.0.1" query-text="?-reachable(oxford_circus,charing_cross,R).">
reachable(X,Y,noroute):-connected(X,Y,L).
reachable(X,Y,route(Z,R)):-connected(X,Z,L),
reachable(Z,Y,R).
</pre>
</div>
<p class="tekst">
At first sight, there does not seem to be a big difference between this and the use of functors in the non-recursive program. However, the query
</p>
<p class="p-el">
?-reachable(oxford_circus,charing_cross,R)
</p>
<p class="tekst">
now has the following answers:
</p>
<p class="p-eerst AutoStyle21">
{R
<span class="AutoStyle09">
→
</span>
route(tottenham_court_road,<br>
route(leicester_square,noroute))}
</p>
<p class="programma">
{R
<span class="AutoStyle09">
→
</span>
route(piccadilly_circus,noroute)}
</p>
<p class="p-laatst AutoStyle21">
{R
<span class="AutoStyle09">
→
</span>
route(piccadilly_circus,<br>
route(leicester_square,noroute))}
</p>
<p class="tekst">
The functor <tt>route</tt> is now also recursive in nature: its first argument is a station, but <i>its second argument is again a route</i>. For instance, the object
</p>
<p align="right" class="p-el AutoStyle22">
route(tottenham_court_road,route(leicester_square,noroute))
</p>
<p class="tekst">
can be pictured as in fig. 1.6. Such a figure is called a <i>tree</i> (we will have a lot more to say about trees in chapter 4). In order to find out the route represented by this complex object, we read the leaves of this tree from left to right, until we reach the ‘terminator’ <tt>noroute</tt>. This would result in a linear notation like
</p>
<p class="p-el">
[tottenham_court_road,leicester_square].
</p>
<div class="extract figure" id="1.7">
<table align="center" cellpadding="0" cellspacing="0" hspace="0" vspace="0">
<tbody>
<tr>
<td align="left" class="AutoStyle04" valign="top">
<div class="AutoStyle20">
<p class="med-figure AutoStyle07">
<img src="img/part_i/image014.svg" v:shapes="_x0000_i1031" width="100%"/>
</p>
</div>
<p class="med-caption">
<b>Figure 1.7.</b> The list <tt>[a,b,c</tt>] as a tree.
</p>
</td>
</tr>
</tbody>
</table>
</div>
<p class="sektie">
For user-defined functors, such a linear notation is not available. However, Prolog provides a built-in ‘datatype’ called <i>lists</i>, for which both the tree-like notation and the linear notation may be used. The functor for lists is <tt>.</tt> (dot), which takes two arguments: the first element of the list (which may be any object), and the rest of the list (which must be a list). The list terminator is the special symbol <tt>[]</tt>, denoting the empty list. For instance, the term
</p>
<p class="p-el">
.(a,.(b,.(c,[])))
</p>
<p class="tekst">
denotes the list consisting of <tt>a</tt> followed by <tt>b</tt> followed by <tt>c</tt> (fig. 1.7). Alternatively, we may use the linear notation, which uses square brackets:
</p>
<p class="p-el">
[a,b,c]
</p>
<p class="tekst">
To increase readability of the tree-like notation, instead of
</p>
<p class="p-el">
.(First,Rest)
</p>
<p class="tekst">
one can also write
</p>
<p class="p-el">
[First|Rest]
</p>
<p class="tekst">
Note that <tt>Rest</tt> is a list: e.g., <tt>[a,b,c]</tt> is the same list as <tt>[a|[b,c]]</tt>. <tt>a</tt> is called the <i>head</i> of the list, and <tt>[b,c]</tt> is called its <i>tail</i>. Finally, to a certain extent the two notations can be mixed: at the head of the list, you can write any number of elements in linear notation. For instance,
</p>
<p class="p-el">
[First,Second,Third|Rest]
</p>
<p class="tekst">
denotes a list with three or more elements.
</p>
<div class="extract exercise" id="1.4">
<div class="AutoStyle06">
<p class="exercise AutoStyle07">
<i>Exercise 1.4.</i> A list is either the empty list <tt>[]</tt>, or a non-empty list <tt>[First|Rest]</tt> where <tt>Rest</tt> is a list. Define a relation <tt>list(L)</tt>, which checks whether <tt>L</tt> is a list. Adapt it such that it succeeds only for lists of (<i>i</i>) even length and (<i>ii</i>) odd length.
</p>
</div>
</div>
<p class="sektie">
The recursive nature of such datastructures makes it possible to ignore the size of the objects, which is extremely useful in many situations. For instance, the definition of a route between two underground stations does not depend on the length of the route; all that matters is whether there is an intermediate station or not. For both cases, there is a clause. Expressing the route as a list, we can state the final definition of the reachability relation:
</p>
<div class="extract swish" id="1.2.3">
<pre class="source swish inherit AutoStyle03" data-variant-id="group-1" id="swish.1.2.3" inherit-id="swish.1.0.1" query-text="?-reachable(oxford_circus,charing_cross,R). ?-reachable(X,charing_cross,[A,B,C,D]). ?-reachable(bond_street,piccadilly_circus,[A,B|L]).">
reachable(X,Y,[]):-connected(X,Y,L).
reachable(X,Y,[Z|R]):-connected(X,Z,L),
reachable(Z,Y,R).
</pre>
</div>
<p class="tekst">
The query <tt>?-reachable(oxford_circus,charing_cross,R)</tt> now results in the following answers:
</p>
<p class="tekst AutoStyle19">
{ <tt>R</tt>
<span class="AutoStyle09">
→
</span>
<tt>[tottenham_court_road,leicester_square]</tt> }<br>
{ <tt>R</tt>
<span class="AutoStyle09">
→
</span>
<tt>[piccadilly_circus]</tt> }<br>
{ <tt>R</tt>
<span class="AutoStyle09">
→
</span>
<tt>[piccadilly_circus, leicester_square]</tt> }
</p>
<p class="tekst">
Note that Prolog writes out lists of fixed length in the linear notation.
</p>
<p class="sektie">
Should we for some reason want to know from which station Charing Cross can be reached via a route with four intermediate stations, we should ask the query
</p>
<p class="sektie AutoStyle23">
<tt>?-reachable(X,charing_cross,[A,B,C,D])</tt>
</p>
<p class="tekst">
which results in two answers:
</p>
<p class="p-eerst AutoStyle24">
<span class="AutoStyle25">
{
</span>
X
<span class="AutoStyle09">
→
</span>
bond_street
<span class="AutoStyle25">
,
</span>
A
<span class="AutoStyle09">
→
</span>
green_park
<span class="AutoStyle25">
,
</span>
B
<span class="AutoStyle09">
→
</span>
oxford_circus
<span class="AutoStyle25">
,
</span>
C
<span class="AutoStyle09">
→
</span>
tottenham_court_road
<span class="AutoStyle25">
,
</span>
D
<span class="AutoStyle09">
→
</span>
leicester_square
<span class="AutoStyle25">
}
</span>
</p>
<p class="p-laatst AutoStyle26">
<span class="AutoStyle25">
{
</span>
X
<span class="AutoStyle09">
→
</span>
bond_street
<span class="AutoStyle25">
,
</span>
A
<span class="AutoStyle09">
→
</span>
green_park
<span class="AutoStyle25">
,
</span>
B
<span class="AutoStyle09">
→
</span>
oxford_circus
<span class="AutoStyle25">
,
</span>
C
<span class="AutoStyle09">
→
</span>
piccadilly_circus
<span class="AutoStyle25">
,
</span>
D
<span class="AutoStyle09">
→
</span>
leicester_square
<span class="AutoStyle25">
}.
</span>
</p>
<div class="extract exercise" id="1.5">
<div class="AutoStyle06">
<p class="exercise AutoStyle07">
<i>Exercise 1.5.</i> Construct a query asking for a route from Bond Street to Piccadilly Circus with at least two intermediate stations.
</p>
</div>
</div>
<h3 id="what_else_is_there_to_know_about_clausal_logic">
1.4 What else is there to know about clausal logic?
</h3>
<p class="sektie1">
The main goal of this chapter has been to introduce the most important concepts in clausal logic, and how it can be used as a reasoning formalism. Needless to say, a subject like this needs a much more extensive and precise discussion than has been attempted here, and many important questions remain. To name a few:
</p>
<p class="opsomming AutoStyle27">
•what are the limits of expressiveness of clausal logic, i.e. what can and what cannot be expressed?
</p>
<p class="opsomming AutoStyle27">
•what are the limits of reasoning with clausal logic, i.e. what can and what cannot be (efficiently) computed?
</p>
<p class="opsomming AutoStyle27">
•how are these two limits related: is it for instance possible to enhance reasoning by limiting expressiveness?
</p>
<p class="tekst">
In order to start answering such questions, we need to be more precise in defining what clausal logic is, what expressions in clausal logic mean, and how we can reason with them. That means that we will have to introduce some theory in the next chapter. This theory will not only be useful for a better understanding of Logic Programming, but it will also be the foundation for most of the topics in Part III (<i>Advanced reasoning techniques</i>).
</p>
<p class="sektie">
Another aim of Part I of this book is to teach the skill of programming in Prolog. For this, theory alone, however important, will not suffice. Like any programming language, Prolog has a number of built-in procedures and datastructures that you should know about. Furthermore, there are of course numerous programming techniques and tricks of the trade, with which the Prolog programmer should be familiar. These subjects will be discussed in Chapter 3. Together, Chapters 2 and 3 will provide a solid foundation for the rest of the book.
</p>
<b>
<span class="AutoStyle02">
<br clear="all"/>
</span>
</b>
<div class="WordSection3">
<p class="cijfer" id="clausal_logic_and_resolution_theoretical_backgrounds">
2
</p>
<h2 id="h_clausal_logic_and_resolution_theoretical_backgrounds">
Clausal logic and resolution:<br>
theoretical backgrounds
</h2>
<p class="sektie1">
In this chapter we develop a more formal view of Logic Programming by means of a rigorous treatment of clausal logic and resolution theorem proving. Any such treatment has three parts: syntax, semantics, and proof theory. <i>Syntax</i> defines the logical language we are using, i.e. the alphabet, different kinds of ‘words’, and the allowed ‘sentences’. <i>Semantics</i> defines, in some formal way, the meaning of words and sentences in the language. As with most logics, semantics for clausal logic is <i>truth-functional</i>, i.e. the meaning of a sentence is defined by specifying the conditions under which it is assigned certain <i>truth values</i> (in our case: <b>true</b> or <b>false</b>). Finally, <i>proof theory</i> specifies how we can obtain new sentences (theorems) from assumed ones (axioms) by means of pure symbol manipulation (inference rules).
</p>
<p class="sektie">
Of these three, proof theory is most closely related to Logic Programming, because answering queries is in fact no different from proving theorems. In addition to proof theory, we need semantics for deciding whether the things we prove actually make sense. For instance, we need to be sure that the truth of the theorems is assured by the truth of the axioms. If our inference rules guarantee this, they are said to be <i>sound</i>. But this will not be enough, because sound inference rules can be actually very weak, and unable to prove anything of interest. We also need to be sure that the inference rules are powerful enough to eventually prove any possible theorem: they should be <i>complete</i>.
</p>
<p class="sektie">
Concepts like soundness and completeness are called <i>meta-theoretical</i>, since they are not expressed <b>in</b> the logic under discussion, but rather belong to a theory <b>about</b> that logic (‘meta’ means above). Their significance is not merely theoretical, but extends to logic programming languages like Prolog. For example, if a logic programming language is unsound, it will give wrong answers to some queries; if it is incomplete, it will give no answer to some other queries. Ideally, a logic programming language should be sound and complete; in practice, this will not be the case. For instance, in the next chapter we will see that Prolog is both unsound and incomplete. This has been a deliberate design choice: a sound and complete Prolog would be much less efficient. Nevertheless, any Prolog programmer should know exactly the circumstances under which Prolog is unsound or incomplete, and avoid these circumstances in her programs.
</p>
<p class="sektie">
The structure of this chapter is as follows. We start with a very simple (propositional) logical language, and enrich this language in two steps to full clausal logic. For each of these three languages, we discuss syntax, semantics, proof theory, and meta-theory. We then discuss definite clause logic, which is the subset of clausal logic used in Prolog. Finally, we relate clausal logic to Predicate Logic, and show that they are essentially equal in expressive power.
</p>
<h3 id="propositional_clausal_logic">
2.1 Propositional clausal logic
</h3>
<p class="sektie1">
Informally, a <i>proposition</i> is any statement which is either true or false, such as ‘2 + 2 = 4’ or ‘the moon is made of green cheese’. These are the building blocks of propositional logic, the weakest form of logic.
</p>
<p class="sektie1">
<i>Syntax. </i> Propositions are abstractly denoted by <i>atoms</i>, which are single words starting with a lowercase character. For instance, <tt>married</tt> is an atom denoting the proposition ‘he/she is married’; similarly, <tt>man</tt> denotes the proposition ‘he is a man’. Using the special symbols ‘ <tt>:-</tt> ’ (<b>if</b>), ‘ <tt>;</tt> ’ (<b>or</b>) and ‘ <tt>,</tt> ’ (<b>and</b>), we can combine atoms to form <i>clauses</i>. For instance,
</p>
<p class="p-el">
married;bachelor:-man,adult
</p>
<p class="tekst">
is a clause, with intended meaning: ‘somebody is married <b>or</b> a bachelor <b>if</b> he is a man <b>and</b> an adult’
<span class="CustomFootnote">
<a href="#_ftn2" name="_ftnref2" title="">
<span class="MsoFootnoteReference">
<span class="AutoStyle13">
<span class="AutoStyle14">
[2]
</span>
</span>
</span>
</a>
</span>
. The part to the left of the if-symbol ‘ <tt>:-</tt> ’ is called the <i>head</i> of the clause, and the right part is called the <i>body</i> of the clause. The head of a clause is always a disjunction (<b>or</b>) of atoms, and the body of a clause is always a conjunction (<b>and</b>).
</p>
<div class="extract exercise" id="2.1">
<div class="AutoStyle06">
<p class="exercise AutoStyle07">
<i>Exercise 2.1.</i> Translate the following statements into clauses, using the atoms <tt>person</tt>, <tt>sad</tt> and <tt>happy</tt>:<br>
(<i>a</i>) persons are happy or sad;<br>
(<i>b</i>) no person is both happy and sad;<br>
(<i>c</i>) sad persons are not happy;<br>
(<i>d</i>) non-happy persons are sad.
</p>
</div>
</div>
<p class="sektie">
A <i>program</i> is a set of clauses, each of them terminated by a period. The clauses are to be read conjunctively; for example, the program
</p>
<p class="p-el">
woman;man:-human.<br>
human:-woman.<br>
human:-man.
</p>
<p class="tekst">
has the intended meaning ‘(<b>if</b> someone is human <b>then</b> she/he is a woman <b>or</b> a man) <b>and</b> (<b>if</b> someone is a woman <b>then</b> she is human) <b>and</b> (<b>if</b> someone is a man <b>then</b> he is human)’, or, in other words, ‘someone is human <b>if and only if</b> she/he is a woman <b>or</b> a man’.
</p>
<p class="sektie1">
<i>Semantics. </i> The <i>Herbrand base</i> of a program <i>P</i> is the set of atoms occurring in <i>P</i>. For the above program, the Herbrand base is { <tt>woman</tt>, <tt>man</tt>, <tt>human</tt> }. A <i>Herbrand interpretation</i> (or interpretation for short) for <i>P</i> is a mapping from the Herbrand base of <i>P</i> into the set of truth values { <b>true</b>, <b>false</b> }. For example, the mapping { <tt>woman</tt>
<span class="AutoStyle09">
→
</span>
<b>true</b>, <tt>man</tt>
<span class="AutoStyle09">
→
</span>
<b>false</b>, <tt>human</tt>
<span class="AutoStyle09">
→
</span>
<b>true</b> } is a Herbrand interpretation for the above program. A Herbrand interpretation can be viewed as describing a possible state of affairs in the Universe of Discourse (in this case: ‘she is a woman, she is not a man, she is human’). Since there are only two possible truth values in the semantics we are considering, we could abbreviate such mappings by listing only the atoms that are assigned the truth value <b>true</b>; by definition, the remaining ones are assigned the truth value <b>false</b>. Under this convention, which we will adopt in this book, a Herbrand interpretation is simply a subset of the Herbrand base. Thus, the previous Herbrand interpretation would be represented as { <tt>woman</tt>, <tt>human</tt> }.
</p>
<p class="sektie">
Since a Herbrand interpretation assigns truth values to every atom in a clause, it also assigns a truth value to the clause as a whole. The rules for determining the truth value of a clause from the truth values of its atoms are not so complicated, if you keep in mind that the body of a clause is a conjunction of atoms, and the head is a disjunction. Consequently, the body of a clause is <b>true</b> if every atom in it is <b>true</b>, and the head of a clause is <b>true</b> if at least one atom in it is <b>true</b>. In turn, the truth value of the clause is determined by the truth values of head and body. There are four possibilities:
</p>
<p class="opsomming">
(<i>i</i>) the body is <b>true</b>, and the head is <b>true</b>;
</p>
<p class="opsomming">
(<i>ii</i>) the body is <b>true</b>, and the head is <b>false</b>;
</p>
<p class="opsomming">
(<i>iii</i>) the body is <b>false</b>, and the head is <b>true</b>;
</p>
<p class="opsomming">
(<i>iv</i>) the body is <b>false</b>, and the head is <b>false</b>.
</p>
<p class="tekst">
The intended meaning of the clause is ‘ <b>if</b> body <b>then</b> head’, which is obviously <b>true</b> in the first case, and <b>false</b> in the second case.
</p>
<p class="sektie">
What about the remaining two cases? They cover statements like ‘ <b>if</b> the moon is made of green cheese <b>then</b> 2 + 2 = 4’, in which there is no connection at all between body and head. One would like to say that such statements are neither <b>true</b> nor <b>false</b>. However, our semantics is not sophisticated enough to deal with this: it simply insists that clauses should be assigned a truth value in every possible interpretation. Therefore, we consider the clause to be <b>true</b> whenever its body is <b>false</b>. It is not difficult to see that under these truth conditions a clause is equivalent with the statement ‘head <b>or not</b> body’. For example, the clause <tt>married;bachelor:-man,adult</tt> can also be read as ‘someone is married <b>or</b> a bachelor <b>ornot</b> a man <b>ornot</b> an adult’. Thus, a clause is a disjunction of atoms, which are negated if they occur in the body of the clause. Therefore, the atoms in the body of the clause are often called <i>negative literals</i>, while those in the head of the clause are called <i>positive literals</i>.
</p>
<p class="sektie">
To summarise: a clause is assigned the truth value <b>true</b> in an interpretation, if and only if at least one of the following conditions is true: (<i>a</i>) at least one atom in the body of the clause is <b>false</b> in the interpretation (cases (<i>iii</i>) and (<i>iv</i>)), or (<i>b</i>) at least one atom in the head of the clause is <b>true</b> in the interpretation (cases (<i>i</i>) and (<i>iii</i>)). If a clause is <b>true</b> in an interpretation, we say that the interpretation is a <i>model</i> for the clause. An interpretation is a model for a program if it is a model for each clause in the program. For example, the above program has the following models:
<span class="AutoStyle09">
∅
</span>
(the empty model, assigning <b>false</b> to every atom), { <tt>woman</tt>, <tt>human</tt> }, { <tt>man</tt>, <tt>human</tt> }, and { <tt>woman</tt>, <tt>man</tt>, <tt>human</tt> }. Since there are eight possible interpretations for a Herbrand base with three atoms, this means that the program contains enough information to rule out half of these.
</p>
<p class="sektie">
Adding more clauses to the program means restricting its set of models. For instance, if we add the clause <tt>woman</tt> (a clause with an empty body) to the program, we rule out the first and third model, which leaves us with the models { <tt>woman</tt>, <tt>human</tt> }, and { <tt>woman</tt>, <tt>man</tt>, <tt>human</tt> }. Note that in both of these models, <tt>human</tt> is <b>true</b>. We say that <tt>human</tt> is a logical consequence of the set of clauses. In general, a clause <i>C</i> is a <i>logical consequence</i> of a program <i>P</i> if every model of the program is also a model of the clause; we write <i>P</i> = <i>C</i>.
</p>
<div class="extract exercise" id="2.2">
<div class="AutoStyle06">
<p class="exercise AutoStyle07">
<i>Exercise 2.2.</i> Given the program<br>
<tt>married;bachelor:-man,adult.<br>
man.<br>
:-bachelor.</tt><br>
determine which of the following clauses are logical consequences of this program:<br>
(<i>a</i>) <tt>married:-adult</tt>;<br>
(<i>b</i>) <tt>married:-bachelor</tt>;<br>
(<i>c</i>) <tt>bachelor:-man</tt>;<br>
(<i>d</i>) <tt>bachelor:-bachelor</tt>.
</p>
</div>
</div>
<p class="sektie">
Of the two remaining models, obviously { <tt>woman</tt>, <tt>human</tt> } is the intended one; but the program does not yet contain enough information to distinguish it from the non-intended model { <tt>woman</tt>, <tt>man</tt>, <tt>human</tt> }. We can add yet another clause, to make sure that the atom <tt>man</tt> is mapped to <b>false</b>. For instance, we could add
</p>
<p class="p-el">
:-man
</p>
<p class="tekst">
(it is not a man) or
</p>
<p class="p-el">
:-man,woman
</p>
<p class="tekst">
(nobody is both a man and a woman). However, explicitly stating everything that is false in the intended model is not always feasible. Consider, for example, an airline database consulted by travel agencies: we simply want to say that if a particular flight (i.e., a combination of plane, origin, destination, date and time) is not listed in the database, then it does not exist, instead of listing all the dates that a particular plane does <b>not</b> fly from Amsterdam to London.
</p>
<p class="sektie">
So, instead of adding clauses until a single model remains, we want to add a rule to our semantics which tells us which of the several models is the intended one. The airline example shows us that, in general, we only want to accept something as <b>true</b> if we are really forced to, i.e. if it is <b>true</b> in every possible model. This means that we should take the intersection of every model of a program in order to construct the intended model. In the example, this is { <tt>woman</tt>, <tt>human</tt> }. Note that this model is <i>minimal</i> in the sense that no subset of it is also a model. Therefore, this semantics is called a <i>minimal model semantics</i>.
</p>
<p class="sektie">
Unfortunately, this approach is only applicable to a restricted class of programs. Consider the following program:
</p>
<p class="p-el">
woman;man:-human.<br>
human.
</p>
<p class="tekst">
This program has three models: { <tt>woman</tt>, <tt>human</tt> }, { <tt>man</tt>, <tt>human</tt> }, and { <tt>woman</tt>, <tt>man</tt>, <tt>human</tt> }. The intersection of these models is { <tt>human</tt> }, but this interpretation is not a model of the first clause! The program has in fact not one, but <b>two</b> minimal models, which is caused by the fact that the first clause has a disjunctive head. Such a clause is called <i>indefinite</i>, because it does not permit definite conclusions to be drawn.
</p>
<p class="sektie">
On the other hand, if we would only allow <i>definite</i> clauses, i.e. clauses with a single positive literal, minimal models are guaranteed to be unique. We will deal with definite clauses in section 2.4, because Prolog is based on definite clause logic. In principle, this means that clauses like <tt>woman;man:-human</tt> are not expressible in Prolog. However, such a clause can be transformed into a ‘pseudo-definite’ clause by moving one of the literals in the head to the body, extended with an extra negation. This gives the following two possibilities:
</p>
<p class="p-el">
woman:-human,not(man).<br>
man:-human,not(woman).
</p>
<p class="tekst">
In Prolog, we have to choose between these two clauses, which means that we have only an approximation of the original indefinite clause. Negation in Prolog is an important subject with many aspects. In Chapter 3, we will show how Prolog handles negation in the body of clauses. In Chapter 8, we will discuss particular applications of this kind of negation.
</p>
<p class="sektie1">
<i>Proof theory. </i> Recall that a clause <i>C</i> is a logical consequence of a program <i>P</i> (<i>P</i> = <i>C</i>) if every model of <i>P</i> is a model of <i>C</i>. Checking this condition is, in general, unfeasible. Therefore, we need a more efficient way of computing logical consequences, by means of inference rules. If <i>C</i> can be derived from <i>P</i> by means of a number of applications of such inference rules, we say that <i>C</i> can be <i>proved</i> from <i>P</i>. Such inference rules are purely syntactic, and do not refer to any underlying semantics.
</p>