diff --git a/AUTHORS b/AUTHORS new file mode 100644 index 0000000..784002e --- /dev/null +++ b/AUTHORS @@ -0,0 +1,43 @@ +src -*- text -*- +--- + +* Pintos core originally written by Ben Pfaff . + +* Additional features contributed by Anthony Romano . + +* The original structure and form of this operating system is inspired + by the Nachos system from the University of California, Berkeley. A + few of the source files are more-or-less literal translations of the + Nachos C++ code into C. These files bear the original UCB license + notice. + +projects +-------- + +* The projects are primarily the creation of Ben Pfaff + . + +* Godmar Back made significant contributions to + project design. + +* Although little remains unchanged, the projects were originally + derived from those designed for Nachos by current and former CS140 + teaching assistants at Stanford University, including at least the + following people: + + - Yu Ping + + - Greg Hutchins + + - Kelly Shaw , + + - Paul Twohey + + - Sameer Qureshi + + - John Rector + + If you're not on this list but should be, please let me know. + +* Example code for condition variables is from classroom slides + originally by Dawson Engler and updated by Mendel Roseblum. diff --git a/LICENSE b/LICENSE new file mode 100644 index 0000000..64103d0 --- /dev/null +++ b/LICENSE @@ -0,0 +1,56 @@ +Pintos, including its documentation, is subject to the following +license: + + Copyright (C) 2004, 2005, 2006 Board of Trustees, Leland Stanford + Jr. University. All rights reserved. + + Permission is hereby granted, free of charge, to any person obtaining + a copy of this software and associated documentation files (the + "Software"), to deal in the Software without restriction, including + without limitation the rights to use, copy, modify, merge, publish, + distribute, sublicense, and/or sell copies of the Software, and to + permit persons to whom the Software is furnished to do so, subject to + the following conditions: + + The above copyright notice and this permission notice shall be + included in all copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, + EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF + MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND + NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE + LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION + OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION + WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. + +A few individual files in Pintos were originally derived from other +projects, but they have been extensively modified for use in Pintos. +The original code falls under the original license, and modifications +for Pintos are additionally covered by the Pintos license above. + +In particular, code derived from Nachos is subject to the following +license: + +/* Copyright (c) 1992-1996 The Regents of the University of California. + All rights reserved. + + Permission to use, copy, modify, and distribute this software + and its documentation for any purpose, without fee, and + without written agreement is hereby granted, provided that the + above copyright notice and the following two paragraphs appear + in all copies of this software. + + IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO + ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR + CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE + AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA + HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + + THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY + WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED + WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR + PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN "AS IS" + BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATION TO + PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR + MODIFICATIONS. +*/ diff --git a/Makefile b/Makefile new file mode 100644 index 0000000..2c374e7 --- /dev/null +++ b/Makefile @@ -0,0 +1,13 @@ +CLEAN_SUBDIRS = src doc tests + +all:: + @echo "This makefile has only 'clean' and 'check' targets." + +clean:: + for d in $(CLEAN_SUBDIRS); do $(MAKE) -C $$d $@; done + +distclean:: clean + find . -name '*~' -exec rm '{}' \; + +check:: + make -C tests $@ diff --git a/doc/.gitignore b/doc/.gitignore new file mode 100644 index 0000000..b5ab7c5 --- /dev/null +++ b/doc/.gitignore @@ -0,0 +1,26 @@ +*.aux +*.cp +*.dvi +*.fn +*.info* +*.ky +*.log +*.pg +*.toc +*.tp +*.vr +/pintos-ic.fns +/pintos-ic.tps +/pintos-ic.vrs +mlfqs1.pdf +mlfqs1.png +mlfqs2.pdf +mlfqs2.png +pintos-ic.html +pintos-ic.pdf +pintos-ic.ps +pintos.text +pintos-ic_*.html +projects.html +sample.tmpl.texi +task0_sheet.pdf diff --git a/doc/44bsd.texi b/doc/44bsd.texi new file mode 100644 index 0000000..471d1ab --- /dev/null +++ b/doc/44bsd.texi @@ -0,0 +1,392 @@ +@node 4.4BSD Scheduler +@appendix 4.4@acronym{BSD} Scheduler + +@iftex +@macro tm{TEX} +@math{\TEX\} +@end macro +@macro nm{TXT} +@end macro +@macro am{TEX, TXT} +@math{\TEX\} +@end macro +@end iftex + +@ifnottex +@macro tm{TEX} +@end macro +@macro nm{TXT} +@w{\TXT\} +@end macro +@macro am{TEX, TXT} +@w{\TXT\} +@end macro +@end ifnottex + +@ifhtml +@macro math{TXT} +\TXT\ +@end macro +@end ifhtml + +@macro m{MATH} +@am{\MATH\, \MATH\} +@end macro + +The goal of a general-purpose scheduler is to balance threads' different +scheduling needs. Threads that perform a lot of I/O require a fast +response time to keep input and output devices busy, but need little CPU +time. On the other hand, compute-bound threads need to receive a lot of +CPU time to finish their work, but have no requirement for fast response +time. Other threads lie somewhere in between, with periods of I/O +punctuated by periods of computation, and thus have requirements that +vary over time. A well-designed scheduler can often accommodate threads +with all these requirements simultaneously. + +For task 1, you must implement the scheduler described in this +appendix. Our scheduler resembles the one described in @bibref{McKusick}, +which is one example of a @dfn{multilevel feedback queue} scheduler. +This type of scheduler maintains several queues of ready-to-run threads, +where each queue holds threads with a different priority. At any given +time, the scheduler chooses a thread from the highest-priority non-empty +queue. If the highest-priority queue contains multiple threads, then +they run in ``round robin'' order. + +Multiple facets of the scheduler require data to be updated after a +certain number of timer ticks. In every case, these updates should +occur before any ordinary kernel thread has a chance to run, so that +there is no chance that a kernel thread could see a newly increased +@func{timer_ticks} value but old scheduler data values. + +The 4.4@acronym{BSD} scheduler does not include priority donation. + +@menu +* Thread Niceness:: +* Calculating Priority:: +* Calculating recent_cpu:: +* Calculating load_avg:: +* 4.4BSD Scheduler Summary:: +* Fixed-Point Real Arithmetic:: +@end menu + +@node Thread Niceness +@section Niceness + +Thread priority is dynamically determined by the scheduler using a +formula given below. However, each thread also has an integer +@dfn{nice} value that determines how ``nice'' the thread should be to +other threads. A @var{nice} of zero does not affect thread priority. A +positive @var{nice}, to the maximum of 20, decreases the priority of a +thread and causes it to give up some CPU time it would otherwise receive. +On the other hand, a negative @var{nice}, to the minimum of -20, tends +to take away CPU time from other threads. + +The initial thread starts with a @var{nice} value of zero. Other +threads start with a @var{nice} value inherited from their parent +thread. You must implement the functions described below, which are for +use by test programs. We have provided skeleton definitions for them in +@file{threads/thread.c}. + +@deftypefun int thread_get_nice (void) +Returns the current thread's @var{nice} value. +@end deftypefun + +@deftypefun void thread_set_nice (int @var{new_nice}) +Sets the current thread's @var{nice} value to @var{new_nice} and +recalculates the thread's priority based on the new value +(@pxref{Calculating Priority}). If the running thread no longer has the +highest priority, yields. +@end deftypefun + +@node Calculating Priority +@section Calculating Priority + +Our scheduler has 64 priorities and thus 64 ready queues, numbered 0 +(@code{PRI_MIN}) through 63 (@code{PRI_MAX}). Lower numbers correspond +to lower priorities, so that priority 0 is the lowest priority +and priority 63 is the highest. Thread priority is calculated initially +at thread initialization. It is also recalculated once every fourth +clock tick, for every thread. In either case, it is determined by +the formula + +@center @t{@var{priority} = @code{PRI_MAX} - (@var{recent_cpu} / 4) - (@var{nice} * 2)}, + +@noindent where @var{recent_cpu} is an estimate of the CPU time the +thread has used recently (see below) and @var{nice} is the thread's +@var{nice} value. The result should be rounded down to the nearest +integer (truncated). +The coefficients @math{1/4} and 2 on @var{recent_cpu} +and @var{nice}, respectively, have been found to work well in practice +but lack deeper meaning. The calculated @var{priority} is always +adjusted to lie in the valid range @code{PRI_MIN} to @code{PRI_MAX}. + +This formula gives a thread that has received CPU +time recently lower priority for being reassigned the CPU the next +time the scheduler runs. This is key to preventing starvation: a +thread that has not received any CPU time recently will have a +@var{recent_cpu} of 0, which barring a high @var{nice} value should +ensure that it receives CPU time soon. + +@node Calculating recent_cpu +@section Calculating @var{recent_cpu} + +We wish @var{recent_cpu} to measure how much CPU time each process has +received ``recently.'' Furthermore, as a refinement, more recent CPU +time should be weighted more heavily than less recent CPU time. One +approach would use an array of @var{n} elements to +track the CPU time received in each of the last @var{n} seconds. +However, this approach requires O(@var{n}) space per thread and +O(@var{n}) time per calculation of a new weighted average. + +Instead, we use a @dfn{exponentially weighted moving average}, which +takes this general form: + +@center @tm{x(0) = f(0),}@nm{x(0) = f(0),} +@center @tm{x(t) = ax(t-1) + (1-a)f(t),}@nm{x(t) = a*x(t-1) + f(t),} +@center @tm{a = k/(k+1),}@nm{a = k/(k+1),} + +@noindent where @math{x(t)} is the moving average at integer time @am{t +\ge 0, t >= 0}, @math{f(t)} is the function being averaged, and @math{k +> 0} controls the rate of decay. We can iterate the formula over a few +steps as follows: + +@center @math{x(1) = f(1)}, +@center @am{x(2) = af(1) + f(2), x(2) = a*f(1) + f(2)}, +@center @am{\vdots, ...} +@center @am{x(5) = a^4f(1) + a^3f(2) + a^2f(3) + af(4) + f(5), x(5) = a**4*f(1) + a**3*f(2) + a**2*f(3) + a*f(4) + f(5)}. + +@noindent The value of @math{f(t)} has a weight of 1 at time @math{t}, a +weight of @math{a} at time @math{t+1}, @am{a^2, a**2} at time +@math{t+2}, and so on. We can also relate @math{x(t)} to @math{k}: +@math{f(t)} has a weight of approximately @math{1/e} at time @math{t+k}, +approximately @am{1/e^2, 1/e**2} at time @am{t+2k, t+2*k}, and so on. +From the opposite direction, @math{f(t)} decays to weight @math{w} at +time @am{t + \log_aw, t + ln(w)/ln(a)}. + +The initial value of @var{recent_cpu} is 0 in the first thread +created, or the parent's value in other new threads. Each time a timer +interrupt occurs, @var{recent_cpu} is incremented by 1 for the running +thread only, unless the idle thread is running. In addition, once per +second the value of @var{recent_cpu} +is recalculated for every thread (whether running, ready, or blocked), +using this formula: + +@center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}}, + +@noindent where @var{load_avg} is a moving average of the number of +threads ready to run (see below). If @var{load_avg} is 1, indicating +that a single thread, on average, is competing for the CPU, then the +current value of @var{recent_cpu} decays to a weight of .1 in +@am{\log_{2/3}.1 \approx 6, ln(.1)/ln(2/3) = approx. 6} seconds; if +@var{load_avg} is 2, then decay to a weight of .1 takes @am{\log_{3/4}.1 +\approx 8, ln(.1)/ln(3/4) = approx. 8} seconds. The effect is that +@var{recent_cpu} estimates the amount of CPU time the thread has +received ``recently,'' with the rate of decay inversely proportional to +the number of threads competing for the CPU. + +Assumptions made by some of the tests require that these recalculations of +@var{recent_cpu} be made exactly when the system tick counter reaches a +multiple of a second, that is, when @code{timer_ticks () % TIMER_FREQ == +0}, and not at any other time. + +The value of @var{recent_cpu} can be negative for a thread with a +negative @var{nice} value. Do not clamp negative @var{recent_cpu} to 0. + +You may need to think about the order of calculations in this formula. +We recommend computing the coefficient of @var{recent_cpu} first, then +multiplying. Some students have reported that multiplying +@var{load_avg} by @var{recent_cpu} directly can cause overflow. + +You must implement @func{thread_get_recent_cpu}, for which there is a +skeleton in @file{threads/thread.c}. + +@deftypefun int thread_get_recent_cpu (void) +Returns 100 times the current thread's @var{recent_cpu} value, rounded +to the nearest integer. +@end deftypefun + +@node Calculating load_avg +@section Calculating @var{load_avg} + +Finally, @var{load_avg}, often known as the system load average, +estimates the average number of threads ready to run over the past +minute. Like @var{recent_cpu}, it is an exponentially weighted moving +average. Unlike @var{priority} and @var{recent_cpu}, @var{load_avg} is +system-wide, not thread-specific. At system boot, it is initialized to +0. Once per second thereafter, it is updated according to the following +formula: + +@center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}}, + +@noindent where @var{ready_threads} is the number of threads that are +either running or ready to run at time of update (not including the idle +thread). + +Because of assumptions made by some of the tests, @var{load_avg} must be +updated exactly when the system tick counter reaches a multiple of a +second, that is, when @code{timer_ticks () % TIMER_FREQ == 0}, and not +at any other time. + +You must implement @func{thread_get_load_avg}, for which there is a +skeleton in @file{threads/thread.c}. + +@deftypefun int thread_get_load_avg (void) +Returns 100 times the current system load average, rounded to the +nearest integer. +@end deftypefun + +@node 4.4BSD Scheduler Summary +@section Summary + +The following formulas summarize the calculations required to implement the +scheduler. They are not a complete description of scheduler requirements. + +Every thread has a @var{nice} value between -20 and 20 directly under +its control. Each thread also has a priority, between 0 +(@code{PRI_MIN}) through 63 (@code{PRI_MAX}), which is recalculated +using the following formula every fourth tick: + +@center @t{@var{priority} = @code{PRI_MAX} - (@var{recent_cpu} / 4) - (@var{nice} * 2)}. + +@var{recent_cpu} measures the amount of CPU time a thread has received +``recently.'' On each timer tick, the running thread's @var{recent_cpu} +is incremented by 1. Once per second, every thread's @var{recent_cpu} +is updated this way: + +@center @t{@var{recent_cpu} = (2*@var{load_avg})/(2*@var{load_avg} + 1) * @var{recent_cpu} + @var{nice}}. + +@var{load_avg} estimates the average number of threads ready to run over +the past minute. It is initialized to 0 at boot and recalculated once +per second as follows: + +@center @t{@var{load_avg} = (59/60)*@var{load_avg} + (1/60)*@var{ready_threads}}. + +@noindent where @var{ready_threads} is the number of threads that are +either running or ready to run at time of update (not including the idle +thread). + +@node Fixed-Point Real Arithmetic +@section Fixed-Point Real Arithmetic + +In the formulas above, @var{priority}, @var{nice}, and +@var{ready_threads} are integers, but @var{recent_cpu} and @var{load_avg} +are real numbers. Unfortunately, Pintos does not support floating-point +arithmetic in the kernel, because it would +complicate and slow the kernel. Real kernels often have the same +limitation, for the same reason. This means that calculations on real +quantities must be simulated using integers. This is not +difficult, but many students do not know how to do it. This +section explains the basics. + +The fundamental idea is to treat the rightmost bits of an integer as +representing a fraction. For example, we can designate the lowest 14 +bits of a signed 32-bit integer as fractional bits, so that an integer +@m{x} represents the real number +@iftex +@m{x/2^{14}}. +@end iftex +@ifnottex +@m{x/(2**14)}, where ** represents exponentiation. +@end ifnottex +This is called a 17.14 fixed-point number representation, because there +are 17 bits before the decimal point, 14 bits after it, and one sign +bit.@footnote{Because we are working in binary, the ``decimal'' point +might more correctly be called the ``binary'' point, but the meaning +should be clear.} A number in 17.14 format represents, at maximum, a +value of @am{(2^{31} - 1) / 2^{14} \approx, (2**31 - 1)/(2**14) = +approx.} 131,071.999. + +Suppose that we are using a @m{p.q} fixed-point format, and let @am{f = +2^q, f = 2**q}. By the definition above, we can convert an integer or +real number into @m{p.q} format by multiplying with @m{f}. For example, +in 17.14 format the fraction 59/60 used in the calculation of +@var{load_avg}, above, is @am{(59/60)2^{14}, 59/60*(2**14)} = 16,110. +To convert a fixed-point value back to an +integer, divide by @m{f}. (The normal @samp{/} operator in C rounds +toward zero, that is, it rounds positive numbers down and negative +numbers up. To round to nearest, add @m{f / 2} to a positive number, or +subtract it from a negative number, before dividing.) + +Many operations on fixed-point numbers are straightforward. Let +@code{x} and @code{y} be fixed-point numbers, and let @code{n} be an +integer. Then the sum of @code{x} and @code{y} is @code{x + y} and +their difference is @code{x - y}. The sum of @code{x} and @code{n} is +@code{x + n * f}; difference, @code{x - n * f}; product, @code{x * n}; +quotient, @code{x / n}. + +Multiplying two fixed-point values has two complications. First, the +decimal point of the result is @m{q} bits too far to the left. Consider +that @am{(59/60)(59/60), (59/60)*(59/60)} should be slightly less than +1, but @tm{16,111\times 16,111}@nm{16,111*16,111} = 259,564,321 is much +greater than @am{2^{14},2**14} = 16,384. Shifting @m{q} bits right, we +get @tm{259,564,321/2^{14}}@nm{259,564,321/(2**14)} = 15,842, or about 0.97, +the correct answer. Second, the multiplication can overflow even though +the answer is representable. For example, 64 in 17.14 format is +@am{64 \times 2^{14}, 64*(2**14)} = 1,048,576 and its square @am{64^2, +64**2} = 4,096 is well within the 17.14 range, but @tm{1,048,576^2 = +2^{40}}@nm{1,048,576**2 = 2**40}, greater than the maximum signed 32-bit +integer value @am{2^{31} - 1, 2**31 - 1}. An easy solution is to do the +multiplication as a 64-bit operation. The product of @code{x} and +@code{y} is then @code{((int64_t) x) * y / f}. + +Dividing two fixed-point values has opposite issues. The +decimal point will be too far to the right, which we fix by shifting the +dividend @m{q} bits to the left before the division. The left shift +discards the top @m{q} bits of the dividend, which we can again fix by +doing the division in 64 bits. Thus, the quotient when @code{x} is +divided by @code{y} is @code{((int64_t) x) * f / y}. + +This section has consistently used multiplication or division by @m{f}, +instead of @m{q}-bit shifts, for two reasons. First, multiplication and +division do not have the surprising operator precedence of the C shift +operators. Second, multiplication and division are well-defined on +negative operands, but the C shift operators are not. Take care with +these issues in your implementation. + +The following table summarizes how fixed-point arithmetic operations can +be implemented in C. In the table, @code{x} and @code{y} are +fixed-point numbers, @code{n} is an integer, fixed-point numbers are in +signed @m{p.q} format where @m{p + q = 31}, and @code{f} is @code{1 << +q}: + +@html +
+@end html +@multitable @columnfractions .5 .5 +@item Convert @code{n} to fixed point: +@tab @code{n * f} + +@item Convert @code{x} to integer (rounding toward zero): +@tab @code{x / f} + +@item Convert @code{x} to integer (rounding to nearest): +@tab @code{(x + f / 2) / f} if @code{x >= 0}, @* +@code{(x - f / 2) / f} if @code{x <= 0}. + +@item Add @code{x} and @code{y}: +@tab @code{x + y} + +@item Subtract @code{y} from @code{x}: +@tab @code{x - y} + +@item Add @code{x} and @code{n}: +@tab @code{x + n * f} + +@item Subtract @code{n} from @code{x}: +@tab @code{x - n * f} + +@item Multiply @code{x} by @code{y}: +@tab @code{((int64_t) x) * y / f} + +@item Multiply @code{x} by @code{n}: +@tab @code{x * n} + +@item Divide @code{x} by @code{y}: +@tab @code{((int64_t) x) * f / y} + +@item Divide @code{x} by @code{n}: +@tab @code{x / n} +@end multitable +@html +
+@end html diff --git a/doc/Makefile b/doc/Makefile new file mode 100644 index 0000000..7ccc712 --- /dev/null +++ b/doc/Makefile @@ -0,0 +1,42 @@ +TEXIS = pintos-ic.texi intro.texi codebase.texi threads.texi userprog.texi vm.texi \ +license.texi reference.texi 44bsd.texi standards.texi \ +doc.texi sample.tmpl.texi devel.texi debug.texi installation.texi \ +bibliography.texi localsettings.texi task0_questions.texi localgitinstructions.texi + +all: pintos-ic.html pintos-ic.info pintos-ic.dvi pintos-ic.ps pintos-ic.pdf task0_sheet.pdf + +pintos-ic.html: $(TEXIS) texi2html + ./texi2html -toc_file=$@ -split=chapter -nosec_nav -nomenu -init_file pintos-t2h.init $< + +pintos-ic.info: $(TEXIS) + makeinfo $< + +pintos-ic.text: $(TEXIS) + makeinfo --plaintext -o $@ $< + +pintos-ic.dvi: $(TEXIS) + texi2dvi $< -o $@ + +pintos-ic.ps: pintos-ic.dvi + dvips $< -o $@ + +pintos-ic.pdf: $(TEXIS) + texi2pdf $< -o $@ + +task0_sheet.pdf : task0_sheet.texi task0_questions.texi + texi2pdf $< -o $@ + +%.texi: % + sed < $< > $@ 's/\([{}@]\)/\@\1/g;' + +clean: + rm -f *.info* *.html + rm -f *.dvi *.pdf *.ps *.log *~ + rm -rf WWW + rm -f sample.tmpl.texi + +dist: pintos-ic.html pintos-ic.pdf + rm -rf WWW + mkdir WWW WWW/specs + cp *.html *.pdf *.css *.tmpl WWW + (cd ../specs && cp -r *.pdf freevga kbd sysv-abi-update.html ../doc/WWW/specs) diff --git a/doc/bibliography.texi b/doc/bibliography.texi new file mode 100644 index 0000000..efdb05f --- /dev/null +++ b/doc/bibliography.texi @@ -0,0 +1,154 @@ +@node Bibliography +@unnumbered Bibliography + +@macro bibdfn{cite} +@noindent @anchor{\cite\} +[\cite\].@w{ } +@end macro + +@menu +* Hardware References:: +* Software References:: +* Operating System Design References:: +@end menu + +@node Hardware References +@section Hardware References + +@bibdfn{IA32-v1} +IA-32 Intel Architecture Software Developer's Manual Volume 1: Basic +Architecture. Basic 80@var{x}86 architecture and programming +environment. Available via @uref{developer.intel.com}. Section numbers +in this document refer to revision 18. + +@bibdfn{IA32-v2a} +IA-32 Intel Architecture Software Developer's Manual +Volume 2A: Instruction Set Reference A-M. 80@var{x}86 instructions +whose names begin with A through M. Available via +@uref{developer.intel.com}. Section numbers in this document refer to +revision 18. + +@bibdfn{IA32-v2b} +IA-32 Intel Architecture Software Developer's Manual Volume 2B: +Instruction Set Reference N-Z. 80@var{x}86 instructions whose names +begin with N through Z. Available via @uref{developer.intel.com}. +Section numbers in this document refer to revision 18. + +@bibdfn{IA32-v3a} +IA-32 Intel Architecture Software Developer's Manual Volume 3A: System +Programming Guide. Operating system support, including segmentation, +paging, tasks, interrupt and exception handling. Available via +@uref{developer.intel.com}. Section numbers in this document refer to +revision 18. + +@bibdfn{FreeVGA} +@uref{specs/freevga/home.htm, , FreeVGA Project}. Documents the VGA video +hardware used in PCs. + +@bibdfn{kbd} +@uref{specs/kbd/scancodes.html, , Keyboard scancodes}. Documents PC keyboard +interface. + +@bibdfn{ATA-3} +@uref{specs/ata-3-std.pdf, , AT Attachment-3 Interface (ATA-3) Working +Draft}. Draft of an old version of the ATA aka IDE interface for the +disks used in most desktop PCs. + +@bibdfn{PC16550D} +@uref{specs/pc16550d.pdf, , National Semiconductor PC16550D Universal +Asynchronous Receiver/Transmitter with FIFOs}. Datasheet for a chip +used for PC serial ports. + +@bibdfn{8254} +@uref{specs/8254.pdf, , Intel 8254 Programmable Interval Timer}. +Datasheet for PC timer chip. + +@bibdfn{8259A} +@uref{specs/8259A.pdf, , Intel 8259A Programmable Interrupt Controller +(8259A/8259A-2)}. Datasheet for PC interrupt controller chip. + +@bibdfn{MC146818A} +@uref{specs/mc146818a.pdf, , Motorola MC146818A Real Time Clock Plus +Ram (RTC)}. Datasheet for PC real-time clock chip. + +@node Software References +@section Software References + +@bibdfn{ELF1} +@uref{specs/elf.pdf, , Tool Interface Standard (TIS) Executable and +Linking Format (ELF) Specification Version 1.2 Book I: Executable and +Linking Format}. The ubiquitous format for executables in modern Unix +systems. + +@bibdfn{ELF2} +@uref{specs/elf.pdf, , Tool Interface Standard (TIS) Executable and +Linking Format (ELF) Specification Version 1.2 Book II: Processor +Specific (Intel Architecture)}. 80@var{x}86-specific parts of ELF. + +@bibdfn{ELF3} +@uref{specs/elf.pdf, , Tool Interface Standard (TIS) Executable and +Linking Format (ELF) Specification Version 1.2 Book III: Operating +System Specific (UNIX System V Release 4)}. Unix-specific parts of +ELF. + +@bibdfn{SysV-ABI} +@uref{specs/sysv-abi-4.1.pdf, , System V Application Binary Interface: +Edition 4.1}. Specifies how applications interface with the OS under +Unix. + +@bibdfn{SysV-i386} +@uref{specs/sysv-abi-i386-4.pdf, , System V Application Binary +Interface: Intel386 Architecture Processor Supplement: Fourth +Edition}. 80@var{x}86-specific parts of the Unix interface. + +@bibdfn{SysV-ABI-update} +@uref{specs/sysv-abi-update.html/contents.html, , System V Application Binary +Interface---DRAFT---24 April 2001}. A draft of a revised version of +@bibref{SysV-ABI} which was never completed. + +@bibdfn{SUSv3} +The Open Group, @uref{http://www.unix.org/single_unix_specification/, +, Single UNIX Specification V3}, 2001. + +@bibdfn{Partitions} +A.@: E.@: Brouwer, @uref{specs/partitions/partition_tables.html, , +Minimal partition table specification}, 1999. + +@bibdfn{IntrList} +R.@: Brown, @uref{http://www.ctyme.com/rbrown.htm, , Ralf Brown's +Interrupt List}, 2000. + +@node Operating System Design References +@section Operating System Design References + +@bibdfn{Christopher} +W.@: A.@: Christopher, S.@: J.@: Procter, T.@: E.@: Anderson, +@cite{The Nachos instructional operating system}. +Proceedings of the @acronym{USENIX} Winter 1993 Conference. +@uref{http://portal.acm.org/citation.cfm?id=1267307}. + +@bibdfn{Dijkstra} +E.@: W.@: Dijkstra, @cite{The structure of the ``THE'' +multiprogramming system}. Communications of the ACM 11(5):341--346, +1968. @uref{http://doi.acm.org/10.1145/363095.363143}. + +@bibdfn{Hoare} +C.@: A.@: R.@: Hoare, @cite{Monitors: An Operating System +Structuring Concept}. Communications of the ACM, 17(10):549--557, +1974. @uref{http://www.acm.org/classics/feb96/}. + +@bibdfn{Lampson} +B.@: W.@: Lampson, D.@: D.@: Redell, @cite{Experience with processes and +monitors in Mesa}. Communications of the ACM, 23(2):105--117, 1980. +@uref{http://doi.acm.org/10.1145/358818.358824}. + +@bibdfn{McKusick} +M.@: K.@: McKusick, K.@: Bostic, M.@: J.@: Karels, J.@: S.@: Quarterman, +@cite{The Design and Implementation of the 4.4@acronym{BSD} Operating +System}. Addison-Wesley, 1996. + +@bibdfn{Wilson} +P.@: R.@: Wilson, M.@: S.@: Johnstone, M.@: Neely, D.@: Boles, +@cite{Dynamic Storage Allocation: A Survey and Critical Review}. +International Workshop on Memory Management, 1995. +@uref{http://www.cs.utexas.edu/users/oops/papers.html#allocsrv}. diff --git a/doc/codebase.texi b/doc/codebase.texi new file mode 100644 index 0000000..1b204e4 --- /dev/null +++ b/doc/codebase.texi @@ -0,0 +1,53 @@ +@node Task 0--Codebase +@chapter Task 0: Codebase Preview + +This codebase preview has been designed to help you understand how +Pintos is structured before you actually begin to add features to it. The +exercise requires you to answer a worksheet (handed out +through CATE) that contains a few questions to check your understanding. + +Tasks 0 and 1 will count as the coursework for the operating systems +course. Task 0 will carry 25% of the coursework marks, with the rest +allocated to Task 1. + +@b{Task 0 will be assessed individually, unlike the rest of the tasks in this + project.} + +@menu +* Sections Tested:: +* Files:: +@end menu + +@node Sections Tested +@section Sections Tested + +You will be expected to have fully read: + +@itemize + +@item Section 1 +@item Section 3.1 +@item Sections A.2-4 +@item Sections C, D, E and F + +@end itemize + +@node Files +@section Files +The source files you will have to understand: +@table @file + +@item src/threads/thread.c + Contains bulk of threading system code +@item src/threads/thread.h + Header file for threads, contains thread struct +@item src/threads/synch.c + Contains the implementation of major synchronisation primitives like + locks and semaphores +@item src/lib/kernel/list.c + Contains Pintos' list implementation +@end table + +@page +@section Task 0 questions +@include task0_questions.texi diff --git a/doc/debug.texi b/doc/debug.texi new file mode 100644 index 0000000..e335e89 --- /dev/null +++ b/doc/debug.texi @@ -0,0 +1,688 @@ +@node Debugging Tools +@appendix Debugging Tools + +Many tools lie at your disposal for debugging Pintos. This appendix +introduces you to a few of them. + +@menu +* printf:: +* ASSERT:: +* Function and Parameter Attributes:: +* Backtraces:: +* GDB:: +* Triple Faults:: +* Debugging Tips:: +@end menu + +@node printf +@section @code{printf()} + +Don't underestimate the value of @func{printf}. The way +@func{printf} is implemented in Pintos, you can call it from +practically anywhere in the kernel, whether it's in a kernel thread or +an interrupt handler, almost regardless of what locks are held. + +@func{printf} is useful for more than just examining data. +It can also help figure out when and where something goes wrong, even +when the kernel crashes or panics without a useful error message. The +strategy is to sprinkle calls to @func{printf} with different strings +(e.g.@: @code{"<1>"}, @code{"<2>"}, @dots{}) throughout the pieces of +code you suspect are failing. If you don't even see @code{<1>} printed, +then something bad happened before that point, if you see @code{<1>} +but not @code{<2>}, then something bad happened between those two +points, and so on. Based on what you learn, you can then insert more +@func{printf} calls in the new, smaller region of code you suspect. +Eventually you can narrow the problem down to a single statement. +@xref{Triple Faults}, for a related technique. + +@node ASSERT +@section @code{ASSERT} + +Assertions are useful because they can catch problems early, before +they'd otherwise be noticed. Ideally, each function should begin with a +set of assertions that check its arguments for validity. (Initializers +for functions' local variables are evaluated before assertions are +checked, so be careful not to assume that an argument is valid in an +initializer.) You can also sprinkle assertions throughout the body of +functions in places where you suspect things are likely to go wrong. +They are especially useful for checking loop invariants. + +Pintos provides the @code{ASSERT} macro, defined in @file{}, +for checking assertions. + +@defmac ASSERT (expression) +Tests the value of @var{expression}. If it evaluates to zero (false), +the kernel panics. The panic message includes the expression that +failed, its file and line number, and a backtrace, which should help you +to find the problem. @xref{Backtraces}, for more information. +@end defmac + +@node Function and Parameter Attributes +@section Function and Parameter Attributes + +These macros defined in @file{} tell the compiler special +attributes of a function or function parameter. Their expansions are +GCC-specific. + +@defmac UNUSED +Appended to a function parameter to tell the compiler that the +parameter might not be used within the function. It suppresses the +warning that would otherwise appear. +@end defmac + +@defmac NO_RETURN +Appended to a function prototype to tell the compiler that the +function never returns. It allows the compiler to fine-tune its +warnings and its code generation. +@end defmac + +@defmac NO_INLINE +Appended to a function prototype to tell the compiler to never emit +the function in-line. Occasionally useful to improve the quality of +backtraces (see below). +@end defmac + +@defmac PRINTF_FORMAT (@var{format}, @var{first}) +Appended to a function prototype to tell the compiler that the function +takes a @func{printf}-like format string as the argument numbered +@var{format} (starting from 1) and that the corresponding value +arguments start at the argument numbered @var{first}. This lets the +compiler tell you if you pass the wrong argument types. +@end defmac + +@node Backtraces +@section Backtraces + +When the kernel panics, it prints a ``backtrace,'' that is, a summary +of how your program got where it is, as a list of addresses inside the +functions that were running at the time of the panic. You can also +insert a call to @func{debug_backtrace}, prototyped in +@file{}, to print a backtrace at any point in your code. +@func{debug_backtrace_all}, also declared in @file{}, +prints backtraces of all threads. + +The addresses in a backtrace are listed as raw hexadecimal numbers, +which are difficult to interpret. We provide a tool called +@command{backtrace} to translate these into function names and source +file line numbers. +Give it the name of your @file{kernel.o} as the first argument and the +hexadecimal numbers composing the backtrace (including the @samp{0x} +prefixes) as the remaining arguments. It outputs the function name +and source file line numbers that correspond to each address. + +If the translated form of a backtrace is garbled, or doesn't make +sense (e.g.@: function A is listed above function B, but B doesn't +call A), then it's a good sign that you're corrupting a kernel +thread's stack, because the backtrace is extracted from the stack. +Alternatively, it could be that the @file{kernel.o} you passed to +@command{backtrace} is not the same kernel that produced +the backtrace. + +Sometimes backtraces can be confusing without any corruption. +Compiler optimizations can cause surprising behavior. When a function +has called another function as its final action (a @dfn{tail call}), the +calling function may not appear in a backtrace at all. Similarly, when +function A calls another function B that never returns, the compiler may +optimize such that an unrelated function C appears in the backtrace +instead of A. Function C is simply the function that happens to be in +memory just after A. In the threads task, this is commonly seen in +backtraces for test failures; see @ref{The pass function fails, , +@func{pass} fails}, for more information. + +@menu +* Backtrace Example:: +@end menu + +@node Backtrace Example +@subsection Example + +Here's an example. Suppose that Pintos printed out this following call +stack, which is taken from an actual Pintos submission: + +@example +Call stack: 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319 +0xc010325a 0x804812c 0x8048a96 0x8048ac8. +@end example + +You would then invoke the @command{backtrace} utility like shown below, +cutting and pasting the backtrace information into the command line. +This assumes that @file{kernel.o} is in the current directory. You +would of course enter all of the following on a single shell command +line, even though that would overflow our margins here: + +@example +backtrace kernel.o 0xc0106eff 0xc01102fb 0xc010dc22 0xc010cf67 +0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8 +@end example + +The backtrace output would then look something like this: + +@example +0xc0106eff: debug_panic (lib/debug.c:86) +0xc01102fb: file_seek (filesys/file.c:405) +0xc010dc22: seek (userprog/syscall.c:744) +0xc010cf67: syscall_handler (userprog/syscall.c:444) +0xc0102319: intr_handler (threads/interrupt.c:334) +0xc010325a: intr_entry (threads/intr-stubs.S:38) +0x0804812c: (unknown) +0x08048a96: (unknown) +0x08048ac8: (unknown) +@end example + +(You will probably not see exactly the same addresses if you run the +command above on your own kernel binary, because the source code you +compiled and the compiler you used are probably different.) + +The first line in the backtrace refers to @func{debug_panic}, the +function that implements kernel panics. Because backtraces commonly +result from kernel panics, @func{debug_panic} will often be the first +function shown in a backtrace. + +The second line shows @func{file_seek} as the function that panicked, +in this case as the result of an assertion failure. In the source code +tree used for this example, line 405 of @file{filesys/file.c} is the +assertion + +@example +ASSERT (file_ofs >= 0); +@end example + +@noindent +(This line was also cited in the assertion failure message.) +Thus, @func{file_seek} panicked because it passed a negative file offset +argument. + +The third line indicates that @func{seek} called @func{file_seek}, +presumably without validating the offset argument. In this submission, +@func{seek} implements the @code{seek} system call. + +The fourth line shows that @func{syscall_handler}, the system call +handler, invoked @func{seek}. + +The fifth and sixth lines are the interrupt handler entry path. + +The remaining lines are for addresses below @code{PHYS_BASE}. This +means that they refer to addresses in the user program, not in the +kernel. If you know what user program was running when the kernel +panicked, you can re-run @command{backtrace} on the user program, like +so: (typing the command on a single line, of course): + +@example +backtrace tests/filesys/extended/grow-too-big 0xc0106eff 0xc01102fb +0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96 +0x8048ac8 +@end example + +The results look like this: + +@example +0xc0106eff: (unknown) +0xc01102fb: (unknown) +0xc010dc22: (unknown) +0xc010cf67: (unknown) +0xc0102319: (unknown) +0xc010325a: (unknown) +0x0804812c: test_main (...xtended/grow-too-big.c:20) +0x08048a96: main (tests/main.c:10) +0x08048ac8: _start (lib/user/entry.c:9) +@end example + +You can even specify both the kernel and the user program names on +the command line, like so: + +@example +backtrace kernel.o tests/filesys/extended/grow-too-big 0xc0106eff +0xc01102fb 0xc010dc22 0xc010cf67 0xc0102319 0xc010325a 0x804812c +0x8048a96 0x8048ac8 +@end example + +The result is a combined backtrace: + +@example +In kernel.o: +0xc0106eff: debug_panic (lib/debug.c:86) +0xc01102fb: file_seek (filesys/file.c:405) +0xc010dc22: seek (userprog/syscall.c:744) +0xc010cf67: syscall_handler (userprog/syscall.c:444) +0xc0102319: intr_handler (threads/interrupt.c:334) +0xc010325a: intr_entry (threads/intr-stubs.S:38) +In tests/filesys/extended/grow-too-big: +0x0804812c: test_main (...xtended/grow-too-big.c:20) +0x08048a96: main (tests/main.c:10) +0x08048ac8: _start (lib/user/entry.c:9) +@end example + +Here's an extra tip for anyone who read this far: @command{backtrace} +is smart enough to strip the @code{Call stack:} header and @samp{.} +trailer from the command line if you include them. This can save you +a little bit of trouble in cutting and pasting. Thus, the following +command prints the same output as the first one we used: + +@example +backtrace kernel.o Call stack: 0xc0106eff 0xc01102fb 0xc010dc22 +0xc010cf67 0xc0102319 0xc010325a 0x804812c 0x8048a96 0x8048ac8. +@end example + +@node GDB +@section GDB + +You can run Pintos under the supervision of the GDB debugger. +First, start Pintos with the @option{--gdb} option, e.g.@: +@command{pintos --gdb -- run mytest}. Second, open a second terminal on +the same machine and +use @command{pintos-gdb} to invoke GDB on +@file{kernel.o}:@footnote{@command{pintos-gdb} is a wrapper around +@command{gdb} (80@var{x}86) that loads the Pintos macros at startup.} +@example +pintos-gdb kernel.o +@end example +@noindent and issue the following GDB command: +@example +target remote localhost:1234 +@end example + +Now GDB is connected to the simulator over a local +network connection. You can now issue any normal GDB +commands. If you issue the @samp{c} command, the simulated BIOS will take +control, load Pintos, and then Pintos will run in the usual way. You +can pause the process at any point with @key{Ctrl+C}. + +@menu +* Using GDB:: +* Example GDB Session:: +* GDB FAQ:: +@end menu + +@node Using GDB +@subsection Using GDB + +You can read the GDB manual by typing @code{info gdb} at a +terminal command prompt. Here's a few commonly useful GDB commands: + +@deffn {GDB Command} c +Continues execution until @key{Ctrl+C} or the next breakpoint. +@end deffn + +@deffn {GDB Command} break function +@deffnx {GDB Command} break file:line +@deffnx {GDB Command} break *address +Sets a breakpoint at @var{function}, at @var{line} within @var{file}, or +@var{address}. +(Use a @samp{0x} prefix to specify an address in hex.) + +Use @code{break main} to make GDB stop when Pintos starts running. +@end deffn + +@deffn {GDB Command} p expression +Evaluates the given @var{expression} and prints its value. +If the expression contains a function call, that function will actually +be executed. +@end deffn + +@deffn {GDB Command} l *address +Lists a few lines of code around @var{address}. +(Use a @samp{0x} prefix to specify an address in hex.) +@end deffn + +@deffn {GDB Command} bt +Prints a stack backtrace similar to that output by the +@command{backtrace} program described above. +@end deffn + +@deffn {GDB Command} p/a address +Prints the name of the function or variable that occupies @var{address}. +(Use a @samp{0x} prefix to specify an address in hex.) +@end deffn + +@deffn {GDB Command} diassemble function +Disassembles @var{function}. +@end deffn + +We also provide a set of macros specialized for debugging Pintos, +written by Godmar Back @email{gback@@cs.vt.edu}. You can type +@code{help user-defined} for basic help with the macros. Here is an +overview of their functionality, based on Godmar's documentation: + +@deffn {GDB Macro} debugpintos +Attach debugger to a waiting pintos process on the same machine. +Shorthand for @code{target remote localhost:1234}. +@end deffn + +@deffn {GDB Macro} dumplist list type element +Prints the elements of @var{list}, which should be a @code{struct} list +that contains elements of the given @var{type} (without the word +@code{struct}) in which @var{element} is the @struct{list_elem} member +that links the elements. + +Example: @code{dumplist all_list thread allelem} prints all elements of +@struct{thread} that are linked in @code{struct list all_list} using the +@code{struct list_elem allelem} which is part of @struct{thread}. +@end deffn + +@deffn {GDB Macro} btthread thread +Shows the backtrace of @var{thread}, which is a pointer to the +@struct{thread} of the thread whose backtrace it should show. For the +current thread, this is identical to the @code{bt} (backtrace) command. +It also works for any thread suspended in @func{schedule}, +provided you know where its kernel stack page is located. +@end deffn + +@deffn {GDB Macro} btthreadlist list element +Shows the backtraces of all threads in @var{list}, the @struct{list} in +which the threads are kept. Specify @var{element} as the +@struct{list_elem} field used inside @struct{thread} to link the threads +together. + +Example: @code{btthreadlist all_list allelem} shows the backtraces of +all threads contained in @code{struct list all_list}, linked together by +@code{allelem}. This command is useful to determine where your threads +are stuck when a deadlock occurs. Please see the example scenario below. +@end deffn + +@deffn {GDB Macro} btthreadall +Short-hand for @code{btthreadlist all_list allelem}. +@end deffn + +@deffn {GDB Macro} hook-stop +GDB invokes this macro every time the simulation stops, which Bochs will +do for every processor exception, among other reasons. If the +simulation stops due to a page fault, @code{hook-stop} will print a +message that says and explains further whether the page fault occurred +in the kernel or in user code. + +If the exception occurred from user code, @code{hook-stop} will say: +@example +pintos-debug: a page fault exception occurred in user mode +pintos-debug: hit 'c' to continue, or 's' to step to intr_handler +@end example + +In Task 2, a page fault in a user process leads to the termination of +the process. You should expect those page faults to occur in the +robustness tests where we test that your kernel properly terminates +processes that try to access invalid addresses. To debug those, set a +break point in @func{page_fault} in @file{exception.c}, which you will +need to modify accordingly. + +In Task 3, a page fault in a user process no longer automatically +leads to the termination of a process. Instead, it may require reading in +data for the page the process was trying to access, either +because it was swapped out or because this is the first time it's +accessed. In either case, you will reach @func{page_fault} and need to +take the appropriate action there. + +If the page fault did not occur in user mode while executing a user +process, then it occurred in kernel mode while executing kernel code. +In this case, @code{hook-stop} will print this message: +@example +pintos-debug: a page fault occurred in kernel mode +@end example + +Before Task 3, a page fault exception in kernel code is always a bug +in your kernel, because your kernel should never crash. Starting with +Task 3, the situation will change if you use the @func{get_user} and +@func{put_user} strategy to verify user memory accesses +(@pxref{Accessing User Memory}). + +@c ---- +@c Unfortunately, this does not work with Bochs's gdb stub. +@c ---- +@c If you don't want GDB to stop for page faults, then issue the command +@c @code{handle SIGSEGV nostop}. GDB will still print a message for +@c every page fault, but it will not come back to a command prompt. +@end deffn + +@node Example GDB Session +@subsection Example GDB Session + +This section narrates a sample GDB session, provided by Godmar Back +(modified by Mark Rutland and Feroz Abdul Salam). +This example illustrates how one might debug a Task 1 solution in +which occasionally a thread that calls @func{timer_sleep} is not woken +up. With this bug, tests such as @code{mlfqs_load_1} get stuck. + +Program output is shown in normal type, user input in @strong{strong} +type. + +First, I start Pintos: + +@smallexample +$ @strong{pintos -v --qemu --gdb -- -q -mlfqs run mlfqs-load-1} + +qemu -hda /tmp/Qu7Ex4UbFv.dsk -m 4 -net none -nographic -s -S +Could not open '/dev/kqemu' - QEMU acceleration layer not activated: No such file or directory + +@end smallexample + +@noindent Then, I open a second window on the same machine and start GDB: + +@smallexample +$ @strong{pintos-gdb kernel.o} +GNU gdb 6.8-debian +Copyright (C) 2008 Free Software Foundation, Inc. +License GPLv3+: GNU GPL version 3 or later +This is free software: you are free to change and redistribute it. +There is NO WARRANTY, to the extent permitted by law. Type "show copying" +and "show warranty" for details. +This GDB was configured as "i486-linux-gnu"... +@end smallexample + +@noindent Then, I tell GDB to attach to the waiting Pintos emulator: + +@smallexample +(gdb) @strong{debugpintos} +Remote debugging using localhost:1234 +0x0000fff0 in ?? () +Reply contains invalid hex digit 78 +@end smallexample + +@noindent Now I tell Pintos to run by executing @code{c} (short for +@code{continue}): + +@smallexample +(gdb) @strong{c} +Continuing. +@end smallexample + +@noindent Now Pintos will continue and output: + +@smallexample +PiLo hda1 +Loading......... +Kernel command line: -q -mlfqs run mlfqs-load-1 +Pintos booting with 4,096 kB RAM... +383 pages available in kernel pool. +383 pages available in user pool. +Calibrating timer... 104,755,200 loops/s. +Boot complete. +Executing 'mlfqs-load-1': +(mlfqs-load-1) begin +(mlfqs-load-1) spinning for up to 45 seconds, please wait... +@end smallexample + +@noindent +@dots{}until it gets stuck because of the bug I had introduced. I hit +@key{Ctrl+C} in the debugger window: + +@smallexample +Program received signal 0, Signal 0. +0xc010168c in next_thread_to_run () at ../../threads/thread.c:649 +649 while (i <= PRI_MAX && list_empty (&ready_list[i])) +(gdb) +@end smallexample + +@noindent +The thread that was running when I interrupted Pintos was the idle +thread. If I run @code{backtrace}, it shows this backtrace: + +@smallexample +(gdb) @strong{bt} +#0 0xc010168c in next_thread_to_run () at ../../threads/thread.c:649 +#1 0xc0101778 in schedule () at ../../threads/thread.c:714 +#2 0xc0100f8f in thread_block () at ../../threads/thread.c:324 +#3 0xc0101419 in idle (aux=0x0) at ../../threads/thread.c:551 +#4 0xc010145a in kernel_thread (function=0xc01013ff , aux=0x0) + at ../../threads/thread.c:575 +#5 0x00000000 in ?? () +@end smallexample + +@noindent +Not terribly useful. What I really like to know is what's up with the +other thread (or threads). Since I keep all threads in a linked list +called @code{all_list}, linked together by a @struct{list_elem} member +named @code{allelem}, I can use the @code{btthreadlist} macro from the +macro library I wrote. @code{btthreadlist} iterates through the list of +threads and prints the backtrace for each thread: + +@smallexample +(gdb) @strong{btthreadlist all_list allelem} +pintos-debug: dumping backtrace of thread 'main' @@0xc002f000 +#0 0xc0101820 in schedule () at ../../threads/thread.c:722 +#1 0xc0100f8f in thread_block () at ../../threads/thread.c:324 +#2 0xc0104755 in timer_sleep (ticks=1000) at ../../devices/timer.c:141 +#3 0xc010bf7c in test_mlfqs_load_1 () at ../../tests/threads/mlfqs-load-1.c:49 +#4 0xc010aabb in run_test (name=0xc0007d8c "mlfqs-load-1") + at ../../tests/threads/tests.c:50 +#5 0xc0100647 in run_task (argv=0xc0110d28) at ../../threads/init.c:281 +#6 0xc0100721 in run_actions (argv=0xc0110d28) at ../../threads/init.c:331 +#7 0xc01000c7 in main () at ../../threads/init.c:140 + +pintos-debug: dumping backtrace of thread 'idle' @@0xc0116000 +#0 0xc010168c in next_thread_to_run () at ../../threads/thread.c:649 +#1 0xc0101778 in schedule () at ../../threads/thread.c:714 +#2 0xc0100f8f in thread_block () at ../../threads/thread.c:324 +#3 0xc0101419 in idle (aux=0x0) at ../../threads/thread.c:551 +#4 0xc010145a in kernel_thread (function=0xc01013ff , aux=0x0) + at ../../threads/thread.c:575 +#5 0x00000000 in ?? () +@end smallexample + +@noindent +In this case, there are only two threads, the idle thread and the main +thread. The kernel stack pages (to which the @struct{thread} points) +are at @t{0xc0116000} and @t{0xc002f000}, respectively. The main thread +is stuck in @func{timer_sleep}, called from @code{test_mlfqs_load_1}. + +Knowing where threads are stuck can be tremendously useful, for instance +when diagnosing deadlocks or unexplained hangs. + +@deffn {GDB Macro} loadusersymbols + +You can also use GDB to debug a user program running under Pintos. +To do that, use the @code{loadusersymbols} macro to load the program's +symbol table: +@example +loadusersymbols @var{program} +@end example +@noindent +where @var{program} is the name of the program's executable (in the host +file system, not in the Pintos file system). For example, you may issue: +@smallexample +(gdb) @strong{loadusersymbols tests/userprog/exec-multiple} +add symbol table from file "tests/userprog/exec-multiple" at + .text_addr = 0x80480a0 +(gdb) +@end smallexample + +After this, you should be +able to debug the user program the same way you would the kernel, by +placing breakpoints, inspecting data, etc. Your actions apply to every +user program running in Pintos, not just to the one you want to debug, +so be careful in interpreting the results: GDB does not know +which process is currently active (because that is an abstraction +the Pintos kernel creates). Also, a name that appears in +both the kernel and the user program will actually refer to the kernel +name. (The latter problem can be avoided by giving the user executable +name on the GDB command line, instead of @file{kernel.o}, and then using +@code{loadusersymbols} to load @file{kernel.o}.) +@code{loadusersymbols} is implemented via GDB's @code{add-symbol-file} +command. + +@end deffn + +@node GDB FAQ +@subsection FAQ + +@table @asis +@item GDB can't connect to QEMU (Error: localhost:1234: Connection refused) + +If the @command{target remote} command fails, then make sure that both +GDB and @command{pintos} are running on the same machine by +running @command{hostname} in each terminal. If the names printed +differ, then you need to open a new terminal for GDB on the +machine running @command{pintos}. + +@item GDB doesn't recognize any of the macros. + +If you start GDB with @command{pintos-gdb}, it should load the Pintos +macros automatically. If you start GDB some other way, then you must +issue the command @code{source @var{pintosdir}/src/misc/gdb-macros}, +where @var{pintosdir} is the root of your Pintos directory, before you +can use them. + +@item Can I debug Pintos with DDD? + +Yes, you can. DDD invokes GDB as a subprocess, so you'll need to tell +it to invokes @command{pintos-gdb} instead: +@example +ddd --gdb --debugger pintos-gdb +@end example + +@item Can I use GDB inside Emacs? + +Yes, you can. Emacs has special support for running GDB as a +subprocess. Type @kbd{M-x gdb} and enter your @command{pintos-gdb} +command at the prompt. The Emacs manual has information on how to use +its debugging features in a section titled ``Debuggers.'' + +@end table + +@node Triple Faults +@section Triple Faults + +When a CPU exception handler, such as a page fault handler, cannot be +invoked because it is missing or defective, the CPU will try to invoke +the ``double fault'' handler. If the double fault handler is itself +missing or defective, that's called a ``triple fault.'' A triple fault +causes an immediate CPU reset. + +Thus, if you get yourself into a situation where the machine reboots in +a loop, that's probably a ``triple fault.'' In a triple fault +situation, you might not be able to use @func{printf} for debugging, +because the reboots might be happening even before everything needed for +@func{printf} is initialized. + +Currently, the only option is ``debugging by infinite loop.'' +Pick a place in the Pintos code, insert the infinite loop +@code{for (;;);} there, and recompile and run. There are two likely +possibilities: + +@itemize @bullet +@item +The machine hangs without rebooting. If this happens, you know that +the infinite loop is running. That means that whatever caused the +reboot must be @emph{after} the place you inserted the infinite loop. +Now move the infinite loop later in the code sequence. + +@item +The machine reboots in a loop. If this happens, you know that the +machine didn't make it to the infinite loop. Thus, whatever caused the +reboot must be @emph{before} the place you inserted the infinite loop. +Now move the infinite loop earlier in the code sequence. +@end itemize + +If you move around the infinite loop in a ``binary search'' fashion, you +can use this technique to pin down the exact spot that everything goes +wrong. It should only take a few minutes at most. + +@node Debugging Tips +@section Tips + +The page allocator in @file{threads/palloc.c} and the block allocator in +@file{threads/malloc.c} clear all the bytes in memory to +@t{0xcc} at time of free. Thus, if you see an attempt to +dereference a pointer like @t{0xcccccccc}, or some other reference to +@t{0xcc}, there's a good chance you're trying to reuse a page that's +already been freed. Also, byte @t{0xcc} is the CPU opcode for ``invoke +interrupt 3,'' so if you see an error like @code{Interrupt 0x03 (#BP +Breakpoint Exception)}, then Pintos tried to execute code in a freed page or +block. diff --git a/doc/devel.texi b/doc/devel.texi new file mode 100644 index 0000000..41f5c29 --- /dev/null +++ b/doc/devel.texi @@ -0,0 +1,108 @@ +@node Development Tools +@appendix Development Tools + +Here are some tools that you might find useful while developing code. + +@menu +* Tags:: +* cscope:: +* Git:: +@ifset recommendvnc +* VNC:: +@end ifset +@ifset recommendcygwin +* Cygwin:: +@end ifset +@end menu + +@node Tags +@section Tags + +Tags are an index to the functions and global variables declared in a +program. Many editors, including Emacs and @command{vi}, can use +them. The @file{Makefile} in @file{pintos-ic/src} produces Emacs-style +tags with the command @code{make TAGS} or @command{vi}-style tags with +@code{make tags}. + +In Emacs, use @kbd{M-.} to follow a tag in the current window, +@kbd{C-x 4 .} in a new window, or @kbd{C-x 5 .} in a new frame. If +your cursor is on a symbol name for any of those commands, it becomes +the default target. If a tag name has multiple definitions, @kbd{M-0 +M-.} jumps to the next one. To jump back to where you were before +you followed the last tag, use @kbd{M-*}. + +@node cscope +@section cscope + +The @command{cscope} program also provides an index to functions and +variables declared in a program. It has some features that tag +facilities lack. Most notably, it can find all the points in a +program at which a given function is called. + +The @file{Makefile} in @file{pintos-ic/src} produces @command{cscope} +indexes when it is invoked as @code{make cscope}. Once the index has +been generated, run @command{cscope} from a shell command line; no +command-line arguments are normally necessary. Then use the arrow +keys to choose one of the search criteria listed near the bottom of +the terminal, type in an identifier, and hit @key{Enter}. +@command{cscope} will then display the matches in the upper part of +the terminal. You may use the arrow keys to choose a particular +match; if you then hit @key{Enter}, @command{cscope} will invoke the +default system editor@footnote{This is typically @command{vi}. To +exit @command{vi}, type @kbd{: q @key{Enter}}.} and position the +cursor on that match. To start a new search, type @key{Tab}. To exit +@command{cscope}, type @kbd{Ctrl-d}. + +Emacs and some versions of @command{vi} have their own interfaces to +@command{cscope}. For information on how to use these interface, +visit @url{http://cscope.sourceforge.net, the @command{cscope} home +page}. + +@node Git +@section Git + +Git is a version-control system. That is, you can use it to keep +track of multiple versions of files. The idea is that you do some +work on your code and test it, then commit it into the version-control +system. If you decide that the work you've done since your last +commit is no good, you can easily revert to the last version. +Furthermore, you can retrieve any old version of your code +as of some given day and time. The version control logs tell you who +made changes and when. + +Whilst Git may not be everyone's preferred version control system, it's +free, has a wealth of documentation, and is easy to install on most +Unix-like environments. + +For more information, visit the @uref{https://www.git-scm.com/, , Git +home page}. + +@include localgitinstructions.texi + +@ifset recommendvnc +@node VNC +@section VNC + +VNC stands for Virtual Network Computing. It is, in essence, a remote +display system which allows you to view a computing ``desktop'' +environment not only on the machine where it is running, but from +anywhere on the Internet and from a wide variety of machine +architectures. It is already installed on the lab machines. +For more information, look at the @uref{http://www.realvnc.com/, , VNC +Home Page}. +@end ifset + +@ifset recommendcygwin +@node Cygwin +@section Cygwin + +@uref{http://cygwin.com/, ,Cygwin} provides a Linux-compatible environment +for Windows. It includes ssh client and an X11 server, Cygwin/X. If your +primary work environment is Windows, you will find Cygwin/X extremely +useful for these tasks. Install Cygwin/X, then start the X server +and open a new xterm. The X11 server also allows you to run pintos while +displaying the qemu-emulated console on your Windows desktop. +@end ifset + +@localdevelopmenttools{} + diff --git a/doc/doc.texi b/doc/doc.texi new file mode 100644 index 0000000..aba268d --- /dev/null +++ b/doc/doc.texi @@ -0,0 +1,59 @@ +@node Task Documentation +@appendix Task Documentation + +This chapter presents a sample assignment and a filled-in design +document for one possible implementation. Its purpose is to give you an +idea of what we expect to see in your own design documents. + +@menu +* Sample Assignment:: +* Sample Design Document:: +@end menu + +@node Sample Assignment +@section Sample Assignment + +Implement @func{thread_join}. + +@deftypefun void thread_join (tid_t @var{tid}) +Blocks the current thread until thread @var{tid} exits. If @var{A} is +the running thread and @var{B} is the argument, then we say that +``@var{A} joins @var{B}.'' + +Incidentally, the argument is a thread id, instead of a thread pointer, +because a thread pointer is not unique over time. That is, when a +thread dies, its memory may be, whether immediately or much later, +reused for another thread. If thread @var{A} over time had two children +@var{B} and @var{C} that were stored at the same address, then +@code{thread_join(@var{B})} and @code{thread_join(@var{C})} would be +ambiguous. + +A thread may only join its immediate children. Calling +@func{thread_join} on a thread that is not the caller's child should +cause the caller to return immediately. Children are not ``inherited,'' +that is, if @var{A} has child @var{B} and @var{B} has child @var{C}, +then @var{A} always returns immediately should it try to join @var{C}, +even if @var{B} is dead. + +A thread need not ever be joined. Your solution should properly free +all of a thread's resources, including its @struct{thread}, +whether it is ever joined or not, and regardless of whether the child +exits before or after its parent. That is, a thread should be freed +exactly once in all cases. + +Joining a given thread is idempotent. That is, joining a thread +multiple times is equivalent to joining it once, because it has already +exited at the time of the later joins. Thus, joins on a given thread +after the first should return immediately. + +You must handle all the ways a join can occur: nested joins (@var{A} +joins @var{B}, then @var{B} joins @var{C}), multiple joins (@var{A} +joins @var{B}, then @var{A} joins @var{C}), and so on. +@end deftypefun + +@node Sample Design Document +@section Sample Design Document + +@example +@include sample.tmpl.texi +@end example diff --git a/doc/installation.texi b/doc/installation.texi new file mode 100644 index 0000000..0ee3a7d --- /dev/null +++ b/doc/installation.texi @@ -0,0 +1,112 @@ +@node Installing Pintos +@appendix Installing Pintos + +This chapter explains how to install a Pintos development environment on +your own machine. If you are using a Pintos development environment +that has been set up by someone else, you do not need to read this +chapter or follow these instructions. + +The Pintos development environment is targeted at Unix-like systems. It +has been most extensively tested on GNU/Linux, in particular the Debian +and Ubuntu distributions, and Solaris. It is not designed to install +under any form of Windows. + +Prerequisites for installing a Pintos development environment include +the following, on top of standard Unix utilities: + +@itemize @bullet +@item +Required: @uref{http://gcc.gnu.org/, GCC}. Version 4.0 or later is +preferred. Version 3.3 or later should work. If the host machine has +an 80@var{x}86 processor, then GCC should be available as @command{gcc}; +otherwise, an 80@var{x}86 cross-compiler should be available as +@command{i386-elf-gcc}. A sample set of commands for installing GCC +3.3.6 as a cross-compiler are included in +@file{src/@/misc/@/gcc-3.3.6-cross-howto}. + +@item +Required: @uref{http://www.gnu.org/software/binutils/, GNU binutils}. +Pintos uses @command{addr2line}, @command{ar}, @command{ld}, +@command{objcopy}, and @command{ranlib}. If the host machine is not an +80@var{x}86, versions targeting 80@var{x}86 should be available with an +@samp{i386-elf-} prefix. + +@item +Required: @uref{http://www.perl.org, Perl}. Version 5.8.0 or later is +preferred. Version 5.6.1 or later should work. + +@item +Required: @uref{http://www.gnu.org/software/make/, GNU make}, version +3.80 or later. + +@item +Required: @uref{http://fabrice.bellard.free.fr/qemu/, QEMU}, version +0.11.0 or later. + +@item +Recommended: @uref{http://www.gnu.org/software/gdb/, GDB}. GDB is +helpful in debugging (@pxref{GDB}). If the host machine is not an +80@var{x}86, a version of GDB targeting 80@var{x}86 should be available +as @samp{i386-elf-gdb}. + +@item +Recommended: @uref{http://www.x.org/, X}. Being able to use an X server +makes the virtual machine feel more like a physical machine, but it is +not strictly necessary. + +@item +Optional: @uref{http://www.gnu.org/software/texinfo/, Texinfo}, version +4.5 or later. Texinfo is required to build the PDF version of the +documentation. + +@item +Optional: @uref{http://www.tug.org/, @TeX{}}. Also required to build +the PDF version of the documentation. + +@item +Optional: @uref{http://www.vmware.com/, VMware Player}. This is another +platform that can also be used to test Pintos. +@end itemize + +Once these prerequisites are available, follow these instructions to +install Pintos: + +@enumerate 1 +@item +Install scripts from @file{src/utils}. Copy @file{backtrace}, +@file{pintos}, @file{pintos-gdb}, @file{pintos-mkdisk}, +@file{pintos-set-cmdline}, and @file{Pintos.pm} into the default +@env{PATH}. + +@item +Install @file{src/misc/gdb-macros} in a public location. Then use a +text editor to edit the installed copy of @file{pintos-gdb}, changing +the definition of @env{GDBMACROS} to point to where you installed +@file{gdb-macros}. Test the installation by running +@command{pintos-gdb} without any arguments. If it does not complain +about missing @file{gdb-macros}, it is installed correctly. + +@item +Compile the remaining Pintos utilities by typing @command{make} in +@file{src/utils}. Install @file{squish-pty} somewhere in @env{PATH}. +To support VMware Player, install @file{squish-unix}. +If your Perl is older than version 5.8.0, also install +@file{setitimer-helper}; otherwise, it is unneeded. + +@item +Pintos should now be ready for use. If you have the Pintos reference +solutions, which are provided only to faculty and their teaching +assistants, then you may test your installation by running @command{make +check} in the top-level @file{tests} directory. The tests take between +20 minutes and 1 hour to run, depending on the speed of your hardware. + +@item +Optional: Build the documentation, by running @command{make dist} in the +top-level @file{doc} directory. This creates a @file{WWW} subdirectory +within @file{doc} that contains HTML and PDF versions of the +documentation, plus the design document templates and various hardware +specifications referenced by the documentation. Building the PDF +version of the manual requires Texinfo and @TeX{} (see above). You may +install @file{WWW} wherever you find most useful. + +@end enumerate diff --git a/doc/intro.texi b/doc/intro.texi new file mode 100644 index 0000000..baa0f5a --- /dev/null +++ b/doc/intro.texi @@ -0,0 +1,483 @@ +@node Introduction +@chapter Introduction + +Welcome to Pintos. Pintos is a simple operating system framework for +the 80@var{x}86 architecture. It supports kernel threads, loading and +running user programs, and a file system, but it implements all of +these in a very simple way. In the Pintos tasks, you and your +task team will strengthen its support in first two of these areas. +You will also add a virtual memory implementation. + +Pintos could, theoretically, run on a regular IBM-compatible PC. +Unfortunately, it is impractical to supply every @value{coursenumber} student +a dedicated PC for use with Pintos. Therefore, we will run Pintos tasks +in a system simulator, that is, a program that simulates an 80@var{x}86 +CPU and its peripheral devices accurately enough that unmodified operating +systems and software can run under it. In particular, we will be using the +@uref{http://fabrice.bellard.free.fr/qemu/, , +QEMU} simulator. Pintos has also been tested with +@uref{http://www.vmware.com/, , VMware Player}. + +These tasks are hard. The Pintos exercies have a reputation of taking a lot of +time, and deservedly so. We will do what we can to reduce the workload, such +as providing a lot of support material, but there is plenty of +hard work that needs to be done. We welcome your +feedback. If you have suggestions on how we can reduce the unnecessary +overhead of assignments, cutting them down to the important underlying +issues, please let us know. + +This version of the exercise has been adapted for use at Imperial College +London, and is significantly different to the original exercise designed at +Stanford University. It's recommended that you only use the Imperial version +of the documentation to avoid unnecessary confusion. + +This chapter explains how to get started working with Pintos. You +should read the entire chapter before you start work on any of the +tasks. + +@menu +* Getting Started:: +* Submission:: +* Grading:: +* Legal and Ethical Issues:: +* Acknowledgements:: +* Trivia:: +@end menu + +@node Getting Started +@section Getting Started + +To get started, you'll have to log into a machine that Pintos can be +built on. +@localmachines{} +We will test your code on these machines, and the instructions given +here assume this environment. We cannot provide support for installing +and working on Pintos on your own machine, but we provide instructions +for doing so nonetheless (@pxref{Installing Pintos}). + +If you are using tcsh (the default shell for CSG-run machines), several Pintos +utilities will already be in your PATH. If you are not using either tcsh or a +CSG-run machine, you will need to add these utilities manually. + +@localpathsetup{} + +@menu +* Source Tree Overview:: +* Building Pintos:: +* Running Pintos:: +* Debugging versus Testing:: +@end menu + +@node Source Tree Overview +@subsection Source Tree Overview + +Now you can retrieve a copy of the source for Pintos by executing +@example +git clone @value{localpintosgitpath} pintos +@end example + +Let's take a look at what's inside. Here's the directory structure +that you should see in @file{pintos-ic/src}: + +@table @file +@item threads/ +Source code for the base kernel, which you will modify starting in +task 1. + +@item userprog/ +Source code for the user program loader, which you will modify +starting with task 2. + +@item vm/ +An almost empty directory. You will implement virtual memory here in +task 3. + +@item filesys/ +Source code for a basic file system. You will use this file system +starting with task 2. + +@item devices/ +Source code for I/O device interfacing: keyboard, timer, disk, etc. +You will modify the timer implementation in task 1. Otherwise +you should have no need to change this code. + +@item lib/ +An implementation of a subset of the standard C library. The code in +this directory is compiled into both the Pintos kernel and, starting +from task 2, user programs that run under it. In both kernel code +and user programs, headers in this directory can be included using the +@code{#include <@dots{}>} notation. You should have little need to +modify this code. + +@item lib/kernel/ +Parts of the C library that are included only in the Pintos kernel. +This also includes implementations of some data types that you are +free to use in your kernel code: bitmaps, doubly linked lists, and +hash tables. In the kernel, headers in this +directory can be included using the @code{#include <@dots{}>} +notation. + +@item lib/user/ +Parts of the C library that are included only in Pintos user programs. +In user programs, headers in this directory can be included using the +@code{#include <@dots{}>} notation. + +@item tests/ +Tests for each task. You can modify this code if it helps you test +your submission, but we will replace it with the originals before we run +the tests. + +@item examples/ +Example user programs for use starting with task 2. + +@item misc/ +@itemx utils/ +These files may come in handy if you decide to try working with Pintos +on your own machine. Otherwise, you can ignore them. +@end table + +@node Building Pintos +@subsection Building Pintos + +As the next step, build the source code supplied for +the first task. First, @command{cd} into the @file{threads} +directory. Then, issue the @samp{make} command. This will create a +@file{build} directory under @file{threads}, populate it with a +@file{Makefile} and a few subdirectories, and then build the kernel +inside. The entire build should take less than 30 seconds. + +@localcrossbuild{} + +Following the build, the following are the interesting files in the +@file{build} directory: + +@table @file +@item Makefile +A copy of @file{pintos-ic/src/Makefile.build}. It describes how to build +the kernel. @xref{Adding Source Files}, for more information. + +@item kernel.o +Object file for the entire kernel. This is the result of linking +object files compiled from each individual kernel source file into a +single object file. It contains debug information, so you can run +GDB (@pxref{GDB}) or @command{backtrace} (@pxref{Backtraces}) on it. + +@item kernel.bin +Memory image of the kernel, that is, the exact bytes loaded into +memory to run the Pintos kernel. This is just @file{kernel.o} with +debug information stripped out, which saves a lot of space, which in +turn keeps the kernel from bumping up against a @w{512 kB} size limit +imposed by the kernel loader's design. + +@item loader.bin +Memory image for the kernel loader, a small chunk of code written in +assembly language that reads the kernel from disk into memory and +starts it up. It is exactly 512 bytes long, a size fixed by the +PC BIOS. +@end table + +Subdirectories of @file{build} contain object files (@file{.o}) and +dependency files (@file{.d}), both produced by the compiler. The +dependency files tell @command{make} which source files need to be +recompiled when other source or header files are changed. + +@node Running Pintos +@subsection Running Pintos + +We've supplied a program for conveniently running Pintos in a simulator, +called @command{pintos}. In the simplest case, you can invoke +@command{pintos} as @code{pintos @var{argument}@dots{}}. Each +@var{argument} is passed to the Pintos kernel for it to act on. + +Try it out. First @command{cd} into the newly created @file{build} +directory. Then issue the command @code{pintos run alarm-multiple}, +which passes the arguments @code{run alarm-multiple} to the Pintos +kernel. In these arguments, @command{run} instructs the kernel to run a +test and @code{alarm-multiple} is the test to run. + +Pintos boots and runs the @code{alarm-multiple} test +program, which outputs a few screenfuls of text. +You can log serial output to a file by redirecting at the +command line, e.g.@: @code{pintos run alarm-multiple > logfile}. + +The @command{pintos} program offers several options for configuring the +simulator or the virtual hardware. If you specify any options, they +must precede the commands passed to the Pintos kernel and be separated +from them by @option{--}, so that the whole command looks like +@code{pintos @var{option}@dots{} -- @var{argument}@dots{}}. Invoke +@code{pintos} without any arguments to see a list of available options. +You can run the simulator with a debugger (@pxref{GDB}). You can also set the +amount of memory to give the VM. + +The Pintos kernel has commands and options other than @command{run}. +These are not very interesting for now, but you can see a list of them +using @option{-h}, e.g.@: @code{pintos -h}. + +@node Debugging versus Testing +@subsection Debugging versus Testing + +The QEMU simulator you will be using to run Pintos only supports real-time +simulations. This has ramifications with regards to both testing and debugging. + +Whilst reproducibility is extremely useful for debugging, running Pintos in +QEMU is not necessarily deterministic. You should keep this in mind when +testing for bugs in your code. In each run, timer interrupts will come at +irregularly spaced intervals, meaning that bugs may appear and disappear with +repeated tests. Therefore it's very important that you run through the tests at +least few times. No number of runs can guarantee that your synchronisation is +perfect, but the more you do, the more confident you can be that your code +doesn't have major flaws. + +@node Submission +@section Submission +When you are finished with a task, you can submit it via CATE as a +@code{tar.gz} archive: + +@itemize @bullet +@item Run @code{make clean} from the top-level @code{pintos-ic} directory, +@item Run @code{tar -czvf task@var{[number]}.tar.gz pintos-ic} from its parent directory, +@item Submit the resulting archive file via CATE. +@end itemize + +@node Grading +@section Grading + +We will grade your assignments based on test results and design quality, +each of which comprises 50% of your grade. + +Task 0 (completing the introductory worksheet handed out through CATE) +and Task 1 will contribute towards your Operating Systems coursework mark, +while Tasks 2 and 3 will contribute towards your Laboratory work mark. + +@menu +* Testing:: +* Design:: +@end menu + +@node Testing +@subsection Testing + +Your test result grade will be based on our tests. Each task has +several tests, each of which has a name beginning with @file{tests}. +To completely test your submission, invoke @code{make check} from the +task @file{build} directory. This will build and run each test and +print a ``pass'' or ``fail'' message for each one. When a test fails, +@command{make check} also prints some details of the reason for failure. +After running all the tests, @command{make check} also prints a summary +of the test results. + +You can also run individual tests one at a time. A given test @var{t} +writes its output to @file{@var{t}.output}, then a script scores the +output as ``pass'' or ``fail'' and writes the verdict to +@file{@var{t}.result}. To run and grade a single test, @command{make} +the @file{.result} file explicitly from the @file{build} directory, e.g.@: +@code{make tests/threads/alarm-multiple.result}. If @command{make} says +that the test result is up-to-date, but you want to re-run it anyway, +either run @code{make clean} or delete the @file{.output} file by hand. + +By default, each test provides feedback only at completion, not during +its run. If you prefer, you can observe the progress of each test by +specifying @option{VERBOSE=1} on the @command{make} command line, as in +@code{make check VERBOSE=1}. You can also provide arbitrary options to the +@command{pintos} run by the tests with @option{PINTOSOPTS='@dots{}'}. + +All of the tests and related files are in @file{pintos-ic/src/tests}. +Before we test your submission, we will replace the contents of that +directory by a pristine, unmodified copy, to ensure that the correct +tests are used. Thus, you can modify some of the tests if that helps in +debugging, but we will run the originals. + +All software has bugs, so some of our tests may be flawed. If you think +a test failure is a bug in the test, not a bug in your code, +please point it out. We will look at it and fix it if necessary. + +Please don't try to take advantage of our generosity in giving out our +test suite. Your code has to work properly in the general case, not +just for the test cases we supply. For example, it would be unacceptable +to explicitly base the kernel's behavior on the name of the running +test case. Such attempts to side-step the test cases will receive no +credit. If you think your solution may be in a gray area here, please +ask us about it. + +@node Design +@subsection Design + +We will judge your design based on the design document and the source +code that you submit. We will read your entire design document and much +of your source code. + +Don't forget that design quality, including the design document, is 50% +of your task grade. It +is better to spend one or two hours writing a good design document than +it is to spend that time getting the last 5% of the points for tests and +then trying to rush through writing the design document in the last 15 +minutes. + +@menu +* Design Document:: +* Source Code:: +@end menu + +@node Design Document +@subsubsection Design Document + +We provide a design document template for each task. For each +significant part of a task, the template asks questions in four +areas: + +@table @strong +@item Data Structures + +The instructions for this section are always the same: + +@quotation +Copy here the declaration of each new or changed @code{struct} or +@code{struct} member, global or static variable, @code{typedef}, or +enumeration. Identify the purpose of each in 25 words or less. +@end quotation + +The first part is mechanical. Just copy new or modified declarations +into the design document, to highlight for us the actual changes to data +structures. Each declaration should include the comment that should +accompany it in the source code (see below). + +We also ask for a very brief description of the purpose of each new or +changed data structure. The limit of 25 words or less is a guideline +intended to save your time and avoid duplication with later areas. + +@item Algorithms + +This is where you tell us how your code works, through questions that +probe your understanding of your code. We might not be able to easily +figure it out from the code, because many creative solutions exist for +most OS problems. Help us out a little. + +Your answers should be at a level below the high level description of +requirements given in the assignment. We have read the assignment too, +so it is unnecessary to repeat or rephrase what is stated there. On the +other hand, your answers should be at a level above the low level of the +code itself. Don't give a line-by-line run-down of what your code does. +Instead, use your answers to explain how your code works to implement +the requirements. + +@item Synchronization + +An operating system kernel is a complex, multithreaded program, in which +synchronizing multiple threads can be difficult. This section asks +about how you chose to synchronize this particular type of activity. + +@item Rationale + +Whereas the other sections primarily ask ``what'' and ``how,'' the +rationale section concentrates on ``why.'' This is where we ask you to +justify some design decisions, by explaining why the choices you made +are better than alternatives. You may be able to state these in terms +of time and space complexity, which can be made as rough or informal +arguments (formal language or proofs are unnecessary). +@end table + +An incomplete, evasive, or non-responsive design document or one that +strays from the template without good reason may be penalized. +Incorrect capitalization, punctuation, spelling, or grammar can also +cost points. @xref{Task Documentation}, for a sample design document +for a fictitious task. + +@node Source Code +@subsubsection Source Code + +Your design will also be judged by looking at your source code. We will +typically look at the differences between the original Pintos source +tree and your submission, based on the output of a command like +@code{diff -urpb pintos.orig pintos.submitted}. We will try to match up your +description of the design with the code submitted. Important +discrepancies between the description and the actual code will be +penalized, as will be any bugs we find by spot checks. + +The most important aspects of source code design are those that +specifically relate to the operating system issues at stake in the +task. Other issues are much less important. For example, multiple +Pintos design problems call for a ``priority queue,'' that is, a dynamic +collection from which the minimum (or maximum) item can quickly be +extracted. Fast priority queues can be implemented many ways, but we do +not expect you to build a fancy data structure even if it might improve +performance. Instead, you are welcome to use a linked list (and Pintos +even provides one with convenient functions for sorting and finding +minimums and maximums). + +Pintos is written in a consistent style. Make your additions and +modifications in existing Pintos source files blend in, not stick out. +In new source files, adopt the existing Pintos style by preference, but +make your code self-consistent at the very least. There should not be a +patchwork of different styles that makes it obvious that three different +people wrote the code. Use horizontal and vertical white space to make +code readable. Add a brief comment on every structure, structure +member, global or static variable, typedef, enumeration, and function +definition. Update +existing comments as you modify code. Don't comment out or use the +preprocessor to ignore blocks of code (instead, remove it entirely). +Use assertions to document key invariants. Decompose code into +functions for clarity. Code that is difficult to understand because it +violates these or other ``common sense'' software engineering practices +will be penalized. + +In the end, remember your audience. Code is written primarily to be +read by humans. It has to be acceptable to the compiler too, but the +compiler doesn't care about how it looks or how well it is written. + +@node Legal and Ethical Issues +@section Legal and Ethical Issues + +Pintos is distributed under a liberal license that allows free use, +modification, and distribution. Students and others who work on Pintos +own the code that they write and may use it for any purpose. +Pintos comes with NO WARRANTY, not even for MERCHANTABILITY or FITNESS +FOR A PARTICULAR PURPOSE. +@xref{License}, for details of the license and lack of warranty. + +@localhonorcodepolicy{} + +@node Acknowledgements +@section Acknowledgements + +The Pintos core and this documentation were originally written by Ben +Pfaff @email{blp@@cs.stanford.edu}. + +Additional features were contributed by Anthony Romano +@email{chz@@vt.edu}. + +The GDB macros supplied with Pintos were written by Godmar Back +@email{gback@@cs.vt.edu}, and their documentation is adapted from his +work. + +The original structure and form of Pintos was inspired by the Nachos +instructional operating system from the University of California, +Berkeley (@bibref{Christopher}). + +The Pintos tasks and documentation originated with those designed for +Nachos by current and former CS 140 teaching assistants at Stanford +University, including at least Yu Ping, Greg Hutchins, Kelly Shaw, Paul +Twohey, Sameer Qureshi, and John Rector. + +Example code for monitors (@pxref{Monitors}) is +from classroom slides originally by Dawson Engler and updated by Mendel +Rosenblum. + +Additional modifications were made to the documentation and code when adapting +it for use at Imperial College London by Mark Rutland and Feroz Abdul Salam. + +@localcredits{} + +@node Trivia +@section Trivia + +Pintos originated as a replacement for Nachos with a similar design. +Since then Pintos has greatly diverged from the Nachos design. Pintos +differs from Nachos in two important ways. First, Pintos runs on real +or simulated 80@var{x}86 hardware, but Nachos runs as a process on a +host operating system. Second, Pintos is written in C like most +real-world operating systems, but Nachos is written in C++. + +Why the name ``Pintos''? First, like nachos, pinto beans are a common +Mexican food. Second, Pintos is small and a ``pint'' is a small amount. +Third, like drivers of the eponymous car, students are likely to have +trouble with blow-ups. diff --git a/doc/license.texi b/doc/license.texi new file mode 100644 index 0000000..412b5a7 --- /dev/null +++ b/doc/license.texi @@ -0,0 +1,62 @@ +@node License +@unnumbered License + +Pintos, including its documentation, is subject to the following +license: + +@quotation +Copyright @copyright{} 2004, 2005, 2006 Board of Trustees, Leland +Stanford Jr.@: University. All rights reserved. + +Permission is hereby granted, free of charge, to any person obtaining +a copy of this software and associated documentation files (the +``Software''), to deal in the Software without restriction, including +without limitation the rights to use, copy, modify, merge, publish, +distribute, sublicense, and/or sell copies of the Software, and to +permit persons to whom the Software is furnished to do so, subject to +the following conditions: + +The above copyright notice and this permission notice shall be +included in all copies or substantial portions of the Software. + +THE SOFTWARE IS PROVIDED ``AS IS'', WITHOUT WARRANTY OF ANY KIND, +EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF +MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND +NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE +LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION +OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION +WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. +@end quotation + +A few individual files in Pintos were originally derived from other +projects, but they have been extensively modified for use in Pintos. +The original code falls under the original license, and modifications +for Pintos are additionally covered by the Pintos license above. + +In particular, code derived from Nachos is subject to the following +license: + +@quotation +Copyright @copyright{} 1992-1996 The Regents of the University of California. +All rights reserved. + +Permission to use, copy, modify, and distribute this software +and its documentation for any purpose, without fee, and +without written agreement is hereby granted, provided that the +above copyright notice and the following two paragraphs appear +in all copies of this software. + +IN NO EVENT SHALL THE UNIVERSITY OF CALIFORNIA BE LIABLE TO +ANY PARTY FOR DIRECT, INDIRECT, SPECIAL, INCIDENTAL, OR +CONSEQUENTIAL DAMAGES ARISING OUT OF THE USE OF THIS SOFTWARE +AND ITS DOCUMENTATION, EVEN IF THE UNIVERSITY OF CALIFORNIA +HAS BEEN ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. + +THE UNIVERSITY OF CALIFORNIA SPECIFICALLY DISCLAIMS ANY +WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED +WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR +PURPOSE. THE SOFTWARE PROVIDED HEREUNDER IS ON AN ``AS IS'' +BASIS, AND THE UNIVERSITY OF CALIFORNIA HAS NO OBLIGATION TO +PROVIDE MAINTENANCE, SUPPORT, UPDATES, ENHANCEMENTS, OR +MODIFICATIONS. +@end quotation diff --git a/doc/localgitinstructions.texi b/doc/localgitinstructions.texi new file mode 100644 index 0000000..dfe0603 --- /dev/null +++ b/doc/localgitinstructions.texi @@ -0,0 +1,118 @@ +@c +@c Instructions on how to set up a group environment, permissions, +@c Git repository, dealing with issues etc. +@c +@c While some of the discussion may apply to more than one environment, +@c no attempt was made to untangle and split the discussion. +@c + +@menu +* Setting Up Git:: +* Using a bare repository:: +* Using Git:: +@end menu + +@node Setting Up Git +@subsection Setting Up Git + +It is recommended when using git for this task to decide on one +group member to be in charge of the canonical repository. It is +recommended that those unfamiliar with git should work together in +the same repository rather than creating copies for each member. + +To make the Git log easier to read later, set the user.name and +user.email variables in your .gitconfig file: +@verbatim +[user] + name = Firstname Surname + email = example@doc.ic.ac.uk +@end verbatim + +To work on the source code, you must create a clone of the initial +repository. This can be done by running +@samp{git clone @value{localpintosgitpath} pintos} to copy +the initial repository into a directory named @samp{pintos}. + +@node Using a bare repository +@subsection Using a bare repository + +If you wish to work with repositories for each group member, you can use +a @strong{bare repository} to make synchronising your development easier. + +To create a bare repository, go to your shared directory and run the +following: + +@verbatim + chmod g+s . + git clone --bare /vol/lab/secondyear/osexercise/pintos.git pintos-ic + cd pintos-ic + git init --bare --shared=group +@end verbatim + +This will make sure group permissions are inherited by the repository when +it is cloned, and will also ensure that the repository behaves correctly with +multiple contributors. + +Once this is done, you can add your group repository as a remote from your +personal repository with +@samp{git remote add group /vol/project/2010/261/g10261@var{$GROUP}/pintos-ic}. +You can then share commits using @samp{git fetch}, @samp{git pull}, and +@samp{git push}. For instance, you can share commits from your personal repo +with @samp{git push group master}, and get commits from the group repo with +@samp{get fetch group}. These commits can be merged as usual. If you want to +merge the master branch directly, you can use @samp{git pull group master}. + +As bare repos are not `special' in any way, they can be deleted and recreated +if required. Just be sure that you have any commits you want to keep in your +personal repo before deleting. + +@node Using Git +@subsection Using Git + +Once you've cloned the repository, you can start working in your clone +straight away. At any point you can see what files you've modified with +@samp{git status}, and check a file in greater detail with +@samp{git diff @var{filename}}. You view more detailed information using +tools such as @samp{tig} + +Git uses an intermediary area between the working filesystem and the actual +repository, known as the staging area (or index). This allows you to perform +tasks such as commiting only a subset of your changes, without modifying your +copy of the filesystem. Whilst the uses of the staging area are outside the +scope of this guide, it is important that you are aware of its existence. + +When you want to place your modifications into the repository, you must +first update the staging area with your changes (@samp{git add @var{filename}}, +and then use this to update the repository, using @samp{git commit}. Git +will open a text editor when commiting, allowing you to provide a description +of your changes. This can be useful later for reviewing the repository, +so be sensible with your commit messages. + +If your group is using a repository per person, rather than working +together using one, you may make conflicting changes at some point, +which git is unable to solve. these problems can be solved using +@samp{git mergetool}, but its use is outside the scope of this dicussion. + +You can view the history of @var{file} in your working directory, +including the log messages, with @samp{git log @var{file}}. + +You can give a particular set of file versions a name called a +@dfn{tag}. Simply execute @samp{git tag @var{name}}. It's best +to have no local changes in the working copy when you do this, because +the tag will not include uncommitted changes. To recover the tagged +commit later, simply execute @samp{git checkout @var{tag}}. + +If you add a new file to the source tree, you'll need to add it to the +repository with @samp{git add @var{file}}. This command does not have +lasting effect until the file is committed later with @samp{git +commit}. + +To remove a file from the source tree, first remove it from the file +system with @samp{git rm @var{file}}. Again, only @samp{git commit} +will make the change permanent. + +To discard your local changes for a given file, without committing +them, use @samp{git checkout @var{file} -f}. + +For more information, visit the @uref{https://www.git-scm.com/, , Git +home page}. diff --git a/doc/localsettings.texi b/doc/localsettings.texi new file mode 100644 index 0000000..39e6c53 --- /dev/null +++ b/doc/localsettings.texi @@ -0,0 +1,60 @@ +@c Local settings + +@set coursenumber OS211 +@set localpintosgitpath /vol/lab/secondyear/osexercise/pintos.git +@set localpintosbindir /vol/lab/secondyear/bin/ + +@set recommendvnc +@clear recommendcygwin + +@macro localmachines{} +The machines officially supported for OS211 Pintos development are +the Linux machines in the labs managed by CSG, as described on +the @uref{http://www.doc.ic.ac.uk/csg/facilities/lab/workstations, , +CSG webpage}. +@end macro + +@macro localpathsetup{} +Pintos utilities can be located at @value{localpintosbindir} on CSG-run +lab machines. +@end macro + +@macro localcrossbuild{} +Watch the commands executed during the build. On the Linux machines, +the ordinary system tools are used. +@end macro + +@macro localhonorcodepolicy{} +Please respect the plagiarism policy by refraining +from reading any homework solutions available online or elsewhere. Reading +the source code for other operating system kernels, such as Linux or FreeBSD, +is allowed, but do not copy code from them literally. Please cite the code +that inspired your own in your design documentation. +@end macro + +@macro localcredits{} +@c none needed +@end macro + +@macro localgitpolicy{} +Instead, we recommend integrating your team's changes early and often, +using a source code control system such as Git (@pxref{Git}), for which +the Pintos-ic folder has already been set up for. + +This is less likely to produce surprises, because everyone can see +everyone else's code as it is written, instead of just when it is +finished. These systems also make it possible to review changes and, +when a change introduces a bug, drop back to working versions of code. +@end macro + +@macro localcodingstandards{} +All of you should be familiar with good coding standards by now. This project +will be much easier to complete and grade if you maintain the code style used +in the original code, and employ sensible variable naming policies. Code style +makes up a large part of your final grade for this work, and will be +scrutinised carefully +@end macro + +@macro localdevelopmenttools{} +@c Descriptions of additional, local development tools can be inserted here +@end macro diff --git a/doc/pintos-ic.texi b/doc/pintos-ic.texi new file mode 100644 index 0000000..bae593f --- /dev/null +++ b/doc/pintos-ic.texi @@ -0,0 +1,90 @@ +\input texinfo @c -*- texinfo -*- + +@c %**start of header +@setfilename pintos-ic.info +@settitle Pintos Tasks +@c %**end of header + +@c @bibref{} macro +@iftex +@macro bibref{cite} +[\cite\] +@end macro +@afourpaper +@end iftex +@ifinfo +@ifnotplaintext +@macro bibref{cite} +@ref{\cite\} +@end macro +@end ifnotplaintext +@ifplaintext +@macro bibref{cite} +[\cite\] +@end macro +@end ifplaintext +@end ifinfo +@ifhtml +@macro bibref{cite} +[@ref{\cite\}] +@end macro +@end ifhtml + +@macro func{name} +@code{\name\()} +@end macro + +@macro struct{name} +@code{struct \name\} +@end macro + +@titlepage +@title Pintos (Imperial College Edition) +Version 2 +@author Originally by Ben Pfaff +@end titlepage + +@shortcontents +@contents + +@ifnottex +@node Top, Introduction, (dir), (dir) +@top Pintos Tasks +@end ifnottex + +@menu +* Introduction:: +* Task 0--Codebase:: +* Task 1--Threads:: +* Task 2--User Programs:: +* Task 3--Virtual Memory:: +* Reference Guide:: +* 4.4BSD Scheduler:: +* Coding Standards:: +* Task Documentation:: +* Debugging Tools:: +* Development Tools:: +* Installing Pintos:: +* Bibliography:: +* License:: +@end menu + +@c institution-local settings +@include localsettings.texi + +@include intro.texi +@include codebase.texi +@include threads.texi +@include userprog.texi +@include vm.texi +@include reference.texi +@include 44bsd.texi +@include standards.texi +@include doc.texi +@include debug.texi +@include devel.texi +@include installation.texi +@include bibliography.texi +@include license.texi + +@bye diff --git a/doc/pintos-t2h.init b/doc/pintos-t2h.init new file mode 100644 index 0000000..064e25a --- /dev/null +++ b/doc/pintos-t2h.init @@ -0,0 +1,16 @@ +sub T2H_InitGlobals +{ + # Set the default body text, inserted between + $T2H_BODYTEXT = ''; + # text inserted after + $T2H_AFTER_BODY_OPEN = ''; + #text inserted before + $T2H_PRE_BODY_CLOSE = ''; + # this is used in footer + $T2H_ADDRESS = "$T2H_USER " if $T2H_USER; + $T2H_ADDRESS .= "on $T2H_TODAY"; + # this is added inside after and some META NAME stuff + # can be used for <style> <script>, <meta> tags + $T2H_EXTRA_HEAD = "<LINK REL=\"stylesheet\" HREF=\"pintos.css\">"; +} +1; diff --git a/doc/pintos.css b/doc/pintos.css new file mode 100644 index 0000000..0af878f --- /dev/null +++ b/doc/pintos.css @@ -0,0 +1,76 @@ +body { + background: white; + color: black; + padding: 0em 1em 0em 3em; + margin: 0; + margin-left: auto; + margin-right: auto; + max-width: 8in; + text-align: justify +} +body>p { + margin: 0pt 0pt 0pt 0em; + text-align: justify +} +body>p + p { + margin: .75em 0pt 0pt 0pt +} +H1 { + font-size: 150%; + margin-left: -1.33em +} +H2 { + font-size: 125%; + font-weight: bold; + margin-left: -.8em +} +H3 { + font-size: 100%; + font-weight: bold; + margin-left: -.5em } +H4 { + font-size: 100%; + margin-left: 0em +} +H1, H2, H3, H4, H5, H6 { + font-family: sans-serif; + color: blue +} +H1, H2 { + text-decoration: underline +} +html { + margin: 0; + font-weight: lighter +} +tt, code { + font-family: sans-serif +} +b, strong { + font-weight: bold +} + +a:link { + color: blue; + text-decoration: none; +} +a:visited { + color: gray; + text-decoration: none; +} +a:active { + color: black; + text-decoration: none; +} +a:hover { + text-decoration: underline +} + +address { + font-size: 90%; + font-style: normal +} + +HR { + display: none +} diff --git a/doc/reference.texi b/doc/reference.texi new file mode 100644 index 0000000..2145f27 --- /dev/null +++ b/doc/reference.texi @@ -0,0 +1,2284 @@ +@node Reference Guide +@appendix Reference Guide + +This chapter is a reference for the Pintos code. The reference guide +does not cover all of the code in Pintos, but it does cover those +pieces that students most often find troublesome. You may find that +you want to read each part of the reference guide as you work on the +task where it becomes important. + +We recommend using ``tags'' to follow along with references to function +and variable names (@pxref{Tags}). + +@menu +* Pintos Loading:: +* Threads:: +* Synchronization:: +* Interrupt Handling:: +* Memory Allocation:: +* Virtual Addresses:: +* Page Table:: +* Hash Table:: +@end menu + +@node Pintos Loading +@section Loading + +This section covers the Pintos loader and basic kernel +initialization. + +@menu +* Pintos Loader:: +* Low-Level Kernel Initialization:: +* High-Level Kernel Initialization:: +* Physical Memory Map:: +@end menu + +@node Pintos Loader +@subsection The Loader + +The first part of Pintos that runs is the loader, in +@file{threads/loader.S}. The PC BIOS loads the loader into memory. +The loader, in turn, is responsible for finding the kernel on disk, +loading it into memory, and then jumping to its start. It's +not important to understand exactly how the loader works, but if +you're interested, read on. You should probably read along with the +loader's source. You should also understand the basics of the +80@var{x}86 architecture as described by chapter 3, ``Basic Execution +Environment,'' of @bibref{IA32-v1}. + +The PC BIOS loads the loader from the first sector of the first hard +disk, called the @dfn{master boot record} (MBR). PC conventions +reserve 64 bytes of the MBR for the partition table, and Pintos uses +about 128 additional bytes for kernel command-line arguments. This +leaves a little over 300 bytes for the loader's own code. This is a +severe restriction that means, practically speaking, the loader must +be written in assembly language. + +The Pintos loader and kernel don't have to be on the same disk, nor +does is the kernel required to be in any particular location on a +given disk. The loader's first job, then, is to find the kernel by +reading the partition table on each hard disk, looking for a bootable +partition of the type used for a Pintos kernel. + +When the loader finds a bootable kernel partition, it reads the +partition's contents into memory at physical address @w{128 kB}. The +kernel is at the beginning of the partition, which might be larger +than necessary due to partition boundary alignment conventions, so the +loader reads no more than @w{512 kB} (and the Pintos build process +will refuse to produce kernels larger than that). Reading more data +than this would cross into the region from @w{640 kB} to @w{1 MB} that +the PC architecture reserves for hardware and the BIOS, and a standard +PC BIOS does not provide any means to load the kernel above @w{1 MB}. + +The loader's final job is to extract the entry point from the loaded +kernel image and transfer control to it. The entry point is not at a +predictable location, but the kernel's ELF header contains a pointer +to it. The loader extracts the pointer and jumps to the location it +points to. + +The Pintos kernel command line +is stored in the boot loader. The @command{pintos} program actually +modifies a copy of the boot loader on disk each time it runs the kernel, +inserting whatever command-line arguments the user supplies to the kernel, +and then the kernel at boot time reads those arguments out of the boot +loader in memory. This is not an elegant solution, but it is simple +and effective. + +@node Low-Level Kernel Initialization +@subsection Low-Level Kernel Initialization + +The loader's last action is to transfer control to the kernel's entry +point, which is @func{start} in @file{threads/start.S}. The job of +this code is to switch the CPU from legacy 16-bit ``real mode'' into +the 32-bit ``protected mode'' used by all modern 80@var{x}86 operating +systems. + +The startup code's first task is actually to obtain the machine's +memory size, by asking the BIOS for the PC's memory size. The +simplest BIOS function to do this can only detect up to 64 MB of RAM, +so that's the practical limit that Pintos can support. The function +stores the memory size, in pages, in global variable +@code{init_ram_pages}. + +The first part of CPU initialization is to enable the A20 line, that +is, the CPU's address line numbered 20. For historical reasons, PCs +boot with this address line fixed at 0, which means that attempts to +access memory beyond the first 1 MB (2 raised to the 20th power) will +fail. Pintos wants to access more memory than this, so we have to +enable it. + +Next, the loader creates a basic page table. This page table maps +the 64 MB at the base of virtual memory (starting at virtual address +0) directly to the identical physical addresses. It also maps the +same physical memory starting at virtual address +@code{LOADER_PHYS_BASE}, which defaults to @t{0xc0000000} (3 GB). The +Pintos kernel only wants the latter mapping, but there's a +chicken-and-egg problem if we don't include the former: our current +virtual address is roughly @t{0x20000}, the location where the loader +put us, and we can't jump to @t{0xc0020000} until we turn on the +page table, but if we turn on the page table without jumping there, +then we've just pulled the rug out from under ourselves. + +After the page table is initialized, we load the CPU's control +registers to turn on protected mode and paging, and set up the segment +registers. We aren't yet equipped to handle interrupts in protected +mode, so we disable interrupts. The final step is to call @func{main}. + +@node High-Level Kernel Initialization +@subsection High-Level Kernel Initialization + +The kernel proper starts with the @func{main} function. The +@func{main} function is written in C, as will be most of the code we +encounter in Pintos from here on out. + +When @func{main} starts, the system is in a pretty raw state. We're +in 32-bit protected mode with paging enabled, but hardly anything else is +ready. Thus, the @func{main} function consists primarily of calls +into other Pintos modules' initialization functions. +These are usually named @func{@var{module}_init}, where +@var{module} is the module's name, @file{@var{module}.c} is the +module's source code, and @file{@var{module}.h} is the module's +header. + +The first step in @func{main} is to call @func{bss_init}, which clears +out the kernel's ``BSS'', which is the traditional name for a +segment that should be initialized to all zeros. In most C +implementations, whenever you +declare a variable outside a function without providing an +initializer, that variable goes into the BSS. Because it's all zeros, the +BSS isn't stored in the image that the loader brought into memory. We +just use @func{memset} to zero it out. + +Next, @func{main} calls @func{read_command_line} to break the kernel command +line into arguments, then @func{parse_options} to read any options at +the beginning of the command line. (Actions specified on the +command line execute later.) + +@func{thread_init} initializes the thread system. We will defer full +discussion to our discussion of Pintos threads below. It is called so +early in initialization because a valid thread structure is a +prerequisite for acquiring a lock, and lock acquisition in turn is +important to other Pintos subsystems. Then we initialize the console +and print a startup message to the console. + +The next block of functions we call initializes the kernel's memory +system. @func{palloc_init} sets up the kernel page allocator, which +doles out memory one or more pages at a time (@pxref{Page Allocator}). +@func{malloc_init} sets +up the allocator that handles allocations of arbitrary-size blocks of +memory (@pxref{Block Allocator}). +@func{paging_init} sets up a page table for the kernel (@pxref{Page +Table}). + +In tasks 2 and later, @func{main} also calls @func{tss_init} and +@func{gdt_init}. + +The next set of calls initializes the interrupt system. +@func{intr_init} sets up the CPU's @dfn{interrupt descriptor table} +(IDT) to ready it for interrupt handling (@pxref{Interrupt +Infrastructure}), then @func{timer_init} and @func{kbd_init} prepare for +handling timer interrupts and keyboard interrupts, respectively. +@func{input_init} sets up to merge serial and keyboard input into one +stream. In +tasks 2 and later, we also prepare to handle interrupts caused by +user programs using @func{exception_init} and @func{syscall_init}. + +Now that interrupts are set up, we can start the scheduler +with @func{thread_start}, which creates the idle thread and enables +interrupts. +With interrupts enabled, interrupt-driven serial port I/O becomes +possible, so we use +@func{serial_init_queue} to switch to that mode. Finally, +@func{timer_calibrate} calibrates the timer for accurate short delays. + +If the file system is compiled in, as it will starting in task 2, we +initialize the IDE disks with @func{ide_init}, then the +file system with @func{filesys_init}. + +Boot is complete, so we print a message. + +Function @func{run_actions} now parses and executes actions specified on +the kernel command line, such as @command{run} to run a test (in task +1) or a user program (in later tasks). + +Finally, if @option{-q} was specified on the kernel command line, we +call @func{shutdown_power_off} to terminate the machine simulator. Otherwise, +@func{main} calls @func{thread_exit}, which allows any other running +threads to continue running. + +@node Physical Memory Map +@subsection Physical Memory Map + +@multitable {@t{00000000}--@t{00000000}} {Hardware} {Some much longer explanatory text} +@headitem Memory Range +@tab Owner +@tab Contents + +@item @t{00000000}--@t{000003ff} @tab CPU @tab Real mode interrupt table. +@item @t{00000400}--@t{000005ff} @tab BIOS @tab Miscellaneous data area. +@item @t{00000600}--@t{00007bff} @tab --- @tab --- +@item @t{00007c00}--@t{00007dff} @tab Pintos @tab Loader. +@item @t{0000e000}--@t{0000efff} @tab Pintos +@tab Stack for loader; kernel stack and @struct{thread} for initial +kernel thread. +@item @t{0000f000}--@t{0000ffff} @tab Pintos +@tab Page directory for startup code. +@item @t{00010000}--@t{00020000} @tab Pintos +@tab Page tables for startup code. +@item @t{00020000}--@t{0009ffff} @tab Pintos +@tab Kernel code, data, and uninitialized data segments. +@item @t{000a0000}--@t{000bffff} @tab Video @tab VGA display memory. +@item @t{000c0000}--@t{000effff} @tab Hardware +@tab Reserved for expansion card RAM and ROM. +@item @t{000f0000}--@t{000fffff} @tab BIOS @tab ROM BIOS. +@item @t{00100000}--@t{03ffffff} @tab Pintos @tab Dynamic memory allocation. +@end multitable + +@node Threads +@section Threads + +@menu +* struct thread:: +* Thread Functions:: +* Thread Switching:: +@end menu + +@node struct thread +@subsection @code{struct thread} + +The main Pintos data structure for threads is @struct{thread}, +declared in @file{threads/thread.h}. + +@deftp {Structure} {struct thread} +Represents a thread or a user process. In the tasks, you will have +to add your own members to @struct{thread}. You may also change or +delete the definitions of existing members. + +Every @struct{thread} occupies the beginning of its own page of +memory. The rest of the page is used for the thread's stack, which +grows downward from the end of the page. It looks like this: + +@example +@group + 4 kB +---------------------------------+ + | kernel stack | + | | | + | | | + | V | + | grows downward | + | | + | | + | | + | | + | | + | | + | | + | | +sizeof (struct thread) +---------------------------------+ + | magic | + | : | + | : | + | status | + | tid | + 0 kB +---------------------------------+ +@end group +@end example + +This has two consequences. First, @struct{thread} must not be allowed +to grow too big. If it does, then there will not be enough room for the +kernel stack. The base @struct{thread} is only a few bytes in size. It +probably should stay well under 1 kB. + +Second, kernel stacks must not be allowed to grow too large. If a stack +overflows, it will corrupt the thread state. Thus, kernel functions +should not allocate large structures or arrays as non-static local +variables. Use dynamic allocation with @func{malloc} or +@func{palloc_get_page} instead (@pxref{Memory Allocation}). +@end deftp + +@deftypecv {Member} {@struct{thread}} {tid_t} tid +The thread's thread identifier or @dfn{tid}. Every thread must have a +tid that is unique over the entire lifetime of the kernel. By +default, @code{tid_t} is a @code{typedef} for @code{int} and each new +thread receives the numerically next higher tid, starting from 1 for +the initial process. You can change the type and the numbering scheme +if you like. +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {enum thread_status} status +@anchor{Thread States} +The thread's state, one of the following: + +@defvr {Thread State} @code{THREAD_RUNNING} +The thread is running. Exactly one thread is running at a given time. +@func{thread_current} returns the running thread. +@end defvr + +@defvr {Thread State} @code{THREAD_READY} +The thread is ready to run, but it's not running right now. The +thread could be selected to run the next time the scheduler is +invoked. Ready threads are kept in a doubly linked list called +@code{ready_list}. +@end defvr + +@defvr {Thread State} @code{THREAD_BLOCKED} +The thread is waiting for something, e.g.@: a lock to become +available, an interrupt to be invoked. The thread won't be scheduled +again until it transitions to the @code{THREAD_READY} state with a +call to @func{thread_unblock}. This is most conveniently done +indirectly, using one of the Pintos synchronization primitives that +block and unblock threads automatically (@pxref{Synchronization}). + +There is no @i{a priori} way to tell what a blocked thread is waiting +for, but a backtrace can help (@pxref{Backtraces}). +@end defvr + +@defvr {Thread State} @code{THREAD_DYING} +The thread will be destroyed by the scheduler after switching to the +next thread. +@end defvr +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {char} name[16] +The thread's name as a string, or at least the first few characters of +it. +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {uint8_t *} stack +Every thread has its own stack to keep track of its state. When the +thread is running, the CPU's stack pointer register tracks the top of +the stack and this member is unused. But when the CPU switches to +another thread, this member saves the thread's stack pointer. No +other members are needed to save the thread's registers, because the +other registers that must be saved are saved on the stack. + +When an interrupt occurs, whether in the kernel or a user program, an +@struct{intr_frame} is pushed onto the stack. When the interrupt occurs +in a user program, the @struct{intr_frame} is always at the very top of +the page. @xref{Interrupt Handling}, for more information. +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {int} priority +A thread priority, ranging from @code{PRI_MIN} (0) to @code{PRI_MAX} +(63). Lower numbers correspond to lower priorities, so that +priority 0 is the lowest priority and priority 63 is the highest. +Pintos as provided ignores thread priorities, but you will implement +priority scheduling in task 1 (@pxref{Priority Scheduling}). +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {@struct{list_elem}} allelem +This ``list element'' is used to link the thread into the list of all +threads. Each thread is inserted into this list when it is created +and removed when it exits. The @func{thread_foreach} function should +be used to iterate over all threads. +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {@struct{list_elem}} elem +A ``list element'' used to put the thread into doubly linked lists, +either @code{ready_list} (the list of threads ready to run) or a list of +threads waiting on a semaphore in @func{sema_down}. It can do double +duty because a thread waiting on a semaphore is not ready, and vice +versa. +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {uint32_t *} pagedir +Only present in task 2 and later. @xref{Page Tables}. +@end deftypecv + +@deftypecv {Member} {@struct{thread}} {unsigned} magic +Always set to @code{THREAD_MAGIC}, which is just an arbitrary number defined +in @file{threads/thread.c}, and used to detect stack overflow. +@func{thread_current} checks that the @code{magic} member of the running +thread's @struct{thread} is set to @code{THREAD_MAGIC}. Stack overflow +tends to change this value, triggering the assertion. For greatest +benefit, as you add members to @struct{thread}, leave @code{magic} at +the end. +@end deftypecv + +@node Thread Functions +@subsection Thread Functions + +@file{threads/thread.c} implements several public functions for thread +support. Let's take a look at the most useful: + +@deftypefun void thread_init (void) +Called by @func{main} to initialize the thread system. Its main +purpose is to create a @struct{thread} for Pintos's initial thread. +This is possible because the Pintos loader puts the initial +thread's stack at the top of a page, in the same position as any other +Pintos thread. + +Before @func{thread_init} runs, +@func{thread_current} will fail because the running thread's +@code{magic} value is incorrect. Lots of functions call +@func{thread_current} directly or indirectly, including +@func{lock_acquire} for locking a lock, so @func{thread_init} is +called early in Pintos initialization. +@end deftypefun + +@deftypefun void thread_start (void) +Called by @func{main} to start the scheduler. Creates the idle +thread, that is, the thread that is scheduled when no other thread is +ready. Then enables interrupts, which as a side effect enables the +scheduler because the scheduler runs on return from the timer interrupt, using +@func{intr_yield_on_return} (@pxref{External Interrupt Handling}). +@end deftypefun + +@deftypefun void thread_tick (void) +Called by the timer interrupt at each timer tick. It keeps track of +thread statistics and triggers the scheduler when a time slice expires. +@end deftypefun + +@deftypefun void thread_print_stats (void) +Called during Pintos shutdown to print thread statistics. +@end deftypefun + +@deftypefun tid_t thread_create (const char *@var{name}, int @var{priority}, thread_func *@var{func}, void *@var{aux}) +Creates and starts a new thread named @var{name} with the given +@var{priority}, returning the new thread's tid. The thread executes +@var{func}, passing @var{aux} as the function's single argument. + +@func{thread_create} allocates a page for the thread's +@struct{thread} and stack and initializes its members, then it sets +up a set of fake stack frames for it (@pxref{Thread Switching}). The +thread is initialized in the blocked state, then unblocked just before +returning, which allows the new thread to +be scheduled (@pxref{Thread States}). + +@deftp {Type} {void thread_func (void *@var{aux})} +This is the type of the function passed to @func{thread_create}, whose +@var{aux} argument is passed along as the function's argument. +@end deftp +@end deftypefun + +@deftypefun void thread_block (void) +Transitions the running thread from the running state to the blocked +state (@pxref{Thread States}). The thread will not run again until +@func{thread_unblock} is +called on it, so you'd better have some way arranged for that to happen. +Because @func{thread_block} is so low-level, you should prefer to use +one of the synchronization primitives instead (@pxref{Synchronization}). +@end deftypefun + +@deftypefun void thread_unblock (struct thread *@var{thread}) +Transitions @var{thread}, which must be in the blocked state, to the +ready state, allowing it to resume running (@pxref{Thread States}). +This is called when the event that the thread is waiting for occurs, +e.g.@: when the lock that +the thread is waiting on becomes available. +@end deftypefun + +@deftypefun {struct thread *} thread_current (void) +Returns the running thread. +@end deftypefun + +@deftypefun {tid_t} thread_tid (void) +Returns the running thread's thread id. Equivalent to +@code{thread_current ()->tid}. +@end deftypefun + +@deftypefun {const char *} thread_name (void) +Returns the running thread's name. Equivalent to @code{thread_current +()->name}. +@end deftypefun + +@deftypefun void thread_exit (void) @code{NO_RETURN} +Causes the current thread to exit. Never returns, hence +@code{NO_RETURN} (@pxref{Function and Parameter Attributes}). +@end deftypefun + +@deftypefun void thread_yield (void) +Yields the CPU to the scheduler, which picks a new thread to run. The +new thread might be the current thread, so you can't depend on this +function to keep this thread from running for any particular length of +time. +@end deftypefun + +@deftypefun void thread_foreach (thread_action_func *@var{action}, void *@var{aux}) +Iterates over all threads @var{t} and invokes @code{action(t, aux)} on each. +@var{action} must refer to a function that matches the signature +given by @func{thread_action_func}: + +@deftp {Type} {void thread_action_func (struct thread *@var{thread}, void *@var{aux})} +Performs some action on a thread, given @var{aux}. +@end deftp +@end deftypefun + +@deftypefun int thread_get_priority (void) +@deftypefunx void thread_set_priority (int @var{new_priority}) +Stub to set and get thread priority. @xref{Priority Scheduling}. +@end deftypefun + +@deftypefun int thread_get_nice (void) +@deftypefunx void thread_set_nice (int @var{new_nice}) +@deftypefunx int thread_get_recent_cpu (void) +@deftypefunx int thread_get_load_avg (void) +Stubs for the advanced scheduler. @xref{4.4BSD Scheduler}. +@end deftypefun + +@node Thread Switching +@subsection Thread Switching + +@func{schedule} is responsible for switching threads. It +is internal to @file{threads/thread.c} and called only by the three +public thread functions that need to switch threads: +@func{thread_block}, @func{thread_exit}, and @func{thread_yield}. +Before any of these functions call @func{schedule}, they disable +interrupts (or ensure that they are already disabled) and then change +the running thread's state to something other than running. + +@func{schedule} is short but tricky. It records the +current thread in local variable @var{cur}, determines the next thread +to run as local variable @var{next} (by calling +@func{next_thread_to_run}), and then calls @func{switch_threads} to do +the actual thread switch. The thread we switched to was also running +inside @func{switch_threads}, as are all the threads not currently +running, so the new thread now returns out of +@func{switch_threads}, returning the previously running thread. + +@func{switch_threads} is an assembly language routine in +@file{threads/switch.S}. It saves registers on the stack, saves the +CPU's current stack pointer in the current @struct{thread}'s @code{stack} +member, restores the new thread's @code{stack} into the CPU's stack +pointer, restores registers from the stack, and returns. + +The rest of the scheduler is implemented in @func{thread_schedule_tail}. It +marks the new thread as running. If the thread we just switched from +is in the dying state, then it also frees the page that contained the +dying thread's @struct{thread} and stack. These couldn't be freed +prior to the thread switch because the switch needed to use it. + +Running a thread for the first time is a special case. When +@func{thread_create} creates a new thread, it goes through a fair +amount of trouble to get it started properly. In particular, the new +thread hasn't started running yet, so there's no way for it to be +running inside @func{switch_threads} as the scheduler expects. To +solve the problem, @func{thread_create} creates some fake stack frames +in the new thread's stack: + +@itemize @bullet +@item +The topmost fake stack frame is for @func{switch_threads}, represented +by @struct{switch_threads_frame}. The important part of this frame is +its @code{eip} member, the return address. We point @code{eip} to +@func{switch_entry}, indicating it to be the function that called +@func{switch_entry}. + +@item +The next fake stack frame is for @func{switch_entry}, an assembly +language routine in @file{threads/switch.S} that adjusts the stack +pointer,@footnote{This is because @func{switch_threads} takes +arguments on the stack and the 80@var{x}86 SVR4 calling convention +requires the caller, not the called function, to remove them when the +call is complete. See @bibref{SysV-i386} chapter 3 for details.} +calls @func{thread_schedule_tail} (this special case is why +@func{thread_schedule_tail} is separate from @func{schedule}), and returns. +We fill in its stack frame so that it returns into +@func{kernel_thread}, a function in @file{threads/thread.c}. + +@item +The final stack frame is for @func{kernel_thread}, which enables +interrupts and calls the thread's function (the function passed to +@func{thread_create}). If the thread's function returns, it calls +@func{thread_exit} to terminate the thread. +@end itemize + +@node Synchronization +@section Synchronization + +If sharing of resources between threads is not handled in a careful, +controlled fashion, the result is usually a big mess. +This is especially the case in operating system kernels, where +faulty sharing can crash the entire machine. Pintos provides several +synchronization primitives to help out. + +@menu +* Disabling Interrupts:: +* Semaphores:: +* Locks:: +* Monitors:: +* Optimization Barriers:: +@end menu + +@node Disabling Interrupts +@subsection Disabling Interrupts + +The crudest way to do synchronization is to disable interrupts, that +is, to temporarily prevent the CPU from responding to interrupts. If +interrupts are off, no other thread will preempt the running thread, +because thread preemption is driven by the timer interrupt. If +interrupts are on, as they normally are, then the running thread may +be preempted by another at any time, whether between two C statements +or even within the execution of one. + +Incidentally, this means that Pintos is a ``preemptible kernel,'' that +is, kernel threads can be preempted at any time. Traditional Unix +systems are ``nonpreemptible,'' that is, kernel threads can only be +preempted at points where they explicitly call into the scheduler. +(User programs can be preempted at any time in both models.) As you +might imagine, preemptible kernels require more explicit +synchronization. + +You should have little need to set the interrupt state directly. Most +of the time you should use the other synchronization primitives +described in the following sections. The main reason to disable +interrupts is to synchronize kernel threads with external interrupt +handlers, which cannot sleep and thus cannot use most other forms of +synchronization (@pxref{External Interrupt Handling}). + +Some external interrupts cannot be postponed, even by disabling +interrupts. These interrupts, called @dfn{non-maskable interrupts} +(NMIs), are supposed to be used only in emergencies, e.g.@: when the +computer is on fire. Pintos does not handle non-maskable interrupts. + +Types and functions for disabling and enabling interrupts are in +@file{threads/interrupt.h}. + +@deftp Type {enum intr_level} +One of @code{INTR_OFF} or @code{INTR_ON}, denoting that interrupts are +disabled or enabled, respectively. +@end deftp + +@deftypefun {enum intr_level} intr_get_level (void) +Returns the current interrupt state. +@end deftypefun + +@deftypefun {enum intr_level} intr_set_level (enum intr_level @var{level}) +Turns interrupts on or off according to @var{level}. Returns the +previous interrupt state. +@end deftypefun + +@deftypefun {enum intr_level} intr_enable (void) +Turns interrupts on. Returns the previous interrupt state. +@end deftypefun + +@deftypefun {enum intr_level} intr_disable (void) +Turns interrupts off. Returns the previous interrupt state. +@end deftypefun + +@node Semaphores +@subsection Semaphores + +A @dfn{semaphore} is a nonnegative integer together with two operators +that manipulate it atomically, which are: + +@itemize @bullet +@item +``Down'' or ``P'': wait for the value to become positive, then +decrement it. + +@item +``Up'' or ``V'': increment the value (and wake up one waiting thread, +if any). +@end itemize + +A semaphore initialized to 0 may be used to wait for an event +that will happen exactly once. For example, suppose thread @var{A} +starts another thread @var{B} and wants to wait for @var{B} to signal +that some activity is complete. @var{A} can create a semaphore +initialized to 0, pass it to @var{B} as it starts it, and then +``down'' the semaphore. When @var{B} finishes its activity, it +``ups'' the semaphore. This works regardless of whether @var{A} +``downs'' the semaphore or @var{B} ``ups'' it first. + +A semaphore initialized to 1 is typically used for controlling access +to a resource. Before a block of code starts using the resource, it +``downs'' the semaphore, then after it is done with the resource it +``ups'' the resource. In such a case a lock, described below, may be +more appropriate. + +Semaphores can also be initialized to values larger than 1. These are +rarely used. + +Semaphores were invented by Edsger Dijkstra and first used in the THE +operating system (@bibref{Dijkstra}). + +Pintos' semaphore type and operations are declared in +@file{threads/synch.h}. + +@deftp {Type} {struct semaphore} +Represents a semaphore. +@end deftp + +@deftypefun void sema_init (struct semaphore *@var{sema}, unsigned @var{value}) +Initializes @var{sema} as a new semaphore with the given initial +@var{value}. +@end deftypefun + +@deftypefun void sema_down (struct semaphore *@var{sema}) +Executes the ``down'' or ``P'' operation on @var{sema}, waiting for +its value to become positive and then decrementing it by one. +@end deftypefun + +@deftypefun bool sema_try_down (struct semaphore *@var{sema}) +Tries to execute the ``down'' or ``P'' operation on @var{sema}, +without waiting. Returns true if @var{sema} +was successfully decremented, or false if it was already +zero and thus could not be decremented without waiting. Calling this +function in a +tight loop wastes CPU time, so use @func{sema_down} or find a +different approach instead. +@end deftypefun + +@deftypefun void sema_up (struct semaphore *@var{sema}) +Executes the ``up'' or ``V'' operation on @var{sema}, +incrementing its value. If any threads are waiting on +@var{sema}, wakes one of them up. + +Unlike most synchronization primitives, @func{sema_up} may be called +inside an external interrupt handler (@pxref{External Interrupt +Handling}). +@end deftypefun + +Semaphores are internally built out of disabling interrupt +(@pxref{Disabling Interrupts}) and thread blocking and unblocking +(@func{thread_block} and @func{thread_unblock}). Each semaphore maintains +a list of waiting threads, using the linked list +implementation in @file{lib/kernel/list.c}. + +@node Locks +@subsection Locks + +A @dfn{lock} is like a semaphore with an initial value of 1 +(@pxref{Semaphores}). A lock's equivalent of ``up'' is called +``release'', and the ``down'' operation is called ``acquire''. + +Compared to a semaphore, a lock has one added restriction: only the +thread that acquires a lock, called the lock's ``owner'', is allowed to +release it. If this restriction is a problem, it's a good sign that a +semaphore should be used, instead of a lock. + +Locks in Pintos are not ``recursive,'' that is, it is an error for the +thread currently holding a lock to try to acquire that lock. + +Lock types and functions are declared in @file{threads/synch.h}. + +@deftp {Type} {struct lock} +Represents a lock. +@end deftp + +@deftypefun void lock_init (struct lock *@var{lock}) +Initializes @var{lock} as a new lock. +The lock is not initially owned by any thread. +@end deftypefun + +@deftypefun void lock_acquire (struct lock *@var{lock}) +Acquires @var{lock} for the current thread, first waiting for +any current owner to release it if necessary. +@end deftypefun + +@deftypefun bool lock_try_acquire (struct lock *@var{lock}) +Tries to acquire @var{lock} for use by the current thread, without +waiting. Returns true if successful, false if the lock is already +owned. Calling this function in a tight loop is a bad idea because it +wastes CPU time, so use @func{lock_acquire} instead. +@end deftypefun + +@deftypefun void lock_release (struct lock *@var{lock}) +Releases @var{lock}, which the current thread must own. +@end deftypefun + +@deftypefun bool lock_held_by_current_thread (const struct lock *@var{lock}) +Returns true if the running thread owns @var{lock}, +false otherwise. +There is no function to test whether an arbitrary thread owns a lock, +because the answer could change before the caller could act on it. +@end deftypefun + +@node Monitors +@subsection Monitors + +A @dfn{monitor} is a higher-level form of synchronization than a +semaphore or a lock. A monitor consists of data being synchronized, +plus a lock, called the @dfn{monitor lock}, and one or more +@dfn{condition variables}. Before it accesses the protected data, a +thread first acquires the monitor lock. It is then said to be ``in the +monitor''. While in the monitor, the thread has control over all the +protected data, which it may freely examine or modify. When access to +the protected data is complete, it releases the monitor lock. + +Condition variables allow code in the monitor to wait for a condition to +become true. Each condition variable is associated with an abstract +condition, e.g.@: ``some data has arrived for processing'' or ``over 10 +seconds has passed since the user's last keystroke''. When code in the +monitor needs to wait for a condition to become true, it ``waits'' on +the associated condition variable, which releases the lock and waits for +the condition to be signaled. If, on the other hand, it has caused one +of these conditions to become true, it ``signals'' the condition to wake +up one waiter, or ``broadcasts'' the condition to wake all of them. + +The theoretical framework for monitors was laid out by C.@: A.@: R.@: +Hoare (@bibref{Hoare}). Their practical usage was later elaborated in a +paper on the Mesa operating system (@bibref{Lampson}). + +Condition variable types and functions are declared in +@file{threads/synch.h}. + +@deftp {Type} {struct condition} +Represents a condition variable. +@end deftp + +@deftypefun void cond_init (struct condition *@var{cond}) +Initializes @var{cond} as a new condition variable. +@end deftypefun + +@deftypefun void cond_wait (struct condition *@var{cond}, struct lock *@var{lock}) +Atomically releases @var{lock} (the monitor lock) and waits for +@var{cond} to be signaled by some other piece of code. After +@var{cond} is signaled, reacquires @var{lock} before returning. +@var{lock} must be held before calling this function. + +Sending a signal and waking up from a wait are not an atomic operation. +Thus, typically @func{cond_wait}'s caller must recheck the condition +after the wait completes and, if necessary, wait again. See the next +section for an example. +@end deftypefun + +@deftypefun void cond_signal (struct condition *@var{cond}, struct lock *@var{lock}) +If any threads are waiting on @var{cond} (protected by monitor lock +@var{lock}), then this function wakes up one of them. If no threads are +waiting, returns without performing any action. +@var{lock} must be held before calling this function. +@end deftypefun + +@deftypefun void cond_broadcast (struct condition *@var{cond}, struct lock *@var{lock}) +Wakes up all threads, if any, waiting on @var{cond} (protected by +monitor lock @var{lock}). @var{lock} must be held before calling this +function. +@end deftypefun + +@subsubsection Monitor Example + +The classical example of a monitor is handling a buffer into which one +or more +``producer'' threads write characters and out of which one or more +``consumer'' threads read characters. To implement this we need, +besides the monitor lock, two condition variables which we will call +@var{not_full} and @var{not_empty}: + +@example +char buf[BUF_SIZE]; /* @r{Buffer.} */ +size_t n = 0; /* @r{0 <= n <= @var{BUF_SIZE}: # of characters in buffer.} */ +size_t head = 0; /* @r{@var{buf} index of next char to write (mod @var{BUF_SIZE}).} */ +size_t tail = 0; /* @r{@var{buf} index of next char to read (mod @var{BUF_SIZE}).} */ +struct lock lock; /* @r{Monitor lock.} */ +struct condition not_empty; /* @r{Signaled when the buffer is not empty.} */ +struct condition not_full; /* @r{Signaled when the buffer is not full.} */ + +@dots{}@r{initialize the locks and condition variables}@dots{} + +void put (char ch) @{ + lock_acquire (&lock); + while (n == BUF_SIZE) /* @r{Can't add to @var{buf} as long as it's full.} */ + cond_wait (¬_full, &lock); + buf[head++ % BUF_SIZE] = ch; /* @r{Add @var{ch} to @var{buf}.} */ + n++; + cond_signal (¬_empty, &lock); /* @r{@var{buf} can't be empty anymore.} */ + lock_release (&lock); +@} + +char get (void) @{ + char ch; + lock_acquire (&lock); + while (n == 0) /* @r{Can't read @var{buf} as long as it's empty.} */ + cond_wait (¬_empty, &lock); + ch = buf[tail++ % BUF_SIZE]; /* @r{Get @var{ch} from @var{buf}.} */ + n--; + cond_signal (¬_full, &lock); /* @r{@var{buf} can't be full anymore.} */ + lock_release (&lock); +@} +@end example + +Note that @code{BUF_SIZE} must divide evenly into @code{SIZE_MAX + 1} +for the above code to be completely correct. Otherwise, it will fail +the first time @code{head} wraps around to 0. In practice, +@code{BUF_SIZE} would ordinarily be a power of 2. + +@node Optimization Barriers +@subsection Optimization Barriers + +@c We should try to come up with a better example. +@c Perhaps something with a linked list? + +An @dfn{optimization barrier} is a special statement that prevents the +compiler from making assumptions about the state of memory across the +barrier. The compiler will not reorder reads or writes of variables +across the barrier or assume that a variable's value is unmodified +across the barrier, except for local variables whose address is never +taken. In Pintos, @file{threads/synch.h} defines the @code{barrier()} +macro as an optimization barrier. + +One reason to use an optimization barrier is when data can change +asynchronously, without the compiler's knowledge, e.g.@: by another +thread or an interrupt handler. The @func{too_many_loops} function in +@file{devices/timer.c} is an example. This function starts out by +busy-waiting in a loop until a timer tick occurs: + +@example +/* Wait for a timer tick. */ +int64_t start = ticks; +while (ticks == start) + barrier (); +@end example + +@noindent +Without an optimization barrier in the loop, the compiler could +conclude that the loop would never terminate, because @code{start} and +@code{ticks} start out equal and the loop itself never changes them. +It could then ``optimize'' the function into an infinite loop, which +would definitely be undesirable. + +Optimization barriers can be used to avoid other compiler +optimizations. The @func{busy_wait} function, also in +@file{devices/timer.c}, is an example. It contains this loop: + +@example +while (loops-- > 0) + barrier (); +@end example + +@noindent +The goal of this loop is to busy-wait by counting @code{loops} down +from its original value to 0. Without the barrier, the compiler could +delete the loop entirely, because it produces no useful output and has +no side effects. The barrier forces the compiler to pretend that the +loop body has an important effect. + +Finally, optimization barriers can be used to force the ordering of +memory reads or writes. For example, suppose we add a ``feature'' +that, whenever a timer interrupt occurs, the character in global +variable @code{timer_put_char} is printed on the console, but only if +global Boolean variable @code{timer_do_put} is true. The best way to +set up @samp{x} to be printed is then to use an optimization barrier, +like this: + +@example +timer_put_char = 'x'; +barrier (); +timer_do_put = true; +@end example + +Without the barrier, the code is buggy because the compiler is free to +reorder operations when it doesn't see a reason to keep them in the +same order. In this case, the compiler doesn't know that the order of +assignments is important, so its optimizer is permitted to exchange +their order. There's no telling whether it will actually do this, and +it is possible that passing the compiler different optimization flags +or using a different version of the compiler will produce different +behavior. + +Another solution is to disable interrupts around the assignments. +This does not prevent reordering, but it prevents the interrupt +handler from intervening between the assignments. It also has the +extra runtime cost of disabling and re-enabling interrupts: + +@example +enum intr_level old_level = intr_disable (); +timer_put_char = 'x'; +timer_do_put = true; +intr_set_level (old_level); +@end example + +A second solution is to mark the declarations of +@code{timer_put_char} and @code{timer_do_put} as @samp{volatile}. This +keyword tells the compiler that the variables are externally observable +and restricts its latitude for optimization. However, the semantics of +@samp{volatile} are not well-defined, so it is not a good general +solution. The base Pintos code does not use @samp{volatile} at all. + +The following is @emph{not} a solution, because locks neither prevent +interrupts nor prevent the compiler from reordering the code within the +region where the lock is held: + +@example +lock_acquire (&timer_lock); /* INCORRECT CODE */ +timer_put_char = 'x'; +timer_do_put = true; +lock_release (&timer_lock); +@end example + +The compiler treats invocation of any function defined externally, +that is, in another source file, as a limited form of optimization +barrier. Specifically, the compiler assumes that any externally +defined function may access any statically or dynamically allocated +data and any local variable whose address is taken. This often means +that explicit barriers can be omitted. It is one reason that Pintos +contains few explicit barriers. + +A function defined in the same source file, or in a header included by +the source file, cannot be relied upon as a optimization barrier. +This applies even to invocation of a function before its +definition, because the compiler may read and parse the entire source +file before performing optimization. + +@node Interrupt Handling +@section Interrupt Handling + +An @dfn{interrupt} notifies the CPU of some event. Much of the work +of an operating system relates to interrupts in one way or another. +For our purposes, we classify interrupts into two broad categories: + +@itemize @bullet +@item +@dfn{Internal interrupts}, that is, interrupts caused directly by CPU +instructions. System calls, attempts at invalid memory access +(@dfn{page faults}), and attempts to divide by zero are some activities +that cause internal interrupts. Because they are caused by CPU +instructions, internal interrupts are @dfn{synchronous} or synchronized +with CPU instructions. @func{intr_disable} does not disable internal +interrupts. + +@item +@dfn{External interrupts}, that is, interrupts originating outside the +CPU. These interrupts come from hardware devices such as the system +timer, keyboard, serial ports, and disks. External interrupts are +@dfn{asynchronous}, meaning that their delivery is not +synchronized with instruction execution. Handling of external interrupts +can be postponed with @func{intr_disable} and related functions +(@pxref{Disabling Interrupts}). +@end itemize + +The CPU treats both classes of interrupts largely the same way, +so Pintos has common infrastructure to handle both classes. +The following section describes this +common infrastructure. The sections after that give the specifics of +external and internal interrupts. + +If you haven't already read chapter 3, ``Basic Execution Environment,'' +in @bibref{IA32-v1}, it is recommended that you do so now. You might +also want to skim chapter 5, ``Interrupt and Exception Handling,'' in +@bibref{IA32-v3a}. + +@menu +* Interrupt Infrastructure:: +* Internal Interrupt Handling:: +* External Interrupt Handling:: +@end menu + +@node Interrupt Infrastructure +@subsection Interrupt Infrastructure + +When an interrupt occurs, the CPU saves +its most essential state on a stack and jumps to an interrupt +handler routine. The 80@var{x}86 architecture supports 256 +interrupts, numbered 0 through 255, each with an independent +handler defined in an array called the @dfn{interrupt +descriptor table} or IDT. + +In Pintos, @func{intr_init} in @file{threads/interrupt.c} sets up the +IDT so that each entry points to a unique entry point in +@file{threads/intr-stubs.S} named @func{intr@var{NN}_stub}, where +@var{NN} is the interrupt number in +hexadecimal. Because the CPU doesn't give +us any other way to find out the interrupt number, this entry point +pushes the interrupt number on the stack. Then it jumps to +@func{intr_entry}, which pushes all the registers that the processor +didn't already push for us, and then calls @func{intr_handler}, which +brings us back into C in @file{threads/interrupt.c}. + +The main job of @func{intr_handler} is to call the function +registered for handling the particular interrupt. (If no +function is registered, it dumps some information to the console and +panics.) It also does some extra processing for external +interrupts (@pxref{External Interrupt Handling}). + +When @func{intr_handler} returns, the assembly code in +@file{threads/intr-stubs.S} restores all the CPU registers saved +earlier and directs the CPU to return from the interrupt. + +The following types and functions are common to all +interrupts. + +@deftp {Type} {void intr_handler_func (struct intr_frame *@var{frame})} +This is how an interrupt handler function must be declared. Its @var{frame} +argument (see below) allows it to determine the cause of the interrupt +and the state of the thread that was interrupted. +@end deftp + +@deftp {Type} {struct intr_frame} +The stack frame of an interrupt handler, as saved by the CPU, the interrupt +stubs, and @func{intr_entry}. Its most interesting members are described +below. +@end deftp + +@deftypecv {Member} {@struct{intr_frame}} uint32_t edi +@deftypecvx {Member} {@struct{intr_frame}} uint32_t esi +@deftypecvx {Member} {@struct{intr_frame}} uint32_t ebp +@deftypecvx {Member} {@struct{intr_frame}} uint32_t esp_dummy +@deftypecvx {Member} {@struct{intr_frame}} uint32_t ebx +@deftypecvx {Member} {@struct{intr_frame}} uint32_t edx +@deftypecvx {Member} {@struct{intr_frame}} uint32_t ecx +@deftypecvx {Member} {@struct{intr_frame}} uint32_t eax +@deftypecvx {Member} {@struct{intr_frame}} uint16_t es +@deftypecvx {Member} {@struct{intr_frame}} uint16_t ds +Register values in the interrupted thread, pushed by @func{intr_entry}. +The @code{esp_dummy} value isn't actually used (refer to the +description of @code{PUSHA} in @bibref{IA32-v2b} for details). +@end deftypecv + +@deftypecv {Member} {@struct{intr_frame}} uint32_t vec_no +The interrupt vector number, ranging from 0 to 255. +@end deftypecv + +@deftypecv {Member} {@struct{intr_frame}} uint32_t error_code +The ``error code'' pushed on the stack by the CPU for some internal +interrupts. +@end deftypecv + +@deftypecv {Member} {@struct{intr_frame}} void (*eip) (void) +The address of the next instruction to be executed by the interrupted +thread. +@end deftypecv + +@deftypecv {Member} {@struct{intr_frame}} {void *} esp +The interrupted thread's stack pointer. +@end deftypecv + +@deftypefun {const char *} intr_name (uint8_t @var{vec}) +Returns the name of the interrupt numbered @var{vec}, or +@code{"unknown"} if the interrupt has no registered name. +@end deftypefun + +@node Internal Interrupt Handling +@subsection Internal Interrupt Handling + +Internal interrupts are caused directly by CPU instructions executed by +the running kernel thread or user process (from task 2 onward). An +internal interrupt is therefore said to arise in a ``process context.'' + +In an internal interrupt's handler, it can make sense to examine the +@struct{intr_frame} passed to the interrupt handler, or even to modify +it. When the interrupt returns, modifications in @struct{intr_frame} +become changes to the calling thread or process's state. For example, +the Pintos system call handler returns a value to the user program by +modifying the saved EAX register (@pxref{System Call Details}). + +There are no special restrictions on what an internal interrupt +handler can or can't do. Generally they should run with interrupts +enabled, just like other code, and so they can be preempted by other +kernel threads. Thus, they do need to synchronize with other threads +on shared data and other resources (@pxref{Synchronization}). + +Internal interrupt handlers can be invoked recursively. For example, +the system call handler might cause a page fault while attempting to +read user memory. Deep recursion would risk overflowing the limited +kernel stack (@pxref{struct thread}), but should be unnecessary. + +@deftypefun void intr_register_int (uint8_t @var{vec}, int @var{dpl}, enum intr_level @var{level}, intr_handler_func *@var{handler}, const char *@var{name}) +Registers @var{handler} to be called when internal interrupt numbered +@var{vec} is triggered. Names the interrupt @var{name} for debugging +purposes. + +If @var{level} is @code{INTR_ON}, external interrupts will be processed +normally during the interrupt handler's execution, which is normally +desirable. Specifying @code{INTR_OFF} will cause the CPU to disable +external interrupts when it invokes the interrupt handler. The effect +is slightly different from calling @func{intr_disable} inside the +handler, because that leaves a window of one or more CPU instructions in +which external interrupts are still enabled. This is important for the +page fault handler; refer to the comments in @file{userprog/exception.c} +for details. + +@var{dpl} determines how the interrupt can be invoked. If @var{dpl} is +0, then the interrupt can be invoked only by kernel threads. Otherwise +@var{dpl} should be 3, which allows user processes to invoke the +interrupt with an explicit INT instruction. The value of @var{dpl} +doesn't affect user processes' ability to invoke the interrupt +indirectly, e.g.@: an invalid memory reference will cause a page fault +regardless of @var{dpl}. +@end deftypefun + +@node External Interrupt Handling +@subsection External Interrupt Handling + +External interrupts are caused by events outside the CPU. +They are asynchronous, so they can be invoked at any time that +interrupts have not been disabled. We say that an external interrupt +runs in an ``interrupt context.'' + +In an external interrupt, the @struct{intr_frame} passed to the +handler is not very meaningful. It describes the state of the thread +or process that was interrupted, but there is no way to predict which +one that is. It is possible, although rarely useful, to examine it, but +modifying it is a recipe for disaster. + +Only one external interrupt may be processed at a time. Neither +internal nor external interrupt may nest within an external interrupt +handler. Thus, an external interrupt's handler must run with interrupts +disabled (@pxref{Disabling Interrupts}). + +An external interrupt handler must not sleep or yield, which rules out +calling @func{lock_acquire}, @func{thread_yield}, and many other +functions. Sleeping in interrupt context would effectively put the +interrupted thread to sleep, too, until the interrupt handler was again +scheduled and returned. This would be unfair to the unlucky thread, and +it would deadlock if the handler were waiting for the sleeping thread +to, e.g., release a lock. + +An external interrupt handler +effectively monopolizes the machine and delays all other activities. +Therefore, external interrupt handlers should complete as quickly as +they can. Anything that require much CPU time should instead run in a +kernel thread, possibly one that the interrupt triggers using a +synchronization primitive. + +External interrupts are controlled by a +pair of devices outside the CPU called @dfn{programmable interrupt +controllers}, @dfn{PICs} for short. When @func{intr_init} sets up the +CPU's IDT, it also initializes the PICs for interrupt handling. The +PICs also must be ``acknowledged'' at the end of processing for each +external interrupt. @func{intr_handler} takes care of that by calling +@func{pic_end_of_interrupt}, which properly signals the PICs. + +The following functions relate to external +interrupts. + +@deftypefun void intr_register_ext (uint8_t @var{vec}, intr_handler_func *@var{handler}, const char *@var{name}) +Registers @var{handler} to be called when external interrupt numbered +@var{vec} is triggered. Names the interrupt @var{name} for debugging +purposes. The handler will run with interrupts disabled. +@end deftypefun + +@deftypefun bool intr_context (void) +Returns true if we are running in an interrupt context, otherwise +false. Mainly used in functions that might sleep +or that otherwise should not be called from interrupt context, in this +form: +@example +ASSERT (!intr_context ()); +@end example +@end deftypefun + +@deftypefun void intr_yield_on_return (void) +When called in an interrupt context, causes @func{thread_yield} to be +called just before the interrupt returns. Used +in the timer interrupt handler when a thread's time slice expires, to +cause a new thread to be scheduled. +@end deftypefun + +@node Memory Allocation +@section Memory Allocation + +Pintos contains two memory allocators, one that allocates memory in +units of a page, and one that can allocate blocks of any size. + +@menu +* Page Allocator:: +* Block Allocator:: +@end menu + +@node Page Allocator +@subsection Page Allocator + +The page allocator declared in @file{threads/palloc.h} allocates +memory in units of a page. It is most often used to allocate memory +one page at a time, but it can also allocate multiple contiguous pages +at once. + +The page allocator divides the memory it allocates into two pools, +called the kernel and user pools. By default, each pool gets half of +system memory above @w{1 MB}, but the division can be changed with the +@option{-ul} kernel +command line +option (@pxref{Why PAL_USER?}). An allocation request draws from one +pool or the other. If one pool becomes empty, the other may still +have free pages. The user pool should be used for allocating memory +for user processes and the kernel pool for all other allocations. +This will only become important starting with task 3. Until then, +all allocations should be made from the kernel pool. + +Each pool's usage is tracked with a bitmap, one bit per page in +the pool. A request to allocate @var{n} pages scans the bitmap +for @var{n} consecutive bits set to +false, indicating that those pages are free, and then sets those bits +to true to mark them as used. This is a ``first fit'' allocation +strategy (@pxref{Wilson}). + +The page allocator is subject to fragmentation. That is, it may not +be possible to allocate @var{n} contiguous pages even though @var{n} +or more pages are free, because the free pages are separated by used +pages. In fact, in pathological cases it may be impossible to +allocate 2 contiguous pages even though half of the pool's pages are free. +Single-page requests can't fail due to fragmentation, so +requests for multiple contiguous pages should be limited as much as +possible. + +Pages may not be allocated from interrupt context, but they may be +freed. + +When a page is freed, all of its bytes are cleared to @t{0xcc}, as +a debugging aid (@pxref{Debugging Tips}). + +Page allocator types and functions are described below. + +@deftypefun {void *} palloc_get_page (enum palloc_flags @var{flags}) +@deftypefunx {void *} palloc_get_multiple (enum palloc_flags @var{flags}, size_t @var{page_cnt}) +Obtains and returns one page, or @var{page_cnt} contiguous pages, +respectively. Returns a null pointer if the pages cannot be allocated. + +The @var{flags} argument may be any combination of the following flags: + +@defvr {Page Allocator Flag} @code{PAL_ASSERT} +If the pages cannot be allocated, panic the kernel. This is only +appropriate during kernel initialization. User processes +should never be permitted to panic the kernel. +@end defvr + +@defvr {Page Allocator Flag} @code{PAL_ZERO} +Zero all the bytes in the allocated pages before returning them. If not +set, the contents of newly allocated pages are unpredictable. +@end defvr + +@defvr {Page Allocator Flag} @code{PAL_USER} +Obtain the pages from the user pool. If not set, pages are allocated +from the kernel pool. +@end defvr +@end deftypefun + +@deftypefun void palloc_free_page (void *@var{page}) +@deftypefunx void palloc_free_multiple (void *@var{pages}, size_t @var{page_cnt}) +Frees one page, or @var{page_cnt} contiguous pages, respectively, +starting at @var{pages}. All of the pages must have been obtained using +@func{palloc_get_page} or @func{palloc_get_multiple}. +@end deftypefun + +@node Block Allocator +@subsection Block Allocator + +The block allocator, declared in @file{threads/malloc.h}, can allocate +blocks of any size. It is layered on top of the page allocator +described in the previous section. Blocks returned by the block +allocator are obtained from the kernel pool. + +The block allocator uses two different strategies for allocating memory. +The first strategy applies to blocks that are 1 kB or smaller +(one-fourth of the page size). These allocations are rounded up to the +nearest power of 2, or 16 bytes, whichever is larger. Then they are +grouped into a page used only for allocations of that size. + +The second strategy applies to blocks larger than 1 kB. +These allocations (plus a small amount of overhead) are rounded up to +the nearest page in size, and then the block allocator requests that +number of contiguous pages from the page allocator. + +In either case, the difference between the allocation requested size +and the actual block size is wasted. A real operating system would +carefully tune its allocator to minimize this waste, but this is +unimportant in an instructional system like Pintos. + +As long as a page can be obtained from the page allocator, small +allocations always succeed. Most small allocations do not require a +new page from the page allocator at all, because they are satisfied +using part of a page already allocated. However, large allocations +always require calling into the page allocator, and any allocation +that needs more than one contiguous page can fail due to fragmentation, +as already discussed in the previous section. Thus, you should +minimize the number of large allocations in your code, especially +those over approximately 4 kB each. + +When a block is freed, all of its bytes are cleared to @t{0xcc}, as +a debugging aid (@pxref{Debugging Tips}). + +The block allocator may not be called from interrupt context. + +The block allocator functions are described below. Their interfaces are +the same as the standard C library functions of the same names. + +@deftypefun {void *} malloc (size_t @var{size}) +Obtains and returns a new block, from the kernel pool, at least +@var{size} bytes long. Returns a null pointer if @var{size} is zero or +if memory is not available. +@end deftypefun + +@deftypefun {void *} calloc (size_t @var{a}, size_t @var{b}) +Obtains a returns a new block, from the kernel pool, at least +@code{@var{a} * @var{b}} bytes long. The block's contents will be +cleared to zeros. Returns a null pointer if @var{a} or @var{b} is zero +or if insufficient memory is available. +@end deftypefun + +@deftypefun {void *} realloc (void *@var{block}, size_t @var{new_size}) +Attempts to resize @var{block} to @var{new_size} bytes, possibly moving +it in the process. If successful, returns the new block, in which case +the old block must no longer be accessed. On failure, returns a null +pointer, and the old block remains valid. + +A call with @var{block} null is equivalent to @func{malloc}. A call +with @var{new_size} zero is equivalent to @func{free}. +@end deftypefun + +@deftypefun void free (void *@var{block}) +Frees @var{block}, which must have been previously returned by +@func{malloc}, @func{calloc}, or @func{realloc} (and not yet freed). +@end deftypefun + +@node Virtual Addresses +@section Virtual Addresses + +A 32-bit virtual address can be divided into a 20-bit @dfn{page number} +and a 12-bit @dfn{page offset} (or just @dfn{offset}), like this: + +@example +@group + 31 12 11 0 + +-------------------+-----------+ + | Page Number | Offset | + +-------------------+-----------+ + Virtual Address +@end group +@end example + +Header @file{threads/vaddr.h} defines these functions and macros for +working with virtual addresses: + +@defmac PGSHIFT +@defmacx PGBITS +The bit index (0) and number of bits (12) of the offset part of a +virtual address, respectively. +@end defmac + +@defmac PGMASK +A bit mask with the bits in the page offset set to 1, the rest set to 0 +(@t{0xfff}). +@end defmac + +@defmac PGSIZE +The page size in bytes (4,096). +@end defmac + +@deftypefun unsigned pg_ofs (const void *@var{va}) +Extracts and returns the page offset in virtual address @var{va}. +@end deftypefun + +@deftypefun uintptr_t pg_no (const void *@var{va}) +Extracts and returns the page number in virtual address @var{va}. +@end deftypefun + +@deftypefun {void *} pg_round_down (const void *@var{va}) +Returns the start of the virtual page that @var{va} points within, that +is, @var{va} with the page offset set to 0. +@end deftypefun + +@deftypefun {void *} pg_round_up (const void *@var{va}) +Returns @var{va} rounded up to the nearest page boundary. +@end deftypefun + +Virtual memory in Pintos is divided into two regions: user virtual +memory and kernel virtual memory (@pxref{Virtual Memory Layout}). The +boundary between them is @code{PHYS_BASE}: + +@defmac PHYS_BASE +Base address of kernel virtual memory. It defaults to @t{0xc0000000} (3 +GB), but it may be changed to any multiple of @t{0x10000000} from +@t{0x80000000} to @t{0xf0000000}. + +User virtual memory ranges from virtual address 0 up to +@code{PHYS_BASE}. Kernel virtual memory occupies the rest of the +virtual address space, from @code{PHYS_BASE} up to 4 GB. +@end defmac + +@deftypefun {bool} is_user_vaddr (const void *@var{va}) +@deftypefunx {bool} is_kernel_vaddr (const void *@var{va}) +Returns true if @var{va} is a user or kernel virtual address, +respectively, false otherwise. +@end deftypefun + +The 80@var{x}86 doesn't provide any way to directly access memory given +a physical address. This ability is often necessary in an operating +system kernel, so Pintos works around it by mapping kernel virtual +memory one-to-one to physical memory. That is, virtual address +@code{PHYS_BASE} accesses physical address 0, virtual address +@code{PHYS_BASE} + @t{0x1234} accesses physical address @t{0x1234}, and +so on up to the size of the machine's physical memory. Thus, adding +@code{PHYS_BASE} to a physical address obtains a kernel virtual address +that accesses that address; conversely, subtracting @code{PHYS_BASE} +from a kernel virtual address obtains the corresponding physical +address. Header @file{threads/vaddr.h} provides a pair of functions to +do these translations: + +@deftypefun {void *} ptov (uintptr_t @var{pa}) +Returns the kernel virtual address corresponding to physical address +@var{pa}, which should be between 0 and the number of bytes of physical +memory. +@end deftypefun + +@deftypefun {uintptr_t} vtop (void *@var{va}) +Returns the physical address corresponding to @var{va}, which must be a +kernel virtual address. +@end deftypefun + +@node Page Table +@section Page Table + +The code in @file{pagedir.c} is an abstract interface to the 80@var{x}86 +hardware page table, also called a ``page directory'' by Intel processor +documentation. The page table interface uses a @code{uint32_t *} to +represent a page table because this is convenient for accessing their +internal structure. + +The sections below describe the page table interface and internals. + +@menu +* Page Table Creation Destruction Activation:: +* Page Tables Inspection and Updates:: +* Page Table Accessed and Dirty Bits:: +* Page Table Details:: +@end menu + +@node Page Table Creation Destruction Activation +@subsection Creation, Destruction, and Activation + +These functions create, destroy, and activate page tables. The base +Pintos code already calls these functions where necessary, so it should +not be necessary to call them yourself. + +@deftypefun {uint32_t *} pagedir_create (void) +Creates and returns a new page table. The new page table contains +Pintos's normal kernel virtual page mappings, but no user virtual +mappings. + +Returns a null pointer if memory cannot be obtained. +@end deftypefun + +@deftypefun void pagedir_destroy (uint32_t *@var{pd}) +Frees all of the resources held by @var{pd}, including the page table +itself and the frames that it maps. +@end deftypefun + +@deftypefun void pagedir_activate (uint32_t *@var{pd}) +Activates @var{pd}. The active page table is the one used by the CPU to +translate memory references. +@end deftypefun + +@node Page Tables Inspection and Updates +@subsection Inspection and Updates + +These functions examine or update the mappings from pages to frames +encapsulated by a page table. They work on both active and inactive +page tables (that is, those for running and suspended processes), +flushing the TLB as necessary. + +@deftypefun bool pagedir_set_page (uint32_t *@var{pd}, void *@var{upage}, void *@var{kpage}, bool @var{writable}) +Adds to @var{pd} a mapping from user page @var{upage} to the frame identified +by kernel virtual address @var{kpage}. If @var{writable} is true, the +page is mapped read/write; otherwise, it is mapped read-only. + +User page @var{upage} must not already be mapped in @var{pd}. + +Kernel page @var{kpage} should be a kernel virtual address obtained from +the user pool with @code{palloc_get_page(PAL_USER)} (@pxref{Why +PAL_USER?}). + +Returns true if successful, false on failure. Failure will occur if +additional memory required for the page table cannot be obtained. +@end deftypefun + +@deftypefun {void *} pagedir_get_page (uint32_t *@var{pd}, const void *@var{uaddr}) +Looks up the frame mapped to @var{uaddr} in @var{pd}. Returns the +kernel virtual address for that frame, if @var{uaddr} is mapped, or a +null pointer if it is not. +@end deftypefun + +@deftypefun void pagedir_clear_page (uint32_t *@var{pd}, void *@var{page}) +Marks @var{page} ``not present'' in @var{pd}. Later accesses to +the page will fault. + +Other bits in the page table for @var{page} are preserved, permitting +the accessed and dirty bits (see the next section) to be checked. + +This function has no effect if @var{page} is not mapped. +@end deftypefun + +@node Page Table Accessed and Dirty Bits +@subsection Accessed and Dirty Bits + +80@var{x}86 hardware provides some assistance for implementing page +replacement algorithms, through a pair of bits in the page table entry +(PTE) for each page. On any read or write to a page, the CPU sets the +@dfn{accessed bit} to 1 in the page's PTE, and on any write, the CPU +sets the @dfn{dirty bit} to 1. The CPU never resets these bits to 0, +but the OS may do so. + +Proper interpretation of these bits requires understanding of +@dfn{aliases}, that is, two (or more) pages that refer to the same +frame. When an aliased frame is accessed, the accessed and dirty bits +are updated in only one page table entry (the one for the page used for +access). The accessed and dirty bits for the other aliases are not +updated. + +@xref{Accessed and Dirty Bits}, on applying these bits in implementing +page replacement algorithms. + +@deftypefun bool pagedir_is_dirty (uint32_t *@var{pd}, const void *@var{page}) +@deftypefunx bool pagedir_is_accessed (uint32_t *@var{pd}, const void *@var{page}) +Returns true if page directory @var{pd} contains a page table entry for +@var{page} that is marked dirty (or accessed). Otherwise, +returns false. +@end deftypefun + +@deftypefun void pagedir_set_dirty (uint32_t *@var{pd}, const void *@var{page}, bool @var{value}) +@deftypefunx void pagedir_set_accessed (uint32_t *@var{pd}, const void *@var{page}, bool @var{value}) +If page directory @var{pd} has a page table entry for @var{page}, then +its dirty (or accessed) bit is set to @var{value}. +@end deftypefun + +@node Page Table Details +@subsection Page Table Details + +The functions provided with Pintos are sufficient to implement the +tasks. However, you may still find it worthwhile to understand the +hardware page table format, so we'll go into a little detail in this +section. + +@menu +* Page Table Structure:: +* Page Table Entry Format:: +* Page Directory Entry Format:: +@end menu + +@node Page Table Structure +@subsubsection Structure + +The top-level paging data structure is a page called the ``page +directory'' (PD) arranged as an array of 1,024 32-bit page directory +entries (PDEs), each of which represents 4 MB of virtual memory. Each +PDE may point to the physical address of another page called a +``page table'' (PT) arranged, similarly, as an array of 1,024 +32-bit page table entries (PTEs), each of which translates a single 4 +kB virtual page to a physical page. + +Translation of a virtual address into a physical address follows +the three-step process illustrated in the diagram +below:@footnote{Actually, virtual to physical translation on the +80@var{x}86 architecture occurs via an intermediate ``linear +address,'' but Pintos (and most modern 80@var{x}86 OSes) set up the CPU +so that linear and virtual addresses are one and the same. Thus, you +can effectively ignore this CPU feature.} + +@enumerate 1 +@item +The most-significant 10 bits of the virtual address (bits 22@dots{}31) +index the page directory. If the PDE is marked ``present,'' the +physical address of a page table is read from the PDE thus obtained. +If the PDE is marked ``not present'' then a page fault occurs. + +@item +The next 10 bits of the virtual address (bits 12@dots{}21) index +the page table. If the PTE is marked ``present,'' the physical +address of a data page is read from the PTE thus obtained. If the PTE +is marked ``not present'' then a page fault occurs. + +@item +The least-significant 12 bits of the virtual address (bits 0@dots{}11) +are added to the data page's physical base address, yielding the final +physical address. +@end enumerate + +@example +@group + 31 22 21 12 11 0 ++----------------------+----------------------+----------------------+ +| Page Directory Index | Page Table Index | Page Offset | ++----------------------+----------------------+----------------------+ + | | | + _______/ _______/ _____/ + / / / + / Page Directory / Page Table / Data Page + / .____________. / .____________. / .____________. + |1,023|____________| |1,023|____________| | |____________| + |1,022|____________| |1,022|____________| | |____________| + |1,021|____________| |1,021|____________| \__\|____________| + |1,020|____________| |1,020|____________| /|____________| + | | | | | | | | + | | | \____\| |_ | | + | | . | /| . | \ | . | + \____\| . |_ | . | | | . | + /| . | \ | . | | | . | + | . | | | . | | | . | + | | | | | | | | + |____________| | |____________| | |____________| + 4|____________| | 4|____________| | |____________| + 3|____________| | 3|____________| | |____________| + 2|____________| | 2|____________| | |____________| + 1|____________| | 1|____________| | |____________| + 0|____________| \__\0|____________| \____\|____________| + / / +@end group +@end example + +Pintos provides some macros and functions that are useful for working +with raw page tables: + +@defmac PTSHIFT +@defmacx PTBITS +The starting bit index (12) and number of bits (10), respectively, in a +page table index. +@end defmac + +@defmac PTMASK +A bit mask with the bits in the page table index set to 1 and the rest +set to 0 (@t{0x3ff000}). +@end defmac + +@defmac PTSPAN +The number of bytes of virtual address space that a single page table +page covers (4,194,304 bytes, or 4 MB). +@end defmac + +@defmac PDSHIFT +@defmacx PDBITS +The starting bit index (22) and number of bits (10), respectively, in a +page directory index. +@end defmac + +@defmac PDMASK +A bit mask with the bits in the page directory index set to 1 and other +bits set to 0 (@t{0xffc00000}). +@end defmac + +@deftypefun uintptr_t pd_no (const void *@var{va}) +@deftypefunx uintptr_t pt_no (const void *@var{va}) +Returns the page directory index or page table index, respectively, for +virtual address @var{va}. These functions are defined in +@file{threads/pte.h}. +@end deftypefun + +@deftypefun unsigned pg_ofs (const void *@var{va}) +Returns the page offset for virtual address @var{va}. This function is +defined in @file{threads/vaddr.h}. +@end deftypefun + +@node Page Table Entry Format +@subsubsection Page Table Entry Format + +You do not need to understand the PTE format to do the Pintos +tasks, unless you wish to incorporate the page table into your +supplemental page table (@pxref{Managing the Supplemental Page Table}). + +The actual format of a page table entry is summarized below. For +complete information, refer to section 3.7, ``Page Translation Using +32-Bit Physical Addressing,'' in @bibref{IA32-v3a}. + +@example +@group + 31 12 11 9 6 5 2 1 0 ++---------------------------------------+----+----+-+-+---+-+-+-+ +| Physical Address | AVL| |D|A| |U|W|P| ++---------------------------------------+----+----+-+-+---+-+-+-+ +@end group +@end example + +Some more information on each bit is given below. The names are +@file{threads/pte.h} macros that represent the bits' values: + +@defmac PTE_P +Bit 0, the ``present'' bit. When this bit is 1, the +other bits are interpreted as described below. When this bit is 0, any +attempt to access the page will page fault. The remaining bits are then +not used by the CPU and may be used by the OS for any purpose. +@end defmac + +@defmac PTE_W +Bit 1, the ``read/write'' bit. When it is 1, the page +is writable. When it is 0, write attempts will page fault. +@end defmac + +@defmac PTE_U +Bit 2, the ``user/supervisor'' bit. When it is 1, user +processes may access the page. When it is 0, only the kernel may access +the page (user accesses will page fault). + +Pintos clears this bit in PTEs for kernel virtual memory, to prevent +user processes from accessing them. +@end defmac + +@defmac PTE_A +Bit 5, the ``accessed'' bit. @xref{Page Table Accessed +and Dirty Bits}. +@end defmac + +@defmac PTE_D +Bit 6, the ``dirty'' bit. @xref{Page Table Accessed and +Dirty Bits}. +@end defmac + +@defmac PTE_AVL +Bits 9@dots{}11, available for operating system use. +Pintos, as provided, does not use them and sets them to 0. +@end defmac + +@defmac PTE_ADDR +Bits 12@dots{}31, the top 20 bits of the physical address of a frame. +The low 12 bits of the frame's address are always 0. +@end defmac + +Other bits are either reserved or uninteresting in a Pintos context and +should be set to@tie{}0. + +Header @file{threads/pte.h} defines three functions for working with +page table entries: + +@deftypefun uint32_t pte_create_kernel (uint32_t *@var{page}, bool @var{writable}) +Returns a page table entry that points to @var{page}, which should be a +kernel virtual address. The PTE's present bit will be set. It will be +marked for kernel-only access. If @var{writable} is true, the PTE will +also be marked read/write; otherwise, it will be read-only. +@end deftypefun + +@deftypefun uint32_t pte_create_user (uint32_t *@var{page}, bool @var{writable}) +Returns a page table entry that points to @var{page}, which should be +the kernel virtual address of a frame in the user pool (@pxref{Why +PAL_USER?}). The PTE's present bit will be set and it will be marked to +allow user-mode access. If @var{writable} is true, the PTE will also be +marked read/write; otherwise, it will be read-only. +@end deftypefun + +@deftypefun {void *} pte_get_page (uint32_t @var{pte}) +Returns the kernel virtual address for the frame that @var{pte} points +to. The @var{pte} may be present or not-present; if it is not-present +then the pointer returned is only meaningful if the address bits in the PTE +actually represent a physical address. +@end deftypefun + +@node Page Directory Entry Format +@subsubsection Page Directory Entry Format + +Page directory entries have the same format as PTEs, except that the +physical address points to a page table page instead of a frame. Header +@file{threads/pte.h} defines two functions for working with page +directory entries: + +@deftypefun uint32_t pde_create (uint32_t *@var{pt}) +Returns a page directory that points to @var{page}, which should be the +kernel virtual address of a page table page. The PDE's present bit will +be set, it will be marked to allow user-mode access, and it will be +marked read/write. +@end deftypefun + +@deftypefun {uint32_t *} pde_get_pt (uint32_t @var{pde}) +Returns the kernel virtual address for the page table page that +@var{pde}, which must be marked present, points to. +@end deftypefun + +@node Hash Table +@section Hash Table + +Pintos provides a hash table data structure in @file{lib/kernel/hash.c}. +To use it you will need to include its header file, +@file{lib/kernel/hash.h}, with @code{#include <hash.h>}. +No code provided with Pintos uses the hash table, which means that you +are free to use it as is, modify its implementation for your own +purposes, or ignore it, as you wish. + +Most implementations of the virtual memory task use a hash table to +translate pages to frames. You may find other uses for hash tables as +well. + +@menu +* Hash Data Types:: +* Basic Hash Functions:: +* Hash Search Functions:: +* Hash Iteration Functions:: +* Hash Table Example:: +* Hash Auxiliary Data:: +* Hash Synchronization:: +@end menu + +@node Hash Data Types +@subsection Data Types + +A hash table is represented by @struct{hash}. + +@deftp {Type} {struct hash} +Represents an entire hash table. The actual members of @struct{hash} +are ``opaque.'' That is, code that uses a hash table should not access +@struct{hash} members directly, nor should it need to. Instead, use +hash table functions and macros. +@end deftp + +The hash table operates on elements of type @struct{hash_elem}. + +@deftp {Type} {struct hash_elem} +Embed a @struct{hash_elem} member in the structure you want to include +in a hash table. Like @struct{hash}, @struct{hash_elem} is opaque. +All functions for operating on hash table elements actually take and +return pointers to @struct{hash_elem}, not pointers to your hash table's +real element type. +@end deftp + +You will often need to obtain a @struct{hash_elem} given a real element +of the hash table, and vice versa. Given a real element of the hash +table, you may use the @samp{&} operator to obtain a pointer to its +@struct{hash_elem}. Use the @code{hash_entry()} macro to go the other +direction. + +@deftypefn {Macro} {@var{type} *} hash_entry (struct hash_elem *@var{elem}, @var{type}, @var{member}) +Returns a pointer to the structure that @var{elem}, a pointer to a +@struct{hash_elem}, is embedded within. You must provide @var{type}, +the name of the structure that @var{elem} is inside, and @var{member}, +the name of the member in @var{type} that @var{elem} points to. + +For example, suppose @code{h} is a @code{struct hash_elem *} variable +that points to a @struct{thread} member (of type @struct{hash_elem}) +named @code{h_elem}. Then, @code{hash_entry@tie{}(h, struct thread, h_elem)} +yields the address of the @struct{thread} that @code{h} points within. +@end deftypefn + +@xref{Hash Table Example}, for an example. + +Each hash table element must contain a key, that is, data that +identifies and distinguishes elements, which must be unique +among elements in the hash table. (Elements may +also contain non-key data that need not be unique.) While an element is +in a hash table, its key data must not be changed. Instead, if need be, +remove the element from the hash table, modify its key, then reinsert +the element. + +For each hash table, you must write two functions that act on keys: a +hash function and a comparison function. These functions must match the +following prototypes: + +@deftp {Type} {unsigned hash_hash_func (const struct hash_elem *@var{element}, void *@var{aux})} +Returns a hash of @var{element}'s data, as a value anywhere in the range +of @code{unsigned int}. The hash of an element should be a +pseudo-random function of the element's key. It must not depend on +non-key data in the element or on any non-constant data other than the +key. Pintos provides the following functions as a suitable basis for +hash functions. + +@deftypefun unsigned hash_bytes (const void *@var{buf}, size_t *@var{size}) +Returns a hash of the @var{size} bytes starting at @var{buf}. The +implementation is the general-purpose +@uref{http://en.wikipedia.org/wiki/Fowler_Noll_Vo_hash, Fowler-Noll-Vo +hash} for 32-bit words. +@end deftypefun + +@deftypefun unsigned hash_string (const char *@var{s}) +Returns a hash of null-terminated string @var{s}. +@end deftypefun + +@deftypefun unsigned hash_int (int @var{i}) +Returns a hash of integer @var{i}. +@end deftypefun + +If your key is a single piece of data of an appropriate type, it is +sensible for your hash function to directly return the output of one of +these functions. For multiple pieces of data, you may wish to combine +the output of more than one call to them using, e.g., the @samp{^} +(exclusive or) +operator. Finally, you may entirely ignore these functions and write +your own hash function from scratch, but remember that your goal is to +build an operating system kernel, not to design a hash function. + +@xref{Hash Auxiliary Data}, for an explanation of @var{aux}. +@end deftp + +@deftp {Type} {bool hash_less_func (const struct hash_elem *@var{a}, const struct hash_elem *@var{b}, void *@var{aux})} +Compares the keys stored in elements @var{a} and @var{b}. Returns +true if @var{a} is less than @var{b}, false if @var{a} is greater than +or equal to @var{b}. + +If two elements compare equal, then they must hash to equal values. + +@xref{Hash Auxiliary Data}, for an explanation of @var{aux}. +@end deftp + +@xref{Hash Table Example}, for hash and comparison function examples. + +A few functions accept a pointer to a third kind of +function as an argument: + +@deftp {Type} {void hash_action_func (struct hash_elem *@var{element}, void *@var{aux})} +Performs some kind of action, chosen by the caller, on @var{element}. + +@xref{Hash Auxiliary Data}, for an explanation of @var{aux}. +@end deftp + +@node Basic Hash Functions +@subsection Basic Functions + +These functions create, destroy, and inspect hash tables. + +@deftypefun bool hash_init (struct hash *@var{hash}, hash_hash_func *@var{hash_func}, hash_less_func *@var{less_func}, void *@var{aux}) +Initializes @var{hash} as a hash table with @var{hash_func} as hash +function, @var{less_func} as comparison function, and @var{aux} as +auxiliary data. +Returns true if successful, false on failure. @func{hash_init} calls +@func{malloc} and fails if memory cannot be allocated. + +@xref{Hash Auxiliary Data}, for an explanation of @var{aux}, which is +most often a null pointer. +@end deftypefun + +@deftypefun void hash_clear (struct hash *@var{hash}, hash_action_func *@var{action}) +Removes all the elements from @var{hash}, which must have been +previously initialized with @func{hash_init}. + +If @var{action} is non-null, then it is called once for each element in +the hash table, which gives the caller an opportunity to deallocate any +memory or other resources used by the element. For example, if the hash +table elements are dynamically allocated using @func{malloc}, then +@var{action} could @func{free} the element. This is safe because +@func{hash_clear} will not access the memory in a given hash element +after calling @var{action} on it. However, @var{action} must not call +any function that may modify the hash table, such as @func{hash_insert} +or @func{hash_delete}. +@end deftypefun + +@deftypefun void hash_destroy (struct hash *@var{hash}, hash_action_func *@var{action}) +If @var{action} is non-null, calls it for each element in the hash, with +the same semantics as a call to @func{hash_clear}. Then, frees the +memory held by @var{hash}. Afterward, @var{hash} must not be passed to +any hash table function, absent an intervening call to @func{hash_init}. +@end deftypefun + +@deftypefun size_t hash_size (struct hash *@var{hash}) +Returns the number of elements currently stored in @var{hash}. +@end deftypefun + +@deftypefun bool hash_empty (struct hash *@var{hash}) +Returns true if @var{hash} currently contains no elements, +false if @var{hash} contains at least one element. +@end deftypefun + +@node Hash Search Functions +@subsection Search Functions + +Each of these functions searches a hash table for an element that +compares equal to one provided. Based on the success of the search, +they perform some action, such as inserting a new element into the hash +table, or simply return the result of the search. + +@deftypefun {struct hash_elem *} hash_insert (struct hash *@var{hash}, struct hash_elem *@var{element}) +Searches @var{hash} for an element equal to @var{element}. If none is +found, inserts @var{element} into @var{hash} and returns a null pointer. +If the table already contains an element equal to @var{element}, it is +returned without modifying @var{hash}. +@end deftypefun + +@deftypefun {struct hash_elem *} hash_replace (struct hash *@var{hash}, struct hash_elem *@var{element}) +Inserts @var{element} into @var{hash}. Any element equal to +@var{element} already in @var{hash} is removed. Returns the element +removed, or a null pointer if @var{hash} did not contain an element +equal to @var{element}. + +The caller is responsible for deallocating any resources associated with +the returned element, as appropriate. For example, if the hash table +elements are dynamically allocated using @func{malloc}, then the caller +must @func{free} the element after it is no longer needed. +@end deftypefun + +The element passed to the following functions is only used for hashing +and comparison purposes. It is never actually inserted into the hash +table. Thus, only key data in the element needs to be initialized, and +other data in the element will not be used. It often makes sense to +declare an instance of the element type as a local variable, initialize +the key data, and then pass the address of its @struct{hash_elem} to +@func{hash_find} or @func{hash_delete}. @xref{Hash Table Example}, for +an example. (Large structures should not be +allocated as local variables. @xref{struct thread}, for more +information.) + +@deftypefun {struct hash_elem *} hash_find (struct hash *@var{hash}, struct hash_elem *@var{element}) +Searches @var{hash} for an element equal to @var{element}. Returns the +element found, if any, or a null pointer otherwise. +@end deftypefun + +@deftypefun {struct hash_elem *} hash_delete (struct hash *@var{hash}, struct hash_elem *@var{element}) +Searches @var{hash} for an element equal to @var{element}. If one is +found, it is removed from @var{hash} and returned. Otherwise, a null +pointer is returned and @var{hash} is unchanged. + +The caller is responsible for deallocating any resources associated with +the returned element, as appropriate. For example, if the hash table +elements are dynamically allocated using @func{malloc}, then the caller +must @func{free} the element after it is no longer needed. +@end deftypefun + +@node Hash Iteration Functions +@subsection Iteration Functions + +These functions allow iterating through the elements in a hash table. +Two interfaces are supplied. The first requires writing and supplying a +@var{hash_action_func} to act on each element (@pxref{Hash Data Types}). + +@deftypefun void hash_apply (struct hash *@var{hash}, hash_action_func *@var{action}) +Calls @var{action} once for each element in @var{hash}, in arbitrary +order. @var{action} must not call any function that may modify the hash +table, such as @func{hash_insert} or @func{hash_delete}. @var{action} +must not modify key data in elements, although it may modify any other +data. +@end deftypefun + +The second interface is based on an ``iterator'' data type. +Idiomatically, iterators are used as follows: + +@example +struct hash_iterator i; + +hash_first (&i, h); +while (hash_next (&i)) + @{ + struct foo *f = hash_entry (hash_cur (&i), struct foo, elem); + @r{@dots{}do something with @i{f}@dots{}} + @} +@end example + +@deftp {Type} {struct hash_iterator} +Represents a position within a hash table. Calling any function that +may modify a hash table, such as @func{hash_insert} or +@func{hash_delete}, invalidates all iterators within that hash table. + +Like @struct{hash} and @struct{hash_elem}, @struct{hash_elem} is opaque. +@end deftp + +@deftypefun void hash_first (struct hash_iterator *@var{iterator}, struct hash *@var{hash}) +Initializes @var{iterator} to just before the first element in +@var{hash}. +@end deftypefun + +@deftypefun {struct hash_elem *} hash_next (struct hash_iterator *@var{iterator}) +Advances @var{iterator} to the next element in @var{hash}, and returns +that element. Returns a null pointer if no elements remain. After +@func{hash_next} returns null for @var{iterator}, calling it again +yields undefined behavior. +@end deftypefun + +@deftypefun {struct hash_elem *} hash_cur (struct hash_iterator *@var{iterator}) +Returns the value most recently returned by @func{hash_next} for +@var{iterator}. Yields undefined behavior after @func{hash_first} has +been called on @var{iterator} but before @func{hash_next} has been +called for the first time. +@end deftypefun + +@node Hash Table Example +@subsection Hash Table Example + +Suppose you have a structure, called @struct{page}, that you +want to put into a hash table. First, define @struct{page} to include a +@struct{hash_elem} member: + +@example +struct page + @{ + struct hash_elem hash_elem; /* @r{Hash table element.} */ + void *addr; /* @r{Virtual address.} */ + /* @r{@dots{}other members@dots{}} */ + @}; +@end example + +We write a hash function and a comparison function using @var{addr} as +the key. A pointer can be hashed based on its bytes, and the @samp{<} +operator works fine for comparing pointers: + +@example +/* @r{Returns a hash value for page @var{p}.} */ +unsigned +page_hash (const struct hash_elem *p_, void *aux UNUSED) +@{ + const struct page *p = hash_entry (p_, struct page, hash_elem); + return hash_bytes (&p->addr, sizeof p->addr); +@} + +/* @r{Returns true if page @var{a} precedes page @var{b}.} */ +bool +page_less (const struct hash_elem *a_, const struct hash_elem *b_, + void *aux UNUSED) +@{ + const struct page *a = hash_entry (a_, struct page, hash_elem); + const struct page *b = hash_entry (b_, struct page, hash_elem); + + return a->addr < b->addr; +@} +@end example + +@noindent +(The use of @code{UNUSED} in these functions' prototypes suppresses a +warning that @var{aux} is unused. @xref{Function and Parameter +Attributes}, for information about @code{UNUSED}. @xref{Hash Auxiliary +Data}, for an explanation of @var{aux}.) + +Then, we can create a hash table like this: + +@example +struct hash pages; + +hash_init (&pages, page_hash, page_less, NULL); +@end example + +Now we can manipulate the hash table we've created. If @code{@var{p}} +is a pointer to a @struct{page}, we can insert it into the hash table +with: + +@example +hash_insert (&pages, &p->hash_elem); +@end example + +@noindent If there's a chance that @var{pages} might already contain a +page with the same @var{addr}, then we should check @func{hash_insert}'s +return value. + +To search for an element in the hash table, use @func{hash_find}. This +takes a little setup, because @func{hash_find} takes an element to +compare against. Here's a function that will find and return a page +based on a virtual address, assuming that @var{pages} is defined at file +scope: + +@example +/* @r{Returns the page containing the given virtual @var{address},} + @r{or a null pointer if no such page exists.} */ +struct page * +page_lookup (const void *address) +@{ + struct page p; + struct hash_elem *e; + + p.addr = address; + e = hash_find (&pages, &p.hash_elem); + return e != NULL ? hash_entry (e, struct page, hash_elem) : NULL; +@} +@end example + +@noindent +@struct{page} is allocated as a local variable here on the assumption +that it is fairly small. Large structures should not be allocated as +local variables. @xref{struct thread}, for more information. + +A similar function could delete a page by address using +@func{hash_delete}. + +@node Hash Auxiliary Data +@subsection Auxiliary Data + +In simple cases like the example above, there's no need for the +@var{aux} parameters. In these cases, just pass a null pointer to +@func{hash_init} for @var{aux} and ignore the values passed to the hash +function and comparison functions. (You'll get a compiler warning if +you don't use the @var{aux} parameter, but you can turn that off with +the @code{UNUSED} macro, as shown in the example, or you can just ignore +it.) + +@var{aux} is useful when you have some property of the data in the +hash table is both constant and needed for hashing or comparison, +but not stored in the data items themselves. For example, if +the items in a hash table are fixed-length strings, but the items +themselves don't indicate what that fixed length is, you could pass +the length as an @var{aux} parameter. + +@node Hash Synchronization +@subsection Synchronization + +The hash table does not do any internal synchronization. It is the +caller's responsibility to synchronize calls to hash table functions. +In general, any number of functions that examine but do not modify the +hash table, such as @func{hash_find} or @func{hash_next}, may execute +simultaneously. However, these function cannot safely execute at the +same time as any function that may modify a given hash table, such as +@func{hash_insert} or @func{hash_delete}, nor may more than one function +that can modify a given hash table execute safely at once. + +It is also the caller's responsibility to synchronize access to data in +hash table elements. How to synchronize access to this data depends on +how it is designed and organized, as with any other data structure. + diff --git a/doc/sample.tmpl b/doc/sample.tmpl new file mode 100644 index 0000000..368d5f0 --- /dev/null +++ b/doc/sample.tmpl @@ -0,0 +1,104 @@ + + +-----------------+ + | CS 140 | + | SAMPLE TASK | + | DESIGN DOCUMENT | + +-----------------+ + +---- GROUP ---- + +Ben Pfaff <blp@stanford.edu> + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for +>> the TAs, or extra credit, please give them here. + +(This is a sample design document.) + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, +>> course text, and lecture notes. + +None. + + JOIN + ==== + +---- DATA STRUCTURES ---- + +>> Copy here the declaration of each new or changed `struct' or `struct' +>> member, global or static variable, `typedef', or enumeration. +>> Identify the purpose of each in 25 words or less. + +A "latch" is a new synchronization primitive. Acquires block +until the first release. Afterward, all ongoing and future +acquires pass immediately. + + /* Latch. */ + struct latch + { + bool released; /* Released yet? */ + struct lock monitor_lock; /* Monitor lock. */ + struct condition rel_cond; /* Signaled when released. */ + }; + +Added to struct thread: + + /* Members for implementing thread_join(). */ + struct latch ready_to_die; /* Release when thread about to die. */ + struct semaphore can_die; /* Up when thread allowed to die. */ + struct list children; /* List of child threads. */ + list_elem children_elem; /* Element of `children' list. */ + +---- ALGORITHMS ---- + +>> Briefly describe your implementation of thread_join() and how it +>> interacts with thread termination. + +thread_join() finds the joined child on the thread's list of +children and waits for the child to exit by acquiring the child's +ready_to_die latch. When thread_exit() is called, the thread +releases its ready_to_die latch, allowing the parent to continue. + +---- SYNCHRONIZATION ---- + +>> Consider parent thread P with child thread C. How do you ensure +>> proper synchronization and avoid race conditions when P calls wait(C) +>> before C exits? After C exits? How do you ensure that all resources +>> are freed in each case? How about when P terminates without waiting, +>> before C exits? After C exits? Are there any special cases? + +C waits in thread_exit() for P to die before it finishes its own +exit, using the can_die semaphore "down"ed by C and "up"ed by P as +it exits. Regardless of whether whether C has terminated, there +is no race on wait(C), because C waits for P's permission before +it frees itself. + +Regardless of whether P waits for C, P still "up"s C's can_die +semaphore when P dies, so C will always be freed. (However, +freeing C's resources is delayed until P's death.) + +The initial thread is a special case because it has no parent to +wait for it or to "up" its can_die semaphore. Therefore, its +can_die semaphore is initialized to 1. + +---- RATIONALE ---- + +>> Critique your design, pointing out advantages and disadvantages in +>> your design choices. + +This design has the advantage of simplicity. Encapsulating most +of the synchronization logic into a new "latch" structure +abstracts what little complexity there is into a separate layer, +making the design easier to reason about. Also, all the new data +members are in `struct thread', with no need for any extra dynamic +allocation, etc., that would require extra management code. + +On the other hand, this design is wasteful in that a child thread +cannot free itself before its parent has terminated. A parent +thread that creates a large number of short-lived child threads +could unnecessarily exhaust kernel memory. This is probably +acceptable for implementing kernel threads, but it may be a bad +idea for use with user processes because of the larger number of +resources that user processes tend to own. diff --git a/doc/standards.texi b/doc/standards.texi new file mode 100644 index 0000000..8e8a6e5 --- /dev/null +++ b/doc/standards.texi @@ -0,0 +1,190 @@ +@node Coding Standards +@appendix Coding Standards + +@localcodingstandards{} + +Our standards for coding are most important for grading. We want to +stress that aside from the fact that we are explicitly basing part of +your grade on these things, good coding practices will improve the +quality of your code. This makes it easier for your partners to +interact with it, and ultimately, will improve your chances of having a +good working program. That said once, the rest of this document will +discuss only the ways in which our coding standards will affect our +grading. + +@menu +* Coding Style:: +* C99:: +* Unsafe String Functions:: +@end menu + +@node Coding Style +@section Style + +Style, for the purposes of our grading, refers to how readable your +code is. At minimum, this means that your code is well formatted, your +variable names are descriptive and your functions are decomposed and +well commented. Any other factors which make it hard (or easy) for us +to read or use your code will be reflected in your style grade. + +The existing Pintos code is written in the GNU style and largely +follows the @uref{http://www.gnu.org/prep/standards_toc.html, , GNU +Coding Standards}. We encourage you to follow the applicable parts of +them too, especially chapter 5, ``Making the Best Use of C.'' Using a +different style won't cause actual problems, but it's ugly to see +gratuitous differences in style from one function to another. If your +code is too ugly, it will cost you points. + +Please limit C source file lines to at most 79 characters long. + +Pintos comments sometimes refer to external standards or +specifications by writing a name inside square brackets, like this: +@code{[IA32-v3a]}. These names refer to the reference names used in +this documentation (@pxref{Bibliography}). + +If you remove existing Pintos code, please delete it from your source +file entirely. Don't just put it into a comment or a conditional +compilation directive, because that makes the resulting code hard to +read. Version control software will allow you to recover the code if +necessary later. + +We're only going to do a compile in the directory for the task being +submitted. You don't need to make sure that the previous tasks also +compile. + +Task code should be written so that all of the subproblems for the +task function together, that is, without the need to rebuild with +different macros defined, etc. If you do extra credit work that +changes normal Pintos behavior so as to interfere with grading, then +you must implement it so that it only acts that way when given a +special command-line option of the form @option{-@var{name}}, where +@var{name} is a name of your choice. You can add such an option by +modifying @func{parse_options} in @file{threads/init.c}. + +The introduction describes additional coding style requirements +(@pxref{Design}). + +@node C99 +@section C99 + +The Pintos source code uses a few features of the ``C99'' standard +library that were not in the original 1989 standard for C. Many +programmers are unaware of these feature, so we will describe them. The +new features used in Pintos are +mostly in new headers: + +@table @file +@item <stdbool.h> +Defines macros @code{bool}, a 1-bit type that takes on only the values +0 and 1, @code{true}, which expands to 1, and @code{false}, which +expands to 0. + +@item <stdint.h> +On systems that support them, this header defines types +@code{int@var{n}_t} and @code{uint@var{n}_t} for @var{n} = 8, 16, 32, +64, and possibly other values. These are 2's complement signed and unsigned +types, respectively, with the given number of bits. + +On systems where it is possible, this header also defines types +@code{intptr_t} and @code{uintptr_t}, which are integer types big +enough to hold a pointer. + +On all systems, this header defines types @code{intmax_t} and +@code{uintmax_t}, which are the system's signed and unsigned integer +types with the widest ranges. + +For every signed integer type @code{@var{type}_t} defined here, as well +as for @code{ptrdiff_t} defined in @file{<stddef.h>}, this header also +defines macros @code{@var{TYPE}_MAX} and @code{@var{TYPE}_MIN} that +give the type's range. Similarly, for every unsigned integer type +@code{@var{type}_t} defined here, as well as for @code{size_t} defined +in @file{<stddef.h>}, this header defines a @code{@var{TYPE}_MAX} +macro giving its maximum value. + +@item <inttypes.h> +@file{<stdint.h>} provides no straightforward way to format +the types it defines with @func{printf} and related functions. This +header provides macros to help with that. For every +@code{int@var{n}_t} defined by @file{<stdint.h>}, it provides macros +@code{PRId@var{n}} and @code{PRIi@var{n}} for formatting values of +that type with @code{"%d"} and @code{"%i"}. Similarly, for every +@code{uint@var{n}_t}, it provides @code{PRIo@var{n}}, +@code{PRIu@var{n}}, @code{PRIu@var{x}}, and @code{PRIu@var{X}}. + +You use these something like this, taking advantage of the fact that +the C compiler concatenates adjacent string literals: +@example +#include <inttypes.h> +@dots{} +int32_t value = @dots{}; +printf ("value=%08"PRId32"\n", value); +@end example +@noindent +The @samp{%} is not supplied by the @code{PRI} macros. As shown +above, you supply it yourself and follow it by any flags, field +width, etc. + +@item <stdio.h> +The @func{printf} function has some new type modifiers for printing +standard types: + +@table @samp +@item j +For @code{intmax_t} (e.g.@: @samp{%jd}) or @code{uintmax_t} (e.g.@: +@samp{%ju}). + +@item z +For @code{size_t} (e.g.@: @samp{%zu}). + +@item t +For @code{ptrdiff_t} (e.g.@: @samp{%td}). +@end table + +Pintos @func{printf} also implements a nonstandard @samp{'} flag that +groups large numbers with commas to make them easier to read. +@end table + +@node Unsafe String Functions +@section Unsafe String Functions + +A few of the string functions declared in the standard +@file{<string.h>} and @file{<stdio.h>} headers are notoriously unsafe. +The worst offenders are intentionally not included in the Pintos C +library: + +@table @func +@item strcpy +When used carelessly this function can overflow the buffer reserved +for its output string. Use @func{strlcpy} instead. Refer to +comments in its source code in @code{lib/string.c} for documentation. + +@item strncpy +This function can leave its destination buffer without a null string +terminator. It also has performance problems. Again, use +@func{strlcpy}. + +@item strcat +Same issue as @func{strcpy}. Use @func{strlcat} instead. +Again, refer to comments in its source code in @code{lib/string.c} for +documentation. + +@item strncat +The meaning of its buffer size argument is surprising. +Again, use @func{strlcat}. + +@item strtok +Uses global data, so it is unsafe in threaded programs such as +kernels. Use @func{strtok_r} instead, and see its source code in +@code{lib/string.c} for documentation and an example. + +@item sprintf +Same issue as @func{strcpy}. Use @func{snprintf} instead. Refer +to comments in @code{lib/stdio.h} for documentation. + +@item vsprintf +Same issue as @func{strcpy}. Use @func{vsnprintf} instead. +@end table + +If you try to use any of these functions, the error message will give +you a hint by referring to an identifier like +@code{dont_use_sprintf_use_snprintf}. diff --git a/doc/task0_questions.texi b/doc/task0_questions.texi new file mode 100644 index 0000000..368f88f --- /dev/null +++ b/doc/task0_questions.texi @@ -0,0 +1,37 @@ +@enumerate + @item Which Git command can be used to retrieve a copy of Pintos to your + group directory? + @item Why is using the @code{strcpy()} function to copy strings usually a + bad idea? + @item Explain how thread scheduling in Pintos currently works in less than + 250 words. Include the chain of execution of function calls. + @item Explain the property of reproducibility and how the lack of + reproducibility will affect debugging. + @item How would you print an unsigned 64 bit @code{int}? (Consider that you + are working with C99) + @item What makes locks and semaphores in Pintos similar? What extra property + do locks have? + @item What are the limitations on the size of the thread struct? How does + Pintos identify stack overflow? + @item If test @file{src/tests/threads/alarm-multiple} fails, where would + you find its output and result logs? You might want to run this test and + find out. + @item Given a struct defined as follows: +@verbatim +struct foo +{ + int bar; + struct list_elem e; +}; +@end verbatim + And a list declaration: +@verbatim +struct list foo_list; +@end verbatim + Give a piece of code that would insert an element of @code{struct foo} + into the list ordered (in ascending order) by the element @code{bar}. + Assume @file{<list.h>} has been included. + @item For a list of @code{struct foo} as above, write a piece of code to + iterate through the list and return a pointer to the struct if element + @code{bar} is equal to @code{int x}. +@end enumerate diff --git a/doc/task0_sheet.texi b/doc/task0_sheet.texi new file mode 100644 index 0000000..e91f08a --- /dev/null +++ b/doc/task0_sheet.texi @@ -0,0 +1,11 @@ +\input texinfo @c -*- texinfo -*- + +@c %**start of header +@setfilename task0_sheet.info +@settitle Pintos Task 0 +@c %**end of header + +@chapter Pintos-IC Task 0 questions +@include task0_questions.texi + +@bye diff --git a/doc/texi2html b/doc/texi2html new file mode 100755 index 0000000..16ecd4a --- /dev/null +++ b/doc/texi2html @@ -0,0 +1,6346 @@ +#! /usr/bin/env perl +'di '; +'ig 00 '; +#+############################################################################## +# +# texi2html: Program to transform Texinfo documents to HTML +# +# Copyright (C) 1999, 2000 Free Software Foundation, Inc. +# +# This program is free software; you can redistribute it and/or modify +# it under the terms of the GNU General Public License as published by +# the Free Software Foundation; either version 2 of the License, or +# (at your option) any later version. +# +# This program is distributed in the hope that it will be useful, +# but WITHOUT ANY WARRANTY; without even the implied warranty of +# MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the +# GNU General Public License for more details. +# +# You should have received a copy of the GNU General Public License +# along with this program; if not, write to the Free Software +# Foundation, Inc., 59 Temple Place, Suite 330, Boston, MA 02111-1307 USA +# +#-############################################################################## + +# This requires perl version 5 or higher +require 5.0; + +# Perl pragma to restrict unsafe constructs +use strict; +# +# According to +# larry.jones@sdrc.com (Larry Jones) +# this pragma is not present in perl5.004_02: +# +# Perl pragma to control optional warnings +# use warnings; + +# Declarations +use vars qw( + $ADDRESS + $ANTI_ALIAS + $ANTI_ALIAS_TEXT + $ASCII_MODE + $AUTO_LINK + $AUTO_PREFIX + $BIBRE + $CHAPTEREND + $CHILDLINE + $Configure_failed + $DEBUG + $DEBUG_BIB + $DEBUG_DEF + $DEBUG_GLOSS + $DEBUG_HTML + $DEBUG_INDEX + $DEBUG_L2H + $DEBUG_TOC + $DEBUG_USER + $DESTDIR + $ERROR + $EXTERNAL_FILE + $EXTERNAL_IMAGES + $EXTERNAL_UP_LINK + $EXTERNAL_UP_TITLE + $FH + $FIGURE_SCALE_FACTOR + $FILERE + $HTML_VERSION + $IMAGES_ONLY + $INFO + $LINE_WIDTH + $LOCAL_ICONS + $LONG_TITLES + $MATH_SCALE_FACTOR + $MAX_LINK_DEPTH + $MAX_SPLIT_DEPTH + $NETSCAPE_HTML + $NODERE + $NODESRE + $NOLATEX + $NO_FOOTNODE + $NO_IMAGES + $NO_NAVIGATION + $NO_SIMPLE_MATH + $NO_SUBDIR + $PAPERSIZE + $PREFIX + $PROTECTTAG + $PS_IMAGES + $REUSE + $SCALABLE_FONTS + $SECTIONEND + $SHORTEXTN + $SHORT_INDEX + $SHOW_SECTION_NUMBERS + $SPLIT + $T2H_ADDRESS + $T2H_AFTER_ABOUT + $T2H_AFTER_BODY_OPEN + $T2H_AUTHORS + $T2H_BODYTEXT + $T2H_BUTTONS + $T2H_EXTRA_HEAD + $T2H_FAILURE_TEXT + $T2H_HAS_TOP_HEADING + $T2H_HOMEPAGE + $T2H_ICONS + $T2H_OBSOLETE_OPTIONS + $T2H_OVERVIEW + $T2H_PRE_BODY_CLOSE + $T2H_THIS_SECTION + $T2H_TOC + $T2H_TODAY + $T2H_TOP + $T2H_USAGE_TEXT + $T2H_USER + $TEXDEFS + $THISPROG + $THISVERSION + $TITLE + $TITLES_LANGUAGE + $TMP + $TOPEND + $VARRE + $VERBOSE + $WARN + $WORDS_IN_NAVIGATION_PANEL_TITLES + $WORDS_IN_PAGE + $about_body + $addr + $after + $before + $bib_num + $button + $call + $changed + $complex_format_map + $contents + $count + $counter + $curlevel + $d + $default_language + $deferred_ref + $doc_num + $docid + $docu + $docu_about + $docu_about_file + $docu_dir + $docu_doc + $docu_doc_file + $docu_ext + $docu_foot + $docu_foot_file + $docu_frame_file + $docu_name + $docu_rdir + $docu_stoc + $docu_stoc_file + $docu_toc + $docu_toc_file + $docu_toc_frame_file + $docu_top + $docu_top_file + $done + $dont_html + $dotbug + $elt + $end_of_para + $end_tag + $entry + $ext + $extensions + $failed + $fh_name + $file + $first_index_chapter + $first_line + $foot + $foot_num + $footid + $ftype + $full + $gloss_num + $h_content + $h_line + $has_top + $has_top_command + $hcontent + $html_element + $html_num + $i + $id + $idx_num + $in + $in_bibliography + $in_glossary + $in_html + $in_list + $in_pre + $in_table + $in_titlepage + $in_top + $index + $index_properties + $init_file + $int_file + $is_extra + $key + $keys + $l + $l2h_cache_file + $l2h_cached_count + $l2h_extract_error + $l2h_html_count + $l2h_html_file + $l2h_latex_closing + $l2h_latex_count + $l2h_latex_file + $l2h_latex_preample + $l2h_name + $l2h_prefix + $l2h_range_error + $l2h_to_latex_count + $l_l2h + $latex_file + $len + $letter + $level + $macros + $man + $match + $maximage + $next + $nn + $node + $node_next + $node_prev + $node_up + $nodes + $num + $old + $only_text + $options + $post + $pre + $prev_node + $previous + $progdir + $re + $reused + $root + $sec_num + $section + $string + $style + $sub + $subst_code + $table_type + $tag + $texi_style + $to_do + $toc_indent + $tocid + $toplevel + $type + $url + $use_acc + $use_bibliography + $what + %T2H_ACTIVE_ICONS + %T2H_BUTTONS_EXAMPLE + %T2H_BUTTONS_GOTO + %T2H_HREF + %T2H_NAME + %T2H_NAVIGATION_TEXT + %T2H_NODE + %T2H_PASSIVE_ICONS + %T2H_THISDOC + %accent_map + %bib2href + %context + %def_map + %format_map + %gloss2href + %idx2node + %l2h_cache + %l2h_img + %l2h_to_latex + %macros + %node2href + %node2next + %node2prev + %node2sec + %node2up + %number2sec + %predefined_index + %sec2level + %sec2node + %sec2seccount + %seccount2sec + %sec2number + %seen + %simple_map + %style_map + %tag2pro + %things_map + %to_skip + %user_sub + %valid_index + %value + @T2H_CHAPTER_BUTTONS + @T2H_MISC_BUTTONS + @T2H_SECTION_BUTTONS + @appendix_sec_num + @args + @doc_lines + @fhs + @foot_lines + @html_stack + @input_spool + @l2h_from_html + @l2h_to_latex + @lines + @lines2 + @lines3 + @normal_sec_num + @sections + @stoc_lines + @tables + @toc_lines + ); + +#++############################################################################## +# +# NOTE FOR DEBUGGING THIS SCRIPT: +# You can run 'perl texi2html.pl' directly, provided you have +# the environment variable T2H_HOME set to the directory containing +# the texi2html.init file +# +#--############################################################################## + +# CVS version: +# $Id: texi2html,v 1.3 2005-06-19 03:20:26 blp Exp $ + +# Homepage: +$T2H_HOMEPAGE = "http://texi2html.cvshome.org"; + +# Authors: +$T2H_AUTHORS = <<EOT; +Written by: Lionel Cons <Lionel.Cons\@cern.ch> (original author) + Karl Berry <karl\@freefriends.org> + Olaf Bachmann <obachman\@mathematik.uni-kl.de> + and many others. +Maintained by: Many creative people <dev\@texi2html.cvshome.org> +Send bugs and suggestions to <users\@texi2html.cvshome.org> +EOT + +# Version: set in configure.in +$THISVERSION = '1.66'; +$THISPROG = "texi2html $THISVERSION"; # program name and version + +# The man page for this program is included at the end of this file and can be +# viewed using the command 'nroff -man texi2html'. + +#+++############################################################################ +# # +# Initialization # +# Pasted content of File $(srcdir)/texi2html.init: Default initializations # +# # +#---############################################################################ + +# leave this within comments, and keep the require statement +# This way, you can directly run texi2html.pl, if $ENV{T2H_HOME}/texi2html.init +# exists. + +# +# -*-perl-*- +###################################################################### +# File: texi2html.init +# +# Sets default values for command-line arguments and for various customizable +# procedures +# +# A copy of this file is pasted into the beginning of texi2html by +# 'make texi2html' +# +# Copy this file and make changes to it, if you like. +# Afterwards, either, load it with command-line option -init_file <your_init_file> +# +# $Id: texi2html,v 1.3 2005-06-19 03:20:26 blp Exp $ + +###################################################################### +# stuff which can also be set by command-line options +# +# +# Note: values set here, overwrite values set by the command-line +# options before -init_file and might still be overwritten by +# command-line arguments following the -init_file option +# + +# T2H_OPTIONS is a hash whose keys are the (long) names of valid +# command-line options and whose values are a hash with the following keys: +# type ==> one of !|=i|:i|=s|:s (see GetOpt::Long for more info) +# linkage ==> ref to scalar, array, or subroutine (see GetOpt::Long for more info) +# verbose ==> short description of option (displayed by -h) +# noHelp ==> if 1 -> for "not so important options": only print description on -h 1 +# 2 -> for obsolete options: only print description on -h 2 + +my $T2H_DEBUG = 0; +my $T2H_OPTIONS; +$T2H_OPTIONS -> {debug} = +{ + type => '=i', + linkage => \$T2H_DEBUG, + verbose => 'output HTML with debuging information', +}; + +# APA: Add SystemLiteral to identify the canonical DTD. +# [Definition:] The SystemLiteral is called the entity's system +# identifier. It is a URI, which may be used to retrieve the entity. +# See http://www.xml.com/axml/target.html#NT-ExternalID +my $T2H_DOCTYPE = '<!DOCTYPE html PUBLIC "-//W3C//DTD HTML 4.01 Transitional//EN" + "http://www.w3.org/TR/html40/loose.dtd">'; +$T2H_OPTIONS -> {doctype} = +{ + type => '=s', + linkage => \$T2H_DOCTYPE, + verbose => 'document type which is output in header of HTML files', + noHelp => 1 +}; + +my $T2H_CHECK = 0; +$T2H_OPTIONS -> {check} = +{ + type => '!', + linkage => \$T2H_CHECK, + verbose => 'if set, only check files and output all things that may be Texinfo commands', + noHelp => 1 +}; + +# -expand +# if set to "tex" (or, "info") expand @iftex and @tex (or, @ifinfo) sections +# else, neither expand @iftex, @tex, nor @ifinfo sections +my $T2H_EXPAND = "info"; +$T2H_OPTIONS -> {expand} = +{ + type => '=s', + linkage => \$T2H_EXPAND, + verbose => 'Expand info|tex|none section of texinfo source', +}; + +# - glossary +# If set, uses section named `Footnotes' for glossary. +my $T2H_USE_GLOSSARY = 0; +$T2H_OPTIONS -> {glossary} = +{ + type => '!', + linkage => \$T2H_USE_GLOSSARY, + verbose => "if set, uses section named `Footnotes' for glossary", + noHelp => 1, +}; + + +# -invisible +# $T2H_INVISIBLE_MARK is the text used to create invisible destination +# anchors for index links (you can for instance use the invisible.xbm +# file shipped with this program). This is a workaround for a known +# bug of many WWW browsers, including netscape. +# For me, it works fine without it -- on the contrary: if there, it +# inserts space between headers and start of text (obachman 3/99) +my $T2H_INVISIBLE_MARK = ''; +# $T2H_INVISIBLE_MARK = ' '; +$T2H_OPTIONS -> {invisible} = +{ + type => '=s', + linkage => \$T2H_INVISIBLE_MARK, + verbose => 'use text in invisble anchot', + noHelp => 1, +}; + +# -iso +# if set, ISO8859 characters are used for special symbols (like copyright, etc) +my $T2H_USE_ISO = 0; +$T2H_OPTIONS -> {iso} = +{ + type => 'iso', + linkage => \$T2H_USE_ISO, + verbose => 'if set, ISO8859 characters are used for special symbols (like copyright, etc)', + noHelp => 1, +}; + +# -I +# list directories where @include files are searched for (besides the +# directory of the doc file) additional '-I' args add to this list +# APA: Don't implicitely search ., to conform with the docs! +# my @T2H_INCLUDE_DIRS = ("."); +my @T2H_INCLUDE_DIRS = (); +$T2H_OPTIONS -> {I} = +{ + type => '=s', + linkage => \@T2H_INCLUDE_DIRS, + verbose => 'append $s to the @include search path', +}; + +# -top_file +# uses file of this name for top-level file +# extension is manipulated appropriately, if necessary. +# If empty, <basename of document>.html is used. +# Typically, you would set this to "index.html". +my $T2H_TOP_FILE = ''; +$T2H_OPTIONS -> {top_file} = +{ + type => '=s', + linkage => \$T2H_TOP_FILE, + verbose => 'use $s as top file, instead of <docname>.html', +}; + + +# -toc_file +# uses file of this name for table of contents. File +# extension is manipulated appropriately, if necessary. +# If empty, <basename of document>_toc.html is used. +my $T2H_TOC_FILE = ''; +$T2H_OPTIONS -> {toc_file} = +{ + type => '=s', + linkage => \$T2H_TOC_FILE, + verbose => 'use $s as ToC file, instead of <docname>_toc.html', +}; + +# -frames +# if set, output two additional files which use HTML 4.0 "frames". +my $T2H_FRAMES = 0; +$T2H_OPTIONS -> {frames} = +{ + type => '!', + linkage => \$T2H_FRAMES, + verbose => 'output files which use HTML 4.0 frames (experimental)', + noHelp => 1, +}; + + +# -menu | -nomenu +# if set, show the Texinfo menus +my $T2H_SHOW_MENU = 1; +$T2H_OPTIONS -> {menu} = +{ + type => '!', + linkage => \$T2H_SHOW_MENU, + verbose => 'ouput Texinfo menus', +}; + +# -number | -nonumber +# if set, number sections and show section names and numbers in references +# and menus +my $T2H_NUMBER_SECTIONS = 1; +$T2H_OPTIONS -> {number} = +{ + type => '!', + linkage => \$T2H_NUMBER_SECTIONS, + verbose => 'use numbered sections' +}; + +# if set, and T2H_NUMBER_SECTIONS is set, then use node names in menu +# entries, instead of section names +my $T2H_NODE_NAME_IN_MENU = 0; + +# if set and menu entry equals menu descr, then do not print menu descr. +# Likewise, if node name equals entry name, do not print entry name. +my $T2H_AVOID_MENU_REDUNDANCY = 1; + +# -split section|chapter|none +# if set to 'section' (resp. 'chapter') create one html file per (sub)section +# (resp. chapter) and separate pages for Top, ToC, Overview, Index, +# Glossary, About. +# Otherwise, create a monolithic html file that contains the whole document. +#$T2H_SPLIT = 'section'; +my $T2H_SPLIT = ''; +$T2H_OPTIONS -> {split} = +{ + type => '=s', + linkage => \$T2H_SPLIT, + verbose => 'split document on section|chapter else no splitting', +}; + +# -section_navigation|-no-section_navigation +# if set, then navigation panels are printed at the beginning of each section +# and, possibly at the end (depending on whether or not there were more than +# $T2H_WORDS_IN_PAGE words on page +# This is most useful if you do not want to have section navigation +# on -split chapter +my $T2H_SECTION_NAVIGATION = 1; +$T2H_OPTIONS -> {sec_nav} = +{ + type => '!', + linkage => \$T2H_SECTION_NAVIGATION, + verbose => 'output navigation panels for each section', +}; + +# -subdir +# If set, then put result files into the specified directory. +# If not set, then result files are put into the current directory. +#$T2H_SUBDIR = 'html'; +my $T2H_SUBDIR = ''; +$T2H_OPTIONS -> {subdir} = +{ + type => '=s', + linkage => \$T2H_SUBDIR, + verbose => 'put HTML files in directory $s, instead of $cwd', +}; + +# -short_extn +# If this is set, then all HTML files will have extension ".htm" instead of +# ".html". This is helpful when shipping the document to DOS-based systems. +my $T2H_SHORTEXTN = 0; +$T2H_OPTIONS -> {short_ext} = +{ + type => '!', + linkage => \$T2H_SHORTEXTN, + verbose => 'use "htm" extension for output HTML files', +}; + + +# -prefix +# Set the output file prefix, prepended to all .html, .gif and .pl files. +# By default, this is the basename of the document +my $T2H_PREFIX = ''; +$T2H_OPTIONS -> {prefix} = +{ + type => '=s', + linkage => \$T2H_PREFIX, + verbose => 'use as prefix for output files, instead of <docname>', +}; + +# -o filename +# If set, generate monolithic document output html into $filename +my $T2H_OUT = ''; +$T2H_OPTIONS -> {out_file} = +{ + type => '=s', + linkage => sub {$T2H_OUT = $_[1]; $T2H_SPLIT = '';}, + verbose => 'if set, all HTML output goes into file $s', +}; + +# -short_ref +#if set cross-references are given without section numbers +my $T2H_SHORT_REF = ''; +$T2H_OPTIONS -> {short_ref} = +{ + type => '!', + linkage => \$T2H_SHORT_REF, + verbose => 'if set, references are without section numbers', +}; + +# -idx_sum +# if value is set, then for each @prinindex $what +# $docu_name_$what.idx is created which contains lines of the form +# $key\t$ref sorted alphabetically (case matters) +my $T2H_IDX_SUMMARY = 0; +$T2H_OPTIONS -> {idx_sum} = +{ + type => '!', + linkage => \$T2H_IDX_SUMMARY, + verbose => 'if set, also output index summary', + noHelp => 1, +}; + +# -def_table +# Use a table construction for @def .... stuff instead +# New Option: 27.07.2000 Karl Heinz Marbaise +my $T2H_DEF_TABLE = 0; +$T2H_OPTIONS -> {def_table} = +{ + type => '!', + linkage => \$T2H_DEF_TABLE, + verbose => 'if set, \@def.. are converted using tables.', + noHelp => 1, +}; + +# -verbose +# if set, chatter about what we are doing +my $T2H_VERBOSE = ''; +$T2H_OPTIONS -> {Verbose} = +{ + type => '!', + linkage => \$T2H_VERBOSE, + verbose => 'print progress info to stdout', +}; + +# -lang +# For page titles use $T2H_WORDS->{$T2H_LANG}->{...} as title. +# To add a new language, supply list of titles (see $T2H_WORDS below). +# and use ISO 639 language codes (see e.g. perl module Locale-Codes-1.02 +# for definitions) +# Default's to 'en' if not set or no @documentlanguage is specified +my $T2H_LANG = ''; +$T2H_OPTIONS -> {lang} = +{ + type => '=s', + linkage => sub {SetDocumentLanguage($_[1])}, + verbose => 'use $s as document language (ISO 639 encoding)', +}; + +# -l2h +# if set, uses latex2html for generation of math content +my $T2H_L2H = ''; +$T2H_OPTIONS -> {l2h} = +{ + type => '!', + linkage => \$T2H_L2H, + verbose => 'if set, uses latex2html for @math and @tex', +}; + +###################### +# The following options are only relevant if $T2H_L2H is set +# +# -l2h_l2h +# name/location of latex2html program +my $T2H_L2H_L2H = "latex2html"; +$T2H_OPTIONS -> {l2h_l2h} = +{ + type => '=s', + linkage => \$T2H_L2H_L2H, + verbose => 'program to use for latex2html translation', + noHelp => 1, +}; + +# -l2h_skip +# If set, skips actual call to latex2html: tries to reuse previously generated +# content, instead. +my $T2H_L2H_SKIP = ''; +$T2H_OPTIONS -> {l2h_skip} = +{ + type => '!', + linkage => \$T2H_L2H_SKIP, + verbose => 'if set, tries to reuse previously latex2html output', + noHelp => 1, +}; + +# -l2h_tmp +# If set, l2h uses the specified directory for temporary files. The path +# leading to this directory may not contain a dot (i.e., a "."); +# otherwise, l2h will fail. +my $T2H_L2H_TMP = ''; +$T2H_OPTIONS -> {l2h_tmp} = +{ + type => '=s', + linkage => \$T2H_L2H_TMP, + verbose => 'if set, uses $s as temporary latex2html directory', + noHelp => 1, +}; + +# if set, cleans intermediate files (they all have the prefix $doc_l2h_) +# of l2h +my $T2H_L2H_CLEAN = 1; +$T2H_OPTIONS -> {l2h_clean} = +{ + type => '!', + linkage => \$T2H_L2H_CLEAN, + verbose => 'if set, do not keep intermediate latex2html files for later reuse', + noHelp => 1, +}; + +$T2H_OPTIONS -> {D} = +{ + type => '=s', + linkage => sub {$main::value{$_[1]} = 1;}, + verbose => 'equivalent to Texinfo "@set $s 1"', + noHelp => 1, +}; + +$T2H_OPTIONS -> {init_file} = +{ + type => '=s', + linkage => \&LoadInitFile, + verbose => 'load init file $s' +}; + + +############################################################################## +# +# The following can only be set in the init file +# +############################################################################## + +# if set, center @image by default +# otherwise, do not center by default +my $T2H_CENTER_IMAGE = 1; + +# used as identation for block enclosing command @example, etc +# If not empty, must be enclosed in <td></td> +my $T2H_EXAMPLE_INDENT_CELL = '<td> </td>'; +# same as above, only for @small +my $T2H_SMALL_EXAMPLE_INDENT_CELL = '<td> </td>'; +# font size for @small +my $T2H_SMALL_FONT_SIZE = '-1'; + +# if non-empty, and no @..heading appeared in Top node, then +# use this as header for top node/section, otherwise use value of +# @settitle or @shorttitle (in that order) +my $T2H_TOP_HEADING = ''; + +# if set, use this chapter for 'Index' button, else +# use first chapter whose name matches 'index' (case insensitive) +my $T2H_INDEX_CHAPTER = ''; + +# if set and $T2H_SPLIT is set, then split index pages at the next letter +# after they have more than that many entries +my $T2H_SPLIT_INDEX = 100; + +# if set (e.g., to index.html) replace hrefs to this file +# (i.e., to index.html) by ./ +my $T2H_HREF_DIR_INSTEAD_FILE = ''; + +######################################################################## +# Language dependencies: +# To add a new language extend T2H_WORDS hash and create $T2H_<...>_WORDS hash +# To redefine one word, simply do: +# $T2H_WORDS->{<language>}->{<word>} = 'whatever' in your personal init file. +# +my $T2H_WORDS_EN = +{ + # titles of pages + 'ToC_Title' => 'Table of Contents', + 'Overview_Title' => 'Short Table of Contents', + 'Index_Title' => 'Index', + 'About_Title' => 'About this document', + 'Footnotes_Title' => 'Footnotes', + 'See' => 'See', + 'see' => 'see', + 'section' => 'section', + # If necessary, we could extend this as follows: + # # text for buttons + # 'Top_Button' => 'Top', + # 'ToC_Button' => 'Contents', + # 'Overview_Button' => 'Overview', + # 'Index_button' => 'Index', + # 'Back_Button' => 'Back', + # 'FastBack_Button' => 'FastBack', + # 'Prev_Button' => 'Prev', + # 'Up_Button' => 'Up', + # 'Next_Button' => 'Next', + # 'Forward_Button' =>'Forward', + # 'FastWorward_Button' => 'FastForward', + # 'First_Button' => 'First', + # 'Last_Button' => 'Last', + # 'About_Button' => 'About' +}; + +my $T2H_WORDS_DE = +{ + 'ToC_Title' => 'Inhaltsverzeichniss', + 'Overview_Title' => 'Kurzes Inhaltsverzeichniss', + 'Index_Title' => 'Index', + 'About_Title' => 'Über dieses Dokument', + 'Footnotes_Title' => 'Fußnoten', + 'See' => 'Siehe', + 'see' => 'siehe', + 'section' => 'Abschnitt', +}; + +my $T2H_WORDS_NL = +{ + 'ToC_Title' => 'Inhoudsopgave', + 'Overview_Title' => 'Korte inhoudsopgave', + 'Index_Title' => 'Index', #Not sure ;-) + 'About_Title' => 'No translation available!', #No translation available! + 'Footnotes_Title' => 'No translation available!', #No translation available! + 'See' => 'Zie', + 'see' => 'zie', + 'section' => 'sectie', +}; + +my $T2H_WORDS_ES = +{ + 'ToC_Title' => 'índice General', + 'Overview_Title' => 'Resumen del Contenido', + 'Index_Title' => 'Index', #Not sure ;-) + 'About_Title' => 'No translation available!', #No translation available! + 'Footnotes_Title' => 'Fußnoten', + 'See' => 'Véase', + 'see' => 'véase', + 'section' => 'sección', +}; + +my $T2H_WORDS_NO = +{ + 'ToC_Title' => 'Innholdsfortegnelse', + 'Overview_Title' => 'Kort innholdsfortegnelse', + 'Index_Title' => 'Indeks', #Not sure ;-) + 'About_Title' => 'No translation available!', #No translation available! + 'Footnotes_Title' => 'No translation available!', + 'See' => 'Se', + 'see' => 'se', + 'section' => 'avsnitt', +}; + +my $T2H_WORDS_PT = +{ + 'ToC_Title' => 'Sumário', + 'Overview_Title' => 'Breve Sumário', + 'Index_Title' => 'Índice', #Not sure ;-) + 'About_Title' => 'No translation available!', #No translation available! + 'Footnotes_Title' => 'No translation available!', + 'See' => 'Veja', + 'see' => 'veja', + 'section' => 'Seção', +}; + +my $T2H_WORDS_FR = +{ + 'ToC_Title' => 'Table des mati&egrav;res', + 'Overview_Title' => 'Résumée du contenu', + 'Index_Title' => 'Index', + 'About_Title' => 'A propos de ce document', + 'Footnotes_Title' => 'Notes de bas de page', + 'See' => 'Voir', + 'see' => 'voir', + 'section' => 'section', +}; + +my $T2H_WORDS = +{ + 'en' => $T2H_WORDS_EN, + 'de' => $T2H_WORDS_DE, + 'nl' => $T2H_WORDS_NL, + 'es' => $T2H_WORDS_ES, + 'no' => $T2H_WORDS_NO, + 'pt' => $T2H_WORDS_PT, + 'fr' => $T2H_WORDS_FR +}; + +my @MONTH_NAMES_EN = + ( + 'January', 'February', 'March', 'April', 'May', + 'June', 'July', 'August', 'September', 'October', + 'November', 'December' + ); + +my @MONTH_NAMES_DE = + ( + 'Januar', 'Februar', 'März', 'April', 'Mai', + 'Juni', 'Juli', 'August', 'September', 'Oktober', + 'November', 'Dezember' + ); + +my @MONTH_NAMES_NL = + ( + 'Januari', 'Februari', 'Maart', 'April', 'Mei', + 'Juni', 'Juli', 'Augustus', 'September', 'Oktober', + 'November', 'December' + ); + +my @MONTH_NAMES_ES = + ( + 'enero', 'febrero', 'marzo', 'abril', 'mayo', + 'junio', 'julio', 'agosto', 'septiembre', 'octubre', + 'noviembre', 'diciembre' + ); + +my @MONTH_NAMES_NO = + ( + + 'januar', 'februar', 'mars', 'april', 'mai', + 'juni', 'juli', 'august', 'september', 'oktober', + 'november', 'desember' + ); + +my @MONTH_NAMES_PT = + ( + 'Janeiro', 'Fevereiro', 'Março', 'Abril', 'Maio', + 'Junho', 'Julho', 'Agosto', 'Setembro', 'Outubro', + 'Novembro', 'Dezembro' + ); + +my @MONTH_NAMES_FR = +( + 'Janvier', 'Février', 'Mars', 'Avril', 'Mai', + 'Juin', 'Juillet', 'Août', 'Septembre', 'Octobre', + 'Novembre', 'Décembre' +); + + + +my $MONTH_NAMES = +{ + 'en' => \@MONTH_NAMES_EN, + 'de' => \@MONTH_NAMES_DE, + 'es' => \@MONTH_NAMES_ES, + 'nl' => \@MONTH_NAMES_NL, + 'no' => \@MONTH_NAMES_NO, + 'pt' => \@MONTH_NAMES_PT, + 'fr' => \@MONTH_NAMES_FR +}; + +######################################################################## +# Control of Page layout: +# You can make changes of the Page layout at two levels: +# 1.) For small changes, it is often enough to change the value of +# some global string/hash/array variables +# 2.) For larger changes, reimplement one of the T2H_DEFAULT_<fnc>* routines, +# give them another name, and assign them to the respective +# $T2H_<fnc> variable. + +# As a general interface, the hashes T2H_HREF, T2H_NAME, T2H_NODE hold +# href, html-name, node-name of +# This -- current section (resp. html page) +# Top -- top page ($T2H_TOP_FILE) +# Contents -- Table of contents +# Overview -- Short table of contents +# Index -- Index page +# About -- page which explain "navigation buttons" +# First -- first node +# Last -- last node +# +# Whether or not the following hash values are set, depends on the context +# (all values are w.r.t. 'This' section) +# Next -- next node of texinfo +# Prev -- previous node of texinfo +# Up -- up node of texinfo +# Forward -- next node in reading order +# Back -- previous node in reading order +# FastForward -- if leave node, up and next, else next node +# FastBackward-- if leave node, up and prev, else prev node +# +# Furthermore, the following global variabels are set: +# $T2H_THISDOC{title} -- title as set by @setttile +# $T2H_THISDOC{fulltitle} -- full title as set by @title... +# $T2H_THISDOC{subtitle} -- subtitle as set by @subtitle +# $T2H_THISDOC{author} -- author as set by @author +# +# and pointer to arrays of lines which need to be printed by t2h_print_lines +# $T2H_OVERVIEW -- lines of short table of contents +# $T2H_TOC -- lines of table of contents +# $T2H_TOP -- lines of Top texinfo node +# $T2H_THIS_SECTION -- lines of 'This' section + +# +# There are the following subs which control the layout: +# +my $T2H_print_section = \&T2H_DEFAULT_print_section; +my $T2H_print_Top_header = \&T2H_DEFAULT_print_Top_header; +my $T2H_print_Top_footer = \&T2H_DEFAULT_print_Top_footer; +my $T2H_print_Top = \&T2H_DEFAULT_print_Top; +my $T2H_print_Toc = \&T2H_DEFAULT_print_Toc; +my $T2H_print_Overview = \&T2H_DEFAULT_print_Overview; +my $T2H_print_Footnotes = \&T2H_DEFAULT_print_Footnotes; +my $T2H_print_About = \&T2H_DEFAULT_print_About; +my $T2H_print_misc_header = \&T2H_DEFAULT_print_misc_header; +my $T2H_print_misc_footer = \&T2H_DEFAULT_print_misc_footer; +my $T2H_print_misc = \&T2H_DEFAULT_print_misc; +my $T2H_print_chapter_header = \&T2H_DEFAULT_print_chapter_header; +my $T2H_print_chapter_footer = \&T2H_DEFAULT_print_chapter_footer; +my $T2H_print_page_head = \&T2H_DEFAULT_print_page_head; +my $T2H_print_page_foot = \&T2H_DEFAULT_print_page_foot; +my $T2H_print_head_navigation = \&T2H_DEFAULT_print_head_navigation; +my $T2H_print_foot_navigation = \&T2H_DEFAULT_print_foot_navigation; +my $T2H_button_icon_img = \&T2H_DEFAULT_button_icon_img; +my $T2H_print_navigation = \&T2H_DEFAULT_print_navigation; +my $T2H_about_body = \&T2H_DEFAULT_about_body; +my $T2H_print_frame = \&T2H_DEFAULT_print_frame; +my $T2H_print_toc_frame = \&T2H_DEFAULT_print_toc_frame; + +######################################################################## +# Layout for html for every sections +# +sub T2H_DEFAULT_print_section +{ + my $fh = shift; + local $T2H_BUTTONS = \@T2H_SECTION_BUTTONS; + &$T2H_print_head_navigation($fh) if $T2H_SECTION_NAVIGATION; + my $nw = t2h_print_lines($fh); + if (defined $T2H_SPLIT + and ($T2H_SPLIT eq 'section' && $T2H_SECTION_NAVIGATION)) + { + &$T2H_print_foot_navigation($fh, $nw); + } + else + { + print $fh '<HR SIZE="6">' . "\n"; + } +} + +################################################################### +# Layout of top-page I recommend that you use @ifnothtml, @ifhtml, +# @html within the Top texinfo node to specify content of top-level +# page. +# +# If you enclose everything in @ifnothtml, then title, subtitle, +# author and overview is printed +# T2H_HREF of Next, Prev, Up, Forward, Back are not defined +# if $T2H_SPLIT then Top page is in its own html file +sub T2H_DEFAULT_print_Top_header +{ + &$T2H_print_page_head(@_) if $T2H_SPLIT; + t2h_print_label(@_); # this needs to be called, otherwise no label set + &$T2H_print_head_navigation(@_); +} +sub T2H_DEFAULT_print_Top_footer +{ + &$T2H_print_foot_navigation(@_); + &$T2H_print_page_foot(@_) if $T2H_SPLIT; +} +sub T2H_DEFAULT_print_Top +{ + my $fh = shift; + + # for redefining navigation buttons use: + # local $T2H_BUTTONS = [...]; + # as it is, 'Top', 'Contents', 'Index', 'About' are printed + local $T2H_BUTTONS = \@T2H_MISC_BUTTONS; + &$T2H_print_Top_header($fh); + if ($T2H_THIS_SECTION) + { + # if top-level node has content, then print it with extra header + print $fh "<H1>$T2H_NAME{Top}</H1>\n" + unless ($T2H_HAS_TOP_HEADING); + t2h_print_lines($fh, $T2H_THIS_SECTION) + } + else + { + # top-level node is fully enclosed in @ifnothtml + # print fulltitle, subtitle, author, Overview + print $fh + "<CENTER>\n<H1>" . + join("</H1>\n<H1>", split(/\n/, $T2H_THISDOC{fulltitle})) . + "</H1>\n"; + print $fh "<H2>$T2H_THISDOC{subtitle}</H2>\n" if $T2H_THISDOC{subtitle}; + print $fh "$T2H_THISDOC{author}\n" if $T2H_THISDOC{author}; + print $fh <<EOT; +</CENTER> +<HR> +<P></P> +<H2> Overview: </H2> +<BLOCKQUOTE> +EOT + t2h_print_lines($fh, $T2H_OVERVIEW); + print $fh "</BLOCKQUOTE>\n"; + } + &$T2H_print_Top_footer($fh); +} + +################################################################### +# Layout of Toc, Overview, and Footnotes pages +# By default, we use "normal" layout +# T2H_HREF of Next, Prev, Up, Forward, Back, etc are not defined +# use: local $T2H_BUTTONS = [...] to redefine navigation buttons +sub T2H_DEFAULT_print_Toc +{ + return &$T2H_print_misc(@_); +} +sub T2H_DEFAULT_print_Overview +{ + return &$T2H_print_misc(@_); +} +sub T2H_DEFAULT_print_Footnotes +{ + return &$T2H_print_misc(@_); +} +sub T2H_DEFAULT_print_About +{ + return &$T2H_print_misc(@_); +} + +sub T2H_DEFAULT_print_misc_header +{ + &$T2H_print_page_head(@_) if $T2H_SPLIT; + # this needs to be called, otherwise, no labels are set + t2h_print_label(@_); + &$T2H_print_head_navigation(@_); +} +sub T2H_DEFAULT_print_misc_footer +{ + &$T2H_print_foot_navigation(@_); + &$T2H_print_page_foot(@_) if $T2H_SPLIT; +} +sub T2H_DEFAULT_print_misc +{ + my $fh = shift; + local $T2H_BUTTONS = \@T2H_MISC_BUTTONS; + &$T2H_print_misc_header($fh); + print $fh "<H1>$T2H_NAME{This}</H1>\n"; + t2h_print_lines($fh); + &$T2H_print_misc_footer($fh); +} + +################################################################### +# chapter_header and chapter_footer are only called if +# T2H_SPLIT eq 'chapter' +# chapter_header: after print_page_head, before print_section +# chapter_footer: after print_section of last section, before print_page_foot +# +# If you want to get rid of navigation stuff after each section, +# redefine print_section such that it does not call print_navigation, +# and put print_navigation into print_chapter_header +@T2H_CHAPTER_BUTTONS = + ( + 'FastBack', 'FastForward', ' ', + ' ', ' ', ' ', ' ', + 'Top', 'Contents', 'Index', 'About', + ); + +sub T2H_DEFAULT_print_chapter_header +{ + # nothing to do there, by default + if (! $T2H_SECTION_NAVIGATION) + { + my $fh = shift; + local $T2H_BUTTONS = \@T2H_CHAPTER_BUTTONS; + &$T2H_print_navigation($fh); + print $fh "\n<HR SIZE=2>\n"; + } +} + +sub T2H_DEFAULT_print_chapter_footer +{ + local $T2H_BUTTONS = \@T2H_CHAPTER_BUTTONS; + &$T2H_print_navigation(@_); +} +################################################################### + +sub pretty_date { + my($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst); + + ($sec, $min, $hour, $mday, $mon, $year, $wday, $yday, $isdst) = localtime(time); + $year += ($year < 70) ? 2000 : 1900; + # obachman: Let's do it as the Americans do + return($MONTH_NAMES->{$T2H_LANG}[$mon] . ", " . $mday . " " . $year); +} + + +################################################################### +# Layout of standard header and footer +# + +# This init routine is called at the beginning of pass5 before first +# output is generated. +sub T2H_InitGlobals +{ + # Set the default body text, inserted between <BODY ... > + $T2H_BODYTEXT = 'LANG="' . $T2H_LANG . '" BGCOLOR="#FFFFFF" TEXT="#000000" LINK="#0000FF" VLINK="#800080" ALINK="#FF0000"'; + # text inserted after <BODY ...> + $T2H_AFTER_BODY_OPEN = ''; + #text inserted before </BODY> + $T2H_PRE_BODY_CLOSE = ''; + # this is used in footer + $T2H_ADDRESS = "<I>$T2H_USER</I> " if $T2H_USER; + $T2H_ADDRESS .= "on <I>$T2H_TODAY</I>"; + # this is added inside <HEAD></HEAD> after <TITLE> and some META NAME stuff + # can be used for <style> <script>, <meta> tags + $T2H_EXTRA_HEAD = ''; +} + +sub T2H_DEFAULT_print_page_head +{ + my $fh = shift; + my $longtitle = "$T2H_THISDOC{title}"; + $longtitle .= ": $T2H_NAME{This}" if exists $T2H_NAME{This}; + print $fh <<EOT; +$T2H_DOCTYPE +<HTML> +<!-- Created on $T2H_TODAY by $THISPROG --> +<!-- +$T2H_AUTHORS +--> +<HEAD> +<TITLE>$longtitle + + + + + + +$T2H_EXTRA_HEAD + + + +$T2H_AFTER_BODY_OPEN +EOT +} + +sub T2H_DEFAULT_print_page_foot +{ + my $fh = shift; + print $fh < + +This document was generated +by $T2H_ADDRESS +using texi2html + +$T2H_PRE_BODY_CLOSE + + +EOT +} + +################################################################### +# Layout of navigation panel + +# if this is set, then a vertical navigation panel is used +my $T2H_VERTICAL_HEAD_NAVIGATION = 0; +sub T2H_DEFAULT_print_head_navigation +{ + my $fh = shift; + if ($T2H_VERTICAL_HEAD_NAVIGATION) + { + print $fh < + + +EOT + } + &$T2H_print_navigation($fh, $T2H_VERTICAL_HEAD_NAVIGATION); + if ($T2H_VERTICAL_HEAD_NAVIGATION) + { + print $fh < + +EOT + } + elsif (defined $T2H_SPLIT + and ($T2H_SPLIT eq 'section')) + { + print $fh "
\n"; + } +} + +# Specifies the minimum page length required before a navigation panel +# is placed at the bottom of a page (the default is that of latex2html) +# T2H_THIS_WORDS_IN_PAGE holds number of words of current page +my $T2H_WORDS_IN_PAGE = 300; +sub T2H_DEFAULT_print_foot_navigation +{ + my $fh = shift; + my $nwords = shift; + if ($T2H_VERTICAL_HEAD_NAVIGATION) + { + print $fh < + + +EOT + } + print $fh "
\n"; + &$T2H_print_navigation($fh) if (defined $nwords + and $nwords >= $T2H_WORDS_IN_PAGE) +} + +###################################################################### +# navigation panel +# +# specify in this array which "buttons" should appear in which order +# in the navigation panel for sections; use ' ' for empty buttons (space) +@T2H_SECTION_BUTTONS = + ( + 'Back', 'Forward', ' ', 'FastBack', 'Up', 'FastForward', + ' ', ' ', ' ', ' ', + 'Top', 'Contents', 'Index', 'About', + ); + +# buttons for misc stuff +@T2H_MISC_BUTTONS = ('Top', 'Contents', 'Index', 'About'); + +# insert here name of icon images for buttons +# Icons are used, if $T2H_ICONS and resp. value are set +%T2H_ACTIVE_ICONS = + ( + 'Top', '', + 'Contents', '', + 'Overview', '', + 'Index', '', + 'Back', '', + 'FastBack', '', + 'Prev', '', + 'Up', '', + 'Next', '', + 'Forward', '', + 'FastForward', '', + 'About' , '', + 'First', '', + 'Last', '', + ' ', '' + ); + +# insert here name of icon images for these, if button is inactive +%T2H_PASSIVE_ICONS = + ( + 'Top', '', + 'Contents', '', + 'Overview', '', + 'Index', '', + 'Back', '', + 'FastBack', '', + 'Prev', '', + 'Up', '', + 'Next', '', + 'Forward', '', + 'FastForward', '', + 'About', '', + 'First', '', + 'Last', '', + ); + +# how to create IMG tag +sub T2H_DEFAULT_button_icon_img +{ + my $button = shift; + my $icon = shift; + my $name = shift; + return qq{$button: $name}; +} + +# Names of text as alternative for icons +%T2H_NAVIGATION_TEXT = + ( + 'Top', 'Top', + 'Contents', 'Contents', + 'Overview', 'Overview', + 'Index', 'Index', + ' ', '   ', + 'Back', ' < ', + 'FastBack', ' << ', + 'Prev', 'Prev', + 'Up', ' Up ', + 'Next', 'Next', + 'Forward', ' > ', + 'FastForward', ' >> ', + 'About', ' ? ', + 'First', ' |< ', + 'Last', ' >| ' + ); + +sub T2H_DEFAULT_print_navigation +{ + my $fh = shift; + my $vertical = shift; + my $spacing = 1; + print $fh "\n"; + + print $fh "" unless $vertical; + for $button (@$T2H_BUTTONS) + { + print $fh qq{\n} if $vertical; + print $fh qq{\n"; + print $fh "\n" if $vertical; + } + print $fh "" unless $vertical; + print $fh "
}; + + if (ref($button) eq 'CODE') + { + &$button($fh, $vertical); + } + elsif ($button eq ' ') + { # handle space button + print $fh + $T2H_ICONS && $T2H_ACTIVE_ICONS{' '} ? + &$T2H_button_icon_img($button, $T2H_ACTIVE_ICONS{' '}) : + $T2H_NAVIGATION_TEXT{' '}; + next; + } + elsif ($T2H_HREF{$button}) + { # button is active + print $fh + $T2H_ICONS && $T2H_ACTIVE_ICONS{$button} ? # use icon ? + t2h_anchor('', $T2H_HREF{$button}, # yes + &$T2H_button_icon_img($button, + $T2H_ACTIVE_ICONS{$button}, + $T2H_NAME{$button})) + : # use text + "[" . + t2h_anchor('', $T2H_HREF{$button}, $T2H_NAVIGATION_TEXT{$button}) . + "]"; + } + else + { # button is passive + print $fh + $T2H_ICONS && $T2H_PASSIVE_ICONS{$button} ? + &$T2H_button_icon_img($button, + $T2H_PASSIVE_ICONS{$button}, + $T2H_NAME{$button}) : + + "[" . $T2H_NAVIGATION_TEXT{$button} . "]"; + } + print $fh "
\n"; +} + +###################################################################### +# Frames: this is from "Richard Y. Kim" +# Should be improved to be more conforming to other _print* functions + +sub T2H_DEFAULT_print_frame +{ + my $fh = shift; + print $fh < +$T2H_THISDOC{title} + + + + + +EOT +} + +sub T2H_DEFAULT_print_toc_frame +{ + my $fh = shift; + &$T2H_print_page_head($fh); + print $fh <Content +EOT + print $fh map {s/HREF=/target=\"main\" HREF=/; $_;} @stoc_lines; + print $fh "\n"; +} + +###################################################################### +# About page +# + +# T2H_PRE_ABOUT might be a function +my $T2H_PRE_ABOUT = <texi2html +

+EOT +my $T2H_AFTER_ABOUT = ''; + +sub T2H_DEFAULT_about_body +{ + my $about; + if (ref($T2H_PRE_ABOUT) eq 'CODE') + { + $about = &$T2H_PRE_ABOUT(); + } + else + { + $about = $T2H_PRE_ABOUT; + } + $about .= <

+ + + + + + + +EOT + + for $button (@T2H_SECTION_BUTTONS) + { + next if $button eq ' ' || ref($button) eq 'CODE'; + $about .= < + + + + +EOT + } + + $about .= < +

+ where the Example assumes that the current position + is at Subsubsection One-Two-Three of a document of + the following structure:

+
    +
  • 1. Section One +
      +
    • 1.1 Subsection One-One +
        +
      • ...
      • +
      +
    • 1.2 Subsection One-Two +
        +
      • 1.2.1 Subsubsection One-Two-One
      • +
      • 1.2.2 Subsubsection One-Two-Two
      • +
      • 1.2.3 Subsubsection One-Two-Three     + <== Current Position
      • +
      • 1.2.4 Subsubsection One-Two-Four
      • +
      +
    • +
    • 1.3 Subsection One-Three +
        +
      • ...
      • +
      +
    • +
    • 1.4 Subsection One-Four
    • +
    +
  • +
+$T2H_AFTER_ABOUT +EOT + return $about; +} + + +%T2H_BUTTONS_GOTO = + ( + 'Top', 'cover (top) of document', + 'Contents', 'table of contents', + 'Overview', 'short table of contents', + 'Index', 'concept index', + 'Back', 'previous section in reading order', + 'FastBack', 'beginning of this chapter or previous chapter', + 'Prev', 'previous section same level', + 'Up', 'up section', + 'Next', 'next section same level', + 'Forward', 'next section in reading order', + 'FastForward', 'next chapter', + 'About' , 'this page', + 'First', 'first section in reading order', + 'Last', 'last section in reading order', + ); + +%T2H_BUTTONS_EXAMPLE = + ( + 'Top', '   ', + 'Contents', '   ', + 'Overview', '   ', + 'Index', '   ', + 'Back', '1.2.2', + 'FastBack', '1', + 'Prev', '1.2.2', + 'Up', '1.2', + 'Next', '1.2.4', + 'Forward', '1.2.4', + 'FastForward', '2', + 'About', '   ', + 'First', '1.', + 'Last', '1.2.4', + ); + + +###################################################################### +# from here on, its l2h init stuff +# + +## initialization for latex2html as for Singular manual generation +## obachman 3/99 + +# +# Options controlling Titles, File-Names, Tracing and Sectioning +# +$TITLE = ''; + +$SHORTEXTN = 0; + +$LONG_TITLES = 0; + +$DESTDIR = ''; # should be overwritten by cmd-line argument + +$NO_SUBDIR = 0; # should be overwritten by cmd-line argument + +$PREFIX = ''; # should be overwritten by cmd-line argument + +$AUTO_PREFIX = 0; # this is needed, so that prefix settings are used + +$AUTO_LINK = 0; + +$SPLIT = 0; + +$MAX_LINK_DEPTH = 0; + +$TMP = ''; # should be overwritten by cmd-line argument + +$DEBUG = 0; + +$VERBOSE = 1; + +# +# Options controlling Extensions and Special Features +# +$HTML_VERSION = "3.2"; + +$TEXDEFS = 1; # we absolutely need that + +$EXTERNAL_FILE = ''; + +$SCALABLE_FONTS = 1; + +$NO_SIMPLE_MATH = 1; + +$LOCAL_ICONS = 1; + +$SHORT_INDEX = 0; + +$NO_FOOTNODE = 1; + +$ADDRESS = ''; + +$INFO = ''; + +# +# Switches controlling Image Generation +# +$ASCII_MODE = 0; + +$NOLATEX = 0; + +$EXTERNAL_IMAGES = 0; + +$PS_IMAGES = 0; + +$NO_IMAGES = 0; + +$IMAGES_ONLY = 0; + +$REUSE = 2; + +$ANTI_ALIAS = 1; + +$ANTI_ALIAS_TEXT = 1; + +# +#Switches controlling Navigation Panels +# +$NO_NAVIGATION = 1; +$ADDRESS = ''; +$INFO = 0; # 0 = do not make a "About this document..." section + +# +#Switches for Linking to other documents +# +# currently -- we don't care + +$MAX_SPLIT_DEPTH = 0; # Stop making separate files at this depth + +$MAX_LINK_DEPTH = 0; # Stop showing child nodes at this depth + +$NOLATEX = 0; # 1 = do not pass unknown environments to Latex + +$EXTERNAL_IMAGES = 0; # 1 = leave the images outside the document + +$ASCII_MODE = 0; # 1 = do not use any icons or internal images + +# 1 = use links to external postscript images rather than inlined bitmap +# images. +$PS_IMAGES = 0; +$SHOW_SECTION_NUMBERS = 0; + +### Other global variables ############################################### +$CHILDLINE = ""; + +# This is the line width measured in pixels and it is used to right justify +# equations and equation arrays; +$LINE_WIDTH = 500; + +# Used in conjunction with AUTO_NAVIGATION +$WORDS_IN_PAGE = 300; + +# Affects ONLY the way accents are processed +$default_language = 'english'; + +# The value of this variable determines how many words to use in each +# title that is added to the navigation panel (see below) +# +$WORDS_IN_NAVIGATION_PANEL_TITLES = 0; + +# This number will determine the size of the equations, special characters, +# and anything which will be converted into an inlined image +# *except* "image generating environments" such as "figure", "table" +# or "minipage". +# Effective values are those greater than 0. +# Sensible values are between 0.1 - 4. +$MATH_SCALE_FACTOR = 1.5; + +# This number will determine the size of +# image generating environments such as "figure", "table" or "minipage". +# Effective values are those greater than 0. +# Sensible values are between 0.1 - 4. +$FIGURE_SCALE_FACTOR = 1.6; + + +# If both of the following two variables are set then the "Up" button +# of the navigation panel in the first node/page of a converted document +# will point to $EXTERNAL_UP_LINK. $EXTERNAL_UP_TITLE should be set +# to some text which describes this external link. +$EXTERNAL_UP_LINK = ""; +$EXTERNAL_UP_TITLE = ""; + +# If this is set then the resulting HTML will look marginally better if viewed +# with Netscape. +$NETSCAPE_HTML = 1; + +# Valid paper sizes are "letter", "legal", "a4","a3","a2" and "a0" +# Paper sizes has no effect other than in the time it takes to create inlined +# images and in whether large images can be created at all ie +# - larger paper sizes *MAY* help with large image problems +# - smaller paper sizes are quicker to handle +$PAPERSIZE = "a4"; + +# Replace "english" with another language in order to tell LaTeX2HTML that you +# want some generated section titles (eg "Table of Contents" or "References") +# to appear in a different language. Currently only "english" and "french" +# is supported but it is very easy to add your own. See the example in the +# file "latex2html.config" +$TITLES_LANGUAGE = "english"; + +1; # This must be the last non-comment line + +# End File texi2html.init +###################################################################### + + +require "$ENV{T2H_HOME}/texi2html.init" + if ($0 =~ /\.pl$/ && + -e "$ENV{T2H_HOME}/texi2html.init" && -r "$ENV{T2H_HOME}/texi2html.init"); + +#+++############################################################################ +# # +# Initialization # +# Pasted content of File $(srcdir)/MySimple.pm: Command-line processing # +# # +#---############################################################################ + +# leave this within comments, and keep the require statement +# This way, you can directly run texi2html.pl, if $ENV{T2H_HOME}/texi2html.init +# exists. + +# +package Getopt::MySimple; + +# Name: +# Getopt::MySimple. +# +# Documentation: +# POD-style (incomplete) documentation is in file MySimple.pod +# +# Tabs: +# 4 spaces || die. +# +# Author: +# Ron Savage rpsavage@ozemail.com.au. +# 1.00 19-Aug-97 Initial version. +# 1.10 13-Oct-97 Add arrays of switches (eg '=s@'). +# 1.20 3-Dec-97 Add 'Help' on a per-switch basis. +# 1.30 11-Dec-97 Change 'Help' to 'verbose'. Make all hash keys lowercase. +# 1.40 10-Nov-98 Change width of help report. Restructure tests. +# 1-Jul-00 Modifications for Texi2html + +# -------------------------------------------------------------------------- +# Locally modified by obachman (Display type instead of env, order by cmp) +# $Id: texi2html,v 1.3 2005-06-19 03:20:26 blp Exp $ + +# use strict; +# no strict 'refs'; + +use vars qw(@EXPORT @EXPORT_OK @ISA); +use vars qw($fieldWidth $opt $VERSION); + +use Exporter(); +use Getopt::Long; + +@ISA = qw(Exporter); +@EXPORT = qw(); +@EXPORT_OK = qw($opt); # An alias for $self -> {'opt'}. + +# -------------------------------------------------------------------------- + +$fieldWidth = 20; +$VERSION = '1.41'; + +# -------------------------------------------------------------------------- + +sub byOrder +{ + my($self) = @_; + + return uc($a) cmp (uc($b)); +} + +# -------------------------------------------------------------------------- + +sub dumpOptions +{ + my($self) = @_; + + print 'Option', ' ' x ($fieldWidth - length('Option') ), "Value\n"; + + for (sort byOrder keys(%{$self -> {'opt'} }) ) + { + print "-$_", ' ' x ($fieldWidth - (1 + length) ), "${$self->{'opt'} }{$_}\n"; + } + + print "\n"; + +} # End of dumpOptions. + +# -------------------------------------------------------------------------- +# Return: +# 0 -> Error. +# 1 -> Ok. + +sub getOptions +{ + push(@_, 0) if ($#_ == 2); # Default for $ignoreCase is 0. + push(@_, 1) if ($#_ == 3); # Default for $helpThenExit is 1. + + my($self, $default, $helpText, $versionText, + $helpThenExit, $versionThenExit, $ignoreCase) = @_; + + $helpThenExit = 1 unless (defined($helpThenExit)); + $versionThenExit = 1 unless (defined($versionThenExit)); + $ignoreCase = 0 unless (defined($ignoreCase)); + + $self -> {'default'} = $default; + $self -> {'helpText'} = $helpText; + $self -> {'versionText'} = $versionText; + $Getopt::Long::ignorecase = $ignoreCase; + + unless (defined($self -> {'default'}{'help'})) + { + $self -> {'default'}{'help'} = + { + type => ':i', + default => '', + linkage => sub {$self->helpOptions($_[1]); exit (0) if $helpThenExit;}, + verbose => "print help and exit" + }; + } + + unless (defined($self -> {'default'}{'version'})) + { + $self -> {'default'}{'version'} = + { + type => '', + default => '', + linkage => sub {print $self->{'versionText'}; exit (0) if $versionThenExit;}, + verbose => "print version and exit" + }; + } + + for (keys(%{$self -> {'default'} }) ) + { + my $type = ${$self -> {'default'} }{$_}{'type'}; + push(@{$self -> {'type'} }, "$_$type"); + $self->{'opt'}->{$_} = ${$self -> {'default'} }{$_}{'linkage'} + if ${$self -> {'default'} }{$_}{'linkage'}; + } + + my($result) = &GetOptions($self -> {'opt'}, @{$self -> {'type'} }); + + return $result unless $result; + + for (keys(%{$self -> {'default'} }) ) + { + if (! defined(${$self -> {'opt'} }{$_})) #{ + { + ${$self -> {'opt'} }{$_} = ${$self -> {'default'} }{$_}{'default'}; + } + } + + $result; +} # End of getOptions. + +# -------------------------------------------------------------------------- + +sub helpOptions +{ + my($self) = shift; + my($noHelp) = shift; + $noHelp = 0 unless $noHelp; + my($optwidth, $typewidth, $defaultwidth, $maxlinewidth, $valind, $valwidth) + = (10, 5, 9, 78, 4, 11); + + print "$self->{'helpText'}" if ($self -> {'helpText'}); + + print ' Option', ' ' x ($optwidth - length('Option') -1 ), + 'Type', ' ' x ($typewidth - length('Type') + 1), + 'Default', ' ' x ($defaultwidth - length('Default') ), + "Description\n"; + + for (sort byOrder keys(%{$self -> {'default'} }) ) + { + my($line, $help, $option, $val); + $option = $_; + next if ${$self->{'default'} }{$_}{'noHelp'} && ${$self->{'default'} }{$_}{'noHelp'} > $noHelp; + #$line = " -$_" . ' ' x ($optwidth - (2 + length) ) . + # "${$self->{'default'} }{$_}{'type'} ". + # ' ' x ($typewidth - (1+length(${$self -> {'default'} }{$_}{'type'}) )); + $line = " -$_" . "${$self->{'default'} }{$_}{'type'}". + ' ' x ($typewidth - (1+length(${$self -> {'default'} }{$_}{'type'}) )); + + $val = ${$self->{'default'} }{$_}{'linkage'}; + if ($val) + { + if (ref($val) eq 'SCALAR') + { + $val = $$val; + } + else + { + $val = ''; + } + } + else + { + $val = ${$self->{'default'} }{$_}{'default'}; + } + $line .= "$val "; + $line .= ' ' x ($optwidth + $typewidth + $defaultwidth + 1 - length($line)); + + if (defined(${$self -> {'default'} }{$_}{'verbose'}) && + ${$self -> {'default'} }{$_}{'verbose'} ne '') + { + $help = "${$self->{'default'} }{$_}{'verbose'}"; + } + else + { + $help = ' '; + } + if ((length("$line") + length($help)) < $maxlinewidth) + { + print $line , $help, "\n"; + } + else + { + print $line, "\n", ' ' x $valind, $help, "\n"; + } + for $val (sort byOrder keys(%{${$self->{'default'}}{$option}{'values'}})) + { + print ' ' x ($valind + 2); + print $val, ' ', ' ' x ($valwidth - length($val) - 2); + print ${$self->{'default'}}{$option}{'values'}{$val}, "\n"; + } + } + + print <| ! no argument: variable is set to 1 on -foo (or, to 0 on -nofoo) + =s | :s mandatory (or, optional) string argument + =i | :i mandatory (or, optional) integer argument +EOT +} # End of helpOptions. + +#------------------------------------------------------------------- + +sub new +{ + my($class) = @_; + my($self) = {}; + $self -> {'default'} = {}; + $self -> {'helpText'} = ''; + $self -> {'opt'} = {}; + $opt = $self -> {'opt'}; # An alias for $self -> {'opt'}. + $self -> {'type'} = (); + + return bless $self, $class; + +} # End of new. + +# -------------------------------------------------------------------------- + +1; + +# End MySimple.pm + +require "$ENV{T2H_HOME}/MySimple.pm" + if ($0 =~ /\.pl$/ && + -e "$ENV{T2H_HOME}/MySimple.pm" && -r "$ENV{T2H_HOME}/MySimple.pm"); + +package main; + +#+++############################################################################ +# # +# Constants # +# # +#---############################################################################ + +$DEBUG_TOC = 1; +$DEBUG_INDEX = 2; +$DEBUG_BIB = 4; +$DEBUG_GLOSS = 8; +$DEBUG_DEF = 16; +$DEBUG_HTML = 32; +$DEBUG_USER = 64; +$DEBUG_L2H = 128; + + +$BIBRE = '\[[\w\/-]+\]'; # RE for a bibliography reference +$FILERE = '[\/\w.+-]+'; # RE for a file name +$VARRE = '[^\s\{\}]+'; # RE for a variable name +$NODERE = '[^,:]+'; # RE for a node name +$NODESRE = '[^:]+'; # RE for a list of node names + +$ERROR = "***"; # prefix for errors +$WARN = "**"; # prefix for warnings + + # program home page +$PROTECTTAG = "_ThisIsProtected_"; # tag to recognize protected sections + +$CHAPTEREND = "\n"; # to know where a chpater ends +$SECTIONEND = "\n"; # to know where section ends +$TOPEND = "\n"; # to know where top ends + + + +# +# pre-defined indices +# +$index_properties = +{ + 'c' => { name => 'cp'}, + 'f' => { name => 'fn', code => 1}, + 'v' => { name => 'vr', code => 1}, + 'k' => { name => 'ky', code => 1}, + 'p' => { name => 'pg', code => 1}, + 't' => { name => 'tp', code => 1} +}; + + +%predefined_index = ( + 'cp', 'c', + 'fn', 'f', + 'vr', 'v', + 'ky', 'k', + 'pg', 'p', + 'tp', 't', + ); + +# +# valid indices +# +%valid_index = ( + 'c', 1, + 'f', 1, + 'v', 1, + 'k', 1, + 'p', 1, + 't', 1, + ); + +# +# texinfo section names to level +# +%sec2level = ( + 'top', 0, + 'chapter', 1, + 'unnumbered', 1, + 'majorheading', 1, + 'chapheading', 1, + 'appendix', 1, + 'section', 2, + 'unnumberedsec', 2, + 'heading', 2, + 'appendixsec', 2, + 'appendixsection', 2, + 'subsection', 3, + 'unnumberedsubsec', 3, + 'subheading', 3, + 'appendixsubsec', 3, + 'subsubsection', 4, + 'unnumberedsubsubsec', 4, + 'subsubheading', 4, + 'appendixsubsubsec', 4, + ); + +# +# accent map, TeX command to ISO name +# +%accent_map = ( + '"', 'uml', + '~', 'tilde', + '^', 'circ', + '`', 'grave', + '\'', 'acute', + ); + +# +# texinfo "simple things" (@foo) to HTML ones +# +%simple_map = ( + # cf. makeinfo.c + "*", "
", # HTML+ + " ", " ", + "\t", " ", + "-", "­", # soft hyphen + "\n", " ", + "|", "", + 'tab', '<\/TD>
Button Name Go to From 1.2.3 go to
+EOT + $about .= + ($T2H_ICONS && $T2H_ACTIVE_ICONS{$button} ? + &$T2H_button_icon_img($button, $T2H_ACTIVE_ICONS{$button}) : + " [" . $T2H_NAVIGATION_TEXT{$button} . "] "); + $about .= < + +$button + +$T2H_BUTTONS_GOTO{$button} + +$T2H_BUTTONS_EXAMPLE{$button} +
', + # spacing commands + ":", "", + "!", "!", + "?", "?", + ".", ".", + "-", "", + "/", "", + ); + +# +# texinfo "things" (@foo{}) to HTML ones +# +%things_map = ( + 'TeX', 'TeX', + 'br', '

', # paragraph break + 'bullet', '*', + #'copyright', '(C)', + 'copyright', '©', + 'dots', '...<\/small>', + 'enddots', '....<\/small>', + 'equiv', '==', + 'error', 'error-->', + 'expansion', '==>', + 'minus', '-', + 'point', '-!-', + 'print', '-|', + 'result', '=>', + # APA: &pretty_date requires $MONTH_NAMES and $T2H_LANG + # to be initialized. The latter gets initialized by + # &SetDocumentLanguage in &main. + # We set following hash entry in &main afterwards. + # 'today', &pretty_date, + 'aa', 'å', + 'AA', 'Å', + 'ae', 'æ', + 'oe', 'œ', + 'AE', 'Æ', + 'OE', 'Œ', + 'o', 'ø', + 'O', 'Ø', + 'ss', 'ß', + 'l', '\/l', + 'L', '\/L', + 'exclamdown', '¡', + 'questiondown', '¿', + 'pounds', '£' + ); + +# +# texinfo styles (@foo{bar}) to HTML ones +# +%style_map = ( + 'acronym', 'ACRONYM', + 'asis', '', + 'b', 'B', + 'cite', 'CITE', + 'code', 'CODE', + 'command', 'CODE', + 'ctrl', '&do_ctrl', # special case + 'dfn', 'EM', # DFN tag is illegal in the standard + 'dmn', '', # useless + 'email', '&do_email', # insert a clickable email address + 'emph', 'EM', + 'env', 'CODE', + 'file', '"TT', # will put quotes, cf. &apply_style + 'i', 'I', + 'kbd', 'KBD', + 'key', 'KBD', + 'math', '&do_math', + 'option', '"SAMP', # will put quotes, cf. &apply_style + 'r', '', # unsupported + 'samp', '"SAMP', # will put quotes, cf. &apply_style + 'sc', '&do_sc', # special case + 'strong', 'STRONG', + 't', 'TT', + 'titlefont', '', # useless + 'uref', '&do_uref', # insert a clickable URL + 'url', '&do_url', # insert a clickable URL + 'var', 'VAR', + 'w', '', # unsupported + 'H', '&do_accent', + 'dotaccent', '&do_accent', + 'ringaccent','&do_accent', + 'tieaccent', '&do_accent', + 'u','&do_accent', + 'ubaraccent','&do_accent', + 'udotaccent','&do_accent', + 'v', '&do_accent', + ',', '&do_accent', + 'dotless', '&do_accent' + ); + +# +# texinfo format (@foo/@end foo) to HTML ones +# +%format_map = ( + 'quotation', 'BLOCKQUOTE', + # lists + 'itemize', 'UL', + 'enumerate', 'OL', + # poorly supported + 'flushleft', 'PRE', + 'flushright', 'PRE', + ); + +# +# an eval of these $complex_format_map->{what}->[0] yields beginning +# an eval of these $complex_format_map->{what}->[1] yieleds end +$complex_format_map = +{ + verbatim => + [ + q{"$T2H_EXAMPLE_INDENT_CELL
"},
+  q{'
'} + ], + example => + [ + q{"$T2H_EXAMPLE_INDENT_CELL
"},
+  q{'
'} + ], + smallexample => + [ + q{"$T2H_SMALL_EXAMPLE_INDENT_CELL
"},
+  q{'
'} + ], + display => + [ + q{"$T2H_EXAMPLE_INDENT_CELL
'},
+  q{'
'} + ], + smalldisplay => + [ + q{"$T2H_SMALL_EXAMPLE_INDENT_CELL
'},
+  q{'
'} + ] +}; + +$complex_format_map->{lisp} = $complex_format_map->{example}; +$complex_format_map->{smalllisp} = $complex_format_map->{smallexample}; +$complex_format_map->{format} = $complex_format_map->{display}; +$complex_format_map->{smallformat} = $complex_format_map->{smalldisplay}; + +# +# texinfo definition shortcuts to real ones +# +%def_map = ( + # basic commands + 'deffn', 0, + 'defvr', 0, + 'deftypefn', 0, + 'deftypeop', 0, + 'deftypevr', 0, + 'deftypecv', 0, + 'defcv', 0, + 'defop', 0, + 'deftp', 0, + # basic x commands + 'deffnx', 0, + 'defvrx', 0, + 'deftypefnx', 0, + 'deftypeopx', 0, + 'deftypevrx', 0, + 'deftypecvx', 0, + 'defcvx', 0, + 'defopx', 0, + 'deftpx', 0, + # shortcuts + 'defun', 'deffn Function', + 'defmac', 'deffn Macro', + 'defspec', 'deffn {Special Form}', + 'defvar', 'defvr Variable', + 'defopt', 'defvr {User Option}', + 'deftypefun', 'deftypefn Function', + 'deftypevar', 'deftypevr Variable', + 'defivar', 'defcv {Instance Variable}', + 'deftypeivar', 'defcv {Instance Variable}', # NEW: FIXME + 'defmethod', 'defop Method', + 'deftypemethod', 'defop Method', # NEW:FIXME + # x shortcuts + 'defunx', 'deffnx Function', + 'defmacx', 'deffnx Macro', + 'defspecx', 'deffnx {Special Form}', + 'defvarx', 'defvrx Variable', + 'defoptx', 'defvrx {User Option}', + 'deftypefunx', 'deftypefnx Function', + 'deftypevarx', 'deftypevrx Variable', + 'defivarx', 'defcvx {Instance Variable}', + 'defmethodx', 'defopx Method', + ); + +# +# things to skip +# +%to_skip = ( + # comments + 'c', 1, + 'comment', 1, + 'ifnotinfo', 1, + 'ifnottex', 1, + 'ifhtml', 1, + 'end ifhtml', 1, + 'end ifnotinfo', 1, + 'end ifnottex', 1, + # useless + 'detailmenu', 1, + 'direntry', 1, + 'contents', 1, + 'shortcontents', 1, + 'summarycontents', 1, + 'footnotestyle', 1, + 'end ifclear', 1, + 'end ifset', 1, + 'titlepage', 1, + 'end titlepage', 1, + # unsupported commands (formatting) + 'afourpaper', 1, + 'cropmarks', 1, + 'finalout', 1, + 'headings', 1, + 'sp', 1, + 'need', 1, + 'page', 1, + 'setchapternewpage', 1, + 'everyheading', 1, + 'everyfooting', 1, + 'evenheading', 1, + 'evenfooting', 1, + 'oddheading', 1, + 'oddfooting', 1, + 'smallbook', 1, + 'vskip', 1, + 'filbreak', 1, + 'paragraphindent', 1, + # unsupported formats + 'cartouche', 1, + 'end cartouche', 1, + 'group', 1, + 'end group', 1, + ); + +#+++############################################################################ +# # +# Argument parsing, initialisation # +# # +#---############################################################################ + +# +# flush stdout and stderr after every write +# +select(STDERR); +$| = 1; +select(STDOUT); +$| = 1; + + +%value = (); # hold texinfo variables, see also -D +$use_bibliography = 1; +$use_acc = 1; + +# +# called on -init-file +sub LoadInitFile +{ + my $init_file = shift; + # second argument is value of options + $init_file = shift; + if (-f $init_file) + { + print "# reading initialization file from $init_file\n" + if ($T2H_VERBOSE); + require($init_file); + } + else + { + print "$ERROR Error: can't read init file $int_file\n"; + $init_file = ''; + } +} + +# +# called on -lang +sub SetDocumentLanguage +{ + my $lang = shift; + if (! exists($T2H_WORDS->{$lang})) + { + warn "$ERROR: Language specs for '$lang' do not exists. Reverting to '" . + ($T2H_LANG ? $T2H_LANG : "en") . "'\n"; + } + else + { + print "# using '$lang' as document language\n" if ($T2H_VERBOSE); + $T2H_LANG = $lang; + } +} + +## +## obsolete cmd line options +## +$T2H_OBSOLETE_OPTIONS -> {'no-section_navigation'} = +{ + type => '!', + linkage => sub {$T2H_SECTION_NAVIGATION = 0;}, + verbose => 'obsolete, use -nosec_nav', + noHelp => 2, +}; +$T2H_OBSOLETE_OPTIONS -> {use_acc} = +{ + type => '!', + linkage => \$use_acc, + verbose => 'obsolete', + noHelp => 2 +}; +$T2H_OBSOLETE_OPTIONS -> {expandinfo} = +{ + type => '!', + linkage => sub {$T2H_EXPAND = 'info';}, + verbose => 'obsolete, use "-expand info" instead', + noHelp => 2, +}; +$T2H_OBSOLETE_OPTIONS -> {expandtex} = +{ + type => '!', + linkage => sub {$T2H_EXPAND = 'tex';}, + verbose => 'obsolete, use "-expand tex" instead', + noHelp => 2, +}; +$T2H_OBSOLETE_OPTIONS -> {monolithic} = +{ + type => '!', + linkage => sub {$T2H_SPLIT = '';}, + verbose => 'obsolete, use "-split no" instead', + noHelp => 2 +}; +$T2H_OBSOLETE_OPTIONS -> {split_node} = +{ + type => '!', + linkage => sub{$T2H_SPLIT = 'section';}, + verbose => 'obsolete, use "-split section" instead', + noHelp => 2, +}; +$T2H_OBSOLETE_OPTIONS -> {split_chapter} = +{ + type => '!', + linkage => sub{$T2H_SPLIT = 'chapter';}, + verbose => 'obsolete, use "-split chapter" instead', + noHelp => 2, +}; +$T2H_OBSOLETE_OPTIONS -> {no_verbose} = +{ + type => '!', + linkage => sub {$T2H_VERBOSE = 0;}, + verbose => 'obsolete, use -noverbose instead', + noHelp => 2, +}; +$T2H_OBSOLETE_OPTIONS -> {output_file} = +{ + type => '=s', + linkage => sub {$T2H_OUT = $_[1]; $T2H_SPLIT = '';}, + verbose => 'obsolete, use -out_file instead', + noHelp => 2 +}; + +$T2H_OBSOLETE_OPTIONS -> {section_navigation} = +{ + type => '!', + linkage => \$T2H_SECTION_NAVIGATION, + verbose => 'obsolete, use -sec_nav instead', + noHelp => 2, +}; + +$T2H_OBSOLETE_OPTIONS -> {verbose} = +{ + type => '!', + linkage => \$T2H_VERBOSE, + verbose => 'obsolete, use -Verbose instead', + noHelp => 2 +}; + +# read initialzation from $sysconfdir/texi2htmlrc or $HOME/.texi2htmlrc +my $home = $ENV{HOME}; +defined($home) or $home = ''; +foreach $i ('/etc/texi2htmlrc', "$home/.texi2htmlrc") +{ + if (-f $i) + { + print "# reading initialization file from $i\n" + if ($T2H_VERBOSE); + require($i); + } +} + + +#+++############################################################################ +# # +# parse command-line options +# # +#---############################################################################ +$T2H_USAGE_TEXT = <getOptions($T2H_OPTIONS, $T2H_USAGE_TEXT, "$THISVERSION\n")) +{ + print $Configure_failed if $Configure_failed; + die $T2H_FAILURE_TEXT; +} + +if (@ARGV > 1) +{ + eval {Getopt::Long::Configure("no_pass_through");}; + if (! $options->getOptions($T2H_OBSOLETE_OPTIONS, $T2H_USAGE_TEXT, "$THISVERSION\n")) + { + print $Configure_failed if $Configure_failed; + die $T2H_FAILURE_TEXT; + } +} + +if ($T2H_CHECK) +{ + die "Need file to check\n$T2H_FAILURE_TEXT" unless @ARGV > 0; + ✓ + exit; +} + +#+++############################################################################ +# # +# evaluation of cmd line options +# # +#---############################################################################ + +if ($T2H_EXPAND eq 'info') +{ + $to_skip{'ifinfo'} = 1; + $to_skip{'end ifinfo'} = 1; +} +elsif ($T2H_EXPAND eq 'tex') +{ + $to_skip{'iftex'} = 1; + $to_skip{'end iftex'} = 1; + +} + +$T2H_INVISIBLE_MARK = '' if $T2H_INVISIBLE_MARK eq 'xbm'; + +# +# file name buisness +# +die "Need exactly one file to translate\n$T2H_FAILURE_TEXT" unless @ARGV == 1; +$docu = shift(@ARGV); +if ($docu =~ /.*\//) +{ + chop($docu_dir = $&); + $docu_name = $'; +} +else +{ + $docu_dir = '.'; + $docu_name = $docu; +} +unshift(@T2H_INCLUDE_DIRS, $docu_dir); +$docu_name =~ s/\.te?x(i|info)?$//; # basename of the document +$docu_name = $T2H_PREFIX if ($T2H_PREFIX); + +# subdir +if ($T2H_SUBDIR && ! $T2H_OUT) +{ + $T2H_SUBDIR =~ s|/*$||; + unless (-d "$T2H_SUBDIR" && -w "$T2H_SUBDIR") + { + if ( mkdir($T2H_SUBDIR, oct(755))) + { + print "# created directory $T2H_SUBDIR\n" if ($T2H_VERBOSE); + } + else + { + warn "$ERROR can't create directory $T2H_SUBDIR. Put results into current directory\n"; + $T2H_SUBDIR = ''; + } + } +} + +if ($T2H_SUBDIR && ! $T2H_OUT) +{ + $docu_rdir = "$T2H_SUBDIR/"; + print "# putting result files into directory $docu_rdir\n" if ($T2H_VERBOSE); +} +else +{ + if ($T2H_OUT && $T2H_OUT =~ m|(.*)/|) + { + $docu_rdir = "$1/"; + print "# putting result files into directory $docu_rdir\n" if ($T2H_VERBOSE); + } + else + { + print "# putting result files into current directory \n" if ($T2H_VERBOSE); + $docu_rdir = ''; + } +} + +# extension +if ($T2H_SHORTEXTN) +{ + $docu_ext = "htm"; +} +else +{ + $docu_ext = "html"; +} +if ($T2H_TOP_FILE =~ /\..*$/) +{ + $T2H_TOP_FILE = $`.".$docu_ext"; +} + +# result files +if (! $T2H_OUT && ($T2H_SPLIT =~ /section/i || $T2H_SPLIT =~ /node/i)) +{ + $T2H_SPLIT = 'section'; +} +elsif (! $T2H_OUT && $T2H_SPLIT =~ /chapter/i) +{ + $T2H_SPLIT = 'chapter' +} +else +{ + undef $T2H_SPLIT; +} + +$docu_doc = "$docu_name.$docu_ext"; # document's contents +$docu_doc_file = "$docu_rdir$docu_doc"; +if ($T2H_SPLIT) +{ + $docu_toc = $T2H_TOC_FILE || "${docu_name}_toc.$docu_ext"; # document's table of contents + $docu_stoc = "${docu_name}_ovr.$docu_ext"; # document's short toc + $docu_foot = "${docu_name}_fot.$docu_ext"; # document's footnotes + $docu_about = "${docu_name}_abt.$docu_ext"; # about this document + $docu_top = $T2H_TOP_FILE || $docu_doc; +} +else +{ + if ($T2H_OUT) + { + $docu_doc = $T2H_OUT; + $docu_doc =~ s|.*/||; + } + $docu_toc = $docu_foot = $docu_stoc = $docu_about = $docu_top = $docu_doc; +} + +$docu_toc_file = "$docu_rdir$docu_toc"; +$docu_stoc_file = "$docu_rdir$docu_stoc"; +$docu_foot_file = "$docu_rdir$docu_foot"; +$docu_about_file = "$docu_rdir$docu_about"; +$docu_top_file = "$docu_rdir$docu_top"; + +$docu_frame_file = "$docu_rdir${docu_name}_frame.$docu_ext"; +$docu_toc_frame_file = "$docu_rdir${docu_name}_toc_frame.$docu_ext"; + +# +# variables +# +$value{'html'} = 1; # predefine html (the output format) +$value{'texi2html'} = $THISVERSION; # predefine texi2html (the translator) +# _foo: internal to track @foo +foreach ('_author', '_title', '_subtitle', + '_settitle', '_setfilename', '_shorttitle') +{ + $value{$_} = ''; # prevent -w warnings +} +%node2sec = (); # node to section name +%sec2node = (); # section to node name +%sec2seccount = (); # section to section count +%seccount2sec = (); # section count to section +%sec2number = (); # section to number + # $number =~ ^[\dA-Z]+\.(\d+(\.\d+)*)?$ +%number2sec = (); # number to section +%idx2node = (); # index keys to node +%node2href = (); # node to HREF +%node2next = (); # node to next +%node2prev = (); # node to prev +%node2up = (); # node to up +%bib2href = (); # bibliography reference to HREF +%gloss2href = (); # glossary term to HREF +@sections = (); # list of sections +%tag2pro = (); # protected sections + +# +# initial indexes +# +$bib_num = 0; +$foot_num = 0; +$gloss_num = 0; +$idx_num = 0; +$sec_num = 0; +$doc_num = 0; +$html_num = 0; + +# +# can I use ISO8859 characters? (HTML+) +# +if ($T2H_USE_ISO) +{ + $things_map{'bullet'} = "•"; + $things_map{'copyright'} = "©"; + $things_map{'dots'} = "…"; + $things_map{'equiv'} = "≡"; + $things_map{'expansion'} = "→"; + $things_map{'point'} = "∗"; + $things_map{'result'} = "⇒"; +} + +# +# read texi2html extensions (if any) +# +$extensions = 'texi2html.ext'; # extensions in working directory +if (-f $extensions) +{ + print "# reading extensions from $extensions\n" if $T2H_VERBOSE; + require($extensions); +} +($progdir = $0) =~ s/[^\/]+$//; +if ($progdir && ($progdir ne './')) +{ + $extensions = "${progdir}texi2html.ext"; # extensions in texi2html directory + if (-f $extensions) + { + print "# reading extensions from $extensions\n" if $T2H_VERBOSE; + require($extensions); + } +} + + +print "# reading from $docu\n" if $T2H_VERBOSE; + +######################################################################### +# +# latex2html stuff +# +# latex2html conversions consist of three stages: +# 1) ToLatex: Put "latex" code into a latex file +# 2) ToHtml: Use latex2html to generate corresponding html code and images +# 3) FromHtml: Extract generated code and images from latex2html run +# + +########################## +# default settings +# + +# defaults for files and names + +sub l2h_Init +{ + local($root) = @_; + return 0 unless ($root); + $l2h_name = "${root}_l2h"; + $l2h_latex_file = "$docu_rdir${l2h_name}.tex"; + $l2h_cache_file = "${docu_rdir}l2h_cache.pm"; + $T2H_L2H_L2H = "latex2html" unless ($T2H_L2H_L2H); + # destination dir -- generated images are put there, should be the same + # as dir of enclosing html document -- + $l2h_html_file = "$docu_rdir${l2h_name}.html"; + $l2h_prefix = "${l2h_name}_"; + return 1; +} + + +########################## +# +# First stage: Generation of Latex file +# Initialize with: l2h_InitToLatex +# Add content with: l2h_ToLatex($text) --> HTML placeholder comment +# Finish with: l2h_FinishToLatex +# + +$l2h_latex_preample = <$l2h_latex_file")) + { + warn "$ERROR Error l2h: Can't open latex file '$latex_file' for writing\n"; + return 0; + } + print "# l2h: use ${l2h_latex_file} as latex file\n" if ($T2H_VERBOSE); + print L2H_LATEX $l2h_latex_preample; + } + # open database for caching + l2h_InitCache(); + $l2h_latex_count = 0; + $l2h_to_latex_count = 0; + $l2h_cached_count = 0; + return 1; +} + +# print text (1st arg) into latex file (if not already there), return +# HTML commentary which can be later on replaced by the latex2html +# generated text +sub l2h_ToLatex +{ + my($text) = @_; + my($count); + $l2h_to_latex_count++; + $text =~ s/(\s*)$//; + # try whether we can cache it + my $cached_text = l2h_FromCache($text); + if ($cached_text) + { + $l2h_cached_count++; + return $cached_text; + } + # try whether we have text already on things to do + unless ($count = $l2h_to_latex{$text}) + { + $count = $l2h_latex_count; + $l2h_latex_count++; + $l2h_to_latex{$text} = $count; + $l2h_to_latex[$count] = $text; + unless ($T2H_L2H_SKIP) + { + print L2H_LATEX "\\begin{rawhtml}\n"; + print L2H_LATEX "\n"; + print L2H_LATEX "\\end{rawhtml}\n"; + + print L2H_LATEX "$text\n"; + + print L2H_LATEX "\\begin{rawhtml}\n"; + print L2H_LATEX "\n"; + print L2H_LATEX "\\end{rawhtml}\n"; + } + } + return ""; +} + +# print closing into latex file and close it +sub l2h_FinishToLatex +{ + local ($reused); + $reused = $l2h_to_latex_count - $l2h_latex_count - $l2h_cached_count; + unless ($T2H_L2H_SKIP) + { + print L2H_LATEX $l2h_latex_closing; + close(L2H_LATEX); + } + print "# l2h: finished to latex ($l2h_cached_count cached, $reused reused, $l2h_latex_count contents)\n" if ($T2H_VERBOSE); + unless ($l2h_latex_count) + { + l2h_Finish(); + return 0; + } + return 1; +} + +################################### +# Second stage: Use latex2html to generate corresponding html code and images +# +# l2h_ToHtml([$l2h_latex_file, [$l2h_html_dir]]): +# Call latex2html on $l2h_latex_file +# Put images (prefixed with $l2h_name."_") and html file(s) in $l2h_html_dir +# Return 1, on success +# 0, otherwise +# +sub l2h_ToHtml +{ + local($call, $ext, $root, $dotbug); + if ($T2H_L2H_SKIP) + { + print "# l2h: skipping latex2html run\n" if ($T2H_VERBOSE); + return 1; + } + # Check for dot in directory where dvips will work + if ($T2H_L2H_TMP) + { + if ($T2H_L2H_TMP =~ /\./) + { + warn "$ERROR Warning l2h: l2h_tmp dir contains a dot. Use /tmp, instead\n"; + $dotbug = 1; + } + } + else + { + if (&getcwd =~ /\./) + { + warn "$ERROR Warning l2h: current dir contains a dot. Use /tmp as l2h_tmp dir \n"; + $dotbug = 1; + } + } + # fix it, if necessary and hope that it works + $T2H_L2H_TMP = "/tmp" if ($dotbug); + + $call = $T2H_L2H_L2H; + # use init file, if specified + $call = $call . " -init_file " . $init_file if ($init_file && -f $init_file); + # set output dir + $call .= ($docu_rdir ? " -dir $docu_rdir" : " -no_subdir"); + # use l2h_tmp, if specified + $call = $call . " -tmp $T2H_L2H_TMP" if ($T2H_L2H_TMP); + # options we want to be sure of + $call = $call ." -address 0 -info 0 -split 0 -no_navigation -no_auto_link"; + $call = $call ." -prefix ${l2h_prefix} $l2h_latex_file"; + + print "# l2h: executing '$call'\n" if ($T2H_VERBOSE); + if (system($call)) + { + warn "l2h ***Error: '${call}' did not succeed\n"; + return 0; + } + else + { + print "# l2h: latex2html finished successfully\n" if ($T2H_VERBOSE); + return 1; + } +} + +# this is directly pasted over from latex2html +sub getcwd +{ + local($_) = `pwd`; + + die "'pwd' failed (out of memory?)\n" + unless length; + chop; + $_; +} + + +########################## +# Third stage: Extract generated contents from latex2html run +# Initialize with: l2h_InitFromHtml +# open $l2h_html_file for reading +# reads in contents into array indexed by numbers +# return 1, on success -- 0, otherwise +# Extract Html code with: l2h_FromHtml($text) +# replaces in $text all previosuly inserted comments by generated html code +# returns (possibly changed) $text +# Finish with: l2h_FinishFromHtml +# closes $l2h_html_dir/$l2h_name.".$docu_ext" + +sub l2h_InitFromHtml +{ + local($h_line, $h_content, $count, %l2h_img); + + if (! open(L2H_HTML, "<${l2h_html_file}")) + { + print "$ERROR Error l2h: Can't open ${l2h_html_file} for reading\n"; + return 0; + } + print "# l2h: use ${l2h_html_file} as html file\n" if ($T2H_VERBOSE); + + $l2h_html_count = 0; + while ($h_line = ) + { + if ($h_line =~ /^/) + { + $count = $1; + $h_content = ""; + while ($h_line = ) + { + if ($h_line =~ /^/) + { + chomp $h_content; + chomp $h_content; + $l2h_html_count++; + $h_content = l2h_ToCache($count, $h_content); + $l2h_from_html[$count] = $h_content; + $h_content = ''; + last; + } + $h_content = $h_content.$h_line; + } + if ($hcontent) + { + print "$ERROR Warning l2h: l2h_end $l2h_name $count not found\n" + if ($T2H_VERBOSE); + close(L2H_HTML); + return 0; + } + } + } + print "# l2h: Got $l2h_html_count of $l2h_latex_count html contents\n" + if ($T2H_VERBOSE); + + close(L2H_HTML); + return 1; +} + +sub l2h_FromHtml +{ + my($text) = @_; + my($done, $to_do, $count); + $to_do = $text; + while ($to_do =~ /([^\000]*)([^\000]*)/) + { + $to_do = $1; + $count = $2; + $done = $3.$done; + $done = "".$done + if ($T2H_DEBUG & $DEBUG_L2H); + + $done = &l2h_ExtractFromHtml($count) . $done; + + $done = "".$done + if ($T2H_DEBUG & $DEBUG_L2H); + } + return $to_do.$done; +} + + +sub l2h_ExtractFromHtml +{ + local($count) = @_; + return $l2h_from_html[$count] if ($l2h_from_html[$count]); + if ($count >= 0 && $count < $l2h_latex_count) + { + # now we are in trouble + local($l_l2h, $_); + + $l2h_extract_error++; + print "$ERROR l2h: can't extract content $count from html\n" + if ($T2H_VERBOSE); + # try simple (ordinary) substition (without l2h) + $l_l2h = $T2H_L2H; + $T2H_L2H = 0; + $_ = $l2h_to_latex{$count}; + $_ = &substitute_style($_); + &unprotect_texi; + $_ = "" . $_ + if ($T2H_DEBUG & $DEBUG_L2H); + $T2H_L2H = $l_l2h; + return $_; + } + else + { + # now we have been incorrectly called + $l2h_range_error++; + print "$ERROR l2h: Request of $count content which is out of valide range [0,$l2h_latex_count)\n"; + return "" + if ($T2H_DEBUG & $DEBUG_L2H); + return ""; + } +} + +sub l2h_FinishFromHtml +{ + if ($T2H_VERBOSE) + { + if ($l2h_extract_error + $l2h_range_error) + { + print "# l2h: finished from html ($l2h_extract_error extract and $l2h_range_error errors)\n"; + } + else + { + print "# l2h: finished from html (no errors)\n"; + } + } +} + +sub l2h_Finish +{ + l2h_StoreCache(); + if ($T2H_L2H_CLEAN) + { + print "# l2h: removing temporary files generated by l2h extension\n" + if $T2H_VERBOSE; + while (<"$docu_rdir$l2h_name"*>) + { + unlink $_; + } + } + print "# l2h: Finished\n" if $T2H_VERBOSE; + return 1; +} + +############################## +# stuff for l2h caching +# + +# I tried doing this with a dbm data base, but it did not store all +# keys/values. Hence, I did as latex2html does it +sub l2h_InitCache +{ + if (-r "$l2h_cache_file") + { + my $rdo = do "$l2h_cache_file"; + warn("$ERROR l2h Error: could not load $docu_rdir$l2h_cache_file: $@\n") + unless ($rdo); + } +} + +sub l2h_StoreCache +{ + return unless $l2h_latex_count; + my ($key, $value); + open(FH, ">$l2h_cache_file") || return warn"$ERROR l2h Error: could not open $docu_rdir$l2h_cache_file for writing: $!\n"; + while (($key, $value) = each %l2h_cache) + { + # escape stuff + $key =~ s|/|\\/|g; + $key =~ s|\\\\/|\\/|g; + # weird, a \ at the end of the key results in an error + # maybe this also broke the dbm database stuff + $key =~ s|\\$|\\\\|; + $value =~ s/\|/\\\|/go; + $value =~ s/\\\\\|/\\\|/go; + $value =~ s|\\\\|\\\\\\\\|g; + print FH "\n\$l2h_cache_key = q/$key/;\n"; + print FH "\$l2h_cache{\$l2h_cache_key} = q|$value|;\n"; + } + print FH "1;"; + close(FH); +} + +# return cached html, if it exists for text, and if all pictures +# are there, as well +sub l2h_FromCache +{ + my $text = shift; + my $cached = $l2h_cache{$text}; + if ($cached) + { + while ($cached =~ m/SRC="(.*?)"/g) + { + unless (-e "$docu_rdir$1") + { + return undef; + } + } + return $cached; + } + return undef; +} + +# insert generated html into cache, move away images, +# return transformed html +$maximage = 1; +sub l2h_ToCache +{ + my $count = shift; + my $content = shift; + my @images = ($content =~ /SRC="(.*?)"/g); + my ($src, $dest); + + for $src (@images) + { + $dest = $l2h_img{$src}; + unless ($dest) + { + my $ext; + if ($src =~ /.*\.(.*)$/ && $1 ne $docu_ext) + { + $ext = $1; + } + else + { + warn "$ERROR: L2h image $src has invalid extension\n"; + next; + } + while (-e "$docu_rdir${docu_name}_$maximage.$ext") + { + $maximage++; + } + $dest = "${docu_name}_$maximage.$ext"; + system("cp -f $docu_rdir$src $docu_rdir$dest"); + $l2h_img{$src} = $dest; + unlink "$docu_rdir$src" unless ($DEBUG & $DEBUG_L2H); + } + $content =~ s/$src/$dest/g; + } + $l2h_cache{$l2h_to_latex[$count]} = $content; + return $content; +} + + +#+++############################################################################ +# # +# Pass 1: read source, handle command, variable, simple substitution # +# # +#---############################################################################ +sub pass1 +{ + my $name; + my $line; + @lines = (); # whole document + @toc_lines = (); # table of contents + @stoc_lines = (); # table of contents + $curlevel = 0; # current level in TOC + $node = ''; # current node name + $node_next = ''; # current node next name + $node_prev = ''; # current node prev name + $node_up = ''; # current node up name + $in_table = 0; # am I inside a table + $table_type = ''; # type of table ('', 'f', 'v', 'multi') + @tables = (); # nested table support + $in_bibliography = 0; # am I inside a bibliography + $in_glossary = 0; # am I inside a glossary + $in_top = 0; # am I inside the top node + $has_top = 0; # did I see a top node? + $has_top_command = 0; # did I see @top for automatic pointers? + $in_pre = 0; # am I inside a preformatted section + $in_list = 0; # am I inside a list + $in_html = 0; # am I inside an HTML section + $first_line = 1; # is it the first line + $dont_html = 0; # don't protect HTML on this line + $deferred_ref = ''; # deferred reference for indexes + @html_stack = (); # HTML elements stack + $html_element = ''; # current HTML element + &html_reset; + %macros = (); # macros + $toc_indent = # used for identation in TOC's + ($T2H_NUMBER_SECTIONS ? 'BLOCKQUOTE' : 'UL'); + + # init l2h + $T2H_L2H = &l2h_Init($docu_name) if ($T2H_L2H); + $T2H_L2H = &l2h_InitToLatex if ($T2H_L2H); + + # build code for simple substitutions + # the maps used (%simple_map and %things_map) MUST be aware of this + # watch out for regexps, / and escaped characters! + $subst_code = ''; + foreach (keys(%simple_map)) + { + ($re = $_) =~ s/(\W)/\\$1/g; # protect regexp chars + $subst_code .= "s/\\\@$re/$simple_map{$_}/g;\n"; + } + foreach (keys(%things_map)) + { + $subst_code .= "s/\\\@$_\\{\\}/$things_map{$_}/g;\n"; + } + if ($use_acc) + { + # accentuated characters + foreach (keys(%accent_map)) + { + if ($_ eq "`") + { + $subst_code .= "s/$;3"; + } + elsif ($_ eq "'") + { + $subst_code .= "s/$;4"; + } + else + { + $subst_code .= "s/\\\@\\$_"; + } + $subst_code .= "([a-z])/&\${1}$accent_map{$_};/gi;\n"; + } + } + eval("sub simple_substitutions { $subst_code }"); + + &init_input; + INPUT_LINE: while ($_ = &next_line) + { + # + # remove \input on the first lines only + # + if ($first_line) + { + next if /^\\input/; + $first_line = 0; + } + # non-@ substitutions cf. texinfmt.el + # + # parse texinfo tags + # + $tag = ''; + $end_tag = ''; + if (/^\s*\@end\s+(\w+)\b/) + { + $end_tag = $1; + } + elsif (/^\s*\@(\w+)\b/) + { + $tag = $1; + } + # + # handle @html / @end html + # + if ($in_html) + { + if ($end_tag eq 'html') + { + $in_html = 0; + } + else + { + $tag2pro{$in_html} .= $_; + } + next; + } + elsif ($tag eq 'html') + { + $in_html = $PROTECTTAG . ++$html_num; + push(@lines, $in_html); + next; + } + + # + # try to remove inlined comments + # syntax from tex-mode.el comment-start-skip + # + s/((^|[^\@])(\@\@)*)\@(c( |\{)|comment ).*$/$1/; + + # Sometimes I use @c right at the end of a line ( to suppress the line feed ) + # s/((^|[^\@])(\@\@)*)\@c(omment)?$/$1/; + # s/((^|[^\@])(\@\@)*)\@c(omment)? .*/$1/; + # s/(.*)\@c{.*?}(.*)/$1$2/; + # s/(.*)\@comment{.*?}(.*)/$1$2/; + # s/^(.*)\@c /$1/; + # s/^(.*)\@comment /$1/; + + ############################################################# + # value substitution before macro expansion, so that + # it works in macro arguments + s/\@value{($VARRE)}/$value{$1}/eg; + + ############################################################# + # macro substitution + while (/\@(\w+)/g) + { + if (exists($macros->{$1})) + { + my $before = $`; + $name = $1; + my $after = $'; + my @args; + my $args; + ##################################################### + # Support for multi-line macro invocations and nested + # '{' and '}' within macro invocations added by + # Eric Sunshine 2000/09/10. + ##################################################### + if ($after =~ /^\s*\{/) # Macro arguments delimited by '{' and '}'? + { + my ($protect, $start, $end, $depth, $c) = (0, 0, 0, 0, 0); + foreach $c (unpack('C*', $after)) + { + if ($protect) + { # Character protected by '\' or '@'; pass through unmolested. + $protect = 0; + } + elsif ($c == ord('\\') || $c == ord('@')) + { # '\' and '@' remove special meaning of next character. + $protect = 1; + } + elsif ($c == ord('{')) # Allow '{' and '}' to nest. + { + $depth++; + } + elsif ($c == ord('}')) + { + $depth--; + last if $depth == 0; + } + $start++ if !$depth; # Position of opening brace. + $end++; # Position of closing brace. + } + + # '{' & '}' did not completely unnest; append next line; try again. + if ($depth > 0) + { + my $paste = &next_line; + die "$ERROR Missing closing brace '}' for invocation of macro " . + "\"\@$name\" on line:\n", substr($_,0,70), "...\n" unless $paste; + s/\n$/ /; + unshift @input_spool, $_ . $paste; + next INPUT_LINE; + } + + # Extract macro arguments from within '{' and '}'. + $len = $end - $start - 1; + $args = ($len > 0) ? substr($after, $start + 1, $len) : ''; + $after = substr($after, $end + 1); + } + ############ End Sunshine Modifications ############# + elsif (@{$macros->{$name}->{Args}} == 1) # Macro arg extends to EOL. + { + $args = $after; + $args =~ s/^\s*//; + $args =~ s/\s*$//; + $after = ''; + } + $args =~ s|\\\\|\\|g; + $args =~ s|\\{|{|g; + $args =~ s|\\}|}|g; + if (@{$macros->{$name}->{Args}} > 1) + { + $args =~ s/(^|[^\\]),/$1$;/g ; + $args =~ s|\\,|,|g; + @args = split(/$;\s*/, $args) if (@{$macros->{$name}->{Args}} > 1); + } + else + { + $args =~ s|\\,|,|g; + @args = ($args); + } + my $macrobody = $macros->{$name}->{Body}; + for ($i=0; $i<=$#args; $i++) + { + $macrobody =~ s|\\$macros->{$name}->{Args}->[$i]\\|$args[$i]|g; + } + $macrobody =~ s|\\\\|\\|g; + $_ = $before . $macrobody . $after; + unshift @input_spool, map {$_ = $_."\n"} split(/\n/, $_); + next INPUT_LINE; + } + } + # + # try to skip the line + # + if ($end_tag) + { + $in_titlepage = 0 if $end_tag eq 'titlepage'; + next if $to_skip{"end $end_tag"}; + } + elsif ($tag) + { + $in_titlepage = 1 if $tag eq 'titlepage'; + next if $to_skip{$tag}; + last if $tag eq 'bye'; + } + if ($in_top) + { + # parsing the top node + if ($tag eq 'node' || + ($sec2level{$tag} && $tag !~ /unnumbered/ && $tag !~ /heading/)) + { + # no more in top + $in_top = 0; + push(@lines, $TOPEND); + } + } + unless ($in_pre) + { + s/``/\"/go; + s/''/\"/go; + s/([\w ])---([\w ])/$1--$2/g; + } + # + # analyze the tag + # + if ($tag) + { + # skip lines + &skip_until($tag), next if $tag eq 'ignore'; + &skip_until($tag), next if $tag eq 'ifnothtml'; + if ($tag eq 'ifinfo') + { + &skip_until($tag), next unless $T2H_EXPAND eq 'info'; + } + if ($tag eq 'iftex') + { + &skip_until($tag), next unless $T2H_EXPAND eq 'tex'; + } + if ($tag eq 'tex') + { + # add to latex2html file + if ($T2H_EXPAND eq 'tex' && $T2H_L2H && ! $in_pre) + { + # add space to the end -- tex(i2dvi) does this, as well + push(@lines, &l2h_ToLatex(&string_until($tag) . " ")); + } + else + { + &skip_until($tag); + } + next; + } + if ($tag eq 'titlepage') + { + next; + } + # handle special tables + if ($tag =~ /^(|f|v|multi)table$/) + { + $table_type = $1; + $tag = 'table'; + } + # special cases + # APA: Fixed regexp to ONLY match the top node, not any + # node starting with the word top. + if ($tag eq 'top' || ($tag eq 'node' && /^\@node\s+top\s*(,.*)?$/i)) + { + $in_top = 1; + $has_top = 1; + $has_top_command = 1 if $tag eq 'top'; + @lines = (); # ignore all lines before top (title page garbage) + next; + } + elsif ($tag eq 'node') + { + if ($in_top) + { + $in_top = 0; + push(@lines, $TOPEND); + } + warn "$ERROR Bad node line: $_" unless $_ =~ /^\@node\s$NODESRE$/o; + # request of "Richard Y. Kim" + s/^\@node\s+//; + $_ = &protect_html($_); # if node contains '&' for instance + ($node, $node_next, $node_prev, $node_up) = split(/,/); + if ($node) + { + &normalise_node($node); + } + else + { + warn "$ERROR Node is undefined: $_ (eg. \@node NODE-NAME, NEXT, PREVIOUS, UP)"; + } + if ($node_next) + { + &normalise_node($node_next); + } + if ($node_prev) + { + &normalise_node($node_prev); + } + if ($node_up) + { + &normalise_node($node_up); + } + $node =~ /\"/ ? + push @lines, &html_debug("\n", __LINE__) : + push @lines, &html_debug("\n", __LINE__); + next; + } + elsif ($tag eq 'include') + { + if (/^\@include\s+($FILERE)\s*$/o) + { + $file = LocateIncludeFile($1); + if ($file && -e $file) + { + &open($file); + print "# including $file\n" if $T2H_VERBOSE; + } + else + { + warn "$ERROR Can't find $1, skipping"; + } + } + else + { + warn "$ERROR Bad include line: $_"; + } + next; + } + elsif ($tag eq 'ifclear') + { + if (/^\@ifclear\s+($VARRE)\s*$/o) + { + next unless defined($value{$1}); + &skip_until($tag); + } + else + { + warn "$ERROR Bad ifclear line: $_"; + } + next; + } + elsif ($tag eq 'ifset') + { + if (/^\@ifset\s+($VARRE)\s*$/o) + { + next if defined($value{$1}); + &skip_until($tag); + } + else + { + warn "$ERROR Bad ifset line: $_"; + } + next; + } + elsif ($tag eq 'menu') + { + unless ($T2H_SHOW_MENU) + { + &skip_until($tag); + next; + } + &html_push_if($tag); + push(@lines, &html_debug('', __LINE__)); + } + elsif ($format_map{$tag}) + { + $in_pre = 1 if $format_map{$tag} eq 'PRE'; + &html_push_if($format_map{$tag}); + push(@lines, &html_debug('', __LINE__)); + $in_list++ if $format_map{$tag} eq 'UL' || $format_map{$tag} eq 'OL' ; + # push(@lines, &debug("

\n", __LINE__)) + # if $tag =~ /example/i; + # Eric Sunshine :
blah
looks + # better than
\nblah
on OmniWeb2 NextStep browser. + push(@lines, &debug("<$format_map{$tag}>" . + ($in_pre ? '' : "\n"), __LINE__)); + next; + } + elsif (exists $complex_format_map->{$tag}) + { + my $start = eval $complex_format_map->{$tag}->[0]; + # APA: implicitly ends paragraph, so let's do it + # explicitly to keep our HTML stack in sync. + if ($start =~ /\A\s*
/i) + { + if ($html_element eq 'P') + { + push (@lines2, &debug("

\n", __LINE__)); + &html_pop(); + } + } + if ($@) + { + print "$ERROR: eval of complex_format_map->{$tag}->[0] $complex_format_map->{$tag}->[0]: $@"; + $start = '
'
+                }
+                $in_pre = 1 if $start =~ /
 implicitly ends paragraph, so let's
+                        # do it explicitly to keep our HTML stack in sync.
+                        if ($html_element eq 'P')
+                        {
+                            push (@lines, &debug("

\n", __LINE__)); + &html_pop(); + } + # don't use borders -- gets confused by empty cells + push(@lines, &debug("
\n", __LINE__)); + &html_push_if('TABLE'); + } + else + { + # APA:
implicitly ends paragraph, so let's + # do it explicitly to keep our HTML stack in sync. + if ($html_element eq 'P') + { + push (@lines, &debug("

\n", __LINE__)); + &html_pop(); + } + push(@lines, &debug("
\n", __LINE__)); + &html_push_if('DL'); + } + push(@lines, &html_debug('', __LINE__)); + } + else + { + warn "$ERROR Bad table line: $_"; + } + next; + } + elsif ($tag eq 'synindex' || $tag eq 'syncodeindex') + { + if (/^\@$tag\s+(\w+)\s+(\w+)\s*$/) + { + my $from = $1; + my $to = $2; + my $prefix_from = IndexName2Prefix($from); + my $prefix_to = IndexName2Prefix($to); + + warn("$ERROR unknown from index name $from ind syn*index line: $_"), next + unless $prefix_from; + warn("$ERROR unknown to index name $to ind syn*index line: $_"), next + unless $prefix_to; + + if ($tag eq 'syncodeindex') + { + $index_properties->{$prefix_to}->{'from_code'}->{$prefix_from} = 1; + } + else + { + $index_properties->{$prefix_to}->{'from'}->{$prefix_from} = 1; + } + } + else + { + warn "$ERROR Bad syn*index line: $_"; + } + next; + } + elsif ($tag eq 'defindex' || $tag eq 'defcodeindex') + { + if (/^\@$tag\s+(\w+)\s*$/) + { + $name = $1; + $index_properties->{$name}->{name} = $name; + $index_properties->{$name}->{code} = 1 if $tag eq 'defcodeindex'; + } + else + { + warn "$ERROR Bad defindex line: $_"; + } + next; + } + elsif (/^\@printindex/) + { + # APA: HTML generated for @printindex contains
+ # which implicitly ends paragraph, so let's do it + # explicitly to keep our HTML stack in sync. + if ($html_element eq 'P') + { + push(@lines, &debug("

\n", __LINE__)); + &html_pop(); + } + push (@lines, "$_"); + next; + } + elsif ($tag eq 'sp') + { + push(@lines, &debug("

\n", __LINE__)); + next; + } + elsif ($tag eq 'center') + { + push(@lines, &debug("
\n", __LINE__)); + s/\@center//; + } + elsif ($tag eq 'setref') + { + my ($setref); + &protect_html; # if setref contains '&' for instance + if (/^\@$tag\s*{($NODERE)}\s*$/) + { + $setref = $1; + $setref =~ s/\s+/ /go; # normalize + $setref =~ s/ $//; + $node2sec{$setref} = $name; + $sec2node{$name} = $setref; + $node2href{$setref} = "$docu_doc#$docid"; + } + else + { + warn "$ERROR Bad setref line: $_"; + } + next; + } + elsif ($tag eq 'lowersections') + { + my ($sec, $level); + while (($sec, $level) = each %sec2level) + { + $sec2level{$sec} = $level + 1; + } + next; + } + elsif ($tag eq 'raisesections') + { + my ($sec, $level); + while (($sec, $level) = each %sec2level) + { + $sec2level{$sec} = $level - 1; + } + next; + } + elsif ($tag eq 'macro' || $tag eq 'rmacro') + { + if (/^\@$tag\s*(\w+)\s*(.*)/) + { + $name = $1; + my @args; + @args = split(/\s*,\s*/ , $1) + if ($2 =~ /^\s*{(.*)}\s*/); + + $macros->{$name}->{Args} = \@args; + $macros->{$name}->{Body} = ''; + while (($_ = &next_line) && $_ !~ /\@end $tag/) + { + $macros->{$name}->{Body} .= $_; + } + die "ERROR: No closing '\@end $tag' found for macro definition of '$name'\n" + unless (/\@end $tag/); + chomp $macros->{$name}->{Body}; + } + else + { + warn "$ERROR: Bad macro defintion $_" + } + next; + } + elsif ($tag eq 'unmacro') + { + delete $macros->{$1} if (/^\@unmacro\s*(\w+)/); + next; + } + elsif ($tag eq 'documentlanguage') + { + SetDocumentLanguage($1) if (!$T2H_LANG && /documentlanguage\s*(\w+)/); + } + elsif (defined($def_map{$tag})) + { + if ($def_map{$tag}) + { + s/^\@$tag\s+//; + $tag = $def_map{$tag}; + $_ = "\@$tag $_"; + $tag =~ s/\s.*//; + } + } + elsif (defined($user_sub{$tag})) + { + s/^\@$tag\s+//; + $sub = $user_sub{$tag}; + print "# user $tag = $sub, arg: $_" if $T2H_DEBUG & $DEBUG_USER; + if (defined(&$sub)) + { + chop($_); + &$sub($_); + } + else + { + warn "$ERROR Bad user sub for $tag: $sub\n"; + } + next; + } + if (defined($def_map{$tag})) + { + s/^\@$tag\s+//; + if ($tag =~ /x$/) + { + # extra definition line + $tag = $`; + $is_extra = 1; + } + else + { + $is_extra = 0; + } + while (/\{([^\{\}]*)\}/) + { + # this is a {} construct + ($before, $contents, $after) = ($`, $1, $'); + # protect spaces + $contents =~ s/\s+/$;9/g; + # restore $_ protecting {} + $_ = "$before$;7$contents$;8$after"; + } + @args = split(/\s+/, &protect_html($_)); + foreach (@args) + { + s/$;9/ /g; # unprotect spaces + s/$;7/\{/g; # ... { + s/$;8/\}/g; # ... } + } + $type = shift(@args); + $type =~ s/^\{(.*)\}$/$1/; + print "# def ($tag): {$type} ", join(', ', @args), "\n" + if $T2H_DEBUG & $DEBUG_DEF; + if ($tag eq 'deftypecv') { + my $class = shift (@args); + $class =~ s/^\{(.*)\}$/$1/; + $type .= " of $class"; + } + $type .= ':' if (!$T2H_DEF_TABLE); # it's nicer like this + $name = shift(@args); + $name =~ s/^\{(.*)\}$/$1/; + if ($is_extra) + { + $_ = &debug("
", __LINE__) if (!$T2H_DEF_TABLE); + $_ = &debug("", __LINE__) if ($T2H_DEF_TABLE); + #$_ = &debug("
\n", __LINE__) if ($T2H_DEF_TABLE); + } + else + { + # APA:
implicitly ends paragraph, so let's + # do it explicitly to keep our HTML stack in sync. + if ($html_element eq 'P') + { + $_ = &debug("

\n", __LINE__); + &html_pop(); + } + else + { + $_ = ''; + } + $_ .= &debug("
\n
", __LINE__) if (!$T2H_DEF_TABLE); + $_ .= &debug("
\n", __LINE__) if ($T2H_DEF_TABLE); + } + if ($tag eq 'deffn' || $tag eq 'defvr' || $tag eq 'deftp') + { + if ($T2H_DEF_TABLE) + { + $_ .= "\n\n"; + $_ .= "\n\n"; + } + else + { + $_ .= "$type$name"; + $_ .= " @args" if @args; + } + } + elsif ($tag eq 'deftypefn' || $tag eq 'deftypevr' + || $tag eq 'deftypeop' || $tag eq 'defcv' + || $tag eq 'defop' || $tag eq 'deftypecv') + { + $ftype = $name; + $name = shift(@args); + $name =~ s/^\{(.*)\}$/$1/; + if ($T2H_DEF_TABLE) + { + $_ .= "\n\n"; + $_ .= "\n\n"; + } + else + { + my $sep = $ftype =~ /\*$/ ? '' : ' '; + $_ .= "$type $ftype$sep$name"; + $_ .= " @args" if @args; + } + } + else + { + warn "$ERROR Unknown definition type: $tag\n"; + $_ .= "$type$name"; + $_ .= " @args" if @args; + } + $_ .= &debug("\n
", __LINE__) if (!$T2H_DEF_TABLE); + ########$_ .= &debug("\n
$name\n"; + $_ .= " @args" if @args; + $_ .= ""; + $_ .= "$type
$name"; + $_ .= " @args" if @args; + $_ .= ""; + $_ .= "$type of $ftype
\n\n", __LINE__) if ($T2H_DEF_TABLE); + $name = &unprotect_html($name); + if ($tag eq 'deffn' || $tag eq 'deftypefn') + { + EnterIndexEntry('f', $name, $docu_doc, $section, \@lines); + # unshift(@input_spool, "\@findex $name\n"); + } + elsif ($tag eq 'defop') + { + EnterIndexEntry('f', "$name on $ftype", $docu_doc, $section, \@lines); + # unshift(@input_spool, "\@findex $name on $ftype\n"); + } + elsif ($tag eq 'defvr' || $tag eq 'deftypevr' || $tag eq 'defcv') + { + EnterIndexEntry('v', $name, $docu_doc, $section, \@lines); + # unshift(@input_spool, "\@vindex $name\n"); + } + else + { + EnterIndexEntry('t', $name, $docu_doc, $section, \@lines); + # unshift(@input_spool, "\@tindex $name\n"); + } + $dont_html = 1; + } + } + elsif ($end_tag) + { + if ($format_map{$end_tag}) + { + $in_pre = 0 if $format_map{$end_tag} eq 'PRE'; + $in_list-- if $format_map{$end_tag} eq 'UL' || $format_map{$end_tag} eq 'OL' ; + &html_pop_if('P'); + &html_pop_if('LI'); + &html_pop_if(); + push(@lines, &debug("\n", __LINE__)); + push(@lines, &html_debug('', __LINE__)); + } + elsif (exists $complex_format_map->{$end_tag}) + { + my $end = eval $complex_format_map->{$end_tag}->[1]; + if ($@) + { + print "$ERROR: eval of complex_format_map->{$end_tag}->[1] $complex_format_map->{$end_tag}->[0]: $@"; + $end = '' + } + $in_pre = 0 if $end =~ m||; + push(@lines, html_debug($end, __LINE__)); + } + elsif ($end_tag =~ /^(|f|v|multi)table$/) + { + unless (@tables) + { + warn "$ERROR \@end $end_tag without \@*table\n"; + next; + } + &html_pop_if('P'); + ($table_type, $in_table) = split($;, shift(@tables)); + unless ($1 eq $table_type) + { + warn "$ERROR \@end $end_tag without matching \@$end_tag\n"; + next; + } + if ($table_type eq "multi") + { + push(@lines, "
\n"); + &html_pop_if('TR'); + } + else + { + # APA: implicitly ends paragraph, so let's + # do it explicitly to keep our HTML stack in sync. + if ($html_element eq 'P') + { + push(@lines, &debug("

\n", __LINE__)); + &html_pop(); + } + push(@lines, "\n"); + &html_pop_if('DD'); + } + &html_pop_if(); + if (@tables) + { + ($table_type, $in_table) = split($;, $tables[0]); + } + else + { + $in_table = 0; + } + } + elsif (defined($def_map{$end_tag})) + { + # APA: and
implicitly ends paragraph, + # so let's do it explicitly to keep our HTML stack in + # sync. + if ($html_element eq 'P') + { + push(@lines, &debug("

\n", __LINE__)); + &html_pop(); + } + push(@lines, &debug("\n", __LINE__)) if (!$T2H_DEF_TABLE); + push(@lines, &debug("\n", __LINE__)) if ($T2H_DEF_TABLE); + } + elsif ($end_tag eq 'menu') + { + &html_pop_if(); + push(@lines, $_); # must keep it for pass 2 + } + next; + } + ############################################################# + # anchor insertion + while (/\@anchor\s*\{(.*?)\}/) + { + $_ = $`.$'; + my $anchor = $1; + $anchor = &normalise_node($anchor); + push @lines, &html_debug("\n"); + $node2href{$anchor} = "$docu_doc#$anchor"; + next INPUT_LINE if $_ =~ /^\s*$/; + } + ############################################################# + # index entry generation, after value substitutions + if (/^\@(\w+?)index\s+/) + { + EnterIndexEntry($1, $', $docu_doc, $section, \@lines); + next; + } + # + # protect texi and HTML things + &protect_texi; + $_ = &protect_html($_) unless $dont_html; + $dont_html = 0; + # substitution (unsupported things) + s/^\@exdent\s+//go; + s/\@noindent\s+//go; + s/\@refill\s+//go; + # other substitutions + &simple_substitutions; + s/\@footnote\{/\@footnote$docu_doc\{/g; # mark footnotes, cf. pass 4 + # + # analyze the tag again + # + if ($tag) + { + if (defined($sec2level{$tag}) && $sec2level{$tag} > 0) + { + if (/^\@$tag\s+(.+)$/) + { + $name = $1; + $name = &normalise_node($name); + $level = $sec2level{$tag}; + # check for index + $first_index_chapter = $name + if ($level == 1 && !$first_index_chapter && + $name =~ /index/i); + if ($in_top && /heading/) + { + $T2H_HAS_TOP_HEADING = 1; + $_ = &debug("$name\n", __LINE__); + &html_push_if('body'); + print "# top heading, section $name, level $level\n" + if $T2H_DEBUG & $DEBUG_TOC; + } + else + { + unless (/^\@\w*heading/) + { + unless (/^\@unnumbered/) + { + my $number = &update_sec_num($tag, $level); + $name = $number . ' ' . $name if $T2H_NUMBER_SECTIONS; + $sec2number{$name} = $number; + $number2sec{$number} = $name; + } + if (defined($toplevel)) + { + push @lines, ($level==$toplevel ? $CHAPTEREND : $SECTIONEND); + } + else + { + # first time we see a "section" + unless ($level == 1) + { + warn "$WARN The first section found is not of level 1: $_"; + } + $toplevel = $level; + } + push(@sections, $name); + next_doc() if (defined $T2H_SPLIT + and + ($T2H_SPLIT eq 'section' + || + $T2H_SPLIT && $level == $toplevel)); + } + $sec_num++; + $docid = "SEC$sec_num"; + $tocid = (/^\@\w*heading/ ? undef : "TOC$sec_num"); + # check biblio and glossary + $in_bibliography = ($name =~ /^([A-Z]|\d+)?(\.\d+)*\s*bibliography$/i); + $in_glossary = ($name =~ /^([A-Z]|\d+)?(\.\d+)*\s*glossary$/i); + # check node + if ($node) + { + warn "$ERROR Duplicate node found: $node\n" + if ($node2sec{$node}); + } + else + { + $name .= ' ' while ($node2sec{$name}); + $node = $name; + } + $name .= ' ' while ($sec2node{$name}); + $section = $name; + $node2sec{$node} = $name; + $sec2node{$name} = $node; + $sec2seccount{$name} = $sec_num; + $seccount2sec{$sec_num} = $name; + $node2href{$node} = "$docu_doc#$docid"; + $node2next{$node} = $node_next; + $node2prev{$node} = $node_prev; + $node2up{$node} = $node_up; + print "# node $node, section $name, level $level\n" + if $T2H_DEBUG & $DEBUG_TOC; + + $node = ''; + $node_next = ''; + $node_prev = ''; + $node_next = ''; + if ($tocid) + { + # update TOC + while ($level > $curlevel) + { + $curlevel++; + push(@toc_lines, "<$toc_indent>\n"); + } + while ($level < $curlevel) + { + $curlevel--; + push(@toc_lines, "\n"); + } + $_ = &t2h_anchor($tocid, "$docu_doc#$docid", $name, 1); + $_ = &substitute_style($_); + push(@stoc_lines, "$_
\n") if ($level == 1); + if ($T2H_NUMBER_SECTIONS) + { + push(@toc_lines, $_ . "
\n") + } + else + { + push(@toc_lines, "
  • " . $_ ."
  • "); + } + } + else + { + push(@lines, &html_debug("\n", + __LINE__)); + } + # update DOC + push(@lines, &html_debug('', __LINE__)); + &html_reset; + $_ = " $name \n\n"; + $_ = &debug($_, __LINE__); + push(@lines, &html_debug('', __LINE__)); + } + # update DOC + foreach $line (split(/\n+/, $_)) + { + push(@lines, "$line\n"); + } + next; + } + else + { + warn "$ERROR Bad section line: $_"; + } + } + else + { + # track variables + $value{$1} = Unprotect_texi($2), next if /^\@set\s+($VARRE)\s+(.*)$/o; + delete $value{$1}, next if /^\@clear\s+($VARRE)\s*$/o; + # store things + $value{'_shorttitle'} = Unprotect_texi($1), next if /^\@shorttitle\s+(.*)$/; + $value{'_setfilename'} = Unprotect_texi($1), next if /^\@setfilename\s+(.*)$/; + $value{'_settitle'} = Unprotect_texi($1), next if /^\@settitle\s+(.*)$/; + $value{'_author'} .= Unprotect_texi($1)."\n", next if /^\@author\s+(.*)$/; + $value{'_subtitle'} .= Unprotect_texi($1)."\n", next if /^\@subtitle\s+(.*)$/; + $value{'_title'} .= Unprotect_texi($1)."\n", next if /^\@title\s+(.*)$/; + + # list item + if (/^\s*\@itemx?\s+/) + { + $what = $'; + $what =~ s/\s+$//; + if ($in_bibliography && $use_bibliography) + { + if ($what =~ /^$BIBRE$/o) + { + $id = 'BIB' . ++$bib_num; + $bib2href{$what} = "$docu_doc#$id"; + print "# found bibliography for '$what' id $id\n" + if $T2H_DEBUG & $DEBUG_BIB; + $what = &t2h_anchor($id, '', $what); + } + } + elsif ($in_glossary && $T2H_USE_GLOSSARY) + { + $id = 'GLOSS' . ++$gloss_num; + $entry = $what; + $entry =~ tr/A-Z/a-z/ unless $entry =~ /^[A-Z\s]+$/; + $gloss2href{$entry} = "$docu_doc#$id"; + print "# found glossary for '$entry' id $id\n" + if $T2H_DEBUG & $DEBUG_GLOSS; + $what = &t2h_anchor($id, '', $what); + } + elsif ($in_table && ($table_type eq 'f' || $table_type eq 'v')) + { + # APA: Insert
    before index anchor, if + # necessary to produce valid HTML. Close open + # paragraph first. + if ($html_element ne 'DT') + { + # APA: End paragraph, if any. + if ($html_element eq 'P') + { + push(@lines, &debug("

    \n", __LINE__)); + &html_pop(); + } + push(@lines, &debug("
    ", __LINE__)); + &html_push('DT'); + } + EnterIndexEntry($table_type, $what, $docu_doc, $section, \@lines); + } + # APA: End paragraph, if any. + if ($html_element eq 'P') + { + push(@lines, &debug("

    \n", __LINE__)); + &html_pop(); + } + if ($html_element =~ m|^D[DLT]$|) + { + unless ($html_element eq 'DT') + { + push(@lines, &debug("
    ", __LINE__)); + } + if ($things_map{$in_table} && !$what) + { + # special case to allow @table @bullet for instance + push(@lines, &debug("$things_map{$in_table}\n", __LINE__)); + } + else + { + push(@lines, &debug("\@$in_table\{$what\}\n", __LINE__)); + } + push(@lines, "
    "); + &html_push('DD') unless $html_element eq 'DD'; + if ($table_type) + { # add also an index + unshift(@input_spool, "\@${table_type}index $what\n"); + } + } + elsif ($html_element eq 'TABLE') + { + push(@lines, &debug("$what\n", __LINE__)); + &html_push('TR'); + } + elsif ($html_element eq 'TR') + { + push(@lines, &debug("\n", __LINE__)); + push(@lines, &debug("$what\n", __LINE__)); + } + else + { + push(@lines, &debug("
  • $what\n", __LINE__)); + &html_push('LI') unless $html_element eq 'LI'; + } + push(@lines, &html_debug('', __LINE__)); + if ($deferred_ref) + { + push(@lines, &debug("$deferred_ref\n", __LINE__)); + $deferred_ref = ''; + } + next; + } + elsif (/^\@tab\s+(.*)$/) + { + push(@lines, "$1\n"); + next; + } + } + } + # paragraph separator + if ($_ eq "\n" && ! $in_pre) + { + next if $#lines >= 0 && $lines[$#lines] eq "\n"; + if ($html_element eq 'P') + { + push (@lines, &debug("

    \n

    \n", __LINE__)); + } + # else + # { + # push(@lines, "

    \n"); + # $_ = &debug("

    \n", __LINE__); + # } + elsif ($html_element eq 'body' || $html_element eq 'BLOCKQUOTE' || $html_element eq 'DD' || $html_element eq 'LI') + { + &html_push('P'); + push(@lines, &debug("

    \n", __LINE__)); + } + } + # otherwise + push(@lines, $_) unless $in_titlepage; + push(@lines, &debug("\n", __LINE__)) if ($tag eq 'center'); + } + # finish TOC + $level = 0; + while ($level < $curlevel) + { + $curlevel--; + push(@toc_lines, "\n"); + } + print "# end of pass 1\n" if $T2H_VERBOSE; +} + +#+++############################################################################ +# # +# Stuff related to Index generation # +# # +#---############################################################################ + +sub EnterIndexEntry +{ + my $prefix = shift; + my $key = shift; + my $docu_doc = shift; + my $section = shift; + my $lines = shift; + local $_; + + warn "$ERROR Undefined index command: $_", next + unless (exists ($index_properties->{$prefix})); + $key =~ s/\s+$//; + $_ = $key; + &protect_texi; + $key = $_; + $_ = &protect_html($_); + my $html_key = substitute_style($_); + my $id; + $key = remove_style($key); + $key = remove_things($key); + $_ = $key; + &unprotect_texi; + $key = $_; + while (exists $index->{$prefix}->{$key}) + { + $key .= ' '; + } + ; + if ($lines->[$#lines] =~ /^$/) + { + $id = $1; + } + else + { + $id = 'IDX' . ++$idx_num; + push(@$lines, &t2h_anchor($id, '', $T2H_INVISIBLE_MARK, !$in_pre)); + } + $index->{$prefix}->{$key}->{html_key} = $html_key; + $index->{$prefix}->{$key}->{section} = $section; + $index->{$prefix}->{$key}->{href} = "$docu_doc#$id"; + print "# found ${prefix}index for '$key' with id $id\n" + if $T2H_DEBUG & $DEBUG_INDEX; +} + +sub IndexName2Prefix +{ + my $name = shift; + my $prefix; + + for $prefix (keys %$index_properties) + { + return $prefix if ($index_properties->{$prefix}->{name} eq $name); + } + return undef; +} + +sub GetIndexEntries +{ + my $normal = shift; + my $code = shift; + my ($entries, $prefix, $key) = ({}); + for $prefix (keys %$normal) + { + for $key (keys %{$index->{$prefix}}) + { + $entries->{$key} = {%{$index->{$prefix}->{$key}}}; + } + } + + if (defined($code)) + { + for $prefix (keys %$code) + { + unless (exists $normal->{$prefix}) + { + for $key (keys %{$index->{$prefix}}) + { + $entries->{$key} = {%{$index->{$prefix}->{$key}}}; + $entries->{$key}->{html_key} = "$entries->{$key}->{html_key}"; + } + } + } + } + return $entries; +} + +sub byAlpha +{ + if ($a =~ /^[A-Za-z]/) + { + if ($b =~ /^[A-Za-z]/) + { + return lc($a) cmp lc($b); + } + else + { + return 1; + } + } + elsif ($b =~ /^[A-Za-z]/) + { + return -1; + } + else + { + return lc($a) cmp lc($b); + } +} + +sub GetIndexPages +{ + my $entries = shift; + my (@Letters, $key); + my ($EntriesByLetter, $Pages, $page) = ({}, [], {}); + my @keys = sort byAlpha keys %$entries; + + for $key (@keys) + { + push @{$EntriesByLetter->{uc(substr($key,0, 1))}} , $entries->{$key}; + } + @Letters = sort byAlpha keys %$EntriesByLetter; + $T2H_SPLIT_INDEX = 0 unless $T2H_SPLIT; + + unless ($T2H_SPLIT_INDEX) + { + $page->{First} = $Letters[0]; + $page->{Last} = $Letters[$#Letters]; + $page->{Letters} = \@Letters; + $page->{EntriesByLetter} = $EntriesByLetter; + push @$Pages, $page; + return $Pages; + } + + if ($T2H_SPLIT_INDEX =~ /^\d+$/) + { + my $i = 0; + my ($prev_letter, $letter); + for $letter (@Letters) + { + if ($i > $T2H_SPLIT_INDEX) + { + $page->{Last} = $prev_letter; + push @$Pages, $page; + $i=0; + } + if ($i == 0) + { + $page = {}; + $page->{Letters} = []; + $page->{EntriesByLetter} = {}; + $page->{First} = $letter; + } + push @{$page->{Letters}}, $letter; + $page->{EntriesByLetter}->{$letter} = [@{$EntriesByLetter->{$letter}}]; + $i += scalar(@{$EntriesByLetter->{$letter}}); + $prev_letter = $letter; + } + $page->{Last} = $Letters[$#Letters]; + push @$Pages, $page; + } + return $Pages; +} + +sub GetIndexSummary +{ + my $first_page = shift; + my $Pages = shift; + my $name = shift; + my ($page, $letter, $summary, $i, $l1, $l2, $l); + + $i = 0; + $summary = '
    Jump to:   '; + for $page ($first_page, @$Pages) + { + for $letter (@{$page->{Letters}}) + { + $l = t2h_anchor('', "$page->{href}#${name}_$letter", "$letter", + 0, 'style="text-decoration:none"') . "\n   \n"; + if ($letter =~ /^[A-Za-z]/) + { + $l2 .= $l; + } + else + { + $l1 .= $l; + } + } + } + $summary .= $l1 . "
    \n" if ($l1); + $summary .= $l2 . '
    '; + return $summary; +} + +sub PrintIndexPage +{ + my $lines = shift; + my $summary = shift; + my $page = shift; + my $name = shift; + + push @$lines, $summary; + + push @$lines , <

    + + + +EOT + + for $letter (@{$page->{Letters}}) + { + push @$lines, "\n"; + for $entry (@{$page->{EntriesByLetter}->{$letter}}) + { + push @$lines, + "\n"; + } + push @$lines, "\n"; + } + push @$lines, "
    Index Entry Section

    ".protect_html($letter)."
    " . + t2h_anchor('', $entry->{href}, $entry->{html_key}) . + "" . + t2h_anchor('', sec_href($entry->{section}), clean_name($entry->{section})) . + "

    "; + push @$lines, $summary; +} + +sub PrintIndex +{ + my $lines = shift; + my $name = shift; + my $section = shift; + $section = 'Top' unless $section; + my $prefix = IndexName2Prefix($name); + + warn ("$ERROR printindex: bad index name: $name"), return + unless $prefix; + + if ($index_properties->{$prefix}->{code}) + { + $index_properties->{$prefix}->{from_code}->{$prefix} = 1; + } + else + { + $index_properties->{$prefix}->{from}->{$prefix}= 1; + } + + my $Entries = GetIndexEntries($index_properties->{$prefix}->{from}, + $index_properties->{$prefix}->{from_code}); + return unless %$Entries; + + if ($T2H_IDX_SUMMARY) + { + my $key; + open(FHIDX, ">$docu_rdir$docu_name" . "_$name.idx") + || die "Can't open > $docu_rdir$docu_name" . "_$name.idx for writing: $!\n"; + print "# writing $name index summary in $docu_rdir$docu_name" . "_$name.idx...\n" if $T2H_VERBOSE; + + for $key (sort keys %$Entries) + { + print FHIDX "$key\t$Entries->{$key}->{href}\n"; + } + } + + my $Pages = GetIndexPages($Entries); + my $page; + my $first_page = shift @$Pages; + my $sec_name = $section; + + # remove section number + $sec_name =~ s/.*? // if $sec_name =~ /^([A-Z]|\d+)\./; + + ($first_page->{href} = sec_href($section)) =~ s/\#.*$//; + $node2prev{$section} = Sec2PrevNode($node2sec{$section}); + $prev_node = $section; + # Update tree structure of document + if (@$Pages) + { + my $sec; + my @after; + + while (@sections && $sections[$#sections] ne $section) + { + unshift @after, pop @sections; + } + + for $page (@$Pages) + { + my $node = ($page->{First} ne $page->{Last} ? + "$sec_name: $page->{First} -- $page->{Last}" : + "$sec_name: $page->{First}"); + push @sections, $node; + $node2sec{$node} = $node; + $sec2node{$node} = $node; + $node2up{$node} = $section; + $page->{href} = next_doc(); + $page->{name} = $node; + $node2href{$node} = $page->{href}; + if ($prev_node) + { + $node2next{$prev_node} = $node; + $node2prev{$node} = $prev_node; + } + $prev_node = $node; + } + # Full circle - Next on last index page goes to Top + $node2next{$prev_node} = "Top"; + push @sections, @after; + } + + my $summary = GetIndexSummary($first_page, $Pages, $name); + PrintIndexPage($lines, $summary, $first_page, $name); + for $page (@$Pages) + { + push @$lines, ($T2H_SPLIT eq 'chapter' ? $CHAPTEREND : $SECTIONEND); + push @$lines, "

    $page->{name}

    \n"; + PrintIndexPage($lines, $summary, $page, $name); + } +} + + +#+++############################################################################ +# # +# Pass 2/3: handle style, menu, index, cross-reference # +# # +#---############################################################################ +sub pass2 +{ + my $sec; + my $href; + @lines2 = (); # whole document (2nd pass) + @lines3 = (); # whole document (3rd pass) + my $in_menu = 0; # am I inside a menu + my $in_menu_listing; + + while (@lines) + { + $_ = shift(@lines); + # + # special case (protected sections) + # + if (/^$PROTECTTAG/o) + { + push(@lines2, $_); + next; + } + # + # menu + # + if (/^\@menu\b/) + { + $in_menu = 1; + $in_menu_listing = 1; + # APA: implicitly ends paragraph, so let's do it + # explicitly to keep our HTML stack in sync. + if ($html_element eq 'P') + { + push (@lines2, &debug("

    \n", __LINE__)); + &html_pop(); + } + push(@lines2, &debug("
    \n", __LINE__)); + next; + } + if (/^\@end\s+menu\b/) + { + if ($in_menu_listing) + { + push(@lines2, &debug("
    \n", __LINE__)); + } + $in_menu = 0; + $in_menu_listing = 0; + next; + } + if ($in_menu) + { + my ($node, $name, $descr); + if (/^\*\s+($NODERE)::/o) + { + $node = $1; + $descr = $'; + } + elsif (/^\*\s+(.+):\s+([^\t,\.\n]+)[\t,\.\n]/) + { + $name = $1; + $node = $2; + $descr = $'; + } + elsif (/^\*/) + { + warn "$ERROR Bad menu line: $_"; + } + else + { + if ($in_menu_listing) + { + # APA: Handle menu comment lines. These don't end the menu! + # $in_menu_listing = 0; + push(@lines2,&debug('' . $_ . ' +', __LINE__)); + } + } + if ($node) + { + if (! $in_menu_listing) + { + $in_menu_listing = 1; + push(@lines2, &debug("\n", __LINE__)); + } + # look for continuation + while ($lines[0] =~ /^\s+\w+/) + { + $descr .= shift(@lines); + } + &menu_entry($node, $name, $descr); + } + next; + } + # + # printindex + # + PrintIndex(\@lines2, $2, $1), next + if (/^\@printindex\s+(\w+)/); + # + # simple style substitutions + # + $_ = &substitute_style($_); + # + # xref + # + while (/\@(x|px|info|)ref{([^{}]+)(}?)/) + { + # note: Texinfo may accept other characters + ($type, $nodes, $full) = ($1, $2, $3); + ($before, $after) = ($`, $'); + if (! $full && $after) + { + warn "$ERROR Bad xref (no ending } on line): $_"; + $_ = "$before$;0${type}ref\{$nodes$after"; + next; # while xref + } + if ($type eq 'x') + { + $type = "$T2H_WORDS->{$T2H_LANG}->{'See'} "; + } + elsif ($type eq 'px') + { + $type = "$T2H_WORDS->{$T2H_LANG}->{'see'} "; + } + elsif ($type eq 'info') + { + $type = "$T2H_WORDS->{$T2H_LANG}->{'See'} Info"; + } + else + { + $type = ''; + } + unless ($full) + { + $next = shift(@lines); + $next = &substitute_style($next); + chop($nodes); # remove final newline + if ($next =~ /\}/) + { # split on 2 lines + $nodes .= " $`"; + $after = $'; + } + else + { + $nodes .= " $next"; + $next = shift(@lines); + $next = &substitute_style($next); + chop($nodes); + if ($next =~ /\}/) + { # split on 3 lines + $nodes .= " $`"; + $after = $'; + } + else + { + warn "$ERROR Bad xref (no ending }): $_"; + $_ = "$before$;0xref\{$nodes$after"; + unshift(@lines, $next); + next; # while xref + } + } + } + $nodes =~ s/\s+/ /go; # remove useless spaces + @args = split(/\s*,\s*/, $nodes); + $node = $args[0]; # the node is always the first arg + $node = &normalise_node($node); + $sec = $args[2] || $args[1] || $node2sec{$node}; + $href = $node2href{$node}; + if (@args == 5) + { # reference to another manual + $sec = $args[2] || $node; + $man = $args[4] || $args[3]; + $_ = "${before}${type}$T2H_WORDS->{$T2H_LANG}->{'section'} `$sec' in \@cite{$man}$after"; + } + elsif ($type =~ /Info/) + { # inforef + warn "$ERROR Wrong number of arguments: $_" unless @args == 3; + ($nn, $_, $in) = @args; + $_ = "${before}${type} file `$in', node `$nn'$after"; + } + elsif ($sec && $href && ! $T2H_SHORT_REF) + { + $_ = "${before}${type}"; + $_ .= "$T2H_WORDS->{$T2H_LANG}->{'section'} " if $type; + $_ .= &t2h_anchor('', $href, $sec) . $after; + } + elsif ($href) + { + $_ = "${before}${type} " . + &t2h_anchor('', $href, $args[2] || $args[1] || $node) . + $after; + } + else + { + warn "$ERROR Undefined node ($node): $_"; + $_ = "$before$;0xref{$nodes}$after"; + } + } + + # replace images + s[\@image\s*{(.+?)}] + { + my @args = split (/\s*,\s*/, $1); + my $base = $args[0]; + my $image = + LocateIncludeFile("$base.png") || + LocateIncludeFile("$base.jpg") || + LocateIncludeFile("$base.gif"); + warn "$ERROR no image file for $base: $_" unless ($image && -e $image); + ($T2H_CENTER_IMAGE ? + "
    \"$base\"
    " : + "\"$base\""); + }eg; + + # + # try to guess bibliography references or glossary terms + # + unless (/^/) + { + $done .= $pre . &t2h_anchor('', $href, $what); + } + else + { + $done .= "$pre$what"; + } + $_ = $post; + } + $_ = $done . $_; + } + if ($T2H_USE_GLOSSARY) + { + $done = ''; + while (/\b\w+\b/) + { + ($pre, $what, $post) = ($`, $&, $'); + $entry = $what; + $entry =~ tr/A-Z/a-z/ unless $entry =~ /^[A-Z\s]+$/; + $href = $gloss2href{$entry}; + if (defined($href) && $post !~ /^[^<]*<\/A>/) + { + $done .= $pre . &t2h_anchor('', $href, $what); + } + else + { + $done .= "$pre$what"; + } + $_ = $post; + } + $_ = $done . $_; + } + } + # otherwise + push(@lines2, $_); + } + print "# end of pass 2\n" if $T2H_VERBOSE; +} + +sub pass3 +{ + # + # split style substitutions + # + my $text; + while (@lines2) + { + $_ = shift(@lines2); + # + # special case (protected sections) + # + if (/^$PROTECTTAG/o) + { + push(@lines3, $_); + next; + } + # + # split style substitutions + # + $old = ''; + while ($old ne $_) + { + $old = $_; + if (/\@(\w+)\{/) + { + ($before, $style, $after) = ($`, $1, $'); + if (defined($style_map{$style})) + { + $_ = $after; + $text = ''; + $after = ''; + $failed = 1; + while (@lines2) + { + if (/\}/) + { + $text .= $`; + $after = $'; + $failed = 0; + last; + } + else + { + $text .= $_; + $_ = shift(@lines2); + } + } + if ($failed) + { + die "* Bad syntax (\@$style) after: $before\n"; + } + else + { + $text = &apply_style($style, $text); + $_ = "$before$text$after"; + } + } + } + } + # otherwise + push(@lines3, $_); + } + print "# end of pass 3\n" if $T2H_VERBOSE; +} + +#+++############################################################################ +# # +# Pass 4: foot notes, final cleanup # +# # +#---############################################################################ +sub pass4 +{ + my $text; + my $name; + @foot_lines = (); # footnotes + @doc_lines = (); # final document + $end_of_para = 0; # true if last line is

    + + # APA: There aint no paragraph before the first one! + # This fixes a HTML validation error. + $lines3[0] =~ s|^

    \n|\n|; + while (@lines3) + { + $_ = shift(@lines3); + # + # special case (protected sections) + # + if (/^$PROTECTTAG/o) + { + push(@doc_lines, $_); + $end_of_para = 0; + next; + } + # + # footnotes + # + while (/\@footnote([^\{\s]+)\{/) + { + ($before, $d, $after) = ($`, $1, $'); + $_ = $after; + $text = ''; + $after = ''; + $failed = 1; + while (@lines3) + { + if (/\}/) + { + $text .= $`; + $after = $'; + $failed = 0; + last; + } + else + { + $text .= $_; + $_ = shift(@lines3); + } + } + if ($failed) + { + die "* Bad syntax (\@footnote) after: $before\n"; + } + else + { + $foot_num++; + $docid = "DOCF$foot_num"; + $footid = "FOOT$foot_num"; + $foot = "($foot_num)"; + push(@foot_lines, "

    " . &t2h_anchor($footid, "$d#$docid", $foot) . "

    \n"); + $text = "

    $text" unless $text =~ /^\s*

    /; + push(@foot_lines, "$text\n"); + $_ = $before . &t2h_anchor($docid, "$docu_foot#$footid", $foot) . $after; + } + } + # + # remove unnecessary

    + # + if (/^\s*

    \s*$/) + { + next if $end_of_para++; + } + else + { + $end_of_para = 0; + } + # otherwise + push(@doc_lines, $_); + } + + print "# end of pass 4\n" if $T2H_VERBOSE; +} + +#+++############################################################################ +# # +# Pass 5: print things # +# # +#---############################################################################ +sub pass5 +{ + $T2H_L2H = &l2h_FinishToLatex if ($T2H_L2H); + $T2H_L2H = &l2h_ToHtml if ($T2H_L2H); + $T2H_L2H = &l2h_InitFromHtml if ($T2H_L2H); + + T2H_InitGlobals(); + + # fix node2up, node2prev, node2next, if desired + if ($has_top_command) + { + for $section (keys %sec2number) + { + $node2href{$sec2node{$section}} =~ /SEC(\d+)$/; + $node = $sec2node{$section}; + $node2up{$node} = Sec2UpNode($section) unless $node2up{$node}; + $node2prev{$node} = Sec2PrevNode($section) unless $node2prev{$node}; + $node2next{$node} = Sec2NextNode($section) unless $node2next{$node}; + } + } + + # prepare %T2H_THISDOC + $T2H_THISDOC{fulltitle} = $value{'_title'} || $value{'_settitle'} || "Untitled Document"; + $T2H_THISDOC{title} = $value{'_settitle'} || $T2H_THISDOC{fulltitle}; + $T2H_THISDOC{author} = $value{'_author'}; + $T2H_THISDOC{subtitle} = $value{'_subtitle'}; + $T2H_THISDOC{shorttitle} = $value{'_shorttitle'}; + for $key (keys %T2H_THISDOC) + { + $_ = &substitute_style($T2H_THISDOC{$key}); + &unprotect_texi; + s/\s*$//; + $T2H_THISDOC{$key} = $_; + } + + # if no sections, then simply print document as is + unless (@sections) + { + print "# Writing content into $docu_top_file \n" if $T2H_VERBOSE; + open(FILE, "> $docu_top_file") + || die "$ERROR: Can't open $docu_top_file for writing: $!\n"; + + &$T2H_print_page_head(\*FILE); + $T2H_THIS_SECTION = \@doc_lines; + t2h_print_lines(\*FILE); + &$T2H_print_foot_navigation(\*FILE); + &$T2H_print_page_foot(\*FILE); + close(FILE); + goto Finish; + } + + # initialize $T2H_HREF, $T2H_NAME + %T2H_HREF = + ( + 'First' , sec_href($sections[0]), + 'Last', sec_href($sections[$#sections]), + 'About', $docu_about. '#SEC_About', + ); + + # prepare TOC, OVERVIEW, TOP + $T2H_TOC = \@toc_lines; + $T2H_OVERVIEW = \@stoc_lines; + if ($has_top) + { + while (1) + { + $_ = shift @doc_lines; + last if /$TOPEND/; + push @$T2H_TOP, $_; + } + $T2H_HREF{'Top'} = $docu_top . '#SEC_Top'; + } + else + { + $T2H_HREF{'Top'} = $T2H_HREF{First}; + } + + $node2href{Top} = $T2H_HREF{Top}; + $T2H_HREF{Contents} = $docu_toc.'#SEC_Contents' if @toc_lines; + $T2H_HREF{Overview} = $docu_stoc.'#SEC_OVERVIEW' if @stoc_lines; + + # settle on index + if ($T2H_INDEX_CHAPTER) + { + $T2H_HREF{Index} = $node2href{normalise_node($T2H_INDEX_CHAPTER)}; + warn "$ERROR T2H_INDEX_CHAPTER '$T2H_INDEX_CHAPTER' not found\n" + unless $T2H_HREF{Index}; + } + if (! $T2H_HREF{Index} && $first_index_chapter) + { + $T2H_INDEX_CHAPTER = $first_index_chapter; + $T2H_HREF{Index} = $node2href{$T2H_INDEX_CHAPTER}; + } + + print "# Using '" . clean_name($T2H_INDEX_CHAPTER) . "' as index page\n" + if ($T2H_VERBOSE && $T2H_HREF{Index}); + + %T2H_NAME = + ( + 'First', clean_name($sec2node{$sections[0]}), + 'Last', clean_name($sec2node{$sections[$#sections]}), + 'About', $T2H_WORDS->{$T2H_LANG}->{'About_Title'}, + 'Contents', $T2H_WORDS->{$T2H_LANG}->{'ToC_Title'}, + 'Overview', $T2H_WORDS->{$T2H_LANG}->{'Overview_Title'}, + 'Index' , clean_name($T2H_INDEX_CHAPTER), + 'Top', clean_name($T2H_TOP_HEADING || $T2H_THISDOC{'title'} || $T2H_THISDOC{'shorttitle'}), + ); + + ############################################################################# + # print frame and frame toc file + # + if ( $T2H_FRAMES ) + { + open(FILE, "> $docu_frame_file") + || die "$ERROR: Can't open $docu_frame_file for writing: $!\n"; + print "# Creating frame in $docu_frame_file ...\n" if $T2H_VERBOSE; + &$T2H_print_frame(\*FILE); + close(FILE); + + open(FILE, "> $docu_toc_frame_file") + || die "$ERROR: Can't open $docu_toc_frame_file for writing: $!\n"; + print "# Creating toc frame in $docu_frame_file ...\n" if $T2H_VERBOSE; + &$T2H_print_toc_frame(\*FILE); + close(FILE); + } + + + ############################################################################# + # Monolithic beginning. + # + unless ($T2H_SPLIT) + { + open(FILE, "> $docu_doc_file") + || die "$ERROR: Can't open $docu_doc_file for writing: $!\n"; + &$T2H_print_page_head(\*FILE); + } + + + ############################################################################# + # print Top + # + if ($has_top) + { + if ($T2H_SPLIT) + { + open(FILE, "> $docu_top_file") + || die "$ERROR: Can't open $docu_top_file for writing: $!\n"; + } + + print "# Creating Top in $docu_top_file ...\n" if $T2H_VERBOSE; + $T2H_THIS_SECTION = $T2H_TOP; + $T2H_HREF{This} = $T2H_HREF{Top}; + $T2H_NAME{This} = $T2H_NAME{Top}; + &$T2H_print_Top(\*FILE); + + if ($T2H_SPLIT) + { + close(FILE) + || die "$ERROR: Error occurred when closing $docu_top_file: $!\n"; + } + } + + + ############################################################################# + # Print sections + # + $T2H_NODE{Forward} = $sec2node{$sections[0]}; + $T2H_NAME{Forward} = &clean_name($sec2node{$sections[0]}); + $T2H_HREF{Forward} = sec_href($sections[0]); + $T2H_NODE{This} = 'Top'; + $T2H_NAME{This} = $T2H_NAME{Top}; + $T2H_HREF{This} = $T2H_HREF{Top}; + if ($T2H_SPLIT) + { + print "# writing " . scalar(@sections) . + " sections into $docu_rdir$docu_name"."_[1..$doc_num].$docu_ext" + if $T2H_VERBOSE; + $previous = ($T2H_SPLIT eq 'chapter' ? $CHAPTEREND : $SECTIONEND); + undef $FH; + $doc_num = 0; + } + else + { + print "# writing " . scalar(@sections) . " sections in $docu_top_file ..." + if $T2H_VERBOSE; + $FH = \*FILE; + $previous = ''; + } + + $counter = 0; + # loop through sections + while ($section = shift(@sections)) + { + if ($T2H_SPLIT && ($T2H_SPLIT eq 'section' || $previous eq $CHAPTEREND)) + { + if ($FH) + { + #close previous page + &$T2H_print_chapter_footer($FH) if $T2H_SPLIT eq 'chapter'; + &$T2H_print_page_foot($FH); + close($FH); + undef $FH; + } + } + $T2H_NAME{Back} = $T2H_NAME{This}; + $T2H_HREF{Back} = $T2H_HREF{This}; + $T2H_NODE{Back} = $T2H_NODE{This}; + $T2H_NAME{This} = $T2H_NAME{Forward}; + $T2H_HREF{This} = $T2H_HREF{Forward}; + $T2H_NODE{This} = $T2H_NODE{Forward}; + if ($sections[0]) + { + $T2H_NODE{Forward} = $sec2node{$sections[0]}; + $T2H_NAME{Forward} = &clean_name($T2H_NODE{Forward}); + $T2H_HREF{Forward} = sec_href($sections[0]); + } + else + { + delete $T2H_HREF{Forward}; + delete $T2H_NODE{Forward}; + delete $T2H_NAME{Forward}; + } + + $node = $node2up{$T2H_NODE{This}}; + $T2H_HREF{Up} = $node2href{$node}; + if ($T2H_HREF{Up} eq $T2H_HREF{This} || ! $T2H_HREF{Up}) + { + $T2H_NAME{Up} = $T2H_NAME{Top}; + $T2H_HREF{Up} = $T2H_HREF{Top}; + $T2H_NODE{Up} = 'Up'; + } + else + { + $T2H_NAME{Up} = &clean_name($node); + $T2H_NODE{Up} = $node; + } + + $node = $node2prev{$T2H_NODE{This}}; + $T2H_NAME{Prev} = &clean_name($node); + $T2H_HREF{Prev} = $node2href{$node}; + $T2H_NODE{Prev} = $node; + + $node = Node2FastBack($T2H_NODE{This}); + $T2H_NAME{FastBack} = &clean_name($node); + $T2H_HREF{FastBack} = $node2href{$node}; + $T2H_NODE{FastBack} = $node; + + $node = $node2next{$T2H_NODE{This}}; + $T2H_NAME{Next} = &clean_name($node); + $T2H_HREF{Next} = $node2href{$node}; + $T2H_NODE{Next} = $node; + + $node = Node2FastForward($T2H_NODE{This}); + $T2H_NAME{FastForward} = &clean_name($node); + $T2H_HREF{FastForward} = $node2href{$node}; + $T2H_NODE{FastForward} = $node; + + if (! defined($FH)) + { + my $file = $T2H_HREF{This}; + $file =~ s/\#.*$//; + open(FILE, "> $docu_rdir$file") || + die "$ERROR: Can't open $docu_rdir$file for writing: $!\n"; + $FH = \*FILE; + &$T2H_print_page_head($FH); + t2h_print_label($FH); + &$T2H_print_chapter_header($FH) if $T2H_SPLIT eq 'chapter'; + } + else + { + t2h_print_label($FH); + } + + $T2H_THIS_SECTION = []; + while (@doc_lines) + { + $_ = shift(@doc_lines); + last if ($_ eq $SECTIONEND || $_ eq $CHAPTEREND); + push(@$T2H_THIS_SECTION, $_); + } + $previous = $_; + &$T2H_print_section($FH); + + if ($T2H_VERBOSE) + { + $counter++; + print "." if $counter =~ /00$/; + } + } + if ($T2H_SPLIT) + { + &$T2H_print_chapter_footer($FH) if $T2H_SPLIT eq 'chapter'; + &$T2H_print_page_foot($FH); + close($FH); + } + print "\n" if $T2H_VERBOSE; + + ############################################################################# + # Print ToC, Overview, Footnotes + # + delete $T2H_HREF{Prev}; + delete $T2H_HREF{Next}; + delete $T2H_HREF{Back}; + delete $T2H_HREF{Forward}; + delete $T2H_HREF{Up}; + + if (@foot_lines) + { + print "# writing Footnotes in $docu_foot_file...\n" if $T2H_VERBOSE; + open (FILE, "> $docu_foot_file") || die "$ERROR: Can't open $docu_foot_file for writing: $!\n" + if $T2H_SPLIT; + $T2H_HREF{This} = $docu_foot; + $T2H_NAME{This} = $T2H_WORDS->{$T2H_LANG}->{'Footnotes_Title'}; + $T2H_THIS_SECTION = \@foot_lines; + &$T2H_print_Footnotes(\*FILE); + close(FILE) if $T2H_SPLIT; + } + + if (@toc_lines) + { + print "# writing Toc in $docu_toc_file...\n" if $T2H_VERBOSE; + open (FILE, "> $docu_toc_file") || die "$ERROR: Can't open $docu_toc_file for writing: $!\n" + if $T2H_SPLIT; + $T2H_HREF{This} = $T2H_HREF{Contents}; + $T2H_NAME{This} = $T2H_NAME{Contents}; + $T2H_THIS_SECTION = \@toc_lines; + &$T2H_print_Toc(\*FILE); + close(FILE) if $T2H_SPLIT; + } + + if (@stoc_lines) + { + print "# writing Overview in $docu_stoc_file...\n" if $T2H_VERBOSE; + open (FILE, "> $docu_stoc_file") || die "$ERROR: Can't open $docu_stoc_file for writing: $!\n" + if $T2H_SPLIT; + $T2H_HREF{This} = $T2H_HREF{Overview}; + $T2H_NAME{This} = $T2H_NAME{Overview}; + $T2H_THIS_SECTION = \@stoc_lines; + unshift @$T2H_THIS_SECTION, "

    \n"; + push @$T2H_THIS_SECTION, "\n
    \n"; + &$T2H_print_Overview(\*FILE); + close(FILE) if $T2H_SPLIT; + } + + if ($about_body = &$T2H_about_body()) + { + print "# writing About in $docu_about_file...\n" if $T2H_VERBOSE; + open (FILE, "> $docu_about_file") || die "$ERROR: Can't open $docu_about_file for writing: $!\n" + if $T2H_SPLIT; + + $T2H_HREF{This} = $T2H_HREF{About}; + $T2H_NAME{This} = $T2H_NAME{About}; + $T2H_THIS_SECTION = [$about_body]; + &$T2H_print_About(\*FILE); + close(FILE) if $T2H_SPLIT; + } + + unless ($T2H_SPLIT) + { + &$T2H_print_page_foot(\*FILE); + close (FILE); + } + + Finish: + &l2h_FinishFromHtml if ($T2H_L2H); + &l2h_Finish if($T2H_L2H); + print "# that's all folks\n" if $T2H_VERBOSE; + + exit(0); +} + +#+++############################################################################ +# # +# Low level functions # +# # +#---############################################################################ + +sub LocateIncludeFile +{ + my $file = shift; + my $dir; + + # APA: Don't implicitely search ., to conform with the docs! + # return $file if (-e $file && -r $file); + foreach $dir (@T2H_INCLUDE_DIRS) + { + return "$dir/$file" if (-e "$dir/$file" && -r "$dir/$file"); + } + return undef; +} + +sub clean_name +{ + local ($_); + $_ = &remove_style($_[0]); + &unprotect_texi; + return $_; +} + +sub update_sec_num +{ + my($name, $level) = @_; + my $ret; + + $level--; # here we start at 0 + if ($name =~ /^appendix/ || defined(@appendix_sec_num)) + { + # appendix style + if (defined(@appendix_sec_num)) + { + &incr_sec_num($level, @appendix_sec_num); + } + else + { + @appendix_sec_num = ('A', 0, 0, 0); + } + $ret = join('.', @appendix_sec_num[0..$level]); + } + else + { + # normal style + if (defined(@normal_sec_num)) + { + &incr_sec_num($level, @normal_sec_num); + } + else + { + @normal_sec_num = (1, 0, 0, 0); + } + $ret = join('.', @normal_sec_num[0..$level]); + } + $ret .= "." if $level == 0; + return $ret; +} + +sub incr_sec_num +{ + local($level, $l); + $level = shift(@_); + $_[$level]++; + foreach $l ($level+1 .. 3) + { + $_[$l] = 0; + } +} + +sub Sec2UpNode +{ + my $sec = shift; + my $num = $sec2number{$sec}; + + return '' unless $num; + return 'Top' unless $num =~ /\.\d+/; + $num =~ s/\.[^\.]*$//; + $num = $num . '.' unless $num =~ /\./; + return $sec2node{$number2sec{$num}}; +} + +# Return previous node or "Top" +sub Sec2PrevNode +{ + my $sec = shift; + my $sec_num = $sec2seccount{$sec} - 1; + return "Top" if !$sec_num || $sec_num < 1; + return $sec2node{$seccount2sec{$sec_num}}; +} + +# Return next node or "Top" +sub Sec2NextNode +{ + my $sec = shift; + my $sec_num = $sec2seccount{$sec} + 1; + return "Top" unless exists $seccount2sec{$sec_num}; + return $sec2node{$seccount2sec{$sec_num}}; +} + +# +# sub Node2FastBack NODE +# +# INPUTS +# NODE A node +# +# RETURNS +# The beginning of this chapter, or if already there, the beginning of the +# previous chapter. +# +sub Node2FastBack +{ + my $node = shift; + my $num = $sec2number{$node2sec{$node}}; + my $n; + + # Index Pages have no section number and 1. should go back to Top + return $node2prev{$node} if !$num or $num eq "1."; + + # Get the current chapter + $num =~ /^([A-Z\d]+)\./; + $n = $1; + + # If the first section of this chapter, decrement chapter + $n = $n eq 'A' ? $normal_sec_num[0] : $n =~ /^\d+$/ ? --$n : chr(ord($n)-1) + if $n . '.' eq $num; + + # Return node name for section number "$n." + return $sec2node{$number2sec{$n . '.'}} || $node2prev{$node}; +} + +# +# sub Node2FastForward NODE +# +# INPUTS +# NODE A node +# +# RETURNS +# The beginning of the next chapter. +# +sub Node2FastForward +{ + my $node = shift; + my $num = $sec2number{$node2sec{$node}}; + my $n; + + # Index pages + return $node2next{$node} if !$num; + + # Get current chapter + $num =~ /^([A-Z\d]+)\./; + $n = $1; + + # Increment chapter + $n = $n eq $normal_sec_num[0] ? 'A' : ++$n; + + # Return node name + return $sec2node{$number2sec{$n . '.'}} || $node2next{$node}; +} + +sub check +{ + local($_, %seen, %context, $before, $match, $after); + + while (<>) + { + if (/\@(\*|\.|\:|\@|\{|\})/) + { + $seen{$&}++; + $context{$&} .= "> $_" if $T2H_VERBOSE; + $_ = "$`XX$'"; + redo; + } + if (/\@(\w+)/) + { + ($before, $match, $after) = ($`, $&, $'); + if ($before =~ /\b[-\w]+$/ && $after =~ /^[-\w.]*\b/) + { # e-mail address + $seen{'e-mail address'}++; + $context{'e-mail address'} .= "> $_" if $T2H_VERBOSE; + } + else + { + $seen{$match}++; + $context{$match} .= "> $_" if $T2H_VERBOSE; + } + $match =~ s/^\@/X/; + $_ = "$before$match$after"; + redo; + } + } + + foreach (sort(keys(%seen))) + { + if ($T2H_VERBOSE) + { + print "$_\n"; + print $context{$_}; + } + else + { + print "$_ ($seen{$_})\n"; + } + } +} + +sub open +{ + my($name) = @_; + + ++$fh_name; + no strict "refs"; + if (open($fh_name, $name)) + { + unshift(@fhs, $fh_name); + } + else + { + warn "$ERROR Can't read file $name: $!\n"; + } + use strict "refs"; +} + +sub init_input +{ + @fhs = (); # hold the file handles to read + @input_spool = (); # spooled lines to read + $fh_name = 'FH000'; + &open($docu); +} + +sub next_line +{ + my($fh, $line); + + if (@input_spool) + { + $line = shift(@input_spool); + return($line); + } + while (@fhs) + { + $fh = $fhs[0]; + $line = <$fh>; + return($line) if $line; + close($fh); + shift(@fhs); + } + return(undef); +} + +# used in pass 1, use &next_line +sub skip_until +{ + local($tag) = @_; + local($_); + + while ($_ = &next_line) + { + return if /^\@end\s+$tag\s*$/; + } + die "* Failed to find '$tag' after: " . $lines[$#lines]; +} + +# used in pass 1 for l2h use &next_line +sub string_until +{ + local($tag) = @_; + local($_, $string); + + while ($_ = &next_line) + { + return $string if /^\@end\s+$tag\s*$/; + # $_ =~ s/hbox/mbox/g; + $string = $string.$_; + } + die "* Failed to find '$tag' after: " . $lines[$#lines]; +} + +# +# HTML stacking to have a better HTML output +# + +sub html_reset +{ + @html_stack = ('html'); + $html_element = 'body'; +} + +sub html_push +{ + local($what) = @_; + push(@html_stack, $html_element); + $html_element = $what; +} + +sub html_push_if +{ + local($what) = @_; + push(@html_stack, $html_element) + if ($html_element && $html_element ne 'P'); + $html_element = $what; +} + +sub html_pop +{ + $html_element = pop(@html_stack); +} + +sub html_pop_if +{ + local($elt); + + if (@_) + { + foreach $elt (@_) + { + if ($elt eq $html_element) + { + $html_element = pop(@html_stack) if @html_stack; + last; + } + } + } + else + { + $html_element = pop(@html_stack) if @html_stack; + } +} + +sub html_debug +{ + my($what, $line) = @_; + if ($T2H_DEBUG & $DEBUG_HTML) + { + $what = "\n" unless $what; + return("$what") + } + return($what); +} + +# to debug the output... +sub debug +{ + my($what, $line) = @_; + return("$what") + if $T2H_DEBUG & $DEBUG_HTML; + return($what); +} + +sub SimpleTexi2Html +{ + local $_ = $_[0]; + &protect_texi; + &protect_html; + $_ = substitute_style($_); + $_[0] = $_; +} + +sub normalise_node +{ + local $_ = $_[0]; + s/\s+/ /go; + s/ $//; + s/^ //; + &protect_texi; + &protect_html; + $_ = substitute_style($_); + $_[0] = $_; +} + +sub menu_entry +{ + my ($node, $name, $descr) = @_; + my ($href, $entry); + + &normalise_node($node); + $href = $node2href{$node}; + if ($href) + { + $descr =~ s/^\s+//; + $descr =~ s/\s*$//; + $descr = SimpleTexi2Html($descr); + if ($T2H_NUMBER_SECTIONS && !$T2H_NODE_NAME_IN_MENU && $node2sec{$node}) + { + $entry = $node2sec{$node}; + $name = ''; + } + else + { + &normalise_node($name); + $entry = ($name && ($name ne $node || ! $T2H_AVOID_MENU_REDUNDANCY) + ? "$name : $node" : $node); + } + + if ($T2H_AVOID_MENU_REDUNDANCY && $descr) + { + my $clean_entry = $entry; + $clean_entry =~ s/^.*? // if ($clean_entry =~ /^([A-Z]|\d+)\.[\d\.]* /); + $clean_entry =~ s/[^\w]//g; + my $clean_descr = $descr; + $clean_descr =~ s/[^\w]//g; + $descr = '' if ($clean_entry eq $clean_descr) + } + push(@lines2,&debug('
    \n", __LINE__)); + } + elsif ($node =~ /^\(.*\)\w+/) + { + push(@lines2,&debug('\n", __LINE__)) + } + else + { + warn "$ERROR Undefined node of menu_entry ($node): $_"; + } +} + +sub do_ctrl { "^$_[0]" } + +sub do_email +{ + my($addr, $text) = split(/,\s*/, $_[0]); + + $text = $addr unless $text; + &t2h_anchor('', "mailto:$addr", $text); +} + +sub do_sc +{ + # l2h does this much better + return &l2h_ToLatex("{\\sc ".&unprotect_html($_[0])."}") if ($T2H_L2H); + return "\U$_[0]\E"; +} + +sub do_math +{ + # APA: FIXME + # This sub doesn't seem to be used. + # What are $_[0] and $text? + my $text; + return &l2h_ToLatex("\$".&unprotect_html($_[0])."\$") if ($T2H_L2H); + return "".$text.""; +} + +sub do_uref +{ + my($url, $text, $only_text) = split(/,\s*/, $_[0]); + # APA: Don't markup obviously bad links. + # e.g. texinfo.texi 4.0 has this, which would lead to a broken + # link: + # @section @code{@@uref@{@var{url}[, @var{text}][, @var{replacement}]@}} + return if $url =~ /[<>]/; + $text = $only_text if $only_text; + $text = $url unless $text; + &t2h_anchor('', $url, $text); +} + +sub do_url { &t2h_anchor('', $_[0], $_[0]) } + +sub do_acronym +{ + return '' . $_[0] . ''; +} + +sub do_accent +{ + return "&$_[0]acute;" if $_[1] eq 'H'; + return "$_[0]." if $_[1] eq 'dotaccent'; + return "$_[0]*" if $_[1] eq 'ringaccent'; + return "$_[0]".'[' if $_[1] eq 'tieaccent'; + return "$_[0]".'(' if $_[1] eq 'u'; + return "$_[0]_" if $_[1] eq 'ubaraccent'; + return ".$_[0]" if $_[1] eq 'udotaccent'; + return "$_[0]<" if $_[1] eq 'v'; + return "&$_[0]cedil;" if $_[1] eq ','; + return "$_[0]" if $_[1] eq 'dotless'; + return undef; +} + +sub apply_style +{ + my($texi_style, $text) = @_; + my($style); + + $style = $style_map{$texi_style}; + if (defined($style)) + { # known style + my $do_quotes = 0; + if ($style =~ /^\"/) + { # add quotes + $style = $'; + $do_quotes = 1; + } + if ($style =~ /^\&/) + { # custom + $style = $'; + no strict "refs"; + $text = &$style($text, $texi_style); + use strict "refs"; + } + elsif ($style) + { # good style + $text = "<$style>$text"; + } + else + { # no style + } + $text = "$text" if $do_quotes; + } + else + { # unknown style + $text = undef; + } + return($text); +} + +# remove Texinfo styles +sub remove_style +{ + local($_) = @_; + 1 while(s/\@\w+{([^\{\}]+)}/$1/g); + return($_); +} + +sub remove_things +{ + local ($_) = @_; + s|\@(\w+)\{\}|$1|g; + return $_; +} + +sub substitute_style +{ + local($_) = @_; + my($changed, $done, $style, $text); + + &simple_substitutions; + $changed = 1; + while ($changed) + { + $changed = 0; + $done = ''; + while (/\@(\w+){([^\{\}]+)}/ || /\@(,){([^\{\}]+)}/) + { + $text = &apply_style($1, $2); + if ($text) + { + $_ = "$`$text$'"; + $changed = 1; + } + else + { + $done .= "$`\@$1"; + $_ = "{$2}$'"; + } + } + $_ = $done . $_; + } + return($_); +} + +sub t2h_anchor +{ + my($name, $href, $text, $newline, $extra_attribs) = @_; + my($result); + + $result = ". + # APA: Keep it simple. This is what perl's CGI::espaceHTML does. + # We may consider using that instead. + # If raw HTML is used outside @ifhtml or @html it's an error + # anyway. + $what =~ s/\&/\&/go; + $what =~ s/\"/\"/go; + $what =~ s/\/\>/go; + return($what); +} + +sub unprotect_texi +{ + s/$;0/\@/go; + s/$;1/\{/go; + s/$;2/\}/go; + s/$;3/\`/go; + s/$;4/\'/go; +} + +sub Unprotect_texi +{ + local $_ = shift; + &unprotect_texi; + return($_); +} + +sub unprotect_html +{ + local($what) = @_; + # APA: Use + # Character entity references (eg. <) + # instead of + # Numeric character references (eg. <) + $what =~ s/\&/\&/go; + $what =~ s/\"/\"/go; + $what =~ s/\</\/go; + return($what); +} + +sub t2h_print_label +{ + my $fh = shift; + my $href = shift || $T2H_HREF{This}; + $href =~ s/.*#(.*)$/$1/; + print $fh qq{\n}; +} + +sub main +{ + SetDocumentLanguage('en') unless ($T2H_LANG); + # APA: There's got to be a better way: + $things_map{'today'} = &pretty_date; + # Identity: + $T2H_TODAY = &pretty_date; # like "20 September 1993" + # the eval prevents this from breaking on system which do not have + # a proper getpwuid implemented + eval { ($T2H_USER = (getpwuid ($<))[6]) =~ s/,.*//;}; # Who am i + # APA: Provide Windows NT workaround until getpwuid gets + # implemented there. + $T2H_USER = $ENV{'USERNAME'} unless defined $T2H_USER; + &pass1(); + &pass2(); + &pass3(); + &pass4(); + &pass5(); +} + +&main(); + +############################################################################## + +# These next few lines are legal in both Perl and nroff. + +.00 ; # finish .ig + +'di \" finish diversion--previous line must be blank +.nr nl 0-1 \" fake up transition to first page again +.nr % 0 \" start at page 1 +'; __END__ ############# From here on it's a standard manual page ############ + .so /usr/share/man/man1/texi2html.1 diff --git a/doc/texinfo.tex b/doc/texinfo.tex new file mode 100644 index 0000000..d93d432 --- /dev/null +++ b/doc/texinfo.tex @@ -0,0 +1,6976 @@ +% texinfo.tex -- TeX macros to handle Texinfo files. +% +% Load plain if necessary, i.e., if running under initex. +\expandafter\ifx\csname fmtname\endcsname\relax\input plain\fi +% +\def\texinfoversion{2004-04-07.08} +% +% Copyright (C) 1985, 1986, 1988, 1990, 1991, 1992, 1993, 1994, 1995, +% 1996, 1997, 1998, 1999, 2000, 2001, 2002, 2003, 2004 Free Software +% Foundation, Inc. +% +% This texinfo.tex file is free software; you can redistribute it and/or +% modify it under the terms of the GNU General Public License as +% published by the Free Software Foundation; either version 2, or (at +% your option) any later version. +% +% This texinfo.tex file is distributed in the hope that it will be +% useful, but WITHOUT ANY WARRANTY; without even the implied warranty +% of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU +% General Public License for more details. +% +% You should have received a copy of the GNU General Public License +% along with this texinfo.tex file; see the file COPYING. If not, write +% to the Free Software Foundation, Inc., 59 Temple Place - Suite 330, +% Boston, MA 02111-1307, USA. +% +% As a special exception, when this file is read by TeX when processing +% a Texinfo source document, you may use the result without +% restriction. (This has been our intent since Texinfo was invented.) +% +% Please try the latest version of texinfo.tex before submitting bug +% reports; you can get the latest version from: +% http://www.gnu.org/software/texinfo/ (the Texinfo home page), or +% ftp://tug.org/tex/texinfo.tex +% (and all CTAN mirrors, see http://www.ctan.org). +% The texinfo.tex in any given distribution could well be out +% of date, so if that's what you're using, please check. +% +% Send bug reports to bug-texinfo@gnu.org. Please include including a +% complete document in each bug report with which we can reproduce the +% problem. Patches are, of course, greatly appreciated. +% +% To process a Texinfo manual with TeX, it's most reliable to use the +% texi2dvi shell script that comes with the distribution. For a simple +% manual foo.texi, however, you can get away with this: +% tex foo.texi +% texindex foo.?? +% tex foo.texi +% tex foo.texi +% dvips foo.dvi -o # or whatever; this makes foo.ps. +% The extra TeX runs get the cross-reference information correct. +% Sometimes one run after texindex suffices, and sometimes you need more +% than two; texi2dvi does it as many times as necessary. +% +% It is possible to adapt texinfo.tex for other languages, to some +% extent. You can get the existing language-specific files from the +% full Texinfo distribution. +% +% The GNU Texinfo home page is http://www.gnu.org/software/texinfo. + + +\message{Loading texinfo [version \texinfoversion]:} + +% If in a .fmt file, print the version number +% and turn on active characters that we couldn't do earlier because +% they might have appeared in the input file name. +\everyjob{\message{[Texinfo version \texinfoversion]}% + \catcode`+=\active \catcode`\_=\active} + +\message{Basics,} +\chardef\other=12 + +% We never want plain's \outer definition of \+ in Texinfo. +% For @tex, we can use \tabalign. +\let\+ = \relax + +% Save some plain tex macros whose names we will redefine. +\let\ptexb=\b +\let\ptexbullet=\bullet +\let\ptexc=\c +\let\ptexcomma=\, +\let\ptexdot=\. +\let\ptexdots=\dots +\let\ptexend=\end +\let\ptexequiv=\equiv +\let\ptexexclam=\! +\let\ptexfootnote=\footnote +\let\ptexgtr=> +\let\ptexhat=^ +\let\ptexi=\i +\let\ptexindent=\indent +\let\ptexnoindent=\noindent +\let\ptexinsert=\insert +\let\ptexlbrace=\{ +\let\ptexless=< +\let\ptexplus=+ +\let\ptexrbrace=\} +\let\ptexslash=\/ +\let\ptexstar=\* +\let\ptext=\t + +% If this character appears in an error message or help string, it +% starts a new line in the output. +\newlinechar = `^^J + +% Use TeX 3.0's \inputlineno to get the line number, for better error +% messages, but if we're using an old version of TeX, don't do anything. +% +\ifx\inputlineno\thisisundefined + \let\linenumber = \empty % Pre-3.0. +\else + \def\linenumber{l.\the\inputlineno:\space} +\fi + +% Set up fixed words for English if not already set. +\ifx\putwordAppendix\undefined \gdef\putwordAppendix{Appendix}\fi +\ifx\putwordChapter\undefined \gdef\putwordChapter{Chapter}\fi +\ifx\putwordfile\undefined \gdef\putwordfile{file}\fi +\ifx\putwordin\undefined \gdef\putwordin{in}\fi +\ifx\putwordIndexIsEmpty\undefined \gdef\putwordIndexIsEmpty{(Index is empty)}\fi +\ifx\putwordIndexNonexistent\undefined \gdef\putwordIndexNonexistent{(Index is nonexistent)}\fi +\ifx\putwordInfo\undefined \gdef\putwordInfo{Info}\fi +\ifx\putwordInstanceVariableof\undefined \gdef\putwordInstanceVariableof{Instance Variable of}\fi +\ifx\putwordMethodon\undefined \gdef\putwordMethodon{Method on}\fi +\ifx\putwordNoTitle\undefined \gdef\putwordNoTitle{No Title}\fi +\ifx\putwordof\undefined \gdef\putwordof{of}\fi +\ifx\putwordon\undefined \gdef\putwordon{on}\fi +\ifx\putwordpage\undefined \gdef\putwordpage{page}\fi +\ifx\putwordsection\undefined \gdef\putwordsection{section}\fi +\ifx\putwordSection\undefined \gdef\putwordSection{Section}\fi +\ifx\putwordsee\undefined \gdef\putwordsee{see}\fi +\ifx\putwordSee\undefined \gdef\putwordSee{See}\fi +\ifx\putwordShortTOC\undefined \gdef\putwordShortTOC{Short Contents}\fi +\ifx\putwordTOC\undefined \gdef\putwordTOC{Table of Contents}\fi +% +\ifx\putwordMJan\undefined \gdef\putwordMJan{January}\fi +\ifx\putwordMFeb\undefined \gdef\putwordMFeb{February}\fi +\ifx\putwordMMar\undefined \gdef\putwordMMar{March}\fi +\ifx\putwordMApr\undefined \gdef\putwordMApr{April}\fi +\ifx\putwordMMay\undefined \gdef\putwordMMay{May}\fi +\ifx\putwordMJun\undefined \gdef\putwordMJun{June}\fi +\ifx\putwordMJul\undefined \gdef\putwordMJul{July}\fi +\ifx\putwordMAug\undefined \gdef\putwordMAug{August}\fi +\ifx\putwordMSep\undefined \gdef\putwordMSep{September}\fi +\ifx\putwordMOct\undefined \gdef\putwordMOct{October}\fi +\ifx\putwordMNov\undefined \gdef\putwordMNov{November}\fi +\ifx\putwordMDec\undefined \gdef\putwordMDec{December}\fi +% +\ifx\putwordDefmac\undefined \gdef\putwordDefmac{Macro}\fi +\ifx\putwordDefspec\undefined \gdef\putwordDefspec{Special Form}\fi +\ifx\putwordDefvar\undefined \gdef\putwordDefvar{Variable}\fi +\ifx\putwordDefopt\undefined \gdef\putwordDefopt{User Option}\fi +\ifx\putwordDeffunc\undefined \gdef\putwordDeffunc{Function}\fi + +% In some macros, we cannot use the `\? notation---the left quote is +% in some cases the escape char. +\chardef\colonChar = `\: +\chardef\commaChar = `\, +\chardef\dotChar = `\. +\chardef\exclamChar= `\! +\chardef\questChar = `\? +\chardef\semiChar = `\; +\chardef\underChar = `\_ + +\chardef\spaceChar = `\ % +\chardef\spacecat = 10 +\def\spaceisspace{\catcode\spaceChar=\spacecat} + +% Ignore a token. +% +\def\gobble#1{} + +% The following is used inside several \edef's. +\def\makecsname#1{\expandafter\noexpand\csname#1\endcsname} + +% Hyphenation fixes. +\hyphenation{ + Flor-i-da Ghost-script Ghost-view Mac-OS Post-Script + ap-pen-dix bit-map bit-maps + data-base data-bases eshell fall-ing half-way long-est man-u-script + man-u-scripts mini-buf-fer mini-buf-fers over-view par-a-digm + par-a-digms rath-er rec-tan-gu-lar ro-bot-ics se-vere-ly set-up spa-ces + spell-ing spell-ings + stand-alone strong-est time-stamp time-stamps which-ever white-space + wide-spread wrap-around +} + +% Margin to add to right of even pages, to left of odd pages. +\newdimen\bindingoffset +\newdimen\normaloffset +\newdimen\pagewidth \newdimen\pageheight + +% For a final copy, take out the rectangles +% that mark overfull boxes (in case you have decided +% that the text looks ok even though it passes the margin). +% +\def\finalout{\overfullrule=0pt} + +% @| inserts a changebar to the left of the current line. It should +% surround any changed text. This approach does *not* work if the +% change spans more than two lines of output. To handle that, we would +% have adopt a much more difficult approach (putting marks into the main +% vertical list for the beginning and end of each change). +% +\def\|{% + % \vadjust can only be used in horizontal mode. + \leavevmode + % + % Append this vertical mode material after the current line in the output. + \vadjust{% + % We want to insert a rule with the height and depth of the current + % leading; that is exactly what \strutbox is supposed to record. + \vskip-\baselineskip + % + % \vadjust-items are inserted at the left edge of the type. So + % the \llap here moves out into the left-hand margin. + \llap{% + % + % For a thicker or thinner bar, change the `1pt'. + \vrule height\baselineskip width1pt + % + % This is the space between the bar and the text. + \hskip 12pt + }% + }% +} + +% Sometimes it is convenient to have everything in the transcript file +% and nothing on the terminal. We don't just call \tracingall here, +% since that produces some useless output on the terminal. We also make +% some effort to order the tracing commands to reduce output in the log +% file; cf. trace.sty in LaTeX. +% +\def\gloggingall{\begingroup \globaldefs = 1 \loggingall \endgroup}% +\def\loggingall{% + \tracingstats2 + \tracingpages1 + \tracinglostchars2 % 2 gives us more in etex + \tracingparagraphs1 + \tracingoutput1 + \tracingmacros2 + \tracingrestores1 + \showboxbreadth\maxdimen \showboxdepth\maxdimen + \ifx\eTeXversion\undefined\else % etex gives us more logging + \tracingscantokens1 + \tracingifs1 + \tracinggroups1 + \tracingnesting2 + \tracingassigns1 + \fi + \tracingcommands3 % 3 gives us more in etex + \errorcontextlines16 +}% + +% add check for \lastpenalty to plain's definitions. If the last thing +% we did was a \nobreak, we don't want to insert more space. +% +\def\smallbreak{\ifnum\lastpenalty<10000\par\ifdim\lastskip<\smallskipamount + \removelastskip\penalty-50\smallskip\fi\fi} +\def\medbreak{\ifnum\lastpenalty<10000\par\ifdim\lastskip<\medskipamount + \removelastskip\penalty-100\medskip\fi\fi} +\def\bigbreak{\ifnum\lastpenalty<10000\par\ifdim\lastskip<\bigskipamount + \removelastskip\penalty-200\bigskip\fi\fi} + +% For @cropmarks command. +% Do @cropmarks to get crop marks. +% +\newif\ifcropmarks +\let\cropmarks = \cropmarkstrue +% +% Dimensions to add cropmarks at corners. +% Added by P. A. MacKay, 12 Nov. 1986 +% +\newdimen\outerhsize \newdimen\outervsize % set by the paper size routines +\newdimen\cornerlong \cornerlong=1pc +\newdimen\cornerthick \cornerthick=.3pt +\newdimen\topandbottommargin \topandbottommargin=.75in + +% Main output routine. +\chardef\PAGE = 255 +\output = {\onepageout{\pagecontents\PAGE}} + +\newbox\headlinebox +\newbox\footlinebox + +% \onepageout takes a vbox as an argument. Note that \pagecontents +% does insertions, but you have to call it yourself. +\def\onepageout#1{% + \ifcropmarks \hoffset=0pt \else \hoffset=\normaloffset \fi + % + \ifodd\pageno \advance\hoffset by \bindingoffset + \else \advance\hoffset by -\bindingoffset\fi + % + % Do this outside of the \shipout so @code etc. will be expanded in + % the headline as they should be, not taken literally (outputting ''code). + \setbox\headlinebox = \vbox{\let\hsize=\pagewidth \makeheadline}% + \setbox\footlinebox = \vbox{\let\hsize=\pagewidth \makefootline}% + % + {% + % Have to do this stuff outside the \shipout because we want it to + % take effect in \write's, yet the group defined by the \vbox ends + % before the \shipout runs. + % + \escapechar = `\\ % use backslash in output files. + \indexdummies % don't expand commands in the output. + \normalturnoffactive % \ in index entries must not stay \, e.g., if + % the page break happens to be in the middle of an example. + \shipout\vbox{% + % Do this early so pdf references go to the beginning of the page. + \ifpdfmakepagedest \pdfdest name{\the\pageno} xyz\fi + % + \ifcropmarks \vbox to \outervsize\bgroup + \hsize = \outerhsize + \vskip-\topandbottommargin + \vtop to0pt{% + \line{\ewtop\hfil\ewtop}% + \nointerlineskip + \line{% + \vbox{\moveleft\cornerthick\nstop}% + \hfill + \vbox{\moveright\cornerthick\nstop}% + }% + \vss}% + \vskip\topandbottommargin + \line\bgroup + \hfil % center the page within the outer (page) hsize. + \ifodd\pageno\hskip\bindingoffset\fi + \vbox\bgroup + \fi + % + \unvbox\headlinebox + \pagebody{#1}% + \ifdim\ht\footlinebox > 0pt + % Only leave this space if the footline is nonempty. + % (We lessened \vsize for it in \oddfootingxxx.) + % The \baselineskip=24pt in plain's \makefootline has no effect. + \vskip 2\baselineskip + \unvbox\footlinebox + \fi + % + \ifcropmarks + \egroup % end of \vbox\bgroup + \hfil\egroup % end of (centering) \line\bgroup + \vskip\topandbottommargin plus1fill minus1fill + \boxmaxdepth = \cornerthick + \vbox to0pt{\vss + \line{% + \vbox{\moveleft\cornerthick\nsbot}% + \hfill + \vbox{\moveright\cornerthick\nsbot}% + }% + \nointerlineskip + \line{\ewbot\hfil\ewbot}% + }% + \egroup % \vbox from first cropmarks clause + \fi + }% end of \shipout\vbox + }% end of group with \normalturnoffactive + \advancepageno + \ifnum\outputpenalty>-20000 \else\dosupereject\fi +} + +\newinsert\margin \dimen\margin=\maxdimen + +\def\pagebody#1{\vbox to\pageheight{\boxmaxdepth=\maxdepth #1}} +{\catcode`\@ =11 +\gdef\pagecontents#1{\ifvoid\topins\else\unvbox\topins\fi +% marginal hacks, juha@viisa.uucp (Juha Takala) +\ifvoid\margin\else % marginal info is present + \rlap{\kern\hsize\vbox to\z@{\kern1pt\box\margin \vss}}\fi +\dimen@=\dp#1 \unvbox#1 +\ifvoid\footins\else\vskip\skip\footins\footnoterule \unvbox\footins\fi +\ifr@ggedbottom \kern-\dimen@ \vfil \fi} +} + +% Here are the rules for the cropmarks. Note that they are +% offset so that the space between them is truly \outerhsize or \outervsize +% (P. A. MacKay, 12 November, 1986) +% +\def\ewtop{\vrule height\cornerthick depth0pt width\cornerlong} +\def\nstop{\vbox + {\hrule height\cornerthick depth\cornerlong width\cornerthick}} +\def\ewbot{\vrule height0pt depth\cornerthick width\cornerlong} +\def\nsbot{\vbox + {\hrule height\cornerlong depth\cornerthick width\cornerthick}} + +% Parse an argument, then pass it to #1. The argument is the rest of +% the input line (except we remove a trailing comment). #1 should be a +% macro which expects an ordinary undelimited TeX argument. +% +\def\parsearg{\parseargusing{}} +\def\parseargusing#1#2{% + \def\next{#2}% + \begingroup + \obeylines + \spaceisspace + #1% + \parseargline\empty% Insert the \empty token, see \finishparsearg below. +} + +{\obeylines % + \gdef\parseargline#1^^M{% + \endgroup % End of the group started in \parsearg. + \argremovecomment #1\comment\ArgTerm% + }% +} + +% First remove any @comment, then any @c comment. +\def\argremovecomment#1\comment#2\ArgTerm{\argremovec #1\c\ArgTerm} +\def\argremovec#1\c#2\ArgTerm{\argcheckspaces#1\^^M\ArgTerm} + +% Each occurence of `\^^M' or `\^^M' is replaced by a single space. +% +% \argremovec might leave us with trailing space, e.g., +% @end itemize @c foo +% This space token undergoes the same procedure and is eventually removed +% by \finishparsearg. +% +\def\argcheckspaces#1\^^M{\argcheckspacesX#1\^^M \^^M} +\def\argcheckspacesX#1 \^^M{\argcheckspacesY#1\^^M} +\def\argcheckspacesY#1\^^M#2\^^M#3\ArgTerm{% + \def\temp{#3}% + \ifx\temp\empty + % We cannot use \next here, as it holds the macro to run; + % thus we reuse \temp. + \let\temp\finishparsearg + \else + \let\temp\argcheckspaces + \fi + % Put the space token in: + \temp#1 #3\ArgTerm +} + +% If a _delimited_ argument is enclosed in braces, they get stripped; so +% to get _exactly_ the rest of the line, we had to prevent such situation. +% We prepended an \empty token at the very beginning and we expand it now, +% just before passing the control to \next. +% (Similarily, we have to think about #3 of \argcheckspacesY above: it is +% either the null string, or it ends with \^^M---thus there is no danger +% that a pair of braces would be stripped. +% +% But first, we have to remove the trailing space token. +% +\def\finishparsearg#1 \ArgTerm{\expandafter\next\expandafter{#1}} + +% \parseargdef\foo{...} +% is roughly equivalent to +% \def\foo{\parsearg\Xfoo} +% \def\Xfoo#1{...} +% +% Actually, I use \csname\string\foo\endcsname, ie. \\foo, as it is my +% favourite TeX trick. --kasal, 16nov03 + +\def\parseargdef#1{% + \expandafter \doparseargdef \csname\string#1\endcsname #1% +} +\def\doparseargdef#1#2{% + \def#2{\parsearg#1}% + \def#1##1% +} + +% Several utility definitions with active space: +{ + \obeyspaces + \gdef\obeyedspace{ } + + % Make each space character in the input produce a normal interword + % space in the output. Don't allow a line break at this space, as this + % is used only in environments like @example, where each line of input + % should produce a line of output anyway. + % + \gdef\sepspaces{\obeyspaces\let =\tie} + + % If an index command is used in an @example environment, any spaces + % therein should become regular spaces in the raw index file, not the + % expansion of \tie (\leavevmode \penalty \@M \ ). + \gdef\unsepspaces{\let =\space} +} + + +\def\flushcr{\ifx\par\lisppar \def\next##1{}\else \let\next=\relax \fi \next} + +% Define the framework for environments in texinfo.tex. It's used like this: +% +% \envdef\foo{...} +% \def\Efoo{...} +% +% It's the responsibility of \envdef to insert \begingroup before the +% actual body; @end closes the group after calling \Efoo. \envdef also +% defines \thisenv, so the current environment is known; @end checks +% whether the environment name matches. The \checkenv macro can also be +% used to check whether the current environment is the one expected. +% +% Non-false conditionals (@iftex, @ifset) don't fit into this, so they +% are not treated as enviroments; they don't open a group. (The +% implementation of @end takes care not to call \endgroup in this +% special case.) + + +% At runtime, environments start with this: +\def\startenvironment#1{\begingroup\def\thisenv{#1}} +% initialize +\let\thisenv\empty + +% ... but they get defined via ``\envdef\foo{...}'': +\long\def\envdef#1#2{\def#1{\startenvironment#1#2}} +\def\envparseargdef#1#2{\parseargdef#1{\startenvironment#1#2}} + +% Check whether we're in the right environment: +\def\checkenv#1{% + \def\temp{#1}% + \ifx\thisenv\temp + \else + \badenverr + \fi +} + +% Evironment mismatch, #1 expected: +\def\badenverr{% + \errhelp = \EMsimple + \errmessage{This command can appear only \inenvironment\temp, + not \inenvironment\thisenv}% +} +\def\inenvironment#1{% + \ifx#1\empty + out of any environment% + \else + in environment \expandafter\string#1% + \fi +} + +% @end foo executes the definition of \Efoo. +% But first, it executes a specialized version of \checkenv +% +\parseargdef\end{% + \if 1\csname iscond.#1\endcsname + \else + % The general wording of \badenverr may not be ideal, but... --kasal, 06nov03 + \expandafter\checkenv\csname#1\endcsname + \csname E#1\endcsname + \endgroup + \fi +} + +\newhelp\EMsimple{Press RETURN to continue.} + + +%% Simple single-character @ commands + +% @@ prints an @ +% Kludge this until the fonts are right (grr). +\def\@{{\tt\char64}} + +% This is turned off because it was never documented +% and you can use @w{...} around a quote to suppress ligatures. +%% Define @` and @' to be the same as ` and ' +%% but suppressing ligatures. +%\def\`{{`}} +%\def\'{{'}} + +% Used to generate quoted braces. +\def\mylbrace {{\tt\char123}} +\def\myrbrace {{\tt\char125}} +\let\{=\mylbrace +\let\}=\myrbrace +\begingroup + % Definitions to produce \{ and \} commands for indices, + % and @{ and @} for the aux file. + \catcode`\{ = \other \catcode`\} = \other + \catcode`\[ = 1 \catcode`\] = 2 + \catcode`\! = 0 \catcode`\\ = \other + !gdef!lbracecmd[\{]% + !gdef!rbracecmd[\}]% + !gdef!lbraceatcmd[@{]% + !gdef!rbraceatcmd[@}]% +!endgroup + +% @comma{} to avoid , parsing problems. +\let\comma = , + +% Accents: @, @dotaccent @ringaccent @ubaraccent @udotaccent +% Others are defined by plain TeX: @` @' @" @^ @~ @= @u @v @H. +\let\, = \c +\let\dotaccent = \. +\def\ringaccent#1{{\accent23 #1}} +\let\tieaccent = \t +\let\ubaraccent = \b +\let\udotaccent = \d + +% Other special characters: @questiondown @exclamdown @ordf @ordm +% Plain TeX defines: @AA @AE @O @OE @L (plus lowercase versions) @ss. +\def\questiondown{?`} +\def\exclamdown{!`} +\def\ordf{\leavevmode\raise1ex\hbox{\selectfonts\lllsize \underbar{a}}} +\def\ordm{\leavevmode\raise1ex\hbox{\selectfonts\lllsize \underbar{o}}} + +% Dotless i and dotless j, used for accents. +\def\imacro{i} +\def\jmacro{j} +\def\dotless#1{% + \def\temp{#1}% + \ifx\temp\imacro \ptexi + \else\ifx\temp\jmacro \j + \else \errmessage{@dotless can be used only with i or j}% + \fi\fi +} + +% The \TeX{} logo, as in plain, but resetting the spacing so that a +% period following counts as ending a sentence. (Idea found in latex.) +% +\edef\TeX{\TeX \spacefactor=3000 } + +% @LaTeX{} logo. Not quite the same results as the definition in +% latex.ltx, since we use a different font for the raised A; it's most +% convenient for us to use an explicitly smaller font, rather than using +% the \scriptstyle font (since we don't reset \scriptstyle and +% \scriptscriptstyle). +% +\def\LaTeX{% + L\kern-.36em + {\setbox0=\hbox{T}% + \vbox to \ht0{\hbox{\selectfonts\lllsize A}\vss}}% + \kern-.15em + \TeX +} + +% Be sure we're in horizontal mode when doing a tie, since we make space +% equivalent to this in @example-like environments. Otherwise, a space +% at the beginning of a line will start with \penalty -- and +% since \penalty is valid in vertical mode, we'd end up putting the +% penalty on the vertical list instead of in the new paragraph. +{\catcode`@ = 11 + % Avoid using \@M directly, because that causes trouble + % if the definition is written into an index file. + \global\let\tiepenalty = \@M + \gdef\tie{\leavevmode\penalty\tiepenalty\ } +} + +% @: forces normal size whitespace following. +\def\:{\spacefactor=1000 } + +% @* forces a line break. +\def\*{\hfil\break\hbox{}\ignorespaces} + +% @/ allows a line break. +\let\/=\allowbreak + +% @. is an end-of-sentence period. +\def\.{.\spacefactor=3000 } + +% @! is an end-of-sentence bang. +\def\!{!\spacefactor=3000 } + +% @? is an end-of-sentence query. +\def\?{?\spacefactor=3000 } + +% @w prevents a word break. Without the \leavevmode, @w at the +% beginning of a paragraph, when TeX is still in vertical mode, would +% produce a whole line of output instead of starting the paragraph. +\def\w#1{\leavevmode\hbox{#1}} + +% @group ... @end group forces ... to be all on one page, by enclosing +% it in a TeX vbox. We use \vtop instead of \vbox to construct the box +% to keep its height that of a normal line. According to the rules for +% \topskip (p.114 of the TeXbook), the glue inserted is +% max (\topskip - \ht (first item), 0). If that height is large, +% therefore, no glue is inserted, and the space between the headline and +% the text is small, which looks bad. +% +% Another complication is that the group might be very large. This can +% cause the glue on the previous page to be unduly stretched, because it +% does not have much material. In this case, it's better to add an +% explicit \vfill so that the extra space is at the bottom. The +% threshold for doing this is if the group is more than \vfilllimit +% percent of a page (\vfilllimit can be changed inside of @tex). +% +\newbox\groupbox +\def\vfilllimit{0.7} +% +\envdef\group{% + \ifnum\catcode`\^^M=\active \else + \errhelp = \groupinvalidhelp + \errmessage{@group invalid in context where filling is enabled}% + \fi + \startsavinginserts + % + \setbox\groupbox = \vtop\bgroup + % Do @comment since we are called inside an environment such as + % @example, where each end-of-line in the input causes an + % end-of-line in the output. We don't want the end-of-line after + % the `@group' to put extra space in the output. Since @group + % should appear on a line by itself (according to the Texinfo + % manual), we don't worry about eating any user text. + \comment +} +% +% The \vtop produces a box with normal height and large depth; thus, TeX puts +% \baselineskip glue before it, and (when the next line of text is done) +% \lineskip glue after it. Thus, space below is not quite equal to space +% above. But it's pretty close. +\def\Egroup{% + % To get correct interline space between the last line of the group + % and the first line afterwards, we have to propagate \prevdepth. + \endgraf % Not \par, as it may have been set to \lisppar. + \global\dimen1 = \prevdepth + \egroup % End the \vtop. + % \dimen0 is the vertical size of the group's box. + \dimen0 = \ht\groupbox \advance\dimen0 by \dp\groupbox + % \dimen2 is how much space is left on the page (more or less). + \dimen2 = \pageheight \advance\dimen2 by -\pagetotal + % if the group doesn't fit on the current page, and it's a big big + % group, force a page break. + \ifdim \dimen0 > \dimen2 + \ifdim \pagetotal < \vfilllimit\pageheight + \page + \fi + \fi + \box\groupbox + \prevdepth = \dimen1 + \checkinserts +} +% +% TeX puts in an \escapechar (i.e., `@') at the beginning of the help +% message, so this ends up printing `@group can only ...'. +% +\newhelp\groupinvalidhelp{% +group can only be used in environments such as @example,^^J% +where each line of input produces a line of output.} + +% @need space-in-mils +% forces a page break if there is not space-in-mils remaining. + +\newdimen\mil \mil=0.001in + +% Old definition--didn't work. +%\parseargdef\need{\par % +%% This method tries to make TeX break the page naturally +%% if the depth of the box does not fit. +%{\baselineskip=0pt% +%\vtop to #1\mil{\vfil}\kern -#1\mil\nobreak +%\prevdepth=-1000pt +%}} + +\parseargdef\need{% + % Ensure vertical mode, so we don't make a big box in the middle of a + % paragraph. + \par + % + % If the @need value is less than one line space, it's useless. + \dimen0 = #1\mil + \dimen2 = \ht\strutbox + \advance\dimen2 by \dp\strutbox + \ifdim\dimen0 > \dimen2 + % + % Do a \strut just to make the height of this box be normal, so the + % normal leading is inserted relative to the preceding line. + % And a page break here is fine. + \vtop to #1\mil{\strut\vfil}% + % + % TeX does not even consider page breaks if a penalty added to the + % main vertical list is 10000 or more. But in order to see if the + % empty box we just added fits on the page, we must make it consider + % page breaks. On the other hand, we don't want to actually break the + % page after the empty box. So we use a penalty of 9999. + % + % There is an extremely small chance that TeX will actually break the + % page at this \penalty, if there are no other feasible breakpoints in + % sight. (If the user is using lots of big @group commands, which + % almost-but-not-quite fill up a page, TeX will have a hard time doing + % good page breaking, for example.) However, I could not construct an + % example where a page broke at this \penalty; if it happens in a real + % document, then we can reconsider our strategy. + \penalty9999 + % + % Back up by the size of the box, whether we did a page break or not. + \kern -#1\mil + % + % Do not allow a page break right after this kern. + \nobreak + \fi +} + +% @br forces paragraph break (and is undocumented). + +\let\br = \par + +% @page forces the start of a new page. +% +\def\page{\par\vfill\supereject} + +% @exdent text.... +% outputs text on separate line in roman font, starting at standard page margin + +% This records the amount of indent in the innermost environment. +% That's how much \exdent should take out. +\newskip\exdentamount + +% This defn is used inside fill environments such as @defun. +\parseargdef\exdent{\hfil\break\hbox{\kern -\exdentamount{\rm#1}}\hfil\break} + +% This defn is used inside nofill environments such as @example. +\parseargdef\nofillexdent{{\advance \leftskip by -\exdentamount + \leftline{\hskip\leftskip{\rm#1}}}} + +% @inmargin{WHICH}{TEXT} puts TEXT in the WHICH margin next to the current +% paragraph. For more general purposes, use the \margin insertion +% class. WHICH is `l' or `r'. +% +\newskip\inmarginspacing \inmarginspacing=1cm +\def\strutdepth{\dp\strutbox} +% +\def\doinmargin#1#2{\strut\vadjust{% + \nobreak + \kern-\strutdepth + \vtop to \strutdepth{% + \baselineskip=\strutdepth + \vss + % if you have multiple lines of stuff to put here, you'll need to + % make the vbox yourself of the appropriate size. + \ifx#1l% + \llap{\ignorespaces #2\hskip\inmarginspacing}% + \else + \rlap{\hskip\hsize \hskip\inmarginspacing \ignorespaces #2}% + \fi + \null + }% +}} +\def\inleftmargin{\doinmargin l} +\def\inrightmargin{\doinmargin r} +% +% @inmargin{TEXT [, RIGHT-TEXT]} +% (if RIGHT-TEXT is given, use TEXT for left page, RIGHT-TEXT for right; +% else use TEXT for both). +% +\def\inmargin#1{\parseinmargin #1,,\finish} +\def\parseinmargin#1,#2,#3\finish{% not perfect, but better than nothing. + \setbox0 = \hbox{\ignorespaces #2}% + \ifdim\wd0 > 0pt + \def\lefttext{#1}% have both texts + \def\righttext{#2}% + \else + \def\lefttext{#1}% have only one text + \def\righttext{#1}% + \fi + % + \ifodd\pageno + \def\temp{\inrightmargin\righttext}% odd page -> outside is right margin + \else + \def\temp{\inleftmargin\lefttext}% + \fi + \temp +} + +% @include file insert text of that file as input. +% +\def\include{\parseargusing\filenamecatcodes\includezzz} +\def\includezzz#1{% + \pushthisfilestack + \def\thisfile{#1}% + {% + \makevalueexpandable + \def\temp{\input #1 }% + \expandafter + }\temp + \popthisfilestack +} +\def\filenamecatcodes{% + \catcode`\\=\other + \catcode`~=\other + \catcode`^=\other + \catcode`_=\other + \catcode`|=\other + \catcode`<=\other + \catcode`>=\other + \catcode`+=\other + \catcode`-=\other +} + +\def\pushthisfilestack{% + \expandafter\pushthisfilestackX\popthisfilestack\StackTerm +} +\def\pushthisfilestackX{% + \expandafter\pushthisfilestackY\thisfile\StackTerm +} +\def\pushthisfilestackY #1\StackTerm #2\StackTerm {% + \gdef\popthisfilestack{\gdef\thisfile{#1}\gdef\popthisfilestack{#2}}% +} + +\def\popthisfilestack{\errthisfilestackempty} +\def\errthisfilestackempty{\errmessage{Internal error: + the stack of filenames is empty.}} + +\def\thisfile{} + +% @center line +% outputs that line, centered. +% +\parseargdef\center{% + \ifhmode + \let\next\centerH + \else + \let\next\centerV + \fi + \next{\hfil \ignorespaces#1\unskip \hfil}% +} +\def\centerH#1{% + {% + \hfil\break + \advance\hsize by -\leftskip + \advance\hsize by -\rightskip + \line{#1}% + \break + }% +} +\def\centerV#1{\line{\kern\leftskip #1\kern\rightskip}} + +% @sp n outputs n lines of vertical space + +\parseargdef\sp{\vskip #1\baselineskip} + +% @comment ...line which is ignored... +% @c is the same as @comment +% @ignore ... @end ignore is another way to write a comment + +\def\comment{\begingroup \catcode`\^^M=\other% +\catcode`\@=\other \catcode`\{=\other \catcode`\}=\other% +\commentxxx} +{\catcode`\^^M=\other \gdef\commentxxx#1^^M{\endgroup}} + +\let\c=\comment + +% @paragraphindent NCHARS +% We'll use ems for NCHARS, close enough. +% NCHARS can also be the word `asis' or `none'. +% We cannot feasibly implement @paragraphindent asis, though. +% +\def\asisword{asis} % no translation, these are keywords +\def\noneword{none} +% +\parseargdef\paragraphindent{% + \def\temp{#1}% + \ifx\temp\asisword + \else + \ifx\temp\noneword + \defaultparindent = 0pt + \else + \defaultparindent = #1em + \fi + \fi + \parindent = \defaultparindent +} + +% @exampleindent NCHARS +% We'll use ems for NCHARS like @paragraphindent. +% It seems @exampleindent asis isn't necessary, but +% I preserve it to make it similar to @paragraphindent. +\parseargdef\exampleindent{% + \def\temp{#1}% + \ifx\temp\asisword + \else + \ifx\temp\noneword + \lispnarrowing = 0pt + \else + \lispnarrowing = #1em + \fi + \fi +} + +% @firstparagraphindent WORD +% If WORD is `none', then suppress indentation of the first paragraph +% after a section heading. If WORD is `insert', then do indent at such +% paragraphs. +% +% The paragraph indentation is suppressed or not by calling +% \suppressfirstparagraphindent, which the sectioning commands do. +% We switch the definition of this back and forth according to WORD. +% By default, we suppress indentation. +% +\def\suppressfirstparagraphindent{\dosuppressfirstparagraphindent} +\def\insertword{insert} +% +\parseargdef\firstparagraphindent{% + \def\temp{#1}% + \ifx\temp\noneword + \let\suppressfirstparagraphindent = \dosuppressfirstparagraphindent + \else\ifx\temp\insertword + \let\suppressfirstparagraphindent = \relax + \else + \errhelp = \EMsimple + \errmessage{Unknown @firstparagraphindent option `\temp'}% + \fi\fi +} + +% Here is how we actually suppress indentation. Redefine \everypar to +% \kern backwards by \parindent, and then reset itself to empty. +% +% We also make \indent itself not actually do anything until the next +% paragraph. +% +\gdef\dosuppressfirstparagraphindent{% + \gdef\indent{% + \restorefirstparagraphindent + \indent + }% + \gdef\noindent{% + \restorefirstparagraphindent + \noindent + }% + \global\everypar = {% + \kern -\parindent + \restorefirstparagraphindent + }% +} + +\gdef\restorefirstparagraphindent{% + \global \let \indent = \ptexindent + \global \let \noindent = \ptexnoindent + \global \everypar = {}% +} + + +% @asis just yields its argument. Used with @table, for example. +% +\def\asis#1{#1} + +% @math outputs its argument in math mode. +% +% One complication: _ usually means subscripts, but it could also mean +% an actual _ character, as in @math{@var{some_variable} + 1}. So make +% _ active, and distinguish by seeing if the current family is \slfam, +% which is what @var uses. +{ + \catcode\underChar = \active + \gdef\mathunderscore{% + \catcode\underChar=\active + \def_{\ifnum\fam=\slfam \_\else\sb\fi}% + } +} +% Another complication: we want \\ (and @\) to output a \ character. +% FYI, plain.tex uses \\ as a temporary control sequence (why?), but +% this is not advertised and we don't care. Texinfo does not +% otherwise define @\. +% +% The \mathchar is class=0=ordinary, family=7=ttfam, position=5C=\. +\def\mathbackslash{\ifnum\fam=\ttfam \mathchar"075C \else\backslash \fi} +% +\def\math{% + \tex + \mathunderscore + \let\\ = \mathbackslash + \mathactive + $\finishmath +} +\def\finishmath#1{#1$\endgroup} % Close the group opened by \tex. + +% Some active characters (such as <) are spaced differently in math. +% We have to reset their definitions in case the @math was an argument +% to a command which sets the catcodes (such as @item or @section). +% +{ + \catcode`^ = \active + \catcode`< = \active + \catcode`> = \active + \catcode`+ = \active + \gdef\mathactive{% + \let^ = \ptexhat + \let< = \ptexless + \let> = \ptexgtr + \let+ = \ptexplus + } +} + +% @bullet and @minus need the same treatment as @math, just above. +\def\bullet{$\ptexbullet$} +\def\minus{$-$} + +% @dots{} outputs an ellipsis using the current font. +% We do .5em per period so that it has the same spacing in a typewriter +% font as three actual period characters. +% +\def\dots{% + \leavevmode + \hbox to 1.5em{% + \hskip 0pt plus 0.25fil + .\hfil.\hfil.% + \hskip 0pt plus 0.5fil + }% +} + +% @enddots{} is an end-of-sentence ellipsis. +% +\def\enddots{% + \dots + \spacefactor=3000 +} + +% @comma{} is so commas can be inserted into text without messing up +% Texinfo's parsing. +% +\let\comma = , + +% @refill is a no-op. +\let\refill=\relax + +% If working on a large document in chapters, it is convenient to +% be able to disable indexing, cross-referencing, and contents, for test runs. +% This is done with @novalidate (before @setfilename). +% +\newif\iflinks \linkstrue % by default we want the aux files. +\let\novalidate = \linksfalse + +% @setfilename is done at the beginning of every texinfo file. +% So open here the files we need to have open while reading the input. +% This makes it possible to make a .fmt file for texinfo. +\def\setfilename{% + \fixbackslash % Turn off hack to swallow `\input texinfo'. + \iflinks + \tryauxfile + % Open the new aux file. TeX will close it automatically at exit. + \immediate\openout\auxfile=\jobname.aux + \fi % \openindices needs to do some work in any case. + \openindices + \let\setfilename=\comment % Ignore extra @setfilename cmds. + % + % If texinfo.cnf is present on the system, read it. + % Useful for site-wide @afourpaper, etc. + \openin 1 texinfo.cnf + \ifeof 1 \else \input texinfo.cnf \fi + \closein 1 + % + \comment % Ignore the actual filename. +} + +% Called from \setfilename. +% +\def\openindices{% + \newindex{cp}% + \newcodeindex{fn}% + \newcodeindex{vr}% + \newcodeindex{tp}% + \newcodeindex{ky}% + \newcodeindex{pg}% +} + +% @bye. +\outer\def\bye{\pagealignmacro\tracingstats=1\ptexend} + + +\message{pdf,} +% adobe `portable' document format +\newcount\tempnum +\newcount\lnkcount +\newtoks\filename +\newcount\filenamelength +\newcount\pgn +\newtoks\toksA +\newtoks\toksB +\newtoks\toksC +\newtoks\toksD +\newbox\boxA +\newcount\countA +\newif\ifpdf +\newif\ifpdfmakepagedest + +% when pdftex is run in dvi mode, \pdfoutput is defined (so \pdfoutput=1 +% can be set). So we test for \relax and 0 as well as \undefined, +% borrowed from ifpdf.sty. +\ifx\pdfoutput\undefined +\else + \ifx\pdfoutput\relax + \else + \ifcase\pdfoutput + \else + \pdftrue + \fi + \fi +\fi +% +\ifpdf + \input pdfcolor + \pdfcatalog{/PageMode /UseOutlines}% + \def\dopdfimage#1#2#3{% + \def\imagewidth{#2}% + \def\imageheight{#3}% + % without \immediate, pdftex seg faults when the same image is + % included twice. (Version 3.14159-pre-1.0-unofficial-20010704.) + \ifnum\pdftexversion < 14 + \immediate\pdfimage + \else + \immediate\pdfximage + \fi + \ifx\empty\imagewidth\else width \imagewidth \fi + \ifx\empty\imageheight\else height \imageheight \fi + \ifnum\pdftexversion<13 + #1.pdf% + \else + {#1.pdf}% + \fi + \ifnum\pdftexversion < 14 \else + \pdfrefximage \pdflastximage + \fi} + \def\pdfmkdest#1{{% + % We have to set dummies so commands such as @code in a section title + % aren't expanded. + \atdummies + \normalturnoffactive + \pdfdest name{#1} xyz% + }} + \def\pdfmkpgn#1{#1} + \let\linkcolor = \Blue % was Cyan, but that seems light? + \def\endlink{\Black\pdfendlink} + % Adding outlines to PDF; macros for calculating structure of outlines + % come from Petr Olsak + \def\expnumber#1{\expandafter\ifx\csname#1\endcsname\relax 0% + \else \csname#1\endcsname \fi} + \def\advancenumber#1{\tempnum=\expnumber{#1}\relax + \advance\tempnum by 1 + \expandafter\xdef\csname#1\endcsname{\the\tempnum}} + % + % #1 is the section text. #2 is the pdf expression for the number + % of subentries (or empty, for subsubsections). #3 is the node + % text, which might be empty if this toc entry had no + % corresponding node. #4 is the page number. + % + \def\dopdfoutline#1#2#3#4{% + % Generate a link to the node text if that exists; else, use the + % page number. We could generate a destination for the section + % text in the case where a section has no node, but it doesn't + % seem worthwhile, since most documents are normally structured. + \def\pdfoutlinedest{#3}% + \ifx\pdfoutlinedest\empty \def\pdfoutlinedest{#4}\fi + % + \pdfoutline goto name{\pdfmkpgn{\pdfoutlinedest}}#2{#1}% + } + % + \def\pdfmakeoutlines{% + \begingroup + % Thanh's hack / proper braces in bookmarks + \edef\mylbrace{\iftrue \string{\else}\fi}\let\{=\mylbrace + \edef\myrbrace{\iffalse{\else\string}\fi}\let\}=\myrbrace + % + % Read toc silently, to get counts of subentries for \pdfoutline. + \def\numchapentry##1##2##3##4{% + \def\thischapnum{##2}% + \let\thissecnum\empty + \let\thissubsecnum\empty + }% + \def\numsecentry##1##2##3##4{% + \advancenumber{chap\thischapnum}% + \def\thissecnum{##2}% + \let\thissubsecnum\empty + }% + \def\numsubsecentry##1##2##3##4{% + \advancenumber{sec\thissecnum}% + \def\thissubsecnum{##2}% + }% + \def\numsubsubsecentry##1##2##3##4{% + \advancenumber{subsec\thissubsecnum}% + }% + \let\thischapnum\empty + \let\thissecnum\empty + \let\thissubsecnum\empty + % + % use \def rather than \let here because we redefine \chapentry et + % al. a second time, below. + \def\appentry{\numchapentry}% + \def\appsecentry{\numsecentry}% + \def\appsubsecentry{\numsubsecentry}% + \def\appsubsubsecentry{\numsubsubsecentry}% + \def\unnchapentry{\numchapentry}% + \def\unnsecentry{\numsecentry}% + \def\unnsubsecentry{\numsubsecentry}% + \def\unnsubsubsecentry{\numsubsubsecentry}% + \input \jobname.toc + % + % Read toc second time, this time actually producing the outlines. + % The `-' means take the \expnumber as the absolute number of + % subentries, which we calculated on our first read of the .toc above. + % + % We use the node names as the destinations. + \def\numchapentry##1##2##3##4{% + \dopdfoutline{##1}{count-\expnumber{chap##2}}{##3}{##4}}% + \def\numsecentry##1##2##3##4{% + \dopdfoutline{##1}{count-\expnumber{sec##2}}{##3}{##4}}% + \def\numsubsecentry##1##2##3##4{% + \dopdfoutline{##1}{count-\expnumber{subsec##2}}{##3}{##4}}% + \def\numsubsubsecentry##1##2##3##4{% count is always zero + \dopdfoutline{##1}{}{##3}{##4}}% + % + % PDF outlines are displayed using system fonts, instead of + % document fonts. Therefore we cannot use special characters, + % since the encoding is unknown. For example, the eogonek from + % Latin 2 (0xea) gets translated to a | character. Info from + % Staszek Wawrykiewicz, 19 Jan 2004 04:09:24 +0100. + % + % xx to do this right, we have to translate 8-bit characters to + % their "best" equivalent, based on the @documentencoding. Right + % now, I guess we'll just let the pdf reader have its way. + \indexnofonts + \turnoffactive + \input \jobname.toc + \endgroup + } + % + \def\makelinks #1,{% + \def\params{#1}\def\E{END}% + \ifx\params\E + \let\nextmakelinks=\relax + \else + \let\nextmakelinks=\makelinks + \ifnum\lnkcount>0,\fi + \picknum{#1}% + \startlink attr{/Border [0 0 0]} + goto name{\pdfmkpgn{\the\pgn}}% + \linkcolor #1% + \advance\lnkcount by 1% + \endlink + \fi + \nextmakelinks + } + \def\picknum#1{\expandafter\pn#1} + \def\pn#1{% + \def\p{#1}% + \ifx\p\lbrace + \let\nextpn=\ppn + \else + \let\nextpn=\ppnn + \def\first{#1} + \fi + \nextpn + } + \def\ppn#1{\pgn=#1\gobble} + \def\ppnn{\pgn=\first} + \def\pdfmklnk#1{\lnkcount=0\makelinks #1,END,} + \def\skipspaces#1{\def\PP{#1}\def\D{|}% + \ifx\PP\D\let\nextsp\relax + \else\let\nextsp\skipspaces + \ifx\p\space\else\addtokens{\filename}{\PP}% + \advance\filenamelength by 1 + \fi + \fi + \nextsp} + \def\getfilename#1{\filenamelength=0\expandafter\skipspaces#1|\relax} + \ifnum\pdftexversion < 14 + \let \startlink \pdfannotlink + \else + \let \startlink \pdfstartlink + \fi + \def\pdfurl#1{% + \begingroup + \normalturnoffactive\def\@{@}% + \makevalueexpandable + \leavevmode\Red + \startlink attr{/Border [0 0 0]}% + user{/Subtype /Link /A << /S /URI /URI (#1) >>}% + \endgroup} + \def\pdfgettoks#1.{\setbox\boxA=\hbox{\toksA={#1.}\toksB={}\maketoks}} + \def\addtokens#1#2{\edef\addtoks{\noexpand#1={\the#1#2}}\addtoks} + \def\adn#1{\addtokens{\toksC}{#1}\global\countA=1\let\next=\maketoks} + \def\poptoks#1#2|ENDTOKS|{\let\first=#1\toksD={#1}\toksA={#2}} + \def\maketoks{% + \expandafter\poptoks\the\toksA|ENDTOKS|\relax + \ifx\first0\adn0 + \else\ifx\first1\adn1 \else\ifx\first2\adn2 \else\ifx\first3\adn3 + \else\ifx\first4\adn4 \else\ifx\first5\adn5 \else\ifx\first6\adn6 + \else\ifx\first7\adn7 \else\ifx\first8\adn8 \else\ifx\first9\adn9 + \else + \ifnum0=\countA\else\makelink\fi + \ifx\first.\let\next=\done\else + \let\next=\maketoks + \addtokens{\toksB}{\the\toksD} + \ifx\first,\addtokens{\toksB}{\space}\fi + \fi + \fi\fi\fi\fi\fi\fi\fi\fi\fi\fi + \next} + \def\makelink{\addtokens{\toksB}% + {\noexpand\pdflink{\the\toksC}}\toksC={}\global\countA=0} + \def\pdflink#1{% + \startlink attr{/Border [0 0 0]} goto name{\pdfmkpgn{#1}} + \linkcolor #1\endlink} + \def\done{\edef\st{\global\noexpand\toksA={\the\toksB}}\st} +\else + \let\pdfmkdest = \gobble + \let\pdfurl = \gobble + \let\endlink = \relax + \let\linkcolor = \relax + \let\pdfmakeoutlines = \relax +\fi % \ifx\pdfoutput + + +\message{fonts,} + +% Change the current font style to #1, remembering it in \curfontstyle. +% For now, we do not accumulate font styles: @b{@i{foo}} prints foo in +% italics, not bold italics. +% +\def\setfontstyle#1{% + \def\curfontstyle{#1}% not as a control sequence, because we are \edef'd. + \csname ten#1\endcsname % change the current font +} + +% Select #1 fonts with the current style. +% +\def\selectfonts#1{\csname #1fonts\endcsname \csname\curfontstyle\endcsname} + +\def\rm{\fam=0 \setfontstyle{rm}} +\def\it{\fam=\itfam \setfontstyle{it}} +\def\sl{\fam=\slfam \setfontstyle{sl}} +\def\bf{\fam=\bffam \setfontstyle{bf}} +\def\tt{\fam=\ttfam \setfontstyle{tt}} + +% Texinfo sort of supports the sans serif font style, which plain TeX does not. +% So we set up a \sf. +\newfam\sffam +\def\sf{\fam=\sffam \setfontstyle{sf}} +\let\li = \sf % Sometimes we call it \li, not \sf. + +% We don't need math for this font style. +\def\ttsl{\setfontstyle{ttsl}} + +% Default leading. +\newdimen\textleading \textleading = 13.2pt + +% Set the baselineskip to #1, and the lineskip and strut size +% correspondingly. There is no deep meaning behind these magic numbers +% used as factors; they just match (closely enough) what Knuth defined. +% +\def\lineskipfactor{.08333} +\def\strutheightpercent{.70833} +\def\strutdepthpercent {.29167} +% +\def\setleading#1{% + \normalbaselineskip = #1\relax + \normallineskip = \lineskipfactor\normalbaselineskip + \normalbaselines + \setbox\strutbox =\hbox{% + \vrule width0pt height\strutheightpercent\baselineskip + depth \strutdepthpercent \baselineskip + }% +} + +% Set the font macro #1 to the font named #2, adding on the +% specified font prefix (normally `cm'). +% #3 is the font's design size, #4 is a scale factor +\def\setfont#1#2#3#4{\font#1=\fontprefix#2#3 scaled #4} + +% Use cm as the default font prefix. +% To specify the font prefix, you must define \fontprefix +% before you read in texinfo.tex. +\ifx\fontprefix\undefined +\def\fontprefix{cm} +\fi +% Support font families that don't use the same naming scheme as CM. +\def\rmshape{r} +\def\rmbshape{bx} %where the normal face is bold +\def\bfshape{b} +\def\bxshape{bx} +\def\ttshape{tt} +\def\ttbshape{tt} +\def\ttslshape{sltt} +\def\itshape{ti} +\def\itbshape{bxti} +\def\slshape{sl} +\def\slbshape{bxsl} +\def\sfshape{ss} +\def\sfbshape{ss} +\def\scshape{csc} +\def\scbshape{csc} + +% Text fonts (11.2pt, magstep1). +\newcount\mainmagstep +\ifx\bigger\relax + % not really supported. + \mainmagstep=\magstep1 + \setfont\textrm\rmshape{12}{1000} + \setfont\texttt\ttshape{12}{1000} +\else + \mainmagstep=\magstephalf + \setfont\textrm\rmshape{10}{\mainmagstep} + \setfont\texttt\ttshape{10}{\mainmagstep} +\fi +\setfont\textbf\bfshape{10}{\mainmagstep} +\setfont\textit\itshape{10}{\mainmagstep} +\setfont\textsl\slshape{10}{\mainmagstep} +\setfont\textsf\sfshape{10}{\mainmagstep} +\setfont\textsc\scshape{10}{\mainmagstep} +\setfont\textttsl\ttslshape{10}{\mainmagstep} +\font\texti=cmmi10 scaled \mainmagstep +\font\textsy=cmsy10 scaled \mainmagstep + +% A few fonts for @defun names and args. +\setfont\defbf\bfshape{10}{\magstep1} +\setfont\deftt\ttshape{10}{\magstep1} +\setfont\defttsl\ttslshape{10}{\magstep1} +\def\df{\let\tentt=\deftt \let\tenbf = \defbf \let\tenttsl=\defttsl \bf} + +% Fonts for indices, footnotes, small examples (9pt). +\setfont\smallrm\rmshape{9}{1000} +\setfont\smalltt\ttshape{9}{1000} +\setfont\smallbf\bfshape{10}{900} +\setfont\smallit\itshape{9}{1000} +\setfont\smallsl\slshape{9}{1000} +\setfont\smallsf\sfshape{9}{1000} +\setfont\smallsc\scshape{10}{900} +\setfont\smallttsl\ttslshape{10}{900} +\font\smalli=cmmi9 +\font\smallsy=cmsy9 + +% Fonts for small examples (8pt). +\setfont\smallerrm\rmshape{8}{1000} +\setfont\smallertt\ttshape{8}{1000} +\setfont\smallerbf\bfshape{10}{800} +\setfont\smallerit\itshape{8}{1000} +\setfont\smallersl\slshape{8}{1000} +\setfont\smallersf\sfshape{8}{1000} +\setfont\smallersc\scshape{10}{800} +\setfont\smallerttsl\ttslshape{10}{800} +\font\smalleri=cmmi8 +\font\smallersy=cmsy8 + +% Fonts for title page (20.4pt): +\setfont\titlerm\rmbshape{12}{\magstep3} +\setfont\titleit\itbshape{10}{\magstep4} +\setfont\titlesl\slbshape{10}{\magstep4} +\setfont\titlett\ttbshape{12}{\magstep3} +\setfont\titlettsl\ttslshape{10}{\magstep4} +\setfont\titlesf\sfbshape{17}{\magstep1} +\let\titlebf=\titlerm +\setfont\titlesc\scbshape{10}{\magstep4} +\font\titlei=cmmi12 scaled \magstep3 +\font\titlesy=cmsy10 scaled \magstep4 +\def\authorrm{\secrm} +\def\authortt{\sectt} + +% Chapter (and unnumbered) fonts (17.28pt). +\setfont\chaprm\rmbshape{12}{\magstep2} +\setfont\chapit\itbshape{10}{\magstep3} +\setfont\chapsl\slbshape{10}{\magstep3} +\setfont\chaptt\ttbshape{12}{\magstep2} +\setfont\chapttsl\ttslshape{10}{\magstep3} +\setfont\chapsf\sfbshape{17}{1000} +\let\chapbf=\chaprm +\setfont\chapsc\scbshape{10}{\magstep3} +\font\chapi=cmmi12 scaled \magstep2 +\font\chapsy=cmsy10 scaled \magstep3 + +% Section fonts (14.4pt). +\setfont\secrm\rmbshape{12}{\magstep1} +\setfont\secit\itbshape{10}{\magstep2} +\setfont\secsl\slbshape{10}{\magstep2} +\setfont\sectt\ttbshape{12}{\magstep1} +\setfont\secttsl\ttslshape{10}{\magstep2} +\setfont\secsf\sfbshape{12}{\magstep1} +\let\secbf\secrm +\setfont\secsc\scbshape{10}{\magstep2} +\font\seci=cmmi12 scaled \magstep1 +\font\secsy=cmsy10 scaled \magstep2 + +% Subsection fonts (13.15pt). +\setfont\ssecrm\rmbshape{12}{\magstephalf} +\setfont\ssecit\itbshape{10}{1315} +\setfont\ssecsl\slbshape{10}{1315} +\setfont\ssectt\ttbshape{12}{\magstephalf} +\setfont\ssecttsl\ttslshape{10}{1315} +\setfont\ssecsf\sfbshape{12}{\magstephalf} +\let\ssecbf\ssecrm +\setfont\ssecsc\scbshape{10}{1315} +\font\sseci=cmmi12 scaled \magstephalf +\font\ssecsy=cmsy10 scaled 1315 + +% Reduced fonts for @acro in text (10pt). +\setfont\reducedrm\rmshape{10}{1000} +\setfont\reducedtt\ttshape{10}{1000} +\setfont\reducedbf\bfshape{10}{1000} +\setfont\reducedit\itshape{10}{1000} +\setfont\reducedsl\slshape{10}{1000} +\setfont\reducedsf\sfshape{10}{1000} +\setfont\reducedsc\scshape{10}{1000} +\setfont\reducedttsl\ttslshape{10}{1000} +\font\reducedi=cmmi10 +\font\reducedsy=cmsy10 + +% In order for the font changes to affect most math symbols and letters, +% we have to define the \textfont of the standard families. Since +% texinfo doesn't allow for producing subscripts and superscripts except +% in the main text, we don't bother to reset \scriptfont and +% \scriptscriptfont (which would also require loading a lot more fonts). +% +\def\resetmathfonts{% + \textfont0=\tenrm \textfont1=\teni \textfont2=\tensy + \textfont\itfam=\tenit \textfont\slfam=\tensl \textfont\bffam=\tenbf + \textfont\ttfam=\tentt \textfont\sffam=\tensf +} + +% The font-changing commands redefine the meanings of \tenSTYLE, instead +% of just \STYLE. We do this because \STYLE needs to also set the +% current \fam for math mode. Our \STYLE (e.g., \rm) commands hardwire +% \tenSTYLE to set the current font. +% +% Each font-changing command also sets the names \lsize (one size lower) +% and \lllsize (three sizes lower). These relative commands are used in +% the LaTeX logo and acronyms. +% +% This all needs generalizing, badly. +% +\def\textfonts{% + \let\tenrm=\textrm \let\tenit=\textit \let\tensl=\textsl + \let\tenbf=\textbf \let\tentt=\texttt \let\smallcaps=\textsc + \let\tensf=\textsf \let\teni=\texti \let\tensy=\textsy + \let\tenttsl=\textttsl + \def\lsize{reduced}\def\lllsize{smaller}% + \resetmathfonts \setleading{\textleading}} +\def\titlefonts{% + \let\tenrm=\titlerm \let\tenit=\titleit \let\tensl=\titlesl + \let\tenbf=\titlebf \let\tentt=\titlett \let\smallcaps=\titlesc + \let\tensf=\titlesf \let\teni=\titlei \let\tensy=\titlesy + \let\tenttsl=\titlettsl + \def\lsize{chap}\def\lllsize{subsec}% + \resetmathfonts \setleading{25pt}} +\def\titlefont#1{{\titlefonts\rm #1}} +\def\chapfonts{% + \let\tenrm=\chaprm \let\tenit=\chapit \let\tensl=\chapsl + \let\tenbf=\chapbf \let\tentt=\chaptt \let\smallcaps=\chapsc + \let\tensf=\chapsf \let\teni=\chapi \let\tensy=\chapsy \let\tenttsl=\chapttsl + \def\lsize{sec}\def\lllsize{text}% + \resetmathfonts \setleading{19pt}} +\def\secfonts{% + \let\tenrm=\secrm \let\tenit=\secit \let\tensl=\secsl + \let\tenbf=\secbf \let\tentt=\sectt \let\smallcaps=\secsc + \let\tensf=\secsf \let\teni=\seci \let\tensy=\secsy + \let\tenttsl=\secttsl + \def\lsize{subsec}\def\lllsize{reduced}% + \resetmathfonts \setleading{16pt}} +\def\subsecfonts{% + \let\tenrm=\ssecrm \let\tenit=\ssecit \let\tensl=\ssecsl + \let\tenbf=\ssecbf \let\tentt=\ssectt \let\smallcaps=\ssecsc + \let\tensf=\ssecsf \let\teni=\sseci \let\tensy=\ssecsy + \let\tenttsl=\ssecttsl + \def\lsize{text}\def\lllsize{small}% + \resetmathfonts \setleading{15pt}} +\let\subsubsecfonts = \subsecfonts +\def\reducedfonts{% + \let\tenrm=\reducedrm \let\tenit=\reducedit \let\tensl=\reducedsl + \let\tenbf=\reducedbf \let\tentt=\reducedtt \let\reducedcaps=\reducedsc + \let\tensf=\reducedsf \let\teni=\reducedi \let\tensy=\reducedsy + \let\tenttsl=\reducedttsl + \def\lsize{small}\def\lllsize{smaller}% + \resetmathfonts \setleading{10.5pt}} +\def\smallfonts{% + \let\tenrm=\smallrm \let\tenit=\smallit \let\tensl=\smallsl + \let\tenbf=\smallbf \let\tentt=\smalltt \let\smallcaps=\smallsc + \let\tensf=\smallsf \let\teni=\smalli \let\tensy=\smallsy + \let\tenttsl=\smallttsl + \def\lsize{smaller}\def\lllsize{smaller}% + \resetmathfonts \setleading{10.5pt}} +\def\smallerfonts{% + \let\tenrm=\smallerrm \let\tenit=\smallerit \let\tensl=\smallersl + \let\tenbf=\smallerbf \let\tentt=\smallertt \let\smallcaps=\smallersc + \let\tensf=\smallersf \let\teni=\smalleri \let\tensy=\smallersy + \let\tenttsl=\smallerttsl + \def\lsize{smaller}\def\lllsize{smaller}% + \resetmathfonts \setleading{9.5pt}} + +% Set the fonts to use with the @small... environments. +\let\smallexamplefonts = \smallfonts + +% About \smallexamplefonts. If we use \smallfonts (9pt), @smallexample +% can fit this many characters: +% 8.5x11=86 smallbook=72 a4=90 a5=69 +% If we use \scriptfonts (8pt), then we can fit this many characters: +% 8.5x11=90+ smallbook=80 a4=90+ a5=77 +% For me, subjectively, the few extra characters that fit aren't worth +% the additional smallness of 8pt. So I'm making the default 9pt. +% +% By the way, for comparison, here's what fits with @example (10pt): +% 8.5x11=71 smallbook=60 a4=75 a5=58 +% +% I wish the USA used A4 paper. +% --karl, 24jan03. + + +% Set up the default fonts, so we can use them for creating boxes. +% +\textfonts \rm + +% Define these so they can be easily changed for other fonts. +\def\angleleft{$\langle$} +\def\angleright{$\rangle$} + +% Count depth in font-changes, for error checks +\newcount\fontdepth \fontdepth=0 + +% Fonts for short table of contents. +\setfont\shortcontrm\rmshape{12}{1000} +\setfont\shortcontbf\bfshape{10}{\magstep1} % no cmb12 +\setfont\shortcontsl\slshape{12}{1000} +\setfont\shortconttt\ttshape{12}{1000} + +%% Add scribe-like font environments, plus @l for inline lisp (usually sans +%% serif) and @ii for TeX italic + +% \smartitalic{ARG} outputs arg in italics, followed by an italic correction +% unless the following character is such as not to need one. +\def\smartitalicx{\ifx\next,\else\ifx\next-\else\ifx\next.\else + \ptexslash\fi\fi\fi} +\def\smartslanted#1{{\ifusingtt\ttsl\sl #1}\futurelet\next\smartitalicx} +\def\smartitalic#1{{\ifusingtt\ttsl\it #1}\futurelet\next\smartitalicx} + +% like \smartslanted except unconditionally uses \ttsl. +% @var is set to this for defun arguments. +\def\ttslanted#1{{\ttsl #1}\futurelet\next\smartitalicx} + +% like \smartslanted except unconditionally use \sl. We never want +% ttsl for book titles, do we? +\def\cite#1{{\sl #1}\futurelet\next\smartitalicx} + +\let\i=\smartitalic +\let\var=\smartslanted +\let\dfn=\smartslanted +\let\emph=\smartitalic + +\def\b#1{{\bf #1}} +\let\strong=\b + +% We can't just use \exhyphenpenalty, because that only has effect at +% the end of a paragraph. Restore normal hyphenation at the end of the +% group within which \nohyphenation is presumably called. +% +\def\nohyphenation{\hyphenchar\font = -1 \aftergroup\restorehyphenation} +\def\restorehyphenation{\hyphenchar\font = `- } + +% Set sfcode to normal for the chars that usually have another value. +% Can't use plain's \frenchspacing because it uses the `\x notation, and +% sometimes \x has an active definition that messes things up. +% +\catcode`@=11 + \def\frenchspacing{% + \sfcode\dotChar =\@m \sfcode\questChar=\@m \sfcode\exclamChar=\@m + \sfcode\colonChar=\@m \sfcode\semiChar =\@m \sfcode\commaChar =\@m + } +\catcode`@=\other + +\def\t#1{% + {\tt \rawbackslash \frenchspacing #1}% + \null +} +\def\samp#1{`\tclose{#1}'\null} +\setfont\keyrm\rmshape{8}{1000} +\font\keysy=cmsy9 +\def\key#1{{\keyrm\textfont2=\keysy \leavevmode\hbox{% + \raise0.4pt\hbox{\angleleft}\kern-.08em\vtop{% + \vbox{\hrule\kern-0.4pt + \hbox{\raise0.4pt\hbox{\vphantom{\angleleft}}#1}}% + \kern-0.4pt\hrule}% + \kern-.06em\raise0.4pt\hbox{\angleright}}}} +% The old definition, with no lozenge: +%\def\key #1{{\ttsl \nohyphenation \uppercase{#1}}\null} +\def\ctrl #1{{\tt \rawbackslash \hat}#1} + +% @file, @option are the same as @samp. +\let\file=\samp +\let\option=\samp + +% @code is a modification of @t, +% which makes spaces the same size as normal in the surrounding text. +\def\tclose#1{% + {% + % Change normal interword space to be same as for the current font. + \spaceskip = \fontdimen2\font + % + % Switch to typewriter. + \tt + % + % But `\ ' produces the large typewriter interword space. + \def\ {{\spaceskip = 0pt{} }}% + % + % Turn off hyphenation. + \nohyphenation + % + \rawbackslash + \frenchspacing + #1% + }% + \null +} + +% We *must* turn on hyphenation at `-' and `_' in @code. +% Otherwise, it is too hard to avoid overfull hboxes +% in the Emacs manual, the Library manual, etc. + +% Unfortunately, TeX uses one parameter (\hyphenchar) to control +% both hyphenation at - and hyphenation within words. +% We must therefore turn them both off (\tclose does that) +% and arrange explicitly to hyphenate at a dash. +% -- rms. +{ + \catcode`\-=\active + \catcode`\_=\active + % + \global\def\code{\begingroup + \catcode`\-=\active \let-\codedash + \catcode`\_=\active \let_\codeunder + \codex + } +} + +\def\realdash{-} +\def\codedash{-\discretionary{}{}{}} +\def\codeunder{% + % this is all so @math{@code{var_name}+1} can work. In math mode, _ + % is "active" (mathcode"8000) and \normalunderscore (or \char95, etc.) + % will therefore expand the active definition of _, which is us + % (inside @code that is), therefore an endless loop. + \ifusingtt{\ifmmode + \mathchar"075F % class 0=ordinary, family 7=ttfam, pos 0x5F=_. + \else\normalunderscore \fi + \discretionary{}{}{}}% + {\_}% +} +\def\codex #1{\tclose{#1}\endgroup} + +% @kbd is like @code, except that if the argument is just one @key command, +% then @kbd has no effect. + +% @kbdinputstyle -- arg is `distinct' (@kbd uses slanted tty font always), +% `example' (@kbd uses ttsl only inside of @example and friends), +% or `code' (@kbd uses normal tty font always). +\parseargdef\kbdinputstyle{% + \def\arg{#1}% + \ifx\arg\worddistinct + \gdef\kbdexamplefont{\ttsl}\gdef\kbdfont{\ttsl}% + \else\ifx\arg\wordexample + \gdef\kbdexamplefont{\ttsl}\gdef\kbdfont{\tt}% + \else\ifx\arg\wordcode + \gdef\kbdexamplefont{\tt}\gdef\kbdfont{\tt}% + \else + \errhelp = \EMsimple + \errmessage{Unknown @kbdinputstyle option `\arg'}% + \fi\fi\fi +} +\def\worddistinct{distinct} +\def\wordexample{example} +\def\wordcode{code} + +% Default is `distinct.' +\kbdinputstyle distinct + +\def\xkey{\key} +\def\kbdfoo#1#2#3\par{\def\one{#1}\def\three{#3}\def\threex{??}% +\ifx\one\xkey\ifx\threex\three \key{#2}% +\else{\tclose{\kbdfont\look}}\fi +\else{\tclose{\kbdfont\look}}\fi} + +% For @indicateurl, @env, @command quotes seem unnecessary, so use \code. +\let\indicateurl=\code +\let\env=\code +\let\command=\code + +% @uref (abbreviation for `urlref') takes an optional (comma-separated) +% second argument specifying the text to display and an optional third +% arg as text to display instead of (rather than in addition to) the url +% itself. First (mandatory) arg is the url. Perhaps eventually put in +% a hypertex \special here. +% +\def\uref#1{\douref #1,,,\finish} +\def\douref#1,#2,#3,#4\finish{\begingroup + \unsepspaces + \pdfurl{#1}% + \setbox0 = \hbox{\ignorespaces #3}% + \ifdim\wd0 > 0pt + \unhbox0 % third arg given, show only that + \else + \setbox0 = \hbox{\ignorespaces #2}% + \ifdim\wd0 > 0pt + \ifpdf + \unhbox0 % PDF: 2nd arg given, show only it + \else + \unhbox0\ (\code{#1})% DVI: 2nd arg given, show both it and url + \fi + \else + \code{#1}% only url given, so show it + \fi + \fi + \endlink +\endgroup} + +% @url synonym for @uref, since that's how everyone uses it. +% +\let\url=\uref + +% rms does not like angle brackets --karl, 17may97. +% So now @email is just like @uref, unless we are pdf. +% +%\def\email#1{\angleleft{\tt #1}\angleright} +\ifpdf + \def\email#1{\doemail#1,,\finish} + \def\doemail#1,#2,#3\finish{\begingroup + \unsepspaces + \pdfurl{mailto:#1}% + \setbox0 = \hbox{\ignorespaces #2}% + \ifdim\wd0>0pt\unhbox0\else\code{#1}\fi + \endlink + \endgroup} +\else + \let\email=\uref +\fi + +% Check if we are currently using a typewriter font. Since all the +% Computer Modern typewriter fonts have zero interword stretch (and +% shrink), and it is reasonable to expect all typewriter fonts to have +% this property, we can check that font parameter. +% +\def\ifmonospace{\ifdim\fontdimen3\font=0pt } + +% Typeset a dimension, e.g., `in' or `pt'. The only reason for the +% argument is to make the input look right: @dmn{pt} instead of @dmn{}pt. +% +\def\dmn#1{\thinspace #1} + +\def\kbd#1{\def\look{#1}\expandafter\kbdfoo\look??\par} + +% @l was never documented to mean ``switch to the Lisp font'', +% and it is not used as such in any manual I can find. We need it for +% Polish suppressed-l. --karl, 22sep96. +%\def\l#1{{\li #1}\null} + +% Explicit font changes: @r, @sc, undocumented @ii. +\def\r#1{{\rm #1}} % roman font +\def\sc#1{{\smallcaps#1}} % smallcaps font +\def\ii#1{{\it #1}} % italic font + +\def\acronym#1{\doacronym #1,,\finish} +\def\doacronym#1,#2,#3\finish{% + {\selectfonts\lsize #1}% + \def\temp{#2}% + \ifx\temp\empty \else + \space ({\unsepspaces \ignorespaces \temp \unskip})% + \fi +} + +% @pounds{} is a sterling sign, which is in the CM italic font. +% +\def\pounds{{\it\$}} + +% @registeredsymbol - R in a circle. The font for the R should really +% be smaller yet, but lllsize is the best we can do for now. +% Adapted from the plain.tex definition of \copyright. +% +\def\registeredsymbol{% + $^{{\ooalign{\hfil\raise.07ex\hbox{\selectfonts\lllsize R}% + \hfil\crcr\Orb}}% + }$% +} + + +\message{page headings,} + +\newskip\titlepagetopglue \titlepagetopglue = 1.5in +\newskip\titlepagebottomglue \titlepagebottomglue = 2pc + +% First the title page. Must do @settitle before @titlepage. +\newif\ifseenauthor +\newif\iffinishedtitlepage + +% Do an implicit @contents or @shortcontents after @end titlepage if the +% user says @setcontentsaftertitlepage or @setshortcontentsaftertitlepage. +% +\newif\ifsetcontentsaftertitlepage + \let\setcontentsaftertitlepage = \setcontentsaftertitlepagetrue +\newif\ifsetshortcontentsaftertitlepage + \let\setshortcontentsaftertitlepage = \setshortcontentsaftertitlepagetrue + +\parseargdef\shorttitlepage{\begingroup\hbox{}\vskip 1.5in \chaprm \centerline{#1}% + \endgroup\page\hbox{}\page} + +\envdef\titlepage{% + % Open one extra group, as we want to close it in the middle of \Etitlepage. + \begingroup + \parindent=0pt \textfonts + % Leave some space at the very top of the page. + \vglue\titlepagetopglue + % No rule at page bottom unless we print one at the top with @title. + \finishedtitlepagetrue + % + % Most title ``pages'' are actually two pages long, with space + % at the top of the second. We don't want the ragged left on the second. + \let\oldpage = \page + \def\page{% + \iffinishedtitlepage\else + \finishtitlepage + \fi + \let\page = \oldpage + \page + \null + }% +} + +\def\Etitlepage{% + \iffinishedtitlepage\else + \finishtitlepage + \fi + % It is important to do the page break before ending the group, + % because the headline and footline are only empty inside the group. + % If we use the new definition of \page, we always get a blank page + % after the title page, which we certainly don't want. + \oldpage + \endgroup + % + % Need this before the \...aftertitlepage checks so that if they are + % in effect the toc pages will come out with page numbers. + \HEADINGSon + % + % If they want short, they certainly want long too. + \ifsetshortcontentsaftertitlepage + \shortcontents + \contents + \global\let\shortcontents = \relax + \global\let\contents = \relax + \fi + % + \ifsetcontentsaftertitlepage + \contents + \global\let\contents = \relax + \global\let\shortcontents = \relax + \fi +} + +\def\finishtitlepage{% + \vskip4pt \hrule height 2pt width \hsize + \vskip\titlepagebottomglue + \finishedtitlepagetrue +} + +%%% Macros to be used within @titlepage: + +\let\subtitlerm=\tenrm +\def\subtitlefont{\subtitlerm \normalbaselineskip = 13pt \normalbaselines} + +\def\authorfont{\authorrm \normalbaselineskip = 16pt \normalbaselines + \let\tt=\authortt} + +\parseargdef\title{% + \checkenv\titlepage + \leftline{\titlefonts\rm #1} + % print a rule at the page bottom also. + \finishedtitlepagefalse + \vskip4pt \hrule height 4pt width \hsize \vskip4pt +} + +\parseargdef\subtitle{% + \checkenv\titlepage + {\subtitlefont \rightline{#1}}% +} + +% @author should come last, but may come many times. +% It can also be used inside @quotation. +% +\parseargdef\author{% + \def\temp{\quotation}% + \ifx\thisenv\temp + \def\quotationauthor{#1}% printed in \Equotation. + \else + \checkenv\titlepage + \ifseenauthor\else \vskip 0pt plus 1filll \seenauthortrue \fi + {\authorfont \leftline{#1}}% + \fi +} + + +%%% Set up page headings and footings. + +\let\thispage=\folio + +\newtoks\evenheadline % headline on even pages +\newtoks\oddheadline % headline on odd pages +\newtoks\evenfootline % footline on even pages +\newtoks\oddfootline % footline on odd pages + +% Now make TeX use those variables +\headline={{\textfonts\rm \ifodd\pageno \the\oddheadline + \else \the\evenheadline \fi}} +\footline={{\textfonts\rm \ifodd\pageno \the\oddfootline + \else \the\evenfootline \fi}\HEADINGShook} +\let\HEADINGShook=\relax + +% Commands to set those variables. +% For example, this is what @headings on does +% @evenheading @thistitle|@thispage|@thischapter +% @oddheading @thischapter|@thispage|@thistitle +% @evenfooting @thisfile|| +% @oddfooting ||@thisfile + + +\def\evenheading{\parsearg\evenheadingxxx} +\def\evenheadingxxx #1{\evenheadingyyy #1\|\|\|\|\finish} +\def\evenheadingyyy #1\|#2\|#3\|#4\finish{% +\global\evenheadline={\rlap{\centerline{#2}}\line{#1\hfil#3}}} + +\def\oddheading{\parsearg\oddheadingxxx} +\def\oddheadingxxx #1{\oddheadingyyy #1\|\|\|\|\finish} +\def\oddheadingyyy #1\|#2\|#3\|#4\finish{% +\global\oddheadline={\rlap{\centerline{#2}}\line{#1\hfil#3}}} + +\parseargdef\everyheading{\oddheadingxxx{#1}\evenheadingxxx{#1}}% + +\def\evenfooting{\parsearg\evenfootingxxx} +\def\evenfootingxxx #1{\evenfootingyyy #1\|\|\|\|\finish} +\def\evenfootingyyy #1\|#2\|#3\|#4\finish{% +\global\evenfootline={\rlap{\centerline{#2}}\line{#1\hfil#3}}} + +\def\oddfooting{\parsearg\oddfootingxxx} +\def\oddfootingxxx #1{\oddfootingyyy #1\|\|\|\|\finish} +\def\oddfootingyyy #1\|#2\|#3\|#4\finish{% + \global\oddfootline = {\rlap{\centerline{#2}}\line{#1\hfil#3}}% + % + % Leave some space for the footline. Hopefully ok to assume + % @evenfooting will not be used by itself. + \global\advance\pageheight by -\baselineskip + \global\advance\vsize by -\baselineskip +} + +\parseargdef\everyfooting{\oddfootingxxx{#1}\evenfootingxxx{#1}} + + +% @headings double turns headings on for double-sided printing. +% @headings single turns headings on for single-sided printing. +% @headings off turns them off. +% @headings on same as @headings double, retained for compatibility. +% @headings after turns on double-sided headings after this page. +% @headings doubleafter turns on double-sided headings after this page. +% @headings singleafter turns on single-sided headings after this page. +% By default, they are off at the start of a document, +% and turned `on' after @end titlepage. + +\def\headings #1 {\csname HEADINGS#1\endcsname} + +\def\HEADINGSoff{% +\global\evenheadline={\hfil} \global\evenfootline={\hfil} +\global\oddheadline={\hfil} \global\oddfootline={\hfil}} +\HEADINGSoff +% When we turn headings on, set the page number to 1. +% For double-sided printing, put current file name in lower left corner, +% chapter name on inside top of right hand pages, document +% title on inside top of left hand pages, and page numbers on outside top +% edge of all pages. +\def\HEADINGSdouble{% +\global\pageno=1 +\global\evenfootline={\hfil} +\global\oddfootline={\hfil} +\global\evenheadline={\line{\folio\hfil\thistitle}} +\global\oddheadline={\line{\thischapter\hfil\folio}} +\global\let\contentsalignmacro = \chapoddpage +} +\let\contentsalignmacro = \chappager + +% For single-sided printing, chapter title goes across top left of page, +% page number on top right. +\def\HEADINGSsingle{% +\global\pageno=1 +\global\evenfootline={\hfil} +\global\oddfootline={\hfil} +\global\evenheadline={\line{\thischapter\hfil\folio}} +\global\oddheadline={\line{\thischapter\hfil\folio}} +\global\let\contentsalignmacro = \chappager +} +\def\HEADINGSon{\HEADINGSdouble} + +\def\HEADINGSafter{\let\HEADINGShook=\HEADINGSdoublex} +\let\HEADINGSdoubleafter=\HEADINGSafter +\def\HEADINGSdoublex{% +\global\evenfootline={\hfil} +\global\oddfootline={\hfil} +\global\evenheadline={\line{\folio\hfil\thistitle}} +\global\oddheadline={\line{\thischapter\hfil\folio}} +\global\let\contentsalignmacro = \chapoddpage +} + +\def\HEADINGSsingleafter{\let\HEADINGShook=\HEADINGSsinglex} +\def\HEADINGSsinglex{% +\global\evenfootline={\hfil} +\global\oddfootline={\hfil} +\global\evenheadline={\line{\thischapter\hfil\folio}} +\global\oddheadline={\line{\thischapter\hfil\folio}} +\global\let\contentsalignmacro = \chappager +} + +% Subroutines used in generating headings +% This produces Day Month Year style of output. +% Only define if not already defined, in case a txi-??.tex file has set +% up a different format (e.g., txi-cs.tex does this). +\ifx\today\undefined +\def\today{% + \number\day\space + \ifcase\month + \or\putwordMJan\or\putwordMFeb\or\putwordMMar\or\putwordMApr + \or\putwordMMay\or\putwordMJun\or\putwordMJul\or\putwordMAug + \or\putwordMSep\or\putwordMOct\or\putwordMNov\or\putwordMDec + \fi + \space\number\year} +\fi + +% @settitle line... specifies the title of the document, for headings. +% It generates no output of its own. +\def\thistitle{\putwordNoTitle} +\def\settitle{\parsearg{\gdef\thistitle}} + + +\message{tables,} +% Tables -- @table, @ftable, @vtable, @item(x). + +% default indentation of table text +\newdimen\tableindent \tableindent=.8in +% default indentation of @itemize and @enumerate text +\newdimen\itemindent \itemindent=.3in +% margin between end of table item and start of table text. +\newdimen\itemmargin \itemmargin=.1in + +% used internally for \itemindent minus \itemmargin +\newdimen\itemmax + +% Note @table, @ftable, and @vtable define @item, @itemx, etc., with +% these defs. +% They also define \itemindex +% to index the item name in whatever manner is desired (perhaps none). + +\newif\ifitemxneedsnegativevskip + +\def\itemxpar{\par\ifitemxneedsnegativevskip\nobreak\vskip-\parskip\nobreak\fi} + +\def\internalBitem{\smallbreak \parsearg\itemzzz} +\def\internalBitemx{\itemxpar \parsearg\itemzzz} + +\def\itemzzz #1{\begingroup % + \advance\hsize by -\rightskip + \advance\hsize by -\tableindent + \setbox0=\hbox{\itemindicate{#1}}% + \itemindex{#1}% + \nobreak % This prevents a break before @itemx. + % + % If the item text does not fit in the space we have, put it on a line + % by itself, and do not allow a page break either before or after that + % line. We do not start a paragraph here because then if the next + % command is, e.g., @kindex, the whatsit would get put into the + % horizontal list on a line by itself, resulting in extra blank space. + \ifdim \wd0>\itemmax + % + % Make this a paragraph so we get the \parskip glue and wrapping, + % but leave it ragged-right. + \begingroup + \advance\leftskip by-\tableindent + \advance\hsize by\tableindent + \advance\rightskip by0pt plus1fil + \leavevmode\unhbox0\par + \endgroup + % + % We're going to be starting a paragraph, but we don't want the + % \parskip glue -- logically it's part of the @item we just started. + \nobreak \vskip-\parskip + % + % Stop a page break at the \parskip glue coming up. (Unfortunately + % we can't prevent a possible page break at the following + % \baselineskip glue.) However, if what follows is an environment + % such as @example, there will be no \parskip glue; then + % the negative vskip we just would cause the example and the item to + % crash together. So we use this bizarre value of 10001 as a signal + % to \aboveenvbreak to insert \parskip glue after all. + % (Possibly there are other commands that could be followed by + % @example which need the same treatment, but not section titles; or + % maybe section titles are the only special case and they should be + % penalty 10001...) + \penalty 10001 + \endgroup + \itemxneedsnegativevskipfalse + \else + % The item text fits into the space. Start a paragraph, so that the + % following text (if any) will end up on the same line. + \noindent + % Do this with kerns and \unhbox so that if there is a footnote in + % the item text, it can migrate to the main vertical list and + % eventually be printed. + \nobreak\kern-\tableindent + \dimen0 = \itemmax \advance\dimen0 by \itemmargin \advance\dimen0 by -\wd0 + \unhbox0 + \nobreak\kern\dimen0 + \endgroup + \itemxneedsnegativevskiptrue + \fi +} + +\def\item{\errmessage{@item while not in a list environment}} +\def\itemx{\errmessage{@itemx while not in a list environment}} + +% @table, @ftable, @vtable. +\envdef\table{% + \let\itemindex\gobble + \tablex +} +\envdef\ftable{% + \def\itemindex ##1{\doind {fn}{\code{##1}}}% + \tablex +} +\envdef\vtable{% + \def\itemindex ##1{\doind {vr}{\code{##1}}}% + \tablex +} +\def\tablex#1{% + \def\itemindicate{#1}% + \parsearg\tabley +} +\def\tabley#1{% + {% + \makevalueexpandable + \edef\temp{\noexpand\tablez #1\space\space\space}% + \expandafter + }\temp \endtablez +} +\def\tablez #1 #2 #3 #4\endtablez{% + \aboveenvbreak + \ifnum 0#1>0 \advance \leftskip by #1\mil \fi + \ifnum 0#2>0 \tableindent=#2\mil \fi + \ifnum 0#3>0 \advance \rightskip by #3\mil \fi + \itemmax=\tableindent + \advance \itemmax by -\itemmargin + \advance \leftskip by \tableindent + \exdentamount=\tableindent + \parindent = 0pt + \parskip = \smallskipamount + \ifdim \parskip=0pt \parskip=2pt \fi + \let\item = \internalBitem + \let\itemx = \internalBitemx +} +\def\Etable{\endgraf\afterenvbreak} +\let\Eftable\Etable +\let\Evtable\Etable +\let\Eitemize\Etable +\let\Eenumerate\Etable + +% This is the counter used by @enumerate, which is really @itemize + +\newcount \itemno + +\envdef\itemize{\parsearg\doitemize} + +\def\doitemize#1{% + \aboveenvbreak + \itemmax=\itemindent + \advance\itemmax by -\itemmargin + \advance\leftskip by \itemindent + \exdentamount=\itemindent + \parindent=0pt + \parskip=\smallskipamount + \ifdim\parskip=0pt \parskip=2pt \fi + \def\itemcontents{#1}% + % @itemize with no arg is equivalent to @itemize @bullet. + \ifx\itemcontents\empty\def\itemcontents{\bullet}\fi + \let\item=\itemizeitem +} + +% Definition of @item while inside @itemize and @enumerate. +% +\def\itemizeitem{% + \advance\itemno by 1 % for enumerations + {\let\par=\endgraf \smallbreak}% reasonable place to break + {% + % If the document has an @itemize directly after a section title, a + % \nobreak will be last on the list, and \sectionheading will have + % done a \vskip-\parskip. In that case, we don't want to zero + % parskip, or the item text will crash with the heading. On the + % other hand, when there is normal text preceding the item (as there + % usually is), we do want to zero parskip, or there would be too much + % space. In that case, we won't have a \nobreak before. At least + % that's the theory. + \ifnum\lastpenalty<10000 \parskip=0in \fi + \noindent + \hbox to 0pt{\hss \itemcontents \kern\itemmargin}% + \vadjust{\penalty 1200}}% not good to break after first line of item. + \flushcr +} + +% \splitoff TOKENS\endmark defines \first to be the first token in +% TOKENS, and \rest to be the remainder. +% +\def\splitoff#1#2\endmark{\def\first{#1}\def\rest{#2}}% + +% Allow an optional argument of an uppercase letter, lowercase letter, +% or number, to specify the first label in the enumerated list. No +% argument is the same as `1'. +% +\envparseargdef\enumerate{\enumeratey #1 \endenumeratey} +\def\enumeratey #1 #2\endenumeratey{% + % If we were given no argument, pretend we were given `1'. + \def\thearg{#1}% + \ifx\thearg\empty \def\thearg{1}\fi + % + % Detect if the argument is a single token. If so, it might be a + % letter. Otherwise, the only valid thing it can be is a number. + % (We will always have one token, because of the test we just made. + % This is a good thing, since \splitoff doesn't work given nothing at + % all -- the first parameter is undelimited.) + \expandafter\splitoff\thearg\endmark + \ifx\rest\empty + % Only one token in the argument. It could still be anything. + % A ``lowercase letter'' is one whose \lccode is nonzero. + % An ``uppercase letter'' is one whose \lccode is both nonzero, and + % not equal to itself. + % Otherwise, we assume it's a number. + % + % We need the \relax at the end of the \ifnum lines to stop TeX from + % continuing to look for a . + % + \ifnum\lccode\expandafter`\thearg=0\relax + \numericenumerate % a number (we hope) + \else + % It's a letter. + \ifnum\lccode\expandafter`\thearg=\expandafter`\thearg\relax + \lowercaseenumerate % lowercase letter + \else + \uppercaseenumerate % uppercase letter + \fi + \fi + \else + % Multiple tokens in the argument. We hope it's a number. + \numericenumerate + \fi +} + +% An @enumerate whose labels are integers. The starting integer is +% given in \thearg. +% +\def\numericenumerate{% + \itemno = \thearg + \startenumeration{\the\itemno}% +} + +% The starting (lowercase) letter is in \thearg. +\def\lowercaseenumerate{% + \itemno = \expandafter`\thearg + \startenumeration{% + % Be sure we're not beyond the end of the alphabet. + \ifnum\itemno=0 + \errmessage{No more lowercase letters in @enumerate; get a bigger + alphabet}% + \fi + \char\lccode\itemno + }% +} + +% The starting (uppercase) letter is in \thearg. +\def\uppercaseenumerate{% + \itemno = \expandafter`\thearg + \startenumeration{% + % Be sure we're not beyond the end of the alphabet. + \ifnum\itemno=0 + \errmessage{No more uppercase letters in @enumerate; get a bigger + alphabet} + \fi + \char\uccode\itemno + }% +} + +% Call \doitemize, adding a period to the first argument and supplying the +% common last two arguments. Also subtract one from the initial value in +% \itemno, since @item increments \itemno. +% +\def\startenumeration#1{% + \advance\itemno by -1 + \doitemize{#1.}\flushcr +} + +% @alphaenumerate and @capsenumerate are abbreviations for giving an arg +% to @enumerate. +% +\def\alphaenumerate{\enumerate{a}} +\def\capsenumerate{\enumerate{A}} +\def\Ealphaenumerate{\Eenumerate} +\def\Ecapsenumerate{\Eenumerate} + + +% @multitable macros +% Amy Hendrickson, 8/18/94, 3/6/96 +% +% @multitable ... @end multitable will make as many columns as desired. +% Contents of each column will wrap at width given in preamble. Width +% can be specified either with sample text given in a template line, +% or in percent of \hsize, the current width of text on page. + +% Table can continue over pages but will only break between lines. + +% To make preamble: +% +% Either define widths of columns in terms of percent of \hsize: +% @multitable @columnfractions .25 .3 .45 +% @item ... +% +% Numbers following @columnfractions are the percent of the total +% current hsize to be used for each column. You may use as many +% columns as desired. + + +% Or use a template: +% @multitable {Column 1 template} {Column 2 template} {Column 3 template} +% @item ... +% using the widest term desired in each column. + +% Each new table line starts with @item, each subsequent new column +% starts with @tab. Empty columns may be produced by supplying @tab's +% with nothing between them for as many times as empty columns are needed, +% ie, @tab@tab@tab will produce two empty columns. + +% @item, @tab do not need to be on their own lines, but it will not hurt +% if they are. + +% Sample multitable: + +% @multitable {Column 1 template} {Column 2 template} {Column 3 template} +% @item first col stuff @tab second col stuff @tab third col +% @item +% first col stuff +% @tab +% second col stuff +% @tab +% third col +% @item first col stuff @tab second col stuff +% @tab Many paragraphs of text may be used in any column. +% +% They will wrap at the width determined by the template. +% @item@tab@tab This will be in third column. +% @end multitable + +% Default dimensions may be reset by user. +% @multitableparskip is vertical space between paragraphs in table. +% @multitableparindent is paragraph indent in table. +% @multitablecolmargin is horizontal space to be left between columns. +% @multitablelinespace is space to leave between table items, baseline +% to baseline. +% 0pt means it depends on current normal line spacing. +% +\newskip\multitableparskip +\newskip\multitableparindent +\newdimen\multitablecolspace +\newskip\multitablelinespace +\multitableparskip=0pt +\multitableparindent=6pt +\multitablecolspace=12pt +\multitablelinespace=0pt + +% Macros used to set up halign preamble: +% +\let\endsetuptable\relax +\def\xendsetuptable{\endsetuptable} +\let\columnfractions\relax +\def\xcolumnfractions{\columnfractions} +\newif\ifsetpercent + +% #1 is the @columnfraction, usually a decimal number like .5, but might +% be just 1. We just use it, whatever it is. +% +\def\pickupwholefraction#1 {% + \global\advance\colcount by 1 + \expandafter\xdef\csname col\the\colcount\endcsname{#1\hsize}% + \setuptable +} + +\newcount\colcount +\def\setuptable#1{% + \def\firstarg{#1}% + \ifx\firstarg\xendsetuptable + \let\go = \relax + \else + \ifx\firstarg\xcolumnfractions + \global\setpercenttrue + \else + \ifsetpercent + \let\go\pickupwholefraction + \else + \global\advance\colcount by 1 + \setbox0=\hbox{#1\unskip\space}% Add a normal word space as a + % separator; typically that is always in the input, anyway. + \expandafter\xdef\csname col\the\colcount\endcsname{\the\wd0}% + \fi + \fi + \ifx\go\pickupwholefraction + % Put the argument back for the \pickupwholefraction call, so + % we'll always have a period there to be parsed. + \def\go{\pickupwholefraction#1}% + \else + \let\go = \setuptable + \fi% + \fi + \go +} + +% multitable-only commands. +% +% @headitem starts a heading row, which we typeset in bold. +% Assignments have to be global since we are inside the implicit group +% of an alignment entry. Note that \everycr resets \everytab. +\def\headitem{\checkenv\multitable \crcr \global\everytab={\bf}\the\everytab}% +% +% A \tab used to include \hskip1sp. But then the space in a template +% line is not enough. That is bad. So let's go back to just `&' until +% we encounter the problem it was intended to solve again. +% --karl, nathan@acm.org, 20apr99. +\def\tab{\checkenv\multitable &\the\everytab}% + +% @multitable ... @end multitable definitions: +% +\newtoks\everytab % insert after every tab. +% +\envdef\multitable{% + \vskip\parskip + \startsavinginserts + % + % @item within a multitable starts a normal row. + \let\item\crcr + % + \tolerance=9500 + \hbadness=9500 + \setmultitablespacing + \parskip=\multitableparskip + \parindent=\multitableparindent + \overfullrule=0pt + \global\colcount=0 + % + \everycr = {% + \noalign{% + \global\everytab={}% + \global\colcount=0 % Reset the column counter. + % Check for saved footnotes, etc. + \checkinserts + % Keeps underfull box messages off when table breaks over pages. + %\filbreak + % Maybe so, but it also creates really weird page breaks when the + % table breaks over pages. Wouldn't \vfil be better? Wait until the + % problem manifests itself, so it can be fixed for real --karl. + }% + }% + % + \parsearg\domultitable +} +\def\domultitable#1{% + % To parse everything between @multitable and @item: + \setuptable#1 \endsetuptable + % + % This preamble sets up a generic column definition, which will + % be used as many times as user calls for columns. + % \vtop will set a single line and will also let text wrap and + % continue for many paragraphs if desired. + \halign\bgroup &% + \global\advance\colcount by 1 + \multistrut + \vtop{% + % Use the current \colcount to find the correct column width: + \hsize=\expandafter\csname col\the\colcount\endcsname + % + % In order to keep entries from bumping into each other + % we will add a \leftskip of \multitablecolspace to all columns after + % the first one. + % + % If a template has been used, we will add \multitablecolspace + % to the width of each template entry. + % + % If the user has set preamble in terms of percent of \hsize we will + % use that dimension as the width of the column, and the \leftskip + % will keep entries from bumping into each other. Table will start at + % left margin and final column will justify at right margin. + % + % Make sure we don't inherit \rightskip from the outer environment. + \rightskip=0pt + \ifnum\colcount=1 + % The first column will be indented with the surrounding text. + \advance\hsize by\leftskip + \else + \ifsetpercent \else + % If user has not set preamble in terms of percent of \hsize + % we will advance \hsize by \multitablecolspace. + \advance\hsize by \multitablecolspace + \fi + % In either case we will make \leftskip=\multitablecolspace: + \leftskip=\multitablecolspace + \fi + % Ignoring space at the beginning and end avoids an occasional spurious + % blank line, when TeX decides to break the line at the space before the + % box from the multistrut, so the strut ends up on a line by itself. + % For example: + % @multitable @columnfractions .11 .89 + % @item @code{#} + % @tab Legal holiday which is valid in major parts of the whole country. + % Is automatically provided with highlighting sequences respectively + % marking characters. + \noindent\ignorespaces##\unskip\multistrut + }\cr +} +\def\Emultitable{% + \crcr + \egroup % end the \halign + \global\setpercentfalse +} + +\def\setmultitablespacing{% test to see if user has set \multitablelinespace. +% If so, do nothing. If not, give it an appropriate dimension based on +% current baselineskip. +\ifdim\multitablelinespace=0pt +\setbox0=\vbox{X}\global\multitablelinespace=\the\baselineskip +\global\advance\multitablelinespace by-\ht0 +%% strut to put in table in case some entry doesn't have descenders, +%% to keep lines equally spaced +\let\multistrut = \strut +\else +%% FIXME: what is \box0 supposed to be? +\gdef\multistrut{\vrule height\multitablelinespace depth\dp0 +width0pt\relax} \fi +%% Test to see if parskip is larger than space between lines of +%% table. If not, do nothing. +%% If so, set to same dimension as multitablelinespace. +\ifdim\multitableparskip>\multitablelinespace +\global\multitableparskip=\multitablelinespace +\global\advance\multitableparskip-7pt %% to keep parskip somewhat smaller + %% than skip between lines in the table. +\fi% +\ifdim\multitableparskip=0pt +\global\multitableparskip=\multitablelinespace +\global\advance\multitableparskip-7pt %% to keep parskip somewhat smaller + %% than skip between lines in the table. +\fi} + + +\message{conditionals,} + +% @iftex, @ifnotdocbook, @ifnothtml, @ifnotinfo, @ifnotplaintext, +% @ifnotxml always succeed. They currently do nothing; we don't +% attempt to check whether the conditionals are properly nested. But we +% have to remember that they are conditionals, so that @end doesn't +% attempt to close an environment group. +% +\def\makecond#1{% + \expandafter\let\csname #1\endcsname = \relax + \expandafter\let\csname iscond.#1\endcsname = 1 +} +\makecond{iftex} +\makecond{ifnotdocbook} +\makecond{ifnothtml} +\makecond{ifnotinfo} +\makecond{ifnotplaintext} +\makecond{ifnotxml} + +% Ignore @ignore, @ifhtml, @ifinfo, and the like. +% +\def\direntry{\doignore{direntry}} +\def\documentdescription{\doignore{documentdescription}} +\def\docbook{\doignore{docbook}} +\def\html{\doignore{html}} +\def\ifdocbook{\doignore{ifdocbook}} +\def\ifhtml{\doignore{ifhtml}} +\def\ifinfo{\doignore{ifinfo}} +\def\ifnottex{\doignore{ifnottex}} +\def\ifplaintext{\doignore{ifplaintext}} +\def\ifxml{\doignore{ifxml}} +\def\ignore{\doignore{ignore}} +\def\menu{\doignore{menu}} +\def\xml{\doignore{xml}} + +% Ignore text until a line `@end #1', keeping track of nested conditionals. +% +% A count to remember the depth of nesting. +\newcount\doignorecount + +\def\doignore#1{\begingroup + % Scan in ``verbatim'' mode: + \catcode`\@ = \other + \catcode`\{ = \other + \catcode`\} = \other + % + % Make sure that spaces turn into tokens that match what \doignoretext wants. + \spaceisspace + % + % Count number of #1's that we've seen. + \doignorecount = 0 + % + % Swallow text until we reach the matching `@end #1'. + \dodoignore {#1}% +} + +{ \catcode`_=11 % We want to use \_STOP_ which cannot appear in texinfo source. + \obeylines % + % + \gdef\dodoignore#1{% + % #1 contains the string `ifinfo'. + % + % Define a command to find the next `@end #1', which must be on a line + % by itself. + \long\def\doignoretext##1^^M@end #1{\doignoretextyyy##1^^M@#1\_STOP_}% + % And this command to find another #1 command, at the beginning of a + % line. (Otherwise, we would consider a line `@c @ifset', for + % example, to count as an @ifset for nesting.) + \long\def\doignoretextyyy##1^^M@#1##2\_STOP_{\doignoreyyy{##2}\_STOP_}% + % + % And now expand that command. + \obeylines % + \doignoretext ^^M% + }% +} + +\def\doignoreyyy#1{% + \def\temp{#1}% + \ifx\temp\empty % Nothing found. + \let\next\doignoretextzzz + \else % Found a nested condition, ... + \advance\doignorecount by 1 + \let\next\doignoretextyyy % ..., look for another. + % If we're here, #1 ends with ^^M\ifinfo (for example). + \fi + \next #1% the token \_STOP_ is present just after this macro. +} + +% We have to swallow the remaining "\_STOP_". +% +\def\doignoretextzzz#1{% + \ifnum\doignorecount = 0 % We have just found the outermost @end. + \let\next\enddoignore + \else % Still inside a nested condition. + \advance\doignorecount by -1 + \let\next\doignoretext % Look for the next @end. + \fi + \next +} + +% Finish off ignored text. +\def\enddoignore{\endgroup\ignorespaces} + + +% @set VAR sets the variable VAR to an empty value. +% @set VAR REST-OF-LINE sets VAR to the value REST-OF-LINE. +% +% Since we want to separate VAR from REST-OF-LINE (which might be +% empty), we can't just use \parsearg; we have to insert a space of our +% own to delimit the rest of the line, and then take it out again if we +% didn't need it. +% We rely on the fact that \parsearg sets \catcode`\ =10. +% +\parseargdef\set{\setyyy#1 \endsetyyy} +\def\setyyy#1 #2\endsetyyy{% + {% + \makevalueexpandable + \def\temp{#2}% + \edef\next{\gdef\makecsname{SET#1}}% + \ifx\temp\empty + \next{}% + \else + \setzzz#2\endsetzzz + \fi + }% +} +% Remove the trailing space \setxxx inserted. +\def\setzzz#1 \endsetzzz{\next{#1}} + +% @clear VAR clears (i.e., unsets) the variable VAR. +% +\parseargdef\clear{% + {% + \makevalueexpandable + \global\expandafter\let\csname SET#1\endcsname=\relax + }% +} + +% @value{foo} gets the text saved in variable foo. +\def\value{\begingroup\makevalueexpandable\valuexxx} +\def\valuexxx#1{\expandablevalue{#1}\endgroup} +{ + \catcode`\- = \active \catcode`\_ = \active + % + \gdef\makevalueexpandable{% + \let\value = \expandablevalue + % We don't want these characters active, ... + \catcode`\-=\other \catcode`\_=\other + % ..., but we might end up with active ones in the argument if + % we're called from @code, as @code{@value{foo-bar_}}, though. + % So \let them to their normal equivalents. + \let-\realdash \let_\normalunderscore + } +} + +% We have this subroutine so that we can handle at least some @value's +% properly in indexes (we call \makevalueexpandable in \indexdummies). +% The command has to be fully expandable (if the variable is set), since +% the result winds up in the index file. This means that if the +% variable's value contains other Texinfo commands, it's almost certain +% it will fail (although perhaps we could fix that with sufficient work +% to do a one-level expansion on the result, instead of complete). +% +\def\expandablevalue#1{% + \expandafter\ifx\csname SET#1\endcsname\relax + {[No value for ``#1'']}% + \message{Variable `#1', used in @value, is not set.}% + \else + \csname SET#1\endcsname + \fi +} + +% @ifset VAR ... @end ifset reads the `...' iff VAR has been defined +% with @set. +% +% To get special treatment of `@end ifset,' call \makeond and the redefine. +% +\makecond{ifset} +\def\ifset{\parsearg{\doifset{\let\next=\ifsetfail}}} +\def\doifset#1#2{% + {% + \makevalueexpandable + \let\next=\empty + \expandafter\ifx\csname SET#2\endcsname\relax + #1% If not set, redefine \next. + \fi + \expandafter + }\next +} +\def\ifsetfail{\doignore{ifset}} + +% @ifclear VAR ... @end ifclear reads the `...' iff VAR has never been +% defined with @set, or has been undefined with @clear. +% +% The `\else' inside the `\doifset' parameter is a trick to reuse the +% above code: if the variable is not set, do nothing, if it is set, +% then redefine \next to \ifclearfail. +% +\makecond{ifclear} +\def\ifclear{\parsearg{\doifset{\else \let\next=\ifclearfail}}} +\def\ifclearfail{\doignore{ifclear}} + +% @dircategory CATEGORY -- specify a category of the dir file +% which this file should belong to. Ignore this in TeX. +\let\dircategory=\comment + +% @defininfoenclose. +\let\definfoenclose=\comment + + +\message{indexing,} +% Index generation facilities + +% Define \newwrite to be identical to plain tex's \newwrite +% except not \outer, so it can be used within \newindex. +{\catcode`\@=11 +\gdef\newwrite{\alloc@7\write\chardef\sixt@@n}} + +% \newindex {foo} defines an index named foo. +% It automatically defines \fooindex such that +% \fooindex ...rest of line... puts an entry in the index foo. +% It also defines \fooindfile to be the number of the output channel for +% the file that accumulates this index. The file's extension is foo. +% The name of an index should be no more than 2 characters long +% for the sake of vms. +% +\def\newindex#1{% + \iflinks + \expandafter\newwrite \csname#1indfile\endcsname + \openout \csname#1indfile\endcsname \jobname.#1 % Open the file + \fi + \expandafter\xdef\csname#1index\endcsname{% % Define @#1index + \noexpand\doindex{#1}} +} + +% @defindex foo == \newindex{foo} +% +\def\defindex{\parsearg\newindex} + +% Define @defcodeindex, like @defindex except put all entries in @code. +% +\def\defcodeindex{\parsearg\newcodeindex} +% +\def\newcodeindex#1{% + \iflinks + \expandafter\newwrite \csname#1indfile\endcsname + \openout \csname#1indfile\endcsname \jobname.#1 + \fi + \expandafter\xdef\csname#1index\endcsname{% + \noexpand\docodeindex{#1}}% +} + + +% @synindex foo bar makes index foo feed into index bar. +% Do this instead of @defindex foo if you don't want it as a separate index. +% +% @syncodeindex foo bar similar, but put all entries made for index foo +% inside @code. +% +\def\synindex#1 #2 {\dosynindex\doindex{#1}{#2}} +\def\syncodeindex#1 #2 {\dosynindex\docodeindex{#1}{#2}} + +% #1 is \doindex or \docodeindex, #2 the index getting redefined (foo), +% #3 the target index (bar). +\def\dosynindex#1#2#3{% + % Only do \closeout if we haven't already done it, else we'll end up + % closing the target index. + \expandafter \ifx\csname donesynindex#2\endcsname \undefined + % The \closeout helps reduce unnecessary open files; the limit on the + % Acorn RISC OS is a mere 16 files. + \expandafter\closeout\csname#2indfile\endcsname + \expandafter\let\csname\donesynindex#2\endcsname = 1 + \fi + % redefine \fooindfile: + \expandafter\let\expandafter\temp\expandafter=\csname#3indfile\endcsname + \expandafter\let\csname#2indfile\endcsname=\temp + % redefine \fooindex: + \expandafter\xdef\csname#2index\endcsname{\noexpand#1{#3}}% +} + +% Define \doindex, the driver for all \fooindex macros. +% Argument #1 is generated by the calling \fooindex macro, +% and it is "foo", the name of the index. + +% \doindex just uses \parsearg; it calls \doind for the actual work. +% This is because \doind is more useful to call from other macros. + +% There is also \dosubind {index}{topic}{subtopic} +% which makes an entry in a two-level index such as the operation index. + +\def\doindex#1{\edef\indexname{#1}\parsearg\singleindexer} +\def\singleindexer #1{\doind{\indexname}{#1}} + +% like the previous two, but they put @code around the argument. +\def\docodeindex#1{\edef\indexname{#1}\parsearg\singlecodeindexer} +\def\singlecodeindexer #1{\doind{\indexname}{\code{#1}}} + +% Take care of Texinfo commands that can appear in an index entry. +% Since there are some commands we want to expand, and others we don't, +% we have to laboriously prevent expansion for those that we don't. +% +\def\indexdummies{% + \def\@{@}% change to @@ when we switch to @ as escape char in index files. + \def\ {\realbackslash\space }% + % Need these in case \tex is in effect and \{ is a \delimiter again. + % But can't use \lbracecmd and \rbracecmd because texindex assumes + % braces and backslashes are used only as delimiters. + \let\{ = \mylbrace + \let\} = \myrbrace + % + % \definedummyword defines \#1 as \realbackslash #1\space, thus + % effectively preventing its expansion. This is used only for control + % words, not control letters, because the \space would be incorrect + % for control characters, but is needed to separate the control word + % from whatever follows. + % + % For control letters, we have \definedummyletter, which omits the + % space. + % + % These can be used both for control words that take an argument and + % those that do not. If it is followed by {arg} in the input, then + % that will dutifully get written to the index (or wherever). + % + \def\definedummyword##1{% + \expandafter\def\csname ##1\endcsname{\realbackslash ##1\space}% + }% + \def\definedummyletter##1{% + \expandafter\def\csname ##1\endcsname{\realbackslash ##1}% + }% + % + % Do the redefinitions. + \commondummies +} + +% For the aux file, @ is the escape character. So we want to redefine +% everything using @ instead of \realbackslash. When everything uses +% @, this will be simpler. +% +\def\atdummies{% + \def\@{@@}% + \def\ {@ }% + \let\{ = \lbraceatcmd + \let\} = \rbraceatcmd + % + % (See comments in \indexdummies.) + \def\definedummyword##1{% + \expandafter\def\csname ##1\endcsname{@##1\space}% + }% + \def\definedummyletter##1{% + \expandafter\def\csname ##1\endcsname{@##1}% + }% + % + % Do the redefinitions. + \commondummies +} + +% Called from \indexdummies and \atdummies. \definedummyword and +% \definedummyletter must be defined first. +% +\def\commondummies{% + % + \normalturnoffactive + % + \commondummiesnofonts + % + \definedummyletter{_}% + % + % Non-English letters. + \definedummyword{AA}% + \definedummyword{AE}% + \definedummyword{L}% + \definedummyword{OE}% + \definedummyword{O}% + \definedummyword{aa}% + \definedummyword{ae}% + \definedummyword{l}% + \definedummyword{oe}% + \definedummyword{o}% + \definedummyword{ss}% + \definedummyword{exclamdown}% + \definedummyword{questiondown}% + \definedummyword{ordf}% + \definedummyword{ordm}% + % + % Although these internal commands shouldn't show up, sometimes they do. + \definedummyword{bf}% + \definedummyword{gtr}% + \definedummyword{hat}% + \definedummyword{less}% + \definedummyword{sf}% + \definedummyword{sl}% + \definedummyword{tclose}% + \definedummyword{tt}% + % + \definedummyword{LaTeX}% + \definedummyword{TeX}% + % + % Assorted special characters. + \definedummyword{bullet}% + \definedummyword{copyright}% + \definedummyword{registeredsymbol}% + \definedummyword{dots}% + \definedummyword{enddots}% + \definedummyword{equiv}% + \definedummyword{error}% + \definedummyword{expansion}% + \definedummyword{minus}% + \definedummyword{pounds}% + \definedummyword{point}% + \definedummyword{print}% + \definedummyword{result}% + % + % Handle some cases of @value -- where it does not contain any + % (non-fully-expandable) commands. + \makevalueexpandable + % + % Normal spaces, not active ones. + \unsepspaces + % + % No macro expansion. + \turnoffmacros +} + +% \commondummiesnofonts: common to \commondummies and \indexnofonts. +% +% Better have this without active chars. +{ + \catcode`\~=\other + \gdef\commondummiesnofonts{% + % Control letters and accents. + \definedummyletter{!}% + \definedummyletter{"}% + \definedummyletter{'}% + \definedummyletter{*}% + \definedummyletter{,}% + \definedummyletter{.}% + \definedummyletter{/}% + \definedummyletter{:}% + \definedummyletter{=}% + \definedummyletter{?}% + \definedummyletter{^}% + \definedummyletter{`}% + \definedummyletter{~}% + \definedummyword{u}% + \definedummyword{v}% + \definedummyword{H}% + \definedummyword{dotaccent}% + \definedummyword{ringaccent}% + \definedummyword{tieaccent}% + \definedummyword{ubaraccent}% + \definedummyword{udotaccent}% + \definedummyword{dotless}% + % + % Texinfo font commands. + \definedummyword{b}% + \definedummyword{i}% + \definedummyword{r}% + \definedummyword{sc}% + \definedummyword{t}% + % + % Commands that take arguments. + \definedummyword{acronym}% + \definedummyword{cite}% + \definedummyword{code}% + \definedummyword{command}% + \definedummyword{dfn}% + \definedummyword{emph}% + \definedummyword{env}% + \definedummyword{file}% + \definedummyword{kbd}% + \definedummyword{key}% + \definedummyword{math}% + \definedummyword{option}% + \definedummyword{samp}% + \definedummyword{strong}% + \definedummyword{tie}% + \definedummyword{uref}% + \definedummyword{url}% + \definedummyword{var}% + \definedummyword{verb}% + \definedummyword{w}% + } +} + +% \indexnofonts is used when outputting the strings to sort the index +% by, and when constructing control sequence names. It eliminates all +% control sequences and just writes whatever the best ASCII sort string +% would be for a given command (usually its argument). +% +\def\indexnofonts{% + \def\definedummyword##1{% + \expandafter\let\csname ##1\endcsname\asis + }% + % We can just ignore the accent commands and other control letters. + \def\definedummyletter##1{% + \expandafter\def\csname ##1\endcsname{}% + }% + % + \commondummiesnofonts + % + % Don't no-op \tt, since it isn't a user-level command + % and is used in the definitions of the active chars like <, >, |, etc. + % Likewise with the other plain tex font commands. + %\let\tt=\asis + % + \def\ { }% + \def\@{@}% + % how to handle braces? + \def\_{\normalunderscore}% + % + % Non-English letters. + \def\AA{AA}% + \def\AE{AE}% + \def\L{L}% + \def\OE{OE}% + \def\O{O}% + \def\aa{aa}% + \def\ae{ae}% + \def\l{l}% + \def\oe{oe}% + \def\o{o}% + \def\ss{ss}% + \def\exclamdown{!}% + \def\questiondown{?}% + \def\ordf{a}% + \def\ordm{o}% + % + \def\LaTeX{LaTeX}% + \def\TeX{TeX}% + % + % Assorted special characters. + % (The following {} will end up in the sort string, but that's ok.) + \def\bullet{bullet}% + \def\copyright{copyright}% + \def\registeredsymbol{R}% + \def\dots{...}% + \def\enddots{...}% + \def\equiv{==}% + \def\error{error}% + \def\expansion{==>}% + \def\minus{-}% + \def\pounds{pounds}% + \def\point{.}% + \def\print{-|}% + \def\result{=>}% +} + +\let\indexbackslash=0 %overridden during \printindex. +\let\SETmarginindex=\relax % put index entries in margin (undocumented)? + +% Most index entries go through here, but \dosubind is the general case. +% #1 is the index name, #2 is the entry text. +\def\doind#1#2{\dosubind{#1}{#2}{}} + +% Workhorse for all \fooindexes. +% #1 is name of index, #2 is stuff to put there, #3 is subentry -- +% empty if called from \doind, as we usually are (the main exception +% is with most defuns, which call us directly). +% +\def\dosubind#1#2#3{% + \iflinks + {% + % Store the main index entry text (including the third arg). + \toks0 = {#2}% + % If third arg is present, precede it with a space. + \def\thirdarg{#3}% + \ifx\thirdarg\empty \else + \toks0 = \expandafter{\the\toks0 \space #3}% + \fi + % + \edef\writeto{\csname#1indfile\endcsname}% + % + \ifvmode + \dosubindsanitize + \else + \dosubindwrite + \fi + }% + \fi +} + +% Write the entry in \toks0 to the index file: +% +\def\dosubindwrite{% + % Put the index entry in the margin if desired. + \ifx\SETmarginindex\relax\else + \insert\margin{\hbox{\vrule height8pt depth3pt width0pt \the\toks0}}% + \fi + % + % Remember, we are within a group. + \indexdummies % Must do this here, since \bf, etc expand at this stage + \escapechar=`\\ + \def\backslashcurfont{\indexbackslash}% \indexbackslash isn't defined now + % so it will be output as is; and it will print as backslash. + % + % Process the index entry with all font commands turned off, to + % get the string to sort by. + {\indexnofonts + \edef\temp{\the\toks0}% need full expansion + \xdef\indexsorttmp{\temp}% + }% + % + % Set up the complete index entry, with both the sort key and + % the original text, including any font commands. We write + % three arguments to \entry to the .?? file (four in the + % subentry case), texindex reduces to two when writing the .??s + % sorted result. + \edef\temp{% + \write\writeto{% + \string\entry{\indexsorttmp}{\noexpand\folio}{\the\toks0}}% + }% + \temp +} + +% Take care of unwanted page breaks: +% +% If a skip is the last thing on the list now, preserve it +% by backing up by \lastskip, doing the \write, then inserting +% the skip again. Otherwise, the whatsit generated by the +% \write will make \lastskip zero. The result is that sequences +% like this: +% @end defun +% @tindex whatever +% @defun ... +% will have extra space inserted, because the \medbreak in the +% start of the @defun won't see the skip inserted by the @end of +% the previous defun. +% +% But don't do any of this if we're not in vertical mode. We +% don't want to do a \vskip and prematurely end a paragraph. +% +% Avoid page breaks due to these extra skips, too. +% +% But wait, there is a catch there: +% We'll have to check whether \lastskip is zero skip. \ifdim is not +% sufficient for this purpose, as it ignores stretch and shrink parts +% of the skip. The only way seems to be to check the textual +% representation of the skip. +% +% The following is almost like \def\zeroskipmacro{0.0pt} except that +% the ``p'' and ``t'' characters have catcode \other, not 11 (letter). +% +\edef\zeroskipmacro{\expandafter\the\csname z@skip\endcsname} +% +% ..., ready, GO: +% +\def\dosubindsanitize{% + % \lastskip and \lastpenalty cannot both be nonzero simultaneously. + \skip0 = \lastskip + \edef\lastskipmacro{\the\lastskip}% + \count255 = \lastpenalty + % + % If \lastskip is nonzero, that means the last item was a + % skip. And since a skip is discardable, that means this + % -\skip0 glue we're inserting is preceded by a + % non-discardable item, therefore it is not a potential + % breakpoint, therefore no \nobreak needed. + \ifx\lastskipmacro\zeroskipmacro + \else + \vskip-\skip0 + \fi + % + \dosubindwrite + % + \ifx\lastskipmacro\zeroskipmacro + % if \lastskip was zero, perhaps the last item was a + % penalty, and perhaps it was >=10000, e.g., a \nobreak. + % In that case, we want to re-insert the penalty; since we + % just inserted a non-discardable item, any following glue + % (such as a \parskip) would be a breakpoint. For example: + % @deffn deffn-whatever + % @vindex index-whatever + % Description. + % would allow a break between the index-whatever whatsit + % and the "Description." paragraph. + \ifnum\count255>9999 \nobreak \fi + \else + % On the other hand, if we had a nonzero \lastskip, + % this make-up glue would be preceded by a non-discardable item + % (the whatsit from the \write), so we must insert a \nobreak. + \nobreak\vskip\skip0 + \fi +} + +% The index entry written in the file actually looks like +% \entry {sortstring}{page}{topic} +% or +% \entry {sortstring}{page}{topic}{subtopic} +% The texindex program reads in these files and writes files +% containing these kinds of lines: +% \initial {c} +% before the first topic whose initial is c +% \entry {topic}{pagelist} +% for a topic that is used without subtopics +% \primary {topic} +% for the beginning of a topic that is used with subtopics +% \secondary {subtopic}{pagelist} +% for each subtopic. + +% Define the user-accessible indexing commands +% @findex, @vindex, @kindex, @cindex. + +\def\findex {\fnindex} +\def\kindex {\kyindex} +\def\cindex {\cpindex} +\def\vindex {\vrindex} +\def\tindex {\tpindex} +\def\pindex {\pgindex} + +\def\cindexsub {\begingroup\obeylines\cindexsub} +{\obeylines % +\gdef\cindexsub "#1" #2^^M{\endgroup % +\dosubind{cp}{#2}{#1}}} + +% Define the macros used in formatting output of the sorted index material. + +% @printindex causes a particular index (the ??s file) to get printed. +% It does not print any chapter heading (usually an @unnumbered). +% +\parseargdef\printindex{\begingroup + \dobreak \chapheadingskip{10000}% + % + \smallfonts \rm + \tolerance = 9500 + \everypar = {}% don't want the \kern\-parindent from indentation suppression. + % + % See if the index file exists and is nonempty. + % Change catcode of @ here so that if the index file contains + % \initial {@} + % as its first line, TeX doesn't complain about mismatched braces + % (because it thinks @} is a control sequence). + \catcode`\@ = 11 + \openin 1 \jobname.#1s + \ifeof 1 + % \enddoublecolumns gets confused if there is no text in the index, + % and it loses the chapter title and the aux file entries for the + % index. The easiest way to prevent this problem is to make sure + % there is some text. + \putwordIndexNonexistent + \else + % + % If the index file exists but is empty, then \openin leaves \ifeof + % false. We have to make TeX try to read something from the file, so + % it can discover if there is anything in it. + \read 1 to \temp + \ifeof 1 + \putwordIndexIsEmpty + \else + % Index files are almost Texinfo source, but we use \ as the escape + % character. It would be better to use @, but that's too big a change + % to make right now. + \def\indexbackslash{\backslashcurfont}% + \catcode`\\ = 0 + \escapechar = `\\ + \begindoublecolumns + \input \jobname.#1s + \enddoublecolumns + \fi + \fi + \closein 1 +\endgroup} + +% These macros are used by the sorted index file itself. +% Change them to control the appearance of the index. + +\def\initial#1{{% + % Some minor font changes for the special characters. + \let\tentt=\sectt \let\tt=\sectt \let\sf=\sectt + % + % Remove any glue we may have, we'll be inserting our own. + \removelastskip + % + % We like breaks before the index initials, so insert a bonus. + \penalty -300 + % + % Typeset the initial. Making this add up to a whole number of + % baselineskips increases the chance of the dots lining up from column + % to column. It still won't often be perfect, because of the stretch + % we need before each entry, but it's better. + % + % No shrink because it confuses \balancecolumns. + \vskip 1.67\baselineskip plus .5\baselineskip + \leftline{\secbf #1}% + \vskip .33\baselineskip plus .1\baselineskip + % + % Do our best not to break after the initial. + \nobreak +}} + +% \entry typesets a paragraph consisting of the text (#1), dot leaders, and +% then page number (#2) flushed to the right margin. It is used for index +% and table of contents entries. The paragraph is indented by \leftskip. +% +% A straightforward implementation would start like this: +% \def\entry#1#2{... +% But this frozes the catcodes in the argument, and can cause problems to +% @code, which sets - active. This problem was fixed by a kludge--- +% ``-'' was active throughout whole index, but this isn't really right. +% +% The right solution is to prevent \entry from swallowing the whole text. +% --kasal, 21nov03 +\def\entry{% + \begingroup + % + % Start a new paragraph if necessary, so our assignments below can't + % affect previous text. + \par + % + % Do not fill out the last line with white space. + \parfillskip = 0in + % + % No extra space above this paragraph. + \parskip = 0in + % + % Do not prefer a separate line ending with a hyphen to fewer lines. + \finalhyphendemerits = 0 + % + % \hangindent is only relevant when the entry text and page number + % don't both fit on one line. In that case, bob suggests starting the + % dots pretty far over on the line. Unfortunately, a large + % indentation looks wrong when the entry text itself is broken across + % lines. So we use a small indentation and put up with long leaders. + % + % \hangafter is reset to 1 (which is the value we want) at the start + % of each paragraph, so we need not do anything with that. + \hangindent = 2em + % + % When the entry text needs to be broken, just fill out the first line + % with blank space. + \rightskip = 0pt plus1fil + % + % A bit of stretch before each entry for the benefit of balancing + % columns. + \vskip 0pt plus1pt + % + % Swallow the left brace of the text (first parameter): + \afterassignment\doentry + \let\temp = +} +\def\doentry{% + \bgroup % Instead of the swallowed brace. + \noindent + \aftergroup\finishentry + % And now comes the text of the entry. +} +\def\finishentry#1{% + % #1 is the page number. + % + % The following is kludged to not output a line of dots in the index if + % there are no page numbers. The next person who breaks this will be + % cursed by a Unix daemon. + \def\tempa{{\rm }}% + \def\tempb{#1}% + \edef\tempc{\tempa}% + \edef\tempd{\tempb}% + \ifx\tempc\tempd + \ % + \else + % + % If we must, put the page number on a line of its own, and fill out + % this line with blank space. (The \hfil is overwhelmed with the + % fill leaders glue in \indexdotfill if the page number does fit.) + \hfil\penalty50 + \null\nobreak\indexdotfill % Have leaders before the page number. + % + % The `\ ' here is removed by the implicit \unskip that TeX does as + % part of (the primitive) \par. Without it, a spurious underfull + % \hbox ensues. + \ifpdf + \pdfgettoks#1.% + \ \the\toksA + \else + \ #1% + \fi + \fi + \par + \endgroup +} + +% Like \dotfill except takes at least 1 em. +\def\indexdotfill{\cleaders + \hbox{$\mathsurround=0pt \mkern1.5mu ${\it .}$ \mkern1.5mu$}\hskip 1em plus 1fill} + +\def\primary #1{\line{#1\hfil}} + +\newskip\secondaryindent \secondaryindent=0.5cm +\def\secondary#1#2{{% + \parfillskip=0in + \parskip=0in + \hangindent=1in + \hangafter=1 + \noindent\hskip\secondaryindent\hbox{#1}\indexdotfill + \ifpdf + \pdfgettoks#2.\ \the\toksA % The page number ends the paragraph. + \else + #2 + \fi + \par +}} + +% Define two-column mode, which we use to typeset indexes. +% Adapted from the TeXbook, page 416, which is to say, +% the manmac.tex format used to print the TeXbook itself. +\catcode`\@=11 + +\newbox\partialpage +\newdimen\doublecolumnhsize + +\def\begindoublecolumns{\begingroup % ended by \enddoublecolumns + % Grab any single-column material above us. + \output = {% + % + % Here is a possibility not foreseen in manmac: if we accumulate a + % whole lot of material, we might end up calling this \output + % routine twice in a row (see the doublecol-lose test, which is + % essentially a couple of indexes with @setchapternewpage off). In + % that case we just ship out what is in \partialpage with the normal + % output routine. Generally, \partialpage will be empty when this + % runs and this will be a no-op. See the indexspread.tex test case. + \ifvoid\partialpage \else + \onepageout{\pagecontents\partialpage}% + \fi + % + \global\setbox\partialpage = \vbox{% + % Unvbox the main output page. + \unvbox\PAGE + \kern-\topskip \kern\baselineskip + }% + }% + \eject % run that output routine to set \partialpage + % + % Use the double-column output routine for subsequent pages. + \output = {\doublecolumnout}% + % + % Change the page size parameters. We could do this once outside this + % routine, in each of @smallbook, @afourpaper, and the default 8.5x11 + % format, but then we repeat the same computation. Repeating a couple + % of assignments once per index is clearly meaningless for the + % execution time, so we may as well do it in one place. + % + % First we halve the line length, less a little for the gutter between + % the columns. We compute the gutter based on the line length, so it + % changes automatically with the paper format. The magic constant + % below is chosen so that the gutter has the same value (well, +-<1pt) + % as it did when we hard-coded it. + % + % We put the result in a separate register, \doublecolumhsize, so we + % can restore it in \pagesofar, after \hsize itself has (potentially) + % been clobbered. + % + \doublecolumnhsize = \hsize + \advance\doublecolumnhsize by -.04154\hsize + \divide\doublecolumnhsize by 2 + \hsize = \doublecolumnhsize + % + % Double the \vsize as well. (We don't need a separate register here, + % since nobody clobbers \vsize.) + \vsize = 2\vsize +} + +% The double-column output routine for all double-column pages except +% the last. +% +\def\doublecolumnout{% + \splittopskip=\topskip \splitmaxdepth=\maxdepth + % Get the available space for the double columns -- the normal + % (undoubled) page height minus any material left over from the + % previous page. + \dimen@ = \vsize + \divide\dimen@ by 2 + \advance\dimen@ by -\ht\partialpage + % + % box0 will be the left-hand column, box2 the right. + \setbox0=\vsplit255 to\dimen@ \setbox2=\vsplit255 to\dimen@ + \onepageout\pagesofar + \unvbox255 + \penalty\outputpenalty +} +% +% Re-output the contents of the output page -- any previous material, +% followed by the two boxes we just split, in box0 and box2. +\def\pagesofar{% + \unvbox\partialpage + % + \hsize = \doublecolumnhsize + \wd0=\hsize \wd2=\hsize + \hbox to\pagewidth{\box0\hfil\box2}% +} +% +% All done with double columns. +\def\enddoublecolumns{% + \output = {% + % Split the last of the double-column material. Leave it on the + % current page, no automatic page break. + \balancecolumns + % + % If we end up splitting too much material for the current page, + % though, there will be another page break right after this \output + % invocation ends. Having called \balancecolumns once, we do not + % want to call it again. Therefore, reset \output to its normal + % definition right away. (We hope \balancecolumns will never be + % called on to balance too much material, but if it is, this makes + % the output somewhat more palatable.) + \global\output = {\onepageout{\pagecontents\PAGE}}% + }% + \eject + \endgroup % started in \begindoublecolumns + % + % \pagegoal was set to the doubled \vsize above, since we restarted + % the current page. We're now back to normal single-column + % typesetting, so reset \pagegoal to the normal \vsize (after the + % \endgroup where \vsize got restored). + \pagegoal = \vsize +} +% +% Called at the end of the double column material. +\def\balancecolumns{% + \setbox0 = \vbox{\unvbox255}% like \box255 but more efficient, see p.120. + \dimen@ = \ht0 + \advance\dimen@ by \topskip + \advance\dimen@ by-\baselineskip + \divide\dimen@ by 2 % target to split to + %debug\message{final 2-column material height=\the\ht0, target=\the\dimen@.}% + \splittopskip = \topskip + % Loop until we get a decent breakpoint. + {% + \vbadness = 10000 + \loop + \global\setbox3 = \copy0 + \global\setbox1 = \vsplit3 to \dimen@ + \ifdim\ht3>\dimen@ + \global\advance\dimen@ by 1pt + \repeat + }% + %debug\message{split to \the\dimen@, column heights: \the\ht1, \the\ht3.}% + \setbox0=\vbox to\dimen@{\unvbox1}% + \setbox2=\vbox to\dimen@{\unvbox3}% + % + \pagesofar +} +\catcode`\@ = \other + + +\message{sectioning,} +% Chapters, sections, etc. + +% \unnumberedno is an oxymoron, of course. But we count the unnumbered +% sections so that we can refer to them unambiguously in the pdf +% outlines by their "section number". We avoid collisions with chapter +% numbers by starting them at 10000. (If a document ever has 10000 +% chapters, we're in trouble anyway, I'm sure.) +\newcount\unnumberedno \unnumberedno = 10000 +\newcount\chapno +\newcount\secno \secno=0 +\newcount\subsecno \subsecno=0 +\newcount\subsubsecno \subsubsecno=0 + +% This counter is funny since it counts through charcodes of letters A, B, ... +\newcount\appendixno \appendixno = `\@ +% +% \def\appendixletter{\char\the\appendixno} +% We do the following ugly conditional instead of the above simple +% construct for the sake of pdftex, which needs the actual +% letter in the expansion, not just typeset. +% +\def\appendixletter{% + \ifnum\appendixno=`A A% + \else\ifnum\appendixno=`B B% + \else\ifnum\appendixno=`C C% + \else\ifnum\appendixno=`D D% + \else\ifnum\appendixno=`E E% + \else\ifnum\appendixno=`F F% + \else\ifnum\appendixno=`G G% + \else\ifnum\appendixno=`H H% + \else\ifnum\appendixno=`I I% + \else\ifnum\appendixno=`J J% + \else\ifnum\appendixno=`K K% + \else\ifnum\appendixno=`L L% + \else\ifnum\appendixno=`M M% + \else\ifnum\appendixno=`N N% + \else\ifnum\appendixno=`O O% + \else\ifnum\appendixno=`P P% + \else\ifnum\appendixno=`Q Q% + \else\ifnum\appendixno=`R R% + \else\ifnum\appendixno=`S S% + \else\ifnum\appendixno=`T T% + \else\ifnum\appendixno=`U U% + \else\ifnum\appendixno=`V V% + \else\ifnum\appendixno=`W W% + \else\ifnum\appendixno=`X X% + \else\ifnum\appendixno=`Y Y% + \else\ifnum\appendixno=`Z Z% + % The \the is necessary, despite appearances, because \appendixletter is + % expanded while writing the .toc file. \char\appendixno is not + % expandable, thus it is written literally, thus all appendixes come out + % with the same letter (or @) in the toc without it. + \else\char\the\appendixno + \fi\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi + \fi\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi\fi} + +% Each @chapter defines this as the name of the chapter. +% page headings and footings can use it. @section does likewise. +% However, they are not reliable, because we don't use marks. +\def\thischapter{} +\def\thissection{} + +\newcount\absseclevel % used to calculate proper heading level +\newcount\secbase\secbase=0 % @raisesections/@lowersections modify this count + +% @raisesections: treat @section as chapter, @subsection as section, etc. +\def\raisesections{\global\advance\secbase by -1} +\let\up=\raisesections % original BFox name + +% @lowersections: treat @chapter as section, @section as subsection, etc. +\def\lowersections{\global\advance\secbase by 1} +\let\down=\lowersections % original BFox name + +% we only have subsub. +\chardef\maxseclevel = 3 +% +% A numbered section within an unnumbered changes to unnumbered too. +% To achive this, remember the "biggest" unnum. sec. we are currently in: +\chardef\unmlevel = \maxseclevel +% +% Trace whether the current chapter is an appendix or not: +% \chapheadtype is "N" or "A", unnumbered chapters are ignored. +\def\chapheadtype{N} + +% Choose a heading macro +% #1 is heading type +% #2 is heading level +% #3 is text for heading +\def\genhead#1#2#3{% + % Compute the abs. sec. level: + \absseclevel=#2 + \advance\absseclevel by \secbase + % Make sure \absseclevel doesn't fall outside the range: + \ifnum \absseclevel < 0 + \absseclevel = 0 + \else + \ifnum \absseclevel > 3 + \absseclevel = 3 + \fi + \fi + % The heading type: + \def\headtype{#1}% + \if \headtype U% + \ifnum \absseclevel < \unmlevel + \chardef\unmlevel = \absseclevel + \fi + \else + % Check for appendix sections: + \ifnum \absseclevel = 0 + \edef\chapheadtype{\headtype}% + \else + \if \headtype A\if \chapheadtype N% + \errmessage{@appendix... within a non-appendix chapter}% + \fi\fi + \fi + % Check for numbered within unnumbered: + \ifnum \absseclevel > \unmlevel + \def\headtype{U}% + \else + \chardef\unmlevel = 3 + \fi + \fi + % Now print the heading: + \if \headtype U% + \ifcase\absseclevel + \unnumberedzzz{#3}% + \or \unnumberedseczzz{#3}% + \or \unnumberedsubseczzz{#3}% + \or \unnumberedsubsubseczzz{#3}% + \fi + \else + \if \headtype A% + \ifcase\absseclevel + \appendixzzz{#3}% + \or \appendixsectionzzz{#3}% + \or \appendixsubseczzz{#3}% + \or \appendixsubsubseczzz{#3}% + \fi + \else + \ifcase\absseclevel + \chapterzzz{#3}% + \or \seczzz{#3}% + \or \numberedsubseczzz{#3}% + \or \numberedsubsubseczzz{#3}% + \fi + \fi + \fi + \suppressfirstparagraphindent +} + +% an interface: +\def\numhead{\genhead N} +\def\apphead{\genhead A} +\def\unnmhead{\genhead U} + +% @chapter, @appendix, @unnumbered. Increment top-level counter, reset +% all lower-level sectioning counters to zero. +% +% Also set \chaplevelprefix, which we prepend to @float sequence numbers +% (e.g., figures), q.v. By default (before any chapter), that is empty. +\let\chaplevelprefix = \empty +% +\outer\parseargdef\chapter{\numhead0{#1}} % normally numhead0 calls chapterzzz +\def\chapterzzz#1{% + % section resetting is \global in case the chapter is in a group, such + % as an @include file. + \global\secno=0 \global\subsecno=0 \global\subsubsecno=0 + \global\advance\chapno by 1 + % + % Used for \float. + \gdef\chaplevelprefix{\the\chapno.}% + \resetallfloatnos + % + \message{\putwordChapter\space \the\chapno}% + % + % Write the actual heading. + \chapmacro{#1}{Ynumbered}{\the\chapno}% + % + % So @section and the like are numbered underneath this chapter. + \global\let\section = \numberedsec + \global\let\subsection = \numberedsubsec + \global\let\subsubsection = \numberedsubsubsec +} + +\outer\parseargdef\appendix{\apphead0{#1}} % normally apphead0 calls appendixzzz +\def\appendixzzz#1{% + \global\secno=0 \global\subsecno=0 \global\subsubsecno=0 + \global\advance\appendixno by 1 + \gdef\chaplevelprefix{\appendixletter.}% + \resetallfloatnos + % + \def\appendixnum{\putwordAppendix\space \appendixletter}% + \message{\appendixnum}% + % + \chapmacro{#1}{Yappendix}{\appendixletter}% + % + \global\let\section = \appendixsec + \global\let\subsection = \appendixsubsec + \global\let\subsubsection = \appendixsubsubsec +} + +\outer\parseargdef\unnumbered{\unnmhead0{#1}} % normally unnmhead0 calls unnumberedzzz +\def\unnumberedzzz#1{% + \global\secno=0 \global\subsecno=0 \global\subsubsecno=0 + \global\advance\unnumberedno by 1 + % + % Since an unnumbered has no number, no prefix for figures. + \global\let\chaplevelprefix = \empty + \resetallfloatnos + % + % This used to be simply \message{#1}, but TeX fully expands the + % argument to \message. Therefore, if #1 contained @-commands, TeX + % expanded them. For example, in `@unnumbered The @cite{Book}', TeX + % expanded @cite (which turns out to cause errors because \cite is meant + % to be executed, not expanded). + % + % Anyway, we don't want the fully-expanded definition of @cite to appear + % as a result of the \message, we just want `@cite' itself. We use + % \the to achieve this: TeX expands \the only once, + % simply yielding the contents of . (We also do this for + % the toc entries.) + \toks0 = {#1}% + \message{(\the\toks0)}% + % + \chapmacro{#1}{Ynothing}{\the\unnumberedno}% + % + \global\let\section = \unnumberedsec + \global\let\subsection = \unnumberedsubsec + \global\let\subsubsection = \unnumberedsubsubsec +} + +% @centerchap is like @unnumbered, but the heading is centered. +\outer\parseargdef\centerchap{% + % Well, we could do the following in a group, but that would break + % an assumption that \chapmacro is called at the outermost level. + % Thus we are safer this way: --kasal, 24feb04 + \let\centerparametersmaybe = \centerparameters + \unnmhead0{#1}% + \let\centerparametersmaybe = \relax +} + +% @top is like @unnumbered. +\let\top\unnumbered + +% Sections. +\outer\parseargdef\numberedsec{\numhead1{#1}} % normally calls seczzz +\def\seczzz#1{% + \global\subsecno=0 \global\subsubsecno=0 \global\advance\secno by 1 + \sectionheading{#1}{sec}{Ynumbered}{\the\chapno.\the\secno}% +} + +\outer\parseargdef\appendixsection{\apphead1{#1}} % normally calls appendixsectionzzz +\def\appendixsectionzzz#1{% + \global\subsecno=0 \global\subsubsecno=0 \global\advance\secno by 1 + \sectionheading{#1}{sec}{Yappendix}{\appendixletter.\the\secno}% +} +\let\appendixsec\appendixsection + +\outer\parseargdef\unnumberedsec{\unnmhead1{#1}} % normally calls unnumberedseczzz +\def\unnumberedseczzz#1{% + \global\subsecno=0 \global\subsubsecno=0 \global\advance\secno by 1 + \sectionheading{#1}{sec}{Ynothing}{\the\unnumberedno.\the\secno}% +} + +% Subsections. +\outer\parseargdef\numberedsubsec{\numhead2{#1}} % normally calls numberedsubseczzz +\def\numberedsubseczzz#1{% + \global\subsubsecno=0 \global\advance\subsecno by 1 + \sectionheading{#1}{subsec}{Ynumbered}{\the\chapno.\the\secno.\the\subsecno}% +} + +\outer\parseargdef\appendixsubsec{\apphead2{#1}} % normally calls appendixsubseczzz +\def\appendixsubseczzz#1{% + \global\subsubsecno=0 \global\advance\subsecno by 1 + \sectionheading{#1}{subsec}{Yappendix}% + {\appendixletter.\the\secno.\the\subsecno}% +} + +\outer\parseargdef\unnumberedsubsec{\unnmhead2{#1}} %normally calls unnumberedsubseczzz +\def\unnumberedsubseczzz#1{% + \global\subsubsecno=0 \global\advance\subsecno by 1 + \sectionheading{#1}{subsec}{Ynothing}% + {\the\unnumberedno.\the\secno.\the\subsecno}% +} + +% Subsubsections. +\outer\parseargdef\numberedsubsubsec{\numhead3{#1}} % normally numberedsubsubseczzz +\def\numberedsubsubseczzz#1{% + \global\advance\subsubsecno by 1 + \sectionheading{#1}{subsubsec}{Ynumbered}% + {\the\chapno.\the\secno.\the\subsecno.\the\subsubsecno}% +} + +\outer\parseargdef\appendixsubsubsec{\apphead3{#1}} % normally appendixsubsubseczzz +\def\appendixsubsubseczzz#1{% + \global\advance\subsubsecno by 1 + \sectionheading{#1}{subsubsec}{Yappendix}% + {\appendixletter.\the\secno.\the\subsecno.\the\subsubsecno}% +} + +\outer\parseargdef\unnumberedsubsubsec{\unnmhead3{#1}} %normally unnumberedsubsubseczzz +\def\unnumberedsubsubseczzz#1{% + \global\advance\subsubsecno by 1 + \sectionheading{#1}{subsubsec}{Ynothing}% + {\the\unnumberedno.\the\secno.\the\subsecno.\the\subsubsecno}% +} + +% These macros control what the section commands do, according +% to what kind of chapter we are in (ordinary, appendix, or unnumbered). +% Define them by default for a numbered chapter. +\let\section = \numberedsec +\let\subsection = \numberedsubsec +\let\subsubsection = \numberedsubsubsec + +% Define @majorheading, @heading and @subheading + +% NOTE on use of \vbox for chapter headings, section headings, and such: +% 1) We use \vbox rather than the earlier \line to permit +% overlong headings to fold. +% 2) \hyphenpenalty is set to 10000 because hyphenation in a +% heading is obnoxious; this forbids it. +% 3) Likewise, headings look best if no \parindent is used, and +% if justification is not attempted. Hence \raggedright. + + +\def\majorheading{% + {\advance\chapheadingskip by 10pt \chapbreak }% + \parsearg\chapheadingzzz +} + +\def\chapheading{\chapbreak \parsearg\chapheadingzzz} +\def\chapheadingzzz#1{% + {\chapfonts \vbox{\hyphenpenalty=10000\tolerance=5000 + \parindent=0pt\raggedright + \rm #1\hfill}}% + \bigskip \par\penalty 200\relax + \suppressfirstparagraphindent +} + +% @heading, @subheading, @subsubheading. +\parseargdef\heading{\sectionheading{#1}{sec}{Yomitfromtoc}{} + \suppressfirstparagraphindent} +\parseargdef\subheading{\sectionheading{#1}{subsec}{Yomitfromtoc}{} + \suppressfirstparagraphindent} +\parseargdef\subsubheading{\sectionheading{#1}{subsubsec}{Yomitfromtoc}{} + \suppressfirstparagraphindent} + +% These macros generate a chapter, section, etc. heading only +% (including whitespace, linebreaking, etc. around it), +% given all the information in convenient, parsed form. + +%%% Args are the skip and penalty (usually negative) +\def\dobreak#1#2{\par\ifdim\lastskip<#1\removelastskip\penalty#2\vskip#1\fi} + +%%% Define plain chapter starts, and page on/off switching for it +% Parameter controlling skip before chapter headings (if needed) + +\newskip\chapheadingskip + +\def\chapbreak{\dobreak \chapheadingskip {-4000}} +\def\chappager{\par\vfill\supereject} +\def\chapoddpage{\chappager \ifodd\pageno \else \hbox to 0pt{} \chappager\fi} + +\def\setchapternewpage #1 {\csname CHAPPAG#1\endcsname} + +\def\CHAPPAGoff{% +\global\let\contentsalignmacro = \chappager +\global\let\pchapsepmacro=\chapbreak +\global\let\pagealignmacro=\chappager} + +\def\CHAPPAGon{% +\global\let\contentsalignmacro = \chappager +\global\let\pchapsepmacro=\chappager +\global\let\pagealignmacro=\chappager +\global\def\HEADINGSon{\HEADINGSsingle}} + +\def\CHAPPAGodd{% +\global\let\contentsalignmacro = \chapoddpage +\global\let\pchapsepmacro=\chapoddpage +\global\let\pagealignmacro=\chapoddpage +\global\def\HEADINGSon{\HEADINGSdouble}} + +\CHAPPAGon + +% Chapter opening. +% +% #1 is the text, #2 is the section type (Ynumbered, Ynothing, +% Yappendix, Yomitfromtoc), #3 the chapter number. +% +% To test against our argument. +\def\Ynothingkeyword{Ynothing} +\def\Yomitfromtockeyword{Yomitfromtoc} +\def\Yappendixkeyword{Yappendix} +% +\def\chapmacro#1#2#3{% + \pchapsepmacro + {% + \chapfonts \rm + % + % Have to define \thissection before calling \donoderef, because the + % xref code eventually uses it. On the other hand, it has to be called + % after \pchapsepmacro, or the headline will change too soon. + \gdef\thissection{#1}% + \gdef\thischaptername{#1}% + % + % Only insert the separating space if we have a chapter/appendix + % number, and don't print the unnumbered ``number''. + \def\temptype{#2}% + \ifx\temptype\Ynothingkeyword + \setbox0 = \hbox{}% + \def\toctype{unnchap}% + \def\thischapter{#1}% + \else\ifx\temptype\Yomitfromtockeyword + \setbox0 = \hbox{}% contents like unnumbered, but no toc entry + \def\toctype{omit}% + \xdef\thischapter{}% + \else\ifx\temptype\Yappendixkeyword + \setbox0 = \hbox{\putwordAppendix{} #3\enspace}% + \def\toctype{app}% + % We don't substitute the actual chapter name into \thischapter + % because we don't want its macros evaluated now. And we don't + % use \thissection because that changes with each section. + % + \xdef\thischapter{\putwordAppendix{} \appendixletter: + \noexpand\thischaptername}% + \else + \setbox0 = \hbox{#3\enspace}% + \def\toctype{numchap}% + \xdef\thischapter{\putwordChapter{} \the\chapno: + \noexpand\thischaptername}% + \fi\fi\fi + % + % Write the toc entry for this chapter. Must come before the + % \donoderef, because we include the current node name in the toc + % entry, and \donoderef resets it to empty. + \writetocentry{\toctype}{#1}{#3}% + % + % For pdftex, we have to write out the node definition (aka, make + % the pdfdest) after any page break, but before the actual text has + % been typeset. If the destination for the pdf outline is after the + % text, then jumping from the outline may wind up with the text not + % being visible, for instance under high magnification. + \donoderef{#2}% + % + % Typeset the actual heading. + \vbox{\hyphenpenalty=10000 \tolerance=5000 \parindent=0pt \raggedright + \hangindent=\wd0 \centerparametersmaybe + \unhbox0 #1\par}% + }% + \nobreak\bigskip % no page break after a chapter title + \nobreak +} + +% @centerchap -- centered and unnumbered. +\let\centerparametersmaybe = \relax +\def\centerparameters{% + \advance\rightskip by 3\rightskip + \leftskip = \rightskip + \parfillskip = 0pt +} + + +% I don't think this chapter style is supported any more, so I'm not +% updating it with the new noderef stuff. We'll see. --karl, 11aug03. +% +\def\setchapterstyle #1 {\csname CHAPF#1\endcsname} +% +\def\unnchfopen #1{% +\chapoddpage {\chapfonts \vbox{\hyphenpenalty=10000\tolerance=5000 + \parindent=0pt\raggedright + \rm #1\hfill}}\bigskip \par\nobreak +} +\def\chfopen #1#2{\chapoddpage {\chapfonts +\vbox to 3in{\vfil \hbox to\hsize{\hfil #2} \hbox to\hsize{\hfil #1} \vfil}}% +\par\penalty 5000 % +} +\def\centerchfopen #1{% +\chapoddpage {\chapfonts \vbox{\hyphenpenalty=10000\tolerance=5000 + \parindent=0pt + \hfill {\rm #1}\hfill}}\bigskip \par\nobreak +} +\def\CHAPFopen{% + \global\let\chapmacro=\chfopen + \global\let\centerchapmacro=\centerchfopen} + + +% Section titles. These macros combine the section number parts and +% call the generic \sectionheading to do the printing. +% +\newskip\secheadingskip +\def\secheadingbreak{\dobreak \secheadingskip{-1000}} + +% Subsection titles. +\newskip\subsecheadingskip +\def\subsecheadingbreak{\dobreak \subsecheadingskip{-500}} + +% Subsubsection titles. +\def\subsubsecheadingskip{\subsecheadingskip} +\def\subsubsecheadingbreak{\subsecheadingbreak} + + +% Print any size, any type, section title. +% +% #1 is the text, #2 is the section level (sec/subsec/subsubsec), #3 is +% the section type for xrefs (Ynumbered, Ynothing, Yappendix), #4 is the +% section number. +% +\def\sectionheading#1#2#3#4{% + {% + % Switch to the right set of fonts. + \csname #2fonts\endcsname \rm + % + % Insert space above the heading. + \csname #2headingbreak\endcsname + % + % Only insert the space after the number if we have a section number. + \def\sectionlevel{#2}% + \def\temptype{#3}% + % + \ifx\temptype\Ynothingkeyword + \setbox0 = \hbox{}% + \def\toctype{unn}% + \gdef\thissection{#1}% + \else\ifx\temptype\Yomitfromtockeyword + % for @headings -- no section number, don't include in toc, + % and don't redefine \thissection. + \setbox0 = \hbox{}% + \def\toctype{omit}% + \let\sectionlevel=\empty + \else\ifx\temptype\Yappendixkeyword + \setbox0 = \hbox{#4\enspace}% + \def\toctype{app}% + \gdef\thissection{#1}% + \else + \setbox0 = \hbox{#4\enspace}% + \def\toctype{num}% + \gdef\thissection{#1}% + \fi\fi\fi + % + % Write the toc entry (before \donoderef). See comments in \chfplain. + \writetocentry{\toctype\sectionlevel}{#1}{#4}% + % + % Write the node reference (= pdf destination for pdftex). + % Again, see comments in \chfplain. + \donoderef{#3}% + % + % Output the actual section heading. + \vbox{\hyphenpenalty=10000 \tolerance=5000 \parindent=0pt \raggedright + \hangindent=\wd0 % zero if no section number + \unhbox0 #1}% + }% + % Add extra space after the heading -- half of whatever came above it. + % Don't allow stretch, though. + \kern .5 \csname #2headingskip\endcsname + % + % Do not let the kern be a potential breakpoint, as it would be if it + % was followed by glue. + \nobreak + % + % We'll almost certainly start a paragraph next, so don't let that + % glue accumulate. (Not a breakpoint because it's preceded by a + % discardable item.) + \vskip-\parskip + % + % This \nobreak is purely so the last item on the list is a \penalty + % of 10000. This is so other code, for instance \parsebodycommon, can + % check for and avoid allowing breakpoints. Otherwise, it would + % insert a valid breakpoint between: + % @section sec-whatever + % @deffn def-whatever + \nobreak +} + + +\message{toc,} +% Table of contents. +\newwrite\tocfile + +% Write an entry to the toc file, opening it if necessary. +% Called from @chapter, etc. +% +% Example usage: \writetocentry{sec}{Section Name}{\the\chapno.\the\secno} +% We append the current node name (if any) and page number as additional +% arguments for the \{chap,sec,...}entry macros which will eventually +% read this. The node name is used in the pdf outlines as the +% destination to jump to. +% +% We open the .toc file for writing here instead of at @setfilename (or +% any other fixed time) so that @contents can be anywhere in the document. +% But if #1 is `omit', then we don't do anything. This is used for the +% table of contents chapter openings themselves. +% +\newif\iftocfileopened +\def\omitkeyword{omit}% +% +\def\writetocentry#1#2#3{% + \edef\writetoctype{#1}% + \ifx\writetoctype\omitkeyword \else + \iftocfileopened\else + \immediate\openout\tocfile = \jobname.toc + \global\tocfileopenedtrue + \fi + % + \iflinks + \toks0 = {#2}% + \toks2 = \expandafter{\lastnode}% + \edef\temp{\write\tocfile{\realbackslash #1entry{\the\toks0}{#3}% + {\the\toks2}{\noexpand\folio}}}% + \temp + \fi + \fi + % + % Tell \shipout to create a pdf destination on each page, if we're + % writing pdf. These are used in the table of contents. We can't + % just write one on every page because the title pages are numbered + % 1 and 2 (the page numbers aren't printed), and so are the first + % two pages of the document. Thus, we'd have two destinations named + % `1', and two named `2'. + \ifpdf \global\pdfmakepagedesttrue \fi +} + +\newskip\contentsrightmargin \contentsrightmargin=1in +\newcount\savepageno +\newcount\lastnegativepageno \lastnegativepageno = -1 + +% Prepare to read what we've written to \tocfile. +% +\def\startcontents#1{% + % If @setchapternewpage on, and @headings double, the contents should + % start on an odd page, unlike chapters. Thus, we maintain + % \contentsalignmacro in parallel with \pagealignmacro. + % From: Torbjorn Granlund + \contentsalignmacro + \immediate\closeout\tocfile + % + % Don't need to put `Contents' or `Short Contents' in the headline. + % It is abundantly clear what they are. + \def\thischapter{}% + \chapmacro{#1}{Yomitfromtoc}{}% + % + \savepageno = \pageno + \begingroup % Set up to handle contents files properly. + \catcode`\\=0 \catcode`\{=1 \catcode`\}=2 \catcode`\@=11 + % We can't do this, because then an actual ^ in a section + % title fails, e.g., @chapter ^ -- exponentiation. --karl, 9jul97. + %\catcode`\^=7 % to see ^^e4 as \"a etc. juha@piuha.ydi.vtt.fi + \raggedbottom % Worry more about breakpoints than the bottom. + \advance\hsize by -\contentsrightmargin % Don't use the full line length. + % + % Roman numerals for page numbers. + \ifnum \pageno>0 \global\pageno = \lastnegativepageno \fi +} + + +% Normal (long) toc. +\def\contents{% + \startcontents{\putwordTOC}% + \openin 1 \jobname.toc + \ifeof 1 \else + \input \jobname.toc + \fi + \vfill \eject + \contentsalignmacro % in case @setchapternewpage odd is in effect + \ifeof 1 \else + \pdfmakeoutlines + \fi + \closein 1 + \endgroup + \lastnegativepageno = \pageno + \global\pageno = \savepageno +} + +% And just the chapters. +\def\summarycontents{% + \startcontents{\putwordShortTOC}% + % + \let\numchapentry = \shortchapentry + \let\appentry = \shortchapentry + \let\unnchapentry = \shortunnchapentry + % We want a true roman here for the page numbers. + \secfonts + \let\rm=\shortcontrm \let\bf=\shortcontbf + \let\sl=\shortcontsl \let\tt=\shortconttt + \rm + \hyphenpenalty = 10000 + \advance\baselineskip by 1pt % Open it up a little. + \def\numsecentry##1##2##3##4{} + \let\appsecentry = \numsecentry + \let\unnsecentry = \numsecentry + \let\numsubsecentry = \numsecentry + \let\appsubsecentry = \numsecentry + \let\unnsubsecentry = \numsecentry + \let\numsubsubsecentry = \numsecentry + \let\appsubsubsecentry = \numsecentry + \let\unnsubsubsecentry = \numsecentry + \openin 1 \jobname.toc + \ifeof 1 \else + \input \jobname.toc + \fi + \closein 1 + \vfill \eject + \contentsalignmacro % in case @setchapternewpage odd is in effect + \endgroup + \lastnegativepageno = \pageno + \global\pageno = \savepageno +} +\let\shortcontents = \summarycontents + +% Typeset the label for a chapter or appendix for the short contents. +% The arg is, e.g., `A' for an appendix, or `3' for a chapter. +% +\def\shortchaplabel#1{% + % This space should be enough, since a single number is .5em, and the + % widest letter (M) is 1em, at least in the Computer Modern fonts. + % But use \hss just in case. + % (This space doesn't include the extra space that gets added after + % the label; that gets put in by \shortchapentry above.) + % + % We'd like to right-justify chapter numbers, but that looks strange + % with appendix letters. And right-justifying numbers and + % left-justifying letters looks strange when there is less than 10 + % chapters. Have to read the whole toc once to know how many chapters + % there are before deciding ... + \hbox to 1em{#1\hss}% +} + +% These macros generate individual entries in the table of contents. +% The first argument is the chapter or section name. +% The last argument is the page number. +% The arguments in between are the chapter number, section number, ... + +% Chapters, in the main contents. +\def\numchapentry#1#2#3#4{\dochapentry{#2\labelspace#1}{#4}} +% +% Chapters, in the short toc. +% See comments in \dochapentry re vbox and related settings. +\def\shortchapentry#1#2#3#4{% + \tocentry{\shortchaplabel{#2}\labelspace #1}{\doshortpageno\bgroup#4\egroup}% +} + +% Appendices, in the main contents. +% Need the word Appendix, and a fixed-size box. +% +\def\appendixbox#1{% + % We use M since it's probably the widest letter. + \setbox0 = \hbox{\putwordAppendix{} M}% + \hbox to \wd0{\putwordAppendix{} #1\hss}} +% +\def\appentry#1#2#3#4{\dochapentry{\appendixbox{#2}\labelspace#1}{#4}} + +% Unnumbered chapters. +\def\unnchapentry#1#2#3#4{\dochapentry{#1}{#4}} +\def\shortunnchapentry#1#2#3#4{\tocentry{#1}{\doshortpageno\bgroup#4\egroup}} + +% Sections. +\def\numsecentry#1#2#3#4{\dosecentry{#2\labelspace#1}{#4}} +\let\appsecentry=\numsecentry +\def\unnsecentry#1#2#3#4{\dosecentry{#1}{#4}} + +% Subsections. +\def\numsubsecentry#1#2#3#4{\dosubsecentry{#2\labelspace#1}{#4}} +\let\appsubsecentry=\numsubsecentry +\def\unnsubsecentry#1#2#3#4{\dosubsecentry{#1}{#4}} + +% And subsubsections. +\def\numsubsubsecentry#1#2#3#4{\dosubsubsecentry{#2\labelspace#1}{#4}} +\let\appsubsubsecentry=\numsubsubsecentry +\def\unnsubsubsecentry#1#2#3#4{\dosubsubsecentry{#1}{#4}} + +% This parameter controls the indentation of the various levels. +% Same as \defaultparindent. +\newdimen\tocindent \tocindent = 15pt + +% Now for the actual typesetting. In all these, #1 is the text and #2 is the +% page number. +% +% If the toc has to be broken over pages, we want it to be at chapters +% if at all possible; hence the \penalty. +\def\dochapentry#1#2{% + \penalty-300 \vskip1\baselineskip plus.33\baselineskip minus.25\baselineskip + \begingroup + \chapentryfonts + \tocentry{#1}{\dopageno\bgroup#2\egroup}% + \endgroup + \nobreak\vskip .25\baselineskip plus.1\baselineskip +} + +\def\dosecentry#1#2{\begingroup + \secentryfonts \leftskip=\tocindent + \tocentry{#1}{\dopageno\bgroup#2\egroup}% +\endgroup} + +\def\dosubsecentry#1#2{\begingroup + \subsecentryfonts \leftskip=2\tocindent + \tocentry{#1}{\dopageno\bgroup#2\egroup}% +\endgroup} + +\def\dosubsubsecentry#1#2{\begingroup + \subsubsecentryfonts \leftskip=3\tocindent + \tocentry{#1}{\dopageno\bgroup#2\egroup}% +\endgroup} + +% We use the same \entry macro as for the index entries. +\let\tocentry = \entry + +% Space between chapter (or whatever) number and the title. +\def\labelspace{\hskip1em \relax} + +\def\dopageno#1{{\rm #1}} +\def\doshortpageno#1{{\rm #1}} + +\def\chapentryfonts{\secfonts \rm} +\def\secentryfonts{\textfonts} +\def\subsecentryfonts{\textfonts} +\def\subsubsecentryfonts{\textfonts} + + +\message{environments,} +% @foo ... @end foo. + +% @point{}, @result{}, @expansion{}, @print{}, @equiv{}. +% +% Since these characters are used in examples, it should be an even number of +% \tt widths. Each \tt character is 1en, so two makes it 1em. +% +\def\point{$\star$} +\def\result{\leavevmode\raise.15ex\hbox to 1em{\hfil$\Rightarrow$\hfil}} +\def\expansion{\leavevmode\raise.1ex\hbox to 1em{\hfil$\mapsto$\hfil}} +\def\print{\leavevmode\lower.1ex\hbox to 1em{\hfil$\dashv$\hfil}} +\def\equiv{\leavevmode\lower.1ex\hbox to 1em{\hfil$\ptexequiv$\hfil}} + +% The @error{} command. +% Adapted from the TeXbook's \boxit. +% +\newbox\errorbox +% +{\tentt \global\dimen0 = 3em}% Width of the box. +\dimen2 = .55pt % Thickness of rules +% The text. (`r' is open on the right, `e' somewhat less so on the left.) +\setbox0 = \hbox{\kern-.75pt \tensf error\kern-1.5pt} +% +\setbox\errorbox=\hbox to \dimen0{\hfil + \hsize = \dimen0 \advance\hsize by -5.8pt % Space to left+right. + \advance\hsize by -2\dimen2 % Rules. + \vbox{% + \hrule height\dimen2 + \hbox{\vrule width\dimen2 \kern3pt % Space to left of text. + \vtop{\kern2.4pt \box0 \kern2.4pt}% Space above/below. + \kern3pt\vrule width\dimen2}% Space to right. + \hrule height\dimen2} + \hfil} +% +\def\error{\leavevmode\lower.7ex\copy\errorbox} + +% @tex ... @end tex escapes into raw Tex temporarily. +% One exception: @ is still an escape character, so that @end tex works. +% But \@ or @@ will get a plain tex @ character. + +\envdef\tex{% + \catcode `\\=0 \catcode `\{=1 \catcode `\}=2 + \catcode `\$=3 \catcode `\&=4 \catcode `\#=6 + \catcode `\^=7 \catcode `\_=8 \catcode `\~=\active \let~=\tie + \catcode `\%=14 + \catcode `\+=\other + \catcode `\"=\other + \catcode `\|=\other + \catcode `\<=\other + \catcode `\>=\other + \escapechar=`\\ + % + \let\b=\ptexb + \let\bullet=\ptexbullet + \let\c=\ptexc + \let\,=\ptexcomma + \let\.=\ptexdot + \let\dots=\ptexdots + \let\equiv=\ptexequiv + \let\!=\ptexexclam + \let\i=\ptexi + \let\indent=\ptexindent + \let\noindent=\ptexnoindent + \let\{=\ptexlbrace + \let\+=\tabalign + \let\}=\ptexrbrace + \let\/=\ptexslash + \let\*=\ptexstar + \let\t=\ptext + % + \def\endldots{\mathinner{\ldots\ldots\ldots\ldots}}% + \def\enddots{\relax\ifmmode\endldots\else$\mathsurround=0pt \endldots\,$\fi}% + \def\@{@}% +} +% There is no need to define \Etex. + +% Define @lisp ... @end lisp. +% @lisp environment forms a group so it can rebind things, +% including the definition of @end lisp (which normally is erroneous). + +% Amount to narrow the margins by for @lisp. +\newskip\lispnarrowing \lispnarrowing=0.4in + +% This is the definition that ^^M gets inside @lisp, @example, and other +% such environments. \null is better than a space, since it doesn't +% have any width. +\def\lisppar{\null\endgraf} + +% This space is always present above and below environments. +\newskip\envskipamount \envskipamount = 0pt + +% Make spacing and below environment symmetrical. We use \parskip here +% to help in doing that, since in @example-like environments \parskip +% is reset to zero; thus the \afterenvbreak inserts no space -- but the +% start of the next paragraph will insert \parskip. +% +\def\aboveenvbreak{{% + % =10000 instead of <10000 because of a special case in \itemzzz, q.v. + \ifnum \lastpenalty=10000 \else + \advance\envskipamount by \parskip + \endgraf + \ifdim\lastskip<\envskipamount + \removelastskip + % it's not a good place to break if the last penalty was \nobreak + % or better ... + \ifnum\lastpenalty<10000 \penalty-50 \fi + \vskip\envskipamount + \fi + \fi +}} + +\let\afterenvbreak = \aboveenvbreak + +% \nonarrowing is a flag. If "set", @lisp etc don't narrow margins. +\let\nonarrowing=\relax + +% @cartouche ... @end cartouche: draw rectangle w/rounded corners around +% environment contents. +\font\circle=lcircle10 +\newdimen\circthick +\newdimen\cartouter\newdimen\cartinner +\newskip\normbskip\newskip\normpskip\newskip\normlskip +\circthick=\fontdimen8\circle +% +\def\ctl{{\circle\char'013\hskip -6pt}}% 6pt from pl file: 1/2charwidth +\def\ctr{{\hskip 6pt\circle\char'010}} +\def\cbl{{\circle\char'012\hskip -6pt}} +\def\cbr{{\hskip 6pt\circle\char'011}} +\def\carttop{\hbox to \cartouter{\hskip\lskip + \ctl\leaders\hrule height\circthick\hfil\ctr + \hskip\rskip}} +\def\cartbot{\hbox to \cartouter{\hskip\lskip + \cbl\leaders\hrule height\circthick\hfil\cbr + \hskip\rskip}} +% +\newskip\lskip\newskip\rskip + +\envdef\cartouche{% + \ifhmode\par\fi % can't be in the midst of a paragraph. + \startsavinginserts + \lskip=\leftskip \rskip=\rightskip + \leftskip=0pt\rightskip=0pt % we want these *outside*. + \cartinner=\hsize \advance\cartinner by-\lskip + \advance\cartinner by-\rskip + \cartouter=\hsize + \advance\cartouter by 18.4pt % allow for 3pt kerns on either + % side, and for 6pt waste from + % each corner char, and rule thickness + \normbskip=\baselineskip \normpskip=\parskip \normlskip=\lineskip + % Flag to tell @lisp, etc., not to narrow margin. + \let\nonarrowing=\comment + \vbox\bgroup + \baselineskip=0pt\parskip=0pt\lineskip=0pt + \carttop + \hbox\bgroup + \hskip\lskip + \vrule\kern3pt + \vbox\bgroup + \kern3pt + \hsize=\cartinner + \baselineskip=\normbskip + \lineskip=\normlskip + \parskip=\normpskip + \vskip -\parskip + \comment % For explanation, see the end of \def\group. +} +\def\Ecartouche{% + \ifhmode\par\fi + \kern3pt + \egroup + \kern3pt\vrule + \hskip\rskip + \egroup + \cartbot + \egroup + \checkinserts +} + + +% This macro is called at the beginning of all the @example variants, +% inside a group. +\def\nonfillstart{% + \aboveenvbreak + \hfuzz = 12pt % Don't be fussy + \sepspaces % Make spaces be word-separators rather than space tokens. + \let\par = \lisppar % don't ignore blank lines + \obeylines % each line of input is a line of output + \parskip = 0pt + \parindent = 0pt + \emergencystretch = 0pt % don't try to avoid overfull boxes + % @cartouche defines \nonarrowing to inhibit narrowing + % at next level down. + \ifx\nonarrowing\relax + \advance \leftskip by \lispnarrowing + \exdentamount=\lispnarrowing + \fi + \let\exdent=\nofillexdent +} + +% If you want all examples etc. small: @set dispenvsize small. +% If you want even small examples the full size: @set dispenvsize nosmall. +% This affects the following displayed environments: +% @example, @display, @format, @lisp +% +\def\smallword{small} +\def\nosmallword{nosmall} +\let\SETdispenvsize\relax +\def\setnormaldispenv{% + \ifx\SETdispenvsize\smallword + \smallexamplefonts \rm + \fi +} +\def\setsmalldispenv{% + \ifx\SETdispenvsize\nosmallword + \else + \smallexamplefonts \rm + \fi +} + +% We often define two environments, @foo and @smallfoo. +% Let's do it by one command: +\def\makedispenv #1#2{ + \expandafter\envdef\csname#1\endcsname {\setnormaldispenv #2} + \expandafter\envdef\csname small#1\endcsname {\setsmalldispenv #2} + \expandafter\let\csname E#1\endcsname \afterenvbreak + \expandafter\let\csname Esmall#1\endcsname \afterenvbreak +} + +% Define two synonyms: +\def\maketwodispenvs #1#2#3{ + \makedispenv{#1}{#3} + \makedispenv{#2}{#3} +} + +% @lisp: indented, narrowed, typewriter font; @example: same as @lisp. +% +% @smallexample and @smalllisp: use smaller fonts. +% Originally contributed by Pavel@xerox. +% +\maketwodispenvs {lisp}{example}{% + \nonfillstart + \tt + \let\kbdfont = \kbdexamplefont % Allow @kbd to do something special. + \gobble % eat return +} + +% @display/@smalldisplay: same as @lisp except keep current font. +% +\makedispenv {display}{% + \nonfillstart + \gobble +} + +% @format/@smallformat: same as @display except don't narrow margins. +% +\makedispenv{format}{% + \let\nonarrowing = t% + \nonfillstart + \gobble +} + +% @flushleft: same as @format, but doesn't obey \SETdispenvsize. +\envdef\flushleft{% + \let\nonarrowing = t% + \nonfillstart + \gobble +} +\let\Eflushleft = \afterenvbreak + +% @flushright. +% +\envdef\flushright{% + \let\nonarrowing = t% + \nonfillstart + \advance\leftskip by 0pt plus 1fill + \gobble +} +\let\Eflushright = \afterenvbreak + + +% @quotation does normal linebreaking (hence we can't use \nonfillstart) +% and narrows the margins. We keep \parskip nonzero in general, since +% we're doing normal filling. So, when using \aboveenvbreak and +% \afterenvbreak, temporarily make \parskip 0. +% +\envdef\quotation{% + {\parskip=0pt \aboveenvbreak}% because \aboveenvbreak inserts \parskip + \parindent=0pt + % + % @cartouche defines \nonarrowing to inhibit narrowing at next level down. + \ifx\nonarrowing\relax + \advance\leftskip by \lispnarrowing + \advance\rightskip by \lispnarrowing + \exdentamount = \lispnarrowing + \let\nonarrowing = \relax + \fi + \parsearg\quotationlabel +} + +% We have retained a nonzero parskip for the environment, since we're +% doing normal filling. +% +\def\Equotation{% + \par + \ifx\quotationauthor\undefined\else + % indent a bit. + \leftline{\kern 2\leftskip \sl ---\quotationauthor}% + \fi + {\parskip=0pt \afterenvbreak}% +} + +% If we're given an argument, typeset it in bold with a colon after. +\def\quotationlabel#1{% + \def\temp{#1}% + \ifx\temp\empty \else + {\bf #1: }% + \fi +} + + +% LaTeX-like @verbatim...@end verbatim and @verb{...} +% If we want to allow any as delimiter, +% we need the curly braces so that makeinfo sees the @verb command, eg: +% `@verbx...x' would look like the '@verbx' command. --janneke@gnu.org +% +% [Knuth]: Donald Ervin Knuth, 1996. The TeXbook. +% +% [Knuth] p.344; only we need to do the other characters Texinfo sets +% active too. Otherwise, they get lost as the first character on a +% verbatim line. +\def\dospecials{% + \do\ \do\\\do\{\do\}\do\$\do\&% + \do\#\do\^\do\^^K\do\_\do\^^A\do\%\do\~% + \do\<\do\>\do\|\do\@\do+\do\"% +} +% +% [Knuth] p. 380 +\def\uncatcodespecials{% + \def\do##1{\catcode`##1=\other}\dospecials} +% +% [Knuth] pp. 380,381,391 +% Disable Spanish ligatures ?` and !` of \tt font +\begingroup + \catcode`\`=\active\gdef`{\relax\lq} +\endgroup +% +% Setup for the @verb command. +% +% Eight spaces for a tab +\begingroup + \catcode`\^^I=\active + \gdef\tabeightspaces{\catcode`\^^I=\active\def^^I{\ \ \ \ \ \ \ \ }} +\endgroup +% +\def\setupverb{% + \tt % easiest (and conventionally used) font for verbatim + \def\par{\leavevmode\endgraf}% + \catcode`\`=\active + \tabeightspaces + % Respect line breaks, + % print special symbols as themselves, and + % make each space count + % must do in this order: + \obeylines \uncatcodespecials \sepspaces +} + +% Setup for the @verbatim environment +% +% Real tab expansion +\newdimen\tabw \setbox0=\hbox{\tt\space} \tabw=8\wd0 % tab amount +% +\def\starttabbox{\setbox0=\hbox\bgroup} +\begingroup + \catcode`\^^I=\active + \gdef\tabexpand{% + \catcode`\^^I=\active + \def^^I{\leavevmode\egroup + \dimen0=\wd0 % the width so far, or since the previous tab + \divide\dimen0 by\tabw + \multiply\dimen0 by\tabw % compute previous multiple of \tabw + \advance\dimen0 by\tabw % advance to next multiple of \tabw + \wd0=\dimen0 \box0 \starttabbox + }% + } +\endgroup +\def\setupverbatim{% + \nonfillstart + \advance\leftskip by -\defbodyindent + % Easiest (and conventionally used) font for verbatim + \tt + \def\par{\leavevmode\egroup\box0\endgraf}% + \catcode`\`=\active + \tabexpand + % Respect line breaks, + % print special symbols as themselves, and + % make each space count + % must do in this order: + \obeylines \uncatcodespecials \sepspaces + \everypar{\starttabbox}% +} + +% Do the @verb magic: verbatim text is quoted by unique +% delimiter characters. Before first delimiter expect a +% right brace, after last delimiter expect closing brace: +% +% \def\doverb'{'#1'}'{#1} +% +% [Knuth] p. 382; only eat outer {} +\begingroup + \catcode`[=1\catcode`]=2\catcode`\{=\other\catcode`\}=\other + \gdef\doverb{#1[\def\next##1#1}[##1\endgroup]\next] +\endgroup +% +\def\verb{\begingroup\setupverb\doverb} +% +% +% Do the @verbatim magic: define the macro \doverbatim so that +% the (first) argument ends when '@end verbatim' is reached, ie: +% +% \def\doverbatim#1@end verbatim{#1} +% +% For Texinfo it's a lot easier than for LaTeX, +% because texinfo's \verbatim doesn't stop at '\end{verbatim}': +% we need not redefine '\', '{' and '}'. +% +% Inspired by LaTeX's verbatim command set [latex.ltx] +% +\begingroup + \catcode`\ =\active + \obeylines % + % ignore everything up to the first ^^M, that's the newline at the end + % of the @verbatim input line itself. Otherwise we get an extra blank + % line in the output. + \xdef\doverbatim#1^^M#2@end verbatim{#2\noexpand\end\gobble verbatim}% + % We really want {...\end verbatim} in the body of the macro, but + % without the active space; thus we have to use \xdef and \gobble. +\endgroup +% +\envdef\verbatim{% + \setupverbatim\doverbatim +} +\let\Everbatim = \afterenvbreak + + +% @verbatiminclude FILE - insert text of file in verbatim environment. +% +\def\verbatiminclude{\parseargusing\filenamecatcodes\doverbatiminclude} +% +\def\doverbatiminclude#1{% + {% + \makevalueexpandable + \setupverbatim + \input #1 + \afterenvbreak + }% +} + +% @copying ... @end copying. +% Save the text away for @insertcopying later. Many commands won't be +% allowed in this context, but that's ok. +% +% We save the uninterpreted tokens, rather than creating a box. +% Saving the text in a box would be much easier, but then all the +% typesetting commands (@smallbook, font changes, etc.) have to be done +% beforehand -- and a) we want @copying to be done first in the source +% file; b) letting users define the frontmatter in as flexible order as +% possible is very desirable. +% +\def\copying{\begingroup + % Define a command to swallow text until we reach `@end copying'. + % \ is the escape char in this texinfo.tex file, so it is the + % delimiter for the command; @ will be the escape char when we read + % it, but that doesn't matter. + \long\def\docopying##1\end copying{\gdef\copyingtext{##1}\enddocopying}% + % + % We must preserve ^^M's in the input file; see \insertcopying below. + \catcode`\^^M = \active + \docopying +} + +% What we do to finish off the copying text. +% +\def\enddocopying{\endgroup\ignorespaces} + +% @insertcopying. Here we must play games with ^^M's. On the one hand, +% we need them to delimit commands such as `@end quotation', so they +% must be active. On the other hand, we certainly don't want every +% end-of-line to be a \par, as would happen with the normal active +% definition of ^^M. On the third hand, two ^^M's in a row should still +% generate a \par. +% +% Our approach is to make ^^M insert a space and a penalty1 normally; +% then it can also check if \lastpenalty=1. If it does, then manually +% do \par. +% +% This messes up the normal definitions of @c[omment], so we redefine +% it. Similarly for @ignore. (These commands are used in the gcc +% manual for man page generation.) +% +% Seems pretty fragile, most line-oriented commands will presumably +% fail, but for the limited use of getting the copying text (which +% should be quite simple) inserted, we can hope it's ok. +% +{\catcode`\^^M=\active % +\gdef\insertcopying{\begingroup % + \parindent = 0pt % looks wrong on title page + \def^^M{% + \ifnum \lastpenalty=1 % + \par % + \else % + \space \penalty 1 % + \fi % + }% + % + % Fix @c[omment] for catcode 13 ^^M's. + \def\c##1^^M{\ignorespaces}% + \let\comment = \c % + % + % Don't bother jumping through all the hoops that \doignore does, it + % would be very hard since the catcodes are already set. + \long\def\ignore##1\end ignore{\ignorespaces}% + % + \copyingtext % +\endgroup}% +} + +\message{defuns,} +% @defun etc. + +\newskip\defbodyindent \defbodyindent=.4in +\newskip\defargsindent \defargsindent=50pt +\newskip\deflastargmargin \deflastargmargin=18pt + +% Start the processing of @deffn: +\def\startdefun{% + \ifnum\lastpenalty<10000 + \medbreak + \else + % If there are two @def commands in a row, we'll have a \nobreak, + % which is there to keep the function description together with its + % header. But if there's nothing but headers, we need to allow a + % break somewhere. Check for penalty 10002 (inserted by + % \defargscommonending) instead of 10000, since the sectioning + % commands insert a \penalty10000, and we don't want to allow a break + % between a section heading and a defun. + \ifnum\lastpenalty=10002 \penalty2000 \fi + % + % Similarly, after a section heading, do not allow a break. + % But do insert the glue. + \medskip % preceded by discardable penalty, so not a breakpoint + \fi + % + \parindent=0in + \advance\leftskip by \defbodyindent + \exdentamount=\defbodyindent +} + +\def\dodefunx#1{% + % First, check whether we are in the right environment: + \checkenv#1% + % + % As above, allow line break if we have multiple x headers in a row. + % It's not a great place, though. + \ifnum\lastpenalty=10002 \penalty3000 \fi + % + % And now, it's time to reuse the body of the original defun: + \expandafter\gobbledefun#1% +} +\def\gobbledefun#1\startdefun{} + +% \printdefunline \deffnheader{text} +% +\def\printdefunline#1#2{% + \begingroup + % call \deffnheader: + #1#2 \endheader + % common ending: + \interlinepenalty = 10000 + \advance\rightskip by 0pt plus 1fil + \endgraf + \nobreak\vskip -\parskip + \penalty 10002 % signal to \startdefun and \dodefunx + % Some of the @defun-type tags do not enable magic parentheses, + % rendering the following check redundant. But we don't optimize. + \checkparencounts + \endgroup +} + +\def\Edefun{\endgraf\medbreak} + +% \makedefun{deffn} creates \deffn, \deffnx and \Edeffn; +% the only thing remainnig is to define \deffnheader. +% +\def\makedefun#1{% + \expandafter\let\csname E#1\endcsname = \Edefun + \edef\temp{\noexpand\domakedefun + \makecsname{#1}\makecsname{#1x}\makecsname{#1header}}% + \temp +} + +% \domakedefun \deffn \deffnx \deffnheader +% +% Define \deffn and \deffnx, without parameters. +% \deffnheader has to be defined explicitly. +% +\def\domakedefun#1#2#3{% + \envdef#1{% + \startdefun + \parseargusing\activeparens{\printdefunline#3}% + }% + \def#2{\dodefunx#1}% + \def#3% +} + +%%% Untyped functions: + +% @deffn category name args +\makedefun{deffn}{\deffngeneral{}} + +% @deffn category class name args +\makedefun{defop}#1 {\defopon{#1\ \putwordon}} + +% \defopon {category on}class name args +\def\defopon#1#2 {\deffngeneral{\putwordon\ \code{#2}}{#1\ \code{#2}} } + +% \deffngeneral {subind}category name args +% +\def\deffngeneral#1#2 #3 #4\endheader{% + % Remember that \dosubind{fn}{foo}{} is equivalent to \doind{fn}{foo}. + \dosubind{fn}{\code{#3}}{#1}% + \defname{#2}{}{#3}\magicamp\defunargs{#4\unskip}% +} + +%%% Typed functions: + +% @deftypefn category type name args +\makedefun{deftypefn}{\deftypefngeneral{}} + +% @deftypeop category class type name args +\makedefun{deftypeop}#1 {\deftypeopon{#1\ \putwordon}} + +% \deftypeopon {category on}class type name args +\def\deftypeopon#1#2 {\deftypefngeneral{\putwordon\ \code{#2}}{#1\ \code{#2}} } + +% \deftypefngeneral {subind}category type name args +% +\def\deftypefngeneral#1#2 #3 #4 #5\endheader{% + \dosubind{fn}{\code{#4}}{#1}% + \defname{#2}{#3}{#4}\defunargs{#5\unskip}% +} + +%%% Typed variables: + +% @deftypevr category type var args +\makedefun{deftypevr}{\deftypecvgeneral{}} + +% @deftypecv category class type var args +\makedefun{deftypecv}#1 {\deftypecvof{#1\ \putwordof}} + +% \deftypecvof {category of}class type var args +\def\deftypecvof#1#2 {\deftypecvgeneral{\putwordof\ \code{#2}}{#1\ \code{#2}} } + +% \deftypecvgeneral {subind}category type var args +% +\def\deftypecvgeneral#1#2 #3 #4 #5\endheader{% + \dosubind{vr}{\code{#4}}{#1}% + \defname{#2}{#3}{#4}\defunargs{#5\unskip}% +} + +%%% Untyped variables: + +% @defvr category var args +\makedefun{defvr}#1 {\deftypevrheader{#1} {} } + +% @defcv category class var args +\makedefun{defcv}#1 {\defcvof{#1\ \putwordof}} + +% \defcvof {category of}class var args +\def\defcvof#1#2 {\deftypecvof{#1}#2 {} } + +%%% Type: +% @deftp category name args +\makedefun{deftp}#1 #2 #3\endheader{% + \doind{tp}{\code{#2}}% + \defname{#1}{}{#2}\defunargs{#3\unskip}% +} + +% Remaining @defun-like shortcuts: +\makedefun{defun}{\deffnheader{\putwordDeffunc} } +\makedefun{defmac}{\deffnheader{\putwordDefmac} } +\makedefun{defspec}{\deffnheader{\putwordDefspec} } +\makedefun{deftypefun}{\deftypefnheader{\putwordDeffunc} } +\makedefun{defvar}{\defvrheader{\putwordDefvar} } +\makedefun{defopt}{\defvrheader{\putwordDefopt} } +\makedefun{deftypevar}{\deftypevrheader{\putwordDefvar} } +\makedefun{defmethod}{\defopon\putwordMethodon} +\makedefun{deftypemethod}{\deftypeopon\putwordMethodon} +\makedefun{defivar}{\defcvof\putwordInstanceVariableof} +\makedefun{deftypeivar}{\deftypecvof\putwordInstanceVariableof} + +% \defname, which formats the name of the @def (not the args). +% #1 is the category, such as "Function". +% #2 is the return type, if any. +% #3 is the function name. +% +% We are followed by (but not passed) the arguments, if any. +% +\def\defname#1#2#3{% + % Get the values of \leftskip and \rightskip as they were outside the @def... + \advance\leftskip by -\defbodyindent + % + % How we'll format the type name. Putting it in brackets helps + % distinguish it from the body text that may end up on the next line + % just below it. + \def\temp{#1}% + \setbox0=\hbox{\kern\deflastargmargin \ifx\temp\empty\else [\rm\temp]\fi} + % + % Figure out line sizes for the paragraph shape. + % The first line needs space for \box0; but if \rightskip is nonzero, + % we need only space for the part of \box0 which exceeds it: + \dimen0=\hsize \advance\dimen0 by -\wd0 \advance\dimen0 by \rightskip + % The continuations: + \dimen2=\hsize \advance\dimen2 by -\defargsindent + % (plain.tex says that \dimen1 should be used only as global.) + \parshape 2 0in \dimen0 \defargsindent \dimen2 + % + % Put the type name to the right margin. + \noindent + \hbox to 0pt{% + \hfil\box0 \kern-\hsize + % \hsize has to be shortened this way: + \kern\leftskip + % Intentionally do not respect \rightskip, since we need the space. + }% + % + % Allow all lines to be underfull without complaint: + \tolerance=10000 \hbadness=10000 + \exdentamount=\defbodyindent + {% + % defun fonts. We use typewriter by default (used to be bold) because: + % . we're printing identifiers, they should be in tt in principle. + % . in languages with many accents, such as Czech or French, it's + % common to leave accents off identifiers. The result looks ok in + % tt, but exceedingly strange in rm. + % . we don't want -- and --- to be treated as ligatures. + % . this still does not fix the ?` and !` ligatures, but so far no + % one has made identifiers using them :). + \df \tt + \def\temp{#2}% return value type + \ifx\temp\empty\else \tclose{\temp} \fi + #3% output function name + }% + {\rm\enskip}% hskip 0.5 em of \tenrm + % + \boldbrax + % arguments will be output next, if any. +} + +% Print arguments in slanted roman (not ttsl), inconsistently with using +% tt for the name. This is because literal text is sometimes needed in +% the argument list (groff manual), and ttsl and tt are not very +% distinguishable. Prevent hyphenation at `-' chars. +% +\def\defunargs#1{% + % use sl by default (not ttsl), + % tt for the names. + \df \sl \hyphenchar\font=0 + % + % On the other hand, if an argument has two dashes (for instance), we + % want a way to get ttsl. Let's try @var for that. + \let\var=\ttslanted + #1% + \sl\hyphenchar\font=45 +} + +% We want ()&[] to print specially on the defun line. +% +\def\activeparens{% + \catcode`\(=\active \catcode`\)=\active + \catcode`\[=\active \catcode`\]=\active + \catcode`\&=\active +} + +% Make control sequences which act like normal parenthesis chars. +\let\lparen = ( \let\rparen = ) + +% Be sure that we always have a definition for `(', etc. For example, +% if the fn name has parens in it, \boldbrax will not be in effect yet, +% so TeX would otherwise complain about undefined control sequence. +{ + \activeparens + \global\let(=\lparen \global\let)=\rparen + \global\let[=\lbrack \global\let]=\rbrack + \global\let& = \& + + \gdef\boldbrax{\let(=\opnr\let)=\clnr\let[=\lbrb\let]=\rbrb} + \gdef\magicamp{\let&=\amprm} +} + +\newcount\parencount + +% If we encounter &foo, then turn on ()-hacking afterwards +\newif\ifampseen +\def\amprm#1 {\ampseentrue{\bf\ }} + +\def\parenfont{% + \ifampseen + % At the first level, print parens in roman, + % otherwise use the default font. + \ifnum \parencount=1 \rm \fi + \else + % The \sf parens (in \boldbrax) actually are a little bolder than + % the contained text. This is especially needed for [ and ] . + \sf + \fi +} +\def\infirstlevel#1{% + \ifampseen + \ifnum\parencount=1 + #1% + \fi + \fi +} +\def\bfafterword#1 {#1 \bf} + +\def\opnr{% + \global\advance\parencount by 1 + {\parenfont(}% + \infirstlevel \bfafterword +} +\def\clnr{% + {\parenfont)}% + \infirstlevel \sl + \global\advance\parencount by -1 +} + +\newcount\brackcount +\def\lbrb{% + \global\advance\brackcount by 1 + {\bf[}% +} +\def\rbrb{% + {\bf]}% + \global\advance\brackcount by -1 +} + +\def\checkparencounts{% + \ifnum\parencount=0 \else \badparencount \fi + \ifnum\brackcount=0 \else \badbrackcount \fi +} +\def\badparencount{% + \errmessage{Unbalanced parentheses in @def}% + \global\parencount=0 +} +\def\badbrackcount{% + \errmessage{Unbalanced square braces in @def}% + \global\brackcount=0 +} + + +\message{macros,} +% @macro. + +% To do this right we need a feature of e-TeX, \scantokens, +% which we arrange to emulate with a temporary file in ordinary TeX. +\ifx\eTeXversion\undefined + \newwrite\macscribble + \def\scantokens#1{% + \toks0={#1\endinput}% + \immediate\openout\macscribble=\jobname.tmp + \immediate\write\macscribble{\the\toks0}% + \immediate\closeout\macscribble + \input \jobname.tmp + } +\fi + +\def\scanmacro#1{% + \begingroup + \newlinechar`\^^M + \let\xeatspaces\eatspaces + % Undo catcode changes of \startcontents and \doprintindex + \catcode`\@=0 \catcode`\\=\other \escapechar=`\@ + % ... and \example + \spaceisspace + % + % Append \endinput to make sure that TeX does not see the ending newline. + % + % I've verified that it is necessary both for e-TeX and for ordinary TeX + % --kasal, 29nov03 + \scantokens{#1\endinput}% + \endgroup +} + +\newcount\paramno % Count of parameters +\newtoks\macname % Macro name +\newif\ifrecursive % Is it recursive? +\def\macrolist{} % List of all defined macros in the form + % \do\macro1\do\macro2... + +% Utility routines. +% This does \let #1 = #2, except with \csnames. +\def\cslet#1#2{% +\expandafter\expandafter +\expandafter\let +\expandafter\expandafter +\csname#1\endcsname +\csname#2\endcsname} + +% Trim leading and trailing spaces off a string. +% Concepts from aro-bend problem 15 (see CTAN). +{\catcode`\@=11 +\gdef\eatspaces #1{\expandafter\trim@\expandafter{#1 }} +\gdef\trim@ #1{\trim@@ @#1 @ #1 @ @@} +\gdef\trim@@ #1@ #2@ #3@@{\trim@@@\empty #2 @} +\def\unbrace#1{#1} +\unbrace{\gdef\trim@@@ #1 } #2@{#1} +} + +% Trim a single trailing ^^M off a string. +{\catcode`\^^M=\other \catcode`\Q=3% +\gdef\eatcr #1{\eatcra #1Q^^MQ}% +\gdef\eatcra#1^^MQ{\eatcrb#1Q}% +\gdef\eatcrb#1Q#2Q{#1}% +} + +% Macro bodies are absorbed as an argument in a context where +% all characters are catcode 10, 11 or 12, except \ which is active +% (as in normal texinfo). It is necessary to change the definition of \. + +% It's necessary to have hard CRs when the macro is executed. This is +% done by making ^^M (\endlinechar) catcode 12 when reading the macro +% body, and then making it the \newlinechar in \scanmacro. + +\def\macrobodyctxt{% + \catcode`\~=\other + \catcode`\^=\other + \catcode`\_=\other + \catcode`\|=\other + \catcode`\<=\other + \catcode`\>=\other + \catcode`\+=\other + \catcode`\{=\other + \catcode`\}=\other + \catcode`\@=\other + \catcode`\^^M=\other + \usembodybackslash} + +\def\macroargctxt{% + \catcode`\~=\other + \catcode`\^=\other + \catcode`\_=\other + \catcode`\|=\other + \catcode`\<=\other + \catcode`\>=\other + \catcode`\+=\other + \catcode`\@=\other + \catcode`\\=\other} + +% \mbodybackslash is the definition of \ in @macro bodies. +% It maps \foo\ => \csname macarg.foo\endcsname => #N +% where N is the macro parameter number. +% We define \csname macarg.\endcsname to be \realbackslash, so +% \\ in macro replacement text gets you a backslash. + +{\catcode`@=0 @catcode`@\=@active + @gdef@usembodybackslash{@let\=@mbodybackslash} + @gdef@mbodybackslash#1\{@csname macarg.#1@endcsname} +} +\expandafter\def\csname macarg.\endcsname{\realbackslash} + +\def\macro{\recursivefalse\parsearg\macroxxx} +\def\rmacro{\recursivetrue\parsearg\macroxxx} + +\def\macroxxx#1{% + \getargs{#1}% now \macname is the macname and \argl the arglist + \ifx\argl\empty % no arguments + \paramno=0% + \else + \expandafter\parsemargdef \argl;% + \fi + \if1\csname ismacro.\the\macname\endcsname + \message{Warning: redefining \the\macname}% + \else + \expandafter\ifx\csname \the\macname\endcsname \relax + \else \errmessage{Macro name \the\macname\space already defined}\fi + \global\cslet{macsave.\the\macname}{\the\macname}% + \global\expandafter\let\csname ismacro.\the\macname\endcsname=1% + % Add the macroname to \macrolist + \toks0 = \expandafter{\macrolist\do}% + \xdef\macrolist{\the\toks0 + \expandafter\noexpand\csname\the\macname\endcsname}% + \fi + \begingroup \macrobodyctxt + \ifrecursive \expandafter\parsermacbody + \else \expandafter\parsemacbody + \fi} + +\parseargdef\unmacro{% + \if1\csname ismacro.#1\endcsname + \global\cslet{#1}{macsave.#1}% + \global\expandafter\let \csname ismacro.#1\endcsname=0% + % Remove the macro name from \macrolist: + \begingroup + \expandafter\let\csname#1\endcsname \relax + \let\do\unmacrodo + \xdef\macrolist{\macrolist}% + \endgroup + \else + \errmessage{Macro #1 not defined}% + \fi +} + +% Called by \do from \dounmacro on each macro. The idea is to omit any +% macro definitions that have been changed to \relax. +% +\def\unmacrodo#1{% + \ifx#1\relax + % remove this + \else + \noexpand\do \noexpand #1% + \fi +} + +% This makes use of the obscure feature that if the last token of a +% is #, then the preceding argument is delimited by +% an opening brace, and that opening brace is not consumed. +\def\getargs#1{\getargsxxx#1{}} +\def\getargsxxx#1#{\getmacname #1 \relax\getmacargs} +\def\getmacname #1 #2\relax{\macname={#1}} +\def\getmacargs#1{\def\argl{#1}} + +% Parse the optional {params} list. Set up \paramno and \paramlist +% so \defmacro knows what to do. Define \macarg.blah for each blah +% in the params list, to be ##N where N is the position in that list. +% That gets used by \mbodybackslash (above). + +% We need to get `macro parameter char #' into several definitions. +% The technique used is stolen from LaTeX: let \hash be something +% unexpandable, insert that wherever you need a #, and then redefine +% it to # just before using the token list produced. +% +% The same technique is used to protect \eatspaces till just before +% the macro is used. + +\def\parsemargdef#1;{\paramno=0\def\paramlist{}% + \let\hash\relax\let\xeatspaces\relax\parsemargdefxxx#1,;,} +\def\parsemargdefxxx#1,{% + \if#1;\let\next=\relax + \else \let\next=\parsemargdefxxx + \advance\paramno by 1% + \expandafter\edef\csname macarg.\eatspaces{#1}\endcsname + {\xeatspaces{\hash\the\paramno}}% + \edef\paramlist{\paramlist\hash\the\paramno,}% + \fi\next} + +% These two commands read recursive and nonrecursive macro bodies. +% (They're different since rec and nonrec macros end differently.) + +\long\def\parsemacbody#1@end macro% +{\xdef\temp{\eatcr{#1}}\endgroup\defmacro}% +\long\def\parsermacbody#1@end rmacro% +{\xdef\temp{\eatcr{#1}}\endgroup\defmacro}% + +% This defines the macro itself. There are six cases: recursive and +% nonrecursive macros of zero, one, and many arguments. +% Much magic with \expandafter here. +% \xdef is used so that macro definitions will survive the file +% they're defined in; @include reads the file inside a group. +\def\defmacro{% + \let\hash=##% convert placeholders to macro parameter chars + \ifrecursive + \ifcase\paramno + % 0 + \expandafter\xdef\csname\the\macname\endcsname{% + \noexpand\scanmacro{\temp}}% + \or % 1 + \expandafter\xdef\csname\the\macname\endcsname{% + \bgroup\noexpand\macroargctxt + \noexpand\braceorline + \expandafter\noexpand\csname\the\macname xxx\endcsname}% + \expandafter\xdef\csname\the\macname xxx\endcsname##1{% + \egroup\noexpand\scanmacro{\temp}}% + \else % many + \expandafter\xdef\csname\the\macname\endcsname{% + \bgroup\noexpand\macroargctxt + \noexpand\csname\the\macname xx\endcsname}% + \expandafter\xdef\csname\the\macname xx\endcsname##1{% + \expandafter\noexpand\csname\the\macname xxx\endcsname ##1,}% + \expandafter\expandafter + \expandafter\xdef + \expandafter\expandafter + \csname\the\macname xxx\endcsname + \paramlist{\egroup\noexpand\scanmacro{\temp}}% + \fi + \else + \ifcase\paramno + % 0 + \expandafter\xdef\csname\the\macname\endcsname{% + \noexpand\norecurse{\the\macname}% + \noexpand\scanmacro{\temp}\egroup}% + \or % 1 + \expandafter\xdef\csname\the\macname\endcsname{% + \bgroup\noexpand\macroargctxt + \noexpand\braceorline + \expandafter\noexpand\csname\the\macname xxx\endcsname}% + \expandafter\xdef\csname\the\macname xxx\endcsname##1{% + \egroup + \noexpand\norecurse{\the\macname}% + \noexpand\scanmacro{\temp}\egroup}% + \else % many + \expandafter\xdef\csname\the\macname\endcsname{% + \bgroup\noexpand\macroargctxt + \expandafter\noexpand\csname\the\macname xx\endcsname}% + \expandafter\xdef\csname\the\macname xx\endcsname##1{% + \expandafter\noexpand\csname\the\macname xxx\endcsname ##1,}% + \expandafter\expandafter + \expandafter\xdef + \expandafter\expandafter + \csname\the\macname xxx\endcsname + \paramlist{% + \egroup + \noexpand\norecurse{\the\macname}% + \noexpand\scanmacro{\temp}\egroup}% + \fi + \fi} + +\def\norecurse#1{\bgroup\cslet{#1}{macsave.#1}} + +% \braceorline decides whether the next nonwhitespace character is a +% {. If so it reads up to the closing }, if not, it reads the whole +% line. Whatever was read is then fed to the next control sequence +% as an argument (by \parsebrace or \parsearg) +\def\braceorline#1{\let\next=#1\futurelet\nchar\braceorlinexxx} +\def\braceorlinexxx{% + \ifx\nchar\bgroup\else + \expandafter\parsearg + \fi \next} + +% We mant to disable all macros during \shipout so that they are not +% expanded by \write. +\def\turnoffmacros{\begingroup \def\do##1{\let\noexpand##1=\relax}% + \edef\next{\macrolist}\expandafter\endgroup\next} + + +% @alias. +% We need some trickery to remove the optional spaces around the equal +% sign. Just make them active and then expand them all to nothing. +\def\alias{\parseargusing\obeyspaces\aliasxxx} +\def\aliasxxx #1{\aliasyyy#1\relax} +\def\aliasyyy #1=#2\relax{% + {% + \expandafter\let\obeyedspace=\empty + \xdef\next{\global\let\makecsname{#1}=\makecsname{#2}}% + }% + \next +} + + +\message{cross references,} + +\newwrite\auxfile + +\newif\ifhavexrefs % True if xref values are known. +\newif\ifwarnedxrefs % True if we warned once that they aren't known. + +% @inforef is relatively simple. +\def\inforef #1{\inforefzzz #1,,,,**} +\def\inforefzzz #1,#2,#3,#4**{\putwordSee{} \putwordInfo{} \putwordfile{} \file{\ignorespaces #3{}}, + node \samp{\ignorespaces#1{}}} + +% @node's only job in TeX is to define \lastnode, which is used in +% cross-references. The @node line might or might not have commas, and +% might or might not have spaces before the first comma, like: +% @node foo , bar , ... +% We don't want such trailing spaces in the node name. +% +\parseargdef\node{\checkenv{}\donode #1 ,\finishnodeparse} +% +% also remove a trailing comma, in case of something like this: +% @node Help-Cross, , , Cross-refs +\def\donode#1 ,#2\finishnodeparse{\dodonode #1,\finishnodeparse} +\def\dodonode#1,#2\finishnodeparse{\gdef\lastnode{#1}} + +\let\nwnode=\node +\let\lastnode=\empty + +% Write a cross-reference definition for the current node. #1 is the +% type (Ynumbered, Yappendix, Ynothing). +% +\def\donoderef#1{% + \ifx\lastnode\empty\else + \setref{\lastnode}{#1}% + \global\let\lastnode=\empty + \fi +} + +% @anchor{NAME} -- define xref target at arbitrary point. +% +\newcount\savesfregister +% +\def\savesf{\relax \ifhmode \savesfregister=\spacefactor \fi} +\def\restoresf{\relax \ifhmode \spacefactor=\savesfregister \fi} +\def\anchor#1{\savesf \setref{#1}{Ynothing}\restoresf \ignorespaces} + +% \setref{NAME}{SNT} defines a cross-reference point NAME (a node or an +% anchor), which consists of three parts: +% 1) NAME-title - the current sectioning name taken from \thissection, +% or the anchor name. +% 2) NAME-snt - section number and type, passed as the SNT arg, or +% empty for anchors. +% 3) NAME-pg - the page number. +% +% This is called from \donoderef, \anchor, and \dofloat. In the case of +% floats, there is an additional part, which is not written here: +% 4) NAME-lof - the text as it should appear in a @listoffloats. +% +\def\setref#1#2{% + \pdfmkdest{#1}% + \iflinks + {% + \atdummies % preserve commands, but don't expand them + \turnoffactive + \otherbackslash + \edef\writexrdef##1##2{% + \write\auxfile{@xrdef{#1-% #1 of \setref, expanded by the \edef + ##1}{##2}}% these are parameters of \writexrdef + }% + \toks0 = \expandafter{\thissection}% + \immediate \writexrdef{title}{\the\toks0 }% + \immediate \writexrdef{snt}{\csname #2\endcsname}% \Ynumbered etc. + \writexrdef{pg}{\folio}% will be written later, during \shipout + }% + \fi +} + +% @xref, @pxref, and @ref generate cross-references. For \xrefX, #1 is +% the node name, #2 the name of the Info cross-reference, #3 the printed +% node name, #4 the name of the Info file, #5 the name of the printed +% manual. All but the node name can be omitted. +% +\def\pxref#1{\putwordsee{} \xrefX[#1,,,,,,,]} +\def\xref#1{\putwordSee{} \xrefX[#1,,,,,,,]} +\def\ref#1{\xrefX[#1,,,,,,,]} +\def\xrefX[#1,#2,#3,#4,#5,#6]{\begingroup + \unsepspaces + \def\printedmanual{\ignorespaces #5}% + \def\printedrefname{\ignorespaces #3}% + \setbox1=\hbox{\printedmanual\unskip}% + \setbox0=\hbox{\printedrefname\unskip}% + \ifdim \wd0 = 0pt + % No printed node name was explicitly given. + \expandafter\ifx\csname SETxref-automatic-section-title\endcsname\relax + % Use the node name inside the square brackets. + \def\printedrefname{\ignorespaces #1}% + \else + % Use the actual chapter/section title appear inside + % the square brackets. Use the real section title if we have it. + \ifdim \wd1 > 0pt + % It is in another manual, so we don't have it. + \def\printedrefname{\ignorespaces #1}% + \else + \ifhavexrefs + % We know the real title if we have the xref values. + \def\printedrefname{\refx{#1-title}{}}% + \else + % Otherwise just copy the Info node name. + \def\printedrefname{\ignorespaces #1}% + \fi% + \fi + \fi + \fi + % + % Make link in pdf output. + \ifpdf + \leavevmode + \getfilename{#4}% + {\turnoffactive \otherbackslash + \ifnum\filenamelength>0 + \startlink attr{/Border [0 0 0]}% + goto file{\the\filename.pdf} name{#1}% + \else + \startlink attr{/Border [0 0 0]}% + goto name{\pdfmkpgn{#1}}% + \fi + }% + \linkcolor + \fi + % + % Float references are printed completely differently: "Figure 1.2" + % instead of "[somenode], p.3". We distinguish them by the + % LABEL-title being set to a magic string. + {% + % Have to otherify everything special to allow the \csname to + % include an _ in the xref name, etc. + \indexnofonts + \turnoffactive + \otherbackslash + \expandafter\global\expandafter\let\expandafter\Xthisreftitle + \csname XR#1-title\endcsname + }% + \iffloat\Xthisreftitle + % If the user specified the print name (third arg) to the ref, + % print it instead of our usual "Figure 1.2". + \ifdim\wd0 = 0pt + \refx{#1-snt}% + \else + \printedrefname + \fi + % + % if the user also gave the printed manual name (fifth arg), append + % "in MANUALNAME". + \ifdim \wd1 > 0pt + \space \putwordin{} \cite{\printedmanual}% + \fi + \else + % node/anchor (non-float) references. + % + % If we use \unhbox0 and \unhbox1 to print the node names, TeX does not + % insert empty discretionaries after hyphens, which means that it will + % not find a line break at a hyphen in a node names. Since some manuals + % are best written with fairly long node names, containing hyphens, this + % is a loss. Therefore, we give the text of the node name again, so it + % is as if TeX is seeing it for the first time. + \ifdim \wd1 > 0pt + \putwordsection{} ``\printedrefname'' \putwordin{} \cite{\printedmanual}% + \else + % _ (for example) has to be the character _ for the purposes of the + % control sequence corresponding to the node, but it has to expand + % into the usual \leavevmode...\vrule stuff for purposes of + % printing. So we \turnoffactive for the \refx-snt, back on for the + % printing, back off for the \refx-pg. + {\turnoffactive \otherbackslash + % Only output a following space if the -snt ref is nonempty; for + % @unnumbered and @anchor, it won't be. + \setbox2 = \hbox{\ignorespaces \refx{#1-snt}{}}% + \ifdim \wd2 > 0pt \refx{#1-snt}\space\fi + }% + % output the `[mynode]' via a macro so it can be overridden. + \xrefprintnodename\printedrefname + % + % But we always want a comma and a space: + ,\space + % + % output the `page 3'. + \turnoffactive \otherbackslash \putwordpage\tie\refx{#1-pg}{}% + \fi + \fi + \endlink +\endgroup} + +% This macro is called from \xrefX for the `[nodename]' part of xref +% output. It's a separate macro only so it can be changed more easily, +% since square brackets don't work well in some documents. Particularly +% one that Bob is working on :). +% +\def\xrefprintnodename#1{[#1]} + +% Things referred to by \setref. +% +\def\Ynothing{} +\def\Yomitfromtoc{} +\def\Ynumbered{% + \ifnum\secno=0 + \putwordChapter@tie \the\chapno + \else \ifnum\subsecno=0 + \putwordSection@tie \the\chapno.\the\secno + \else \ifnum\subsubsecno=0 + \putwordSection@tie \the\chapno.\the\secno.\the\subsecno + \else + \putwordSection@tie \the\chapno.\the\secno.\the\subsecno.\the\subsubsecno + \fi\fi\fi +} +\def\Yappendix{% + \ifnum\secno=0 + \putwordAppendix@tie @char\the\appendixno{}% + \else \ifnum\subsecno=0 + \putwordSection@tie @char\the\appendixno.\the\secno + \else \ifnum\subsubsecno=0 + \putwordSection@tie @char\the\appendixno.\the\secno.\the\subsecno + \else + \putwordSection@tie + @char\the\appendixno.\the\secno.\the\subsecno.\the\subsubsecno + \fi\fi\fi +} + +% Define \refx{NAME}{SUFFIX} to reference a cross-reference string named NAME. +% If its value is nonempty, SUFFIX is output afterward. +% +\def\refx#1#2{% + {% + \indexnofonts + \otherbackslash + \expandafter\global\expandafter\let\expandafter\thisrefX + \csname XR#1\endcsname + }% + \ifx\thisrefX\relax + % If not defined, say something at least. + \angleleft un\-de\-fined\angleright + \iflinks + \ifhavexrefs + \message{\linenumber Undefined cross reference `#1'.}% + \else + \ifwarnedxrefs\else + \global\warnedxrefstrue + \message{Cross reference values unknown; you must run TeX again.}% + \fi + \fi + \fi + \else + % It's defined, so just use it. + \thisrefX + \fi + #2% Output the suffix in any case. +} + +% This is the macro invoked by entries in the aux file. Usually it's +% just a \def (we prepend XR to the control sequence name to avoid +% collisions). But if this is a float type, we have more work to do. +% +\def\xrdef#1#2{% + \expandafter\gdef\csname XR#1\endcsname{#2}% remember this xref value. + % + % Was that xref control sequence that we just defined for a float? + \expandafter\iffloat\csname XR#1\endcsname + % it was a float, and we have the (safe) float type in \iffloattype. + \expandafter\let\expandafter\floatlist + \csname floatlist\iffloattype\endcsname + % + % Is this the first time we've seen this float type? + \expandafter\ifx\floatlist\relax + \toks0 = {\do}% yes, so just \do + \else + % had it before, so preserve previous elements in list. + \toks0 = \expandafter{\floatlist\do}% + \fi + % + % Remember this xref in the control sequence \floatlistFLOATTYPE, + % for later use in \listoffloats. + \expandafter\xdef\csname floatlist\iffloattype\endcsname{\the\toks0{#1}}% + \fi +} + +% Read the last existing aux file, if any. No error if none exists. +% +\def\tryauxfile{% + \openin 1 \jobname.aux + \ifeof 1 \else + \readauxfile + \global\havexrefstrue + \fi + \closein 1 +} + +\def\readauxfile{\begingroup + \catcode`\^^@=\other + \catcode`\^^A=\other + \catcode`\^^B=\other + \catcode`\^^C=\other + \catcode`\^^D=\other + \catcode`\^^E=\other + \catcode`\^^F=\other + \catcode`\^^G=\other + \catcode`\^^H=\other + \catcode`\^^K=\other + \catcode`\^^L=\other + \catcode`\^^N=\other + \catcode`\^^P=\other + \catcode`\^^Q=\other + \catcode`\^^R=\other + \catcode`\^^S=\other + \catcode`\^^T=\other + \catcode`\^^U=\other + \catcode`\^^V=\other + \catcode`\^^W=\other + \catcode`\^^X=\other + \catcode`\^^Z=\other + \catcode`\^^[=\other + \catcode`\^^\=\other + \catcode`\^^]=\other + \catcode`\^^^=\other + \catcode`\^^_=\other + % It was suggested to set the catcode of ^ to 7, which would allow ^^e4 etc. + % in xref tags, i.e., node names. But since ^^e4 notation isn't + % supported in the main text, it doesn't seem desirable. Furthermore, + % that is not enough: for node names that actually contain a ^ + % character, we would end up writing a line like this: 'xrdef {'hat + % b-title}{'hat b} and \xrdef does a \csname...\endcsname on the first + % argument, and \hat is not an expandable control sequence. It could + % all be worked out, but why? Either we support ^^ or we don't. + % + % The other change necessary for this was to define \auxhat: + % \def\auxhat{\def^{'hat }}% extra space so ok if followed by letter + % and then to call \auxhat in \setq. + % + \catcode`\^=\other + % + % Special characters. Should be turned off anyway, but... + \catcode`\~=\other + \catcode`\[=\other + \catcode`\]=\other + \catcode`\"=\other + \catcode`\_=\other + \catcode`\|=\other + \catcode`\<=\other + \catcode`\>=\other + \catcode`\$=\other + \catcode`\#=\other + \catcode`\&=\other + \catcode`\%=\other + \catcode`+=\other % avoid \+ for paranoia even though we've turned it off + % + % This is to support \ in node names and titles, since the \ + % characters end up in a \csname. It's easier than + % leaving it active and making its active definition an actual \ + % character. What I don't understand is why it works in the *value* + % of the xrdef. Seems like it should be a catcode12 \, and that + % should not typeset properly. But it works, so I'm moving on for + % now. --karl, 15jan04. + \catcode`\\=\other + % + % Make the characters 128-255 be printing characters. + {% + \count 1=128 + \def\loop{% + \catcode\count 1=\other + \advance\count 1 by 1 + \ifnum \count 1<256 \loop \fi + }% + }% + % + % @ is our escape character in .aux files, and we need braces. + \catcode`\{=1 + \catcode`\}=2 + \catcode`\@=0 + % + \input \jobname.aux +\endgroup} + + +\message{insertions,} +% including footnotes. + +\newcount \footnoteno + +% The trailing space in the following definition for supereject is +% vital for proper filling; pages come out unaligned when you do a +% pagealignmacro call if that space before the closing brace is +% removed. (Generally, numeric constants should always be followed by a +% space to prevent strange expansion errors.) +\def\supereject{\par\penalty -20000\footnoteno =0 } + +% @footnotestyle is meaningful for info output only. +\let\footnotestyle=\comment + +{\catcode `\@=11 +% +% Auto-number footnotes. Otherwise like plain. +\gdef\footnote{% + \let\indent=\ptexindent + \let\noindent=\ptexnoindent + \global\advance\footnoteno by \@ne + \edef\thisfootno{$^{\the\footnoteno}$}% + % + % In case the footnote comes at the end of a sentence, preserve the + % extra spacing after we do the footnote number. + \let\@sf\empty + \ifhmode\edef\@sf{\spacefactor\the\spacefactor}\ptexslash\fi + % + % Remove inadvertent blank space before typesetting the footnote number. + \unskip + \thisfootno\@sf + \dofootnote +}% + +% Don't bother with the trickery in plain.tex to not require the +% footnote text as a parameter. Our footnotes don't need to be so general. +% +% Oh yes, they do; otherwise, @ifset (and anything else that uses +% \parseargline) fails inside footnotes because the tokens are fixed when +% the footnote is read. --karl, 16nov96. +% +\gdef\dofootnote{% + \insert\footins\bgroup + % We want to typeset this text as a normal paragraph, even if the + % footnote reference occurs in (for example) a display environment. + % So reset some parameters. + \hsize=\pagewidth + \interlinepenalty\interfootnotelinepenalty + \splittopskip\ht\strutbox % top baseline for broken footnotes + \splitmaxdepth\dp\strutbox + \floatingpenalty\@MM + \leftskip\z@skip + \rightskip\z@skip + \spaceskip\z@skip + \xspaceskip\z@skip + \parindent\defaultparindent + % + \smallfonts \rm + % + % Because we use hanging indentation in footnotes, a @noindent appears + % to exdent this text, so make it be a no-op. makeinfo does not use + % hanging indentation so @noindent can still be needed within footnote + % text after an @example or the like (not that this is good style). + \let\noindent = \relax + % + % Hang the footnote text off the number. Use \everypar in case the + % footnote extends for more than one paragraph. + \everypar = {\hang}% + \textindent{\thisfootno}% + % + % Don't crash into the line above the footnote text. Since this + % expands into a box, it must come within the paragraph, lest it + % provide a place where TeX can split the footnote. + \footstrut + \futurelet\next\fo@t +} +}%end \catcode `\@=11 + +% In case a @footnote appears in a vbox, save the footnote text and create +% the real \insert just after the vbox finished. Otherwise, the insertion +% would be lost. +% Similarily, if a @footnote appears inside an alignment, save the footnote +% text to a box and make the \insert when a row of the table is finished. +% And the same can be done for other insert classes. --kasal, 16nov03. + +% Replace the \insert primitive by a cheating macro. +% Deeper inside, just make sure that the saved insertions are not spilled +% out prematurely. +% +\def\startsavinginserts{% + \ifx \insert\ptexinsert + \let\insert\saveinsert + \else + \let\checkinserts\relax + \fi +} + +% This \insert replacement works for both \insert\footins{foo} and +% \insert\footins\bgroup foo\egroup, but it doesn't work for \insert27{foo}. +% +\def\saveinsert#1{% + \edef\next{\noexpand\savetobox \makeSAVEname#1}% + \afterassignment\next + % swallow the left brace + \let\temp = +} +\def\makeSAVEname#1{\makecsname{SAVE\expandafter\gobble\string#1}} +\def\savetobox#1{\global\setbox#1 = \vbox\bgroup \unvbox#1} + +\def\checksaveins#1{\ifvoid#1\else \placesaveins#1\fi} + +\def\placesaveins#1{% + \ptexinsert \csname\expandafter\gobblesave\string#1\endcsname + {\box#1}% +} + +% eat @SAVE -- beware, all of them have catcode \other: +{ + \def\dospecials{\do S\do A\do V\do E} \uncatcodespecials % ;-) + \gdef\gobblesave @SAVE{} +} + +% initialization: +\def\newsaveins #1{% + \edef\next{\noexpand\newsaveinsX \makeSAVEname#1}% + \next +} +\def\newsaveinsX #1{% + \csname newbox\endcsname #1% + \expandafter\def\expandafter\checkinserts\expandafter{\checkinserts + \checksaveins #1}% +} + +% initialize: +\let\checkinserts\empty +\newsaveins\footins +\newsaveins\margin + + +% @image. We use the macros from epsf.tex to support this. +% If epsf.tex is not installed and @image is used, we complain. +% +% Check for and read epsf.tex up front. If we read it only at @image +% time, we might be inside a group, and then its definitions would get +% undone and the next image would fail. +\openin 1 = epsf.tex +\ifeof 1 \else + % Do not bother showing banner with epsf.tex v2.7k (available in + % doc/epsf.tex and on ctan). + \def\epsfannounce{\toks0 = }% + \input epsf.tex +\fi +\closein 1 +% +% We will only complain once about lack of epsf.tex. +\newif\ifwarnednoepsf +\newhelp\noepsfhelp{epsf.tex must be installed for images to + work. It is also included in the Texinfo distribution, or you can get + it from ftp://tug.org/tex/epsf.tex.} +% +\def\image#1{% + \ifx\epsfbox\undefined + \ifwarnednoepsf \else + \errhelp = \noepsfhelp + \errmessage{epsf.tex not found, images will be ignored}% + \global\warnednoepsftrue + \fi + \else + \imagexxx #1,,,,,\finish + \fi +} +% +% Arguments to @image: +% #1 is (mandatory) image filename; we tack on .eps extension. +% #2 is (optional) width, #3 is (optional) height. +% #4 is (ignored optional) html alt text. +% #5 is (ignored optional) extension. +% #6 is just the usual extra ignored arg for parsing this stuff. +\newif\ifimagevmode +\def\imagexxx#1,#2,#3,#4,#5,#6\finish{\begingroup + \catcode`\^^M = 5 % in case we're inside an example + \normalturnoffactive % allow _ et al. in names + % If the image is by itself, center it. + \ifvmode + \imagevmodetrue + \nobreak\bigskip + % Usually we'll have text after the image which will insert + % \parskip glue, so insert it here too to equalize the space + % above and below. + \nobreak\vskip\parskip + \nobreak + \line\bgroup\hss + \fi + % + % Output the image. + \ifpdf + \dopdfimage{#1}{#2}{#3}% + \else + % \epsfbox itself resets \epsf?size at each figure. + \setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \epsfxsize=#2\relax \fi + \setbox0 = \hbox{\ignorespaces #3}\ifdim\wd0 > 0pt \epsfysize=#3\relax \fi + \epsfbox{#1.eps}% + \fi + % + \ifimagevmode \hss \egroup \bigbreak \fi % space after the image +\endgroup} + + +% @float FLOATTYPE,LOC ... @end float for displayed figures, tables, etc. +% We don't actually implement floating yet, we just plop the float "here". +% But it seemed the best name for the future. +% +\envparseargdef\float{\dofloat #1,,,\finish} + +% #1 is the optional FLOATTYPE, the text label for this float, typically +% "Figure", "Table", "Example", etc. Can't contain commas. If omitted, +% this float will not be numbered and cannot be referred to. +% +% #2 is the optional xref label. Also must be present for the float to +% be referable. +% +% #3 is the optional positioning argument; for now, it is ignored. It +% will somehow specify the positions allowed to float to (here, top, bottom). +% +% We keep a separate counter for each FLOATTYPE, which we reset at each +% chapter-level command. +\let\resetallfloatnos=\empty +% +\def\dofloat#1,#2,#3,#4\finish{% + \let\thiscaption=\empty + \let\thisshortcaption=\empty + % + % don't lose footnotes inside @float. + \startsavinginserts + % + % We can't be used inside a paragraph. + \par + % + \vtop\bgroup + \def\floattype{#1}% + \def\floatlabel{#2}% + \def\floatloc{#3}% we do nothing with this yet. + % + \ifx\floattype\empty + \let\safefloattype=\empty + \else + {% + % the floattype might have accents or other special characters, + % but we need to use it in a control sequence name. + \indexnofonts + \turnoffactive + \xdef\safefloattype{\floattype}% + }% + \fi + % + % If label is given but no type, we handle that as the empty type. + \ifx\floatlabel\empty \else + % We want each FLOATTYPE to be numbered separately (Figure 1, + % Table 1, Figure 2, ...). (And if no label, no number.) + % + \expandafter\getfloatno\csname\safefloattype floatno\endcsname + \global\advance\floatno by 1 + % + {% + % This magic value for \thissection is output by \setref as the + % XREFLABEL-title value. \xrefX uses it to distinguish float + % labels (which have a completely different output format) from + % node and anchor labels. And \xrdef uses it to construct the + % lists of floats. + % + \edef\thissection{\floatmagic=\safefloattype}% + \setref{\floatlabel}{Yfloat}% + }% + \fi + % + % start with \parskip glue, I guess. + \vskip\parskip + % + % Don't suppress indentation if a float happens to start a section. + \restorefirstparagraphindent +} + +% we have these possibilities: +% @float Foo,lbl & @caption{Cap}: Foo 1.1: Cap +% @float Foo,lbl & no caption: Foo 1.1 +% @float Foo & @caption{Cap}: Foo: Cap +% @float Foo & no caption: Foo +% @float ,lbl & Caption{Cap}: 1.1: Cap +% @float ,lbl & no caption: 1.1 +% @float & @caption{Cap}: Cap +% @float & no caption: +% +\def\Efloat{% + \let\floatident = \empty + % + % In all cases, if we have a float type, it comes first. + \ifx\floattype\empty \else \def\floatident{\floattype}\fi + % + % If we have an xref label, the number comes next. + \ifx\floatlabel\empty \else + \ifx\floattype\empty \else % if also had float type, need tie first. + \appendtomacro\floatident{\tie}% + \fi + % the number. + \appendtomacro\floatident{\chaplevelprefix\the\floatno}% + \fi + % + % Start the printed caption with what we've constructed in + % \floatident, but keep it separate; we need \floatident again. + \let\captionline = \floatident + % + \ifx\thiscaption\empty \else + \ifx\floatident\empty \else + \appendtomacro\captionline{: }% had ident, so need a colon between + \fi + % + % caption text. + \appendtomacro\captionline\thiscaption + \fi + % + % If we have anything to print, print it, with space before. + % Eventually this needs to become an \insert. + \ifx\captionline\empty \else + \vskip.5\parskip + \captionline + \fi + % + % If have an xref label, write the list of floats info. Do this + % after the caption, to avoid chance of it being a breakpoint. + \ifx\floatlabel\empty \else + % Write the text that goes in the lof to the aux file as + % \floatlabel-lof. Besides \floatident, we include the short + % caption if specified, else the full caption if specified, else nothing. + {% + \atdummies \turnoffactive \otherbackslash + \immediate\write\auxfile{@xrdef{\floatlabel-lof}{% + \floatident + \ifx\thisshortcaption\empty + \ifx\thiscaption\empty \else : \thiscaption \fi + \else + : \thisshortcaption + \fi + }}% + }% + \fi + % + % Space below caption, if we printed anything. + \ifx\printedsomething\empty \else \vskip\parskip \fi + \egroup % end of \vtop + \checkinserts +} + +% Append the tokens #2 to the definition of macro #1, not expanding either. +% +\newtoks\appendtomacroAtoks +\newtoks\appendtomacroBtoks +\def\appendtomacro#1#2{% + \appendtomacroAtoks = \expandafter{#1}% + \appendtomacroBtoks = {#2}% + \edef#1{\the\appendtomacroAtoks \the\appendtomacroBtoks}% +} + +% @caption, @shortcaption are easy. +% +\long\def\caption#1{\checkenv\float \def\thiscaption{#1}} +\def\shortcaption#1{\checkenv\float \def\thisshortcaption{#1}} + +% The parameter is the control sequence identifying the counter we are +% going to use. Create it if it doesn't exist and assign it to \floatno. +\def\getfloatno#1{% + \ifx#1\relax + % Haven't seen this figure type before. + \csname newcount\endcsname #1% + % + % Remember to reset this floatno at the next chap. + \expandafter\gdef\expandafter\resetallfloatnos + \expandafter{\resetallfloatnos #1=0 }% + \fi + \let\floatno#1% +} + +% \setref calls this to get the XREFLABEL-snt value. We want an @xref +% to the FLOATLABEL to expand to "Figure 3.1". We call \setref when we +% first read the @float command. +% +\def\Yfloat{\floattype@tie \chaplevelprefix\the\floatno}% + +% Magic string used for the XREFLABEL-title value, so \xrefX can +% distinguish floats from other xref types. +\def\floatmagic{!!float!!} + +% #1 is the control sequence we are passed; we expand into a conditional +% which is true if #1 represents a float ref. That is, the magic +% \thissection value which we \setref above. +% +\def\iffloat#1{\expandafter\doiffloat#1==\finish} +% +% #1 is (maybe) the \floatmagic string. If so, #2 will be the +% (safe) float type for this float. We set \iffloattype to #2. +% +\def\doiffloat#1=#2=#3\finish{% + \def\temp{#1}% + \def\iffloattype{#2}% + \ifx\temp\floatmagic +} + +% @listoffloats FLOATTYPE - print a list of floats like a table of contents. +% +\parseargdef\listoffloats{% + \def\floattype{#1}% floattype + {% + % the floattype might have accents or other special characters, + % but we need to use it in a control sequence name. + \indexnofonts + \turnoffactive + \xdef\safefloattype{\floattype}% + }% + % + % \xrdef saves the floats as a \do-list in \floatlistSAFEFLOATTYPE. + \expandafter\ifx\csname floatlist\safefloattype\endcsname \relax + \ifhavexrefs + % if the user said @listoffloats foo but never @float foo. + \message{\linenumber No `\safefloattype' floats to list.}% + \fi + \else + \begingroup + \leftskip=\tocindent % indent these entries like a toc + \let\do=\listoffloatsdo + \csname floatlist\safefloattype\endcsname + \endgroup + \fi +} + +% This is called on each entry in a list of floats. We're passed the +% xref label, in the form LABEL-title, which is how we save it in the +% aux file. We strip off the -title and look up \XRLABEL-lof, which +% has the text we're supposed to typeset here. +% +% Figures without xref labels will not be included in the list (since +% they won't appear in the aux file). +% +\def\listoffloatsdo#1{\listoffloatsdoentry#1\finish} +\def\listoffloatsdoentry#1-title\finish{{% + % Can't fully expand XR#1-lof because it can contain anything. Just + % pass the control sequence. On the other hand, XR#1-pg is just the + % page number, and we want to fully expand that so we can get a link + % in pdf output. + \toksA = \expandafter{\csname XR#1-lof\endcsname}% + % + % use the same \entry macro we use to generate the TOC and index. + \edef\writeentry{\noexpand\entry{\the\toksA}{\csname XR#1-pg\endcsname}}% + \writeentry +}} + +\message{localization,} +% and i18n. + +% @documentlanguage is usually given very early, just after +% @setfilename. If done too late, it may not override everything +% properly. Single argument is the language abbreviation. +% It would be nice if we could set up a hyphenation file here. +% +\parseargdef\documentlanguage{% + \tex % read txi-??.tex file in plain TeX. + % Read the file if it exists. + \openin 1 txi-#1.tex + \ifeof 1 + \errhelp = \nolanghelp + \errmessage{Cannot read language file txi-#1.tex}% + \else + \input txi-#1.tex + \fi + \closein 1 + \endgroup +} +\newhelp\nolanghelp{The given language definition file cannot be found or +is empty. Maybe you need to install it? In the current directory +should work if nowhere else does.} + + +% @documentencoding should change something in TeX eventually, most +% likely, but for now just recognize it. +\let\documentencoding = \comment + + +% Page size parameters. +% +\newdimen\defaultparindent \defaultparindent = 15pt + +\chapheadingskip = 15pt plus 4pt minus 2pt +\secheadingskip = 12pt plus 3pt minus 2pt +\subsecheadingskip = 9pt plus 2pt minus 2pt + +% Prevent underfull vbox error messages. +\vbadness = 10000 + +% Don't be so finicky about underfull hboxes, either. +\hbadness = 2000 + +% Following George Bush, just get rid of widows and orphans. +\widowpenalty=10000 +\clubpenalty=10000 + +% Use TeX 3.0's \emergencystretch to help line breaking, but if we're +% using an old version of TeX, don't do anything. We want the amount of +% stretch added to depend on the line length, hence the dependence on +% \hsize. We call this whenever the paper size is set. +% +\def\setemergencystretch{% + \ifx\emergencystretch\thisisundefined + % Allow us to assign to \emergencystretch anyway. + \def\emergencystretch{\dimen0}% + \else + \emergencystretch = .15\hsize + \fi +} + +% Parameters in order: 1) textheight; 2) textwidth; 3) voffset; +% 4) hoffset; 5) binding offset; 6) topskip; 7) physical page height; 8) +% physical page width. +% +% We also call \setleading{\textleading}, so the caller should define +% \textleading. The caller should also set \parskip. +% +\def\internalpagesizes#1#2#3#4#5#6#7#8{% + \voffset = #3\relax + \topskip = #6\relax + \splittopskip = \topskip + % + \vsize = #1\relax + \advance\vsize by \topskip + \outervsize = \vsize + \advance\outervsize by 2\topandbottommargin + \pageheight = \vsize + % + \hsize = #2\relax + \outerhsize = \hsize + \advance\outerhsize by 0.5in + \pagewidth = \hsize + % + \normaloffset = #4\relax + \bindingoffset = #5\relax + % + \ifpdf + \pdfpageheight #7\relax + \pdfpagewidth #8\relax + \fi + % + \setleading{\textleading} + % + \parindent = \defaultparindent + \setemergencystretch +} + +% @letterpaper (the default). +\def\letterpaper{{\globaldefs = 1 + \parskip = 3pt plus 2pt minus 1pt + \textleading = 13.2pt + % + % If page is nothing but text, make it come out even. + \internalpagesizes{46\baselineskip}{6in}% + {\voffset}{.25in}% + {\bindingoffset}{36pt}% + {11in}{8.5in}% +}} + +% Use @smallbook to reset parameters for 7x9.5 (or so) format. +\def\smallbook{{\globaldefs = 1 + \parskip = 2pt plus 1pt + \textleading = 12pt + % + \internalpagesizes{7.5in}{5in}% + {\voffset}{.25in}% + {\bindingoffset}{16pt}% + {9.25in}{7in}% + % + \lispnarrowing = 0.3in + \tolerance = 700 + \hfuzz = 1pt + \contentsrightmargin = 0pt + \defbodyindent = .5cm +}} + +% Use @afourpaper to print on European A4 paper. +\def\afourpaper{{\globaldefs = 1 + \parskip = 3pt plus 2pt minus 1pt + \textleading = 13.2pt + % + % Double-side printing via postscript on Laserjet 4050 + % prints double-sided nicely when \bindingoffset=10mm and \hoffset=-6mm. + % To change the settings for a different printer or situation, adjust + % \normaloffset until the front-side and back-side texts align. Then + % do the same for \bindingoffset. You can set these for testing in + % your texinfo source file like this: + % @tex + % \global\normaloffset = -6mm + % \global\bindingoffset = 10mm + % @end tex + \internalpagesizes{51\baselineskip}{160mm} + {\voffset}{\hoffset}% + {\bindingoffset}{44pt}% + {297mm}{210mm}% + % + \tolerance = 700 + \hfuzz = 1pt + \contentsrightmargin = 0pt + \defbodyindent = 5mm +}} + +% Use @afivepaper to print on European A5 paper. +% From romildo@urano.iceb.ufop.br, 2 July 2000. +% He also recommends making @example and @lisp be small. +\def\afivepaper{{\globaldefs = 1 + \parskip = 2pt plus 1pt minus 0.1pt + \textleading = 12.5pt + % + \internalpagesizes{160mm}{120mm}% + {\voffset}{\hoffset}% + {\bindingoffset}{8pt}% + {210mm}{148mm}% + % + \lispnarrowing = 0.2in + \tolerance = 800 + \hfuzz = 1.2pt + \contentsrightmargin = 0pt + \defbodyindent = 2mm + \tableindent = 12mm +}} + +% A specific text layout, 24x15cm overall, intended for A4 paper. +\def\afourlatex{{\globaldefs = 1 + \afourpaper + \internalpagesizes{237mm}{150mm}% + {\voffset}{4.6mm}% + {\bindingoffset}{7mm}% + {297mm}{210mm}% + % + % Must explicitly reset to 0 because we call \afourpaper. + \globaldefs = 0 +}} + +% Use @afourwide to print on A4 paper in landscape format. +\def\afourwide{{\globaldefs = 1 + \afourpaper + \internalpagesizes{241mm}{165mm}% + {\voffset}{-2.95mm}% + {\bindingoffset}{7mm}% + {297mm}{210mm}% + \globaldefs = 0 +}} + +% @pagesizes TEXTHEIGHT[,TEXTWIDTH] +% Perhaps we should allow setting the margins, \topskip, \parskip, +% and/or leading, also. Or perhaps we should compute them somehow. +% +\parseargdef\pagesizes{\pagesizesyyy #1,,\finish} +\def\pagesizesyyy#1,#2,#3\finish{{% + \setbox0 = \hbox{\ignorespaces #2}\ifdim\wd0 > 0pt \hsize=#2\relax \fi + \globaldefs = 1 + % + \parskip = 3pt plus 2pt minus 1pt + \setleading{\textleading}% + % + \dimen0 = #1 + \advance\dimen0 by \voffset + % + \dimen2 = \hsize + \advance\dimen2 by \normaloffset + % + \internalpagesizes{#1}{\hsize}% + {\voffset}{\normaloffset}% + {\bindingoffset}{44pt}% + {\dimen0}{\dimen2}% +}} + +% Set default to letter. +% +\letterpaper + + +\message{and turning on texinfo input format.} + +% Define macros to output various characters with catcode for normal text. +\catcode`\"=\other +\catcode`\~=\other +\catcode`\^=\other +\catcode`\_=\other +\catcode`\|=\other +\catcode`\<=\other +\catcode`\>=\other +\catcode`\+=\other +\catcode`\$=\other +\def\normaldoublequote{"} +\def\normaltilde{~} +\def\normalcaret{^} +\def\normalunderscore{_} +\def\normalverticalbar{|} +\def\normalless{<} +\def\normalgreater{>} +\def\normalplus{+} +\def\normaldollar{$}%$ font-lock fix + +% This macro is used to make a character print one way in \tt +% (where it can probably be output as-is), and another way in other fonts, +% where something hairier probably needs to be done. +% +% #1 is what to print if we are indeed using \tt; #2 is what to print +% otherwise. Since all the Computer Modern typewriter fonts have zero +% interword stretch (and shrink), and it is reasonable to expect all +% typewriter fonts to have this, we can check that font parameter. +% +\def\ifusingtt#1#2{\ifdim \fontdimen3\font=0pt #1\else #2\fi} + +% Same as above, but check for italic font. Actually this also catches +% non-italic slanted fonts since it is impossible to distinguish them from +% italic fonts. But since this is only used by $ and it uses \sl anyway +% this is not a problem. +\def\ifusingit#1#2{\ifdim \fontdimen1\font>0pt #1\else #2\fi} + +% Turn off all special characters except @ +% (and those which the user can use as if they were ordinary). +% Most of these we simply print from the \tt font, but for some, we can +% use math or other variants that look better in normal text. + +\catcode`\"=\active +\def\activedoublequote{{\tt\char34}} +\let"=\activedoublequote +\catcode`\~=\active +\def~{{\tt\char126}} +\chardef\hat=`\^ +\catcode`\^=\active +\def^{{\tt \hat}} + +\catcode`\_=\active +\def_{\ifusingtt\normalunderscore\_} +% Subroutine for the previous macro. +\def\_{\leavevmode \kern.07em \vbox{\hrule width.3em height.1ex}\kern .07em } + +\catcode`\|=\active +\def|{{\tt\char124}} +\chardef \less=`\< +\catcode`\<=\active +\def<{{\tt \less}} +\chardef \gtr=`\> +\catcode`\>=\active +\def>{{\tt \gtr}} +\catcode`\+=\active +\def+{{\tt \char 43}} +\catcode`\$=\active +\def${\ifusingit{{\sl\$}}\normaldollar}%$ font-lock fix + +% If a .fmt file is being used, characters that might appear in a file +% name cannot be active until we have parsed the command line. +% So turn them off again, and have \everyjob (or @setfilename) turn them on. +% \otherifyactive is called near the end of this file. +\def\otherifyactive{\catcode`+=\other \catcode`\_=\other} + +\catcode`\@=0 + +% \backslashcurfont outputs one backslash character in current font, +% as in \char`\\. +\global\chardef\backslashcurfont=`\\ +\global\let\rawbackslashxx=\backslashcurfont % let existing .??s files work + +% \rawbackslash defines an active \ to do \backslashcurfont. +% \otherbackslash defines an active \ to be a literal `\' character with +% catcode other. +{\catcode`\\=\active + @gdef@rawbackslash{@let\=@backslashcurfont} + @gdef@otherbackslash{@let\=@realbackslash} +} + +% \realbackslash is an actual character `\' with catcode other. +{\catcode`\\=\other @gdef@realbackslash{\}} + +% \normalbackslash outputs one backslash in fixed width font. +\def\normalbackslash{{\tt\backslashcurfont}} + +\catcode`\\=\active + +% Used sometimes to turn off (effectively) the active characters +% even after parsing them. +@def@turnoffactive{% + @let"=@normaldoublequote + @let\=@realbackslash + @let~=@normaltilde + @let^=@normalcaret + @let_=@normalunderscore + @let|=@normalverticalbar + @let<=@normalless + @let>=@normalgreater + @let+=@normalplus + @let$=@normaldollar %$ font-lock fix + @unsepspaces +} + +% Same as @turnoffactive except outputs \ as {\tt\char`\\} instead of +% the literal character `\'. (Thus, \ is not expandable when this is in +% effect.) +% +@def@normalturnoffactive{@turnoffactive @let\=@normalbackslash} + +% Make _ and + \other characters, temporarily. +% This is canceled by @fixbackslash. +@otherifyactive + +% If a .fmt file is being used, we don't want the `\input texinfo' to show up. +% That is what \eatinput is for; after that, the `\' should revert to printing +% a backslash. +% +@gdef@eatinput input texinfo{@fixbackslash} +@global@let\ = @eatinput + +% On the other hand, perhaps the file did not have a `\input texinfo'. Then +% the first `\{ in the file would cause an error. This macro tries to fix +% that, assuming it is called before the first `\' could plausibly occur. +% Also back turn on active characters that might appear in the input +% file name, in case not using a pre-dumped format. +% +@gdef@fixbackslash{% + @ifx\@eatinput @let\ = @normalbackslash @fi + @catcode`+=@active + @catcode`@_=@active +} + +% Say @foo, not \foo, in error messages. +@escapechar = `@@ + +% These look ok in all fonts, so just make them not special. +@catcode`@& = @other +@catcode`@# = @other +@catcode`@% = @other + + +@c Local variables: +@c eval: (add-hook 'write-file-hooks 'time-stamp) +@c page-delimiter: "^\\\\message" +@c time-stamp-start: "def\\\\texinfoversion{" +@c time-stamp-format: "%:y-%02m-%02d.%02H" +@c time-stamp-end: "}" +@c End: + +@c vim:sw=2: + +@ignore + arch-tag: e1b36e32-c96e-4135-a41a-0b2efa2ea115 +@end ignore diff --git a/doc/threads.texi b/doc/threads.texi new file mode 100644 index 0000000..1deba9f --- /dev/null +++ b/doc/threads.texi @@ -0,0 +1,872 @@ +@node Task 1--Threads +@chapter Task 1: Threads + +In this assignment, we give you a minimally functional thread system. +Your job is to extend the functionality of this system to gain a +better understanding of synchronization problems. + +You will be working primarily in the @file{threads} directory for +this assignment, with some work in the @file{devices} directory on the +side. Compilation should be done in the @file{threads} directory. + +Before you read the description of this task, you should read all of +the following sections: @ref{Introduction}, @ref{Coding Standards}, +@ref{Debugging Tools}, and @ref{Development Tools}. You should at least +skim the material from @ref{Pintos Loading} through @ref{Memory +Allocation}, especially @ref{Synchronization}. To complete this task +you will also need to read @ref{4.4BSD Scheduler}. + +@menu +* Task 1 Background:: +* Task 1 Requirements:: +* Task 1 FAQ:: +@end menu + +@node Task 1 Background +@section Background + + +@menu +* Understanding Threads:: +* Task 1 Source Files:: +* Task 1 Synchronization:: +* Development Suggestions:: +@end menu + +@node Understanding Threads +@subsection Understanding Threads + +The first step is to read and understand the code for the initial thread +system. +Pintos already implements thread creation and thread completion, +a simple scheduler to switch between threads, and synchronization +primitives (semaphores, locks, condition variables, and optimization +barriers). + +Some of this code might seem slightly mysterious. If +you haven't already compiled and run the base system, as described in +the introduction (@pxref{Introduction}), you should do so now. You +can read through parts of the source code to see what's going +on. If you like, you can add calls to @func{printf} almost +anywhere, then recompile and run to see what happens and in what +order. You can also run the kernel in a debugger and set breakpoints +at interesting spots, single-step through code and examine data, and +so on. + +When a thread is created, you are creating a new context to be +scheduled. You provide a function to be run in this context as an +argument to @func{thread_create}. The first time the thread is +scheduled and runs, it starts from the beginning of that function +and executes in that context. When the function returns, the thread +terminates. Each thread, therefore, acts like a mini-program running +inside Pintos, with the function passed to @func{thread_create} +acting like @func{main}. + +At any given time, exactly one thread runs and the rest, if any, +become inactive. The scheduler decides which thread to +run next. (If no thread is ready to run +at any given time, then the special ``idle'' thread, implemented in +@func{idle}, runs.) +Synchronization primitives can force context switches when one +thread needs to wait for another thread to do something. + +The mechanics of a context switch are +in @file{threads/switch.S}, which is 80@var{x}86 +assembly code. (You don't have to understand it.) It saves the +state of the currently running thread and restores the state of the +thread we're switching to. + +Using the GDB debugger, slowly trace through a context +switch to see what happens (@pxref{GDB}). You can set a +breakpoint on @func{schedule} to start out, and then +single-step from there.@footnote{GDB might tell you that +@func{schedule} doesn't exist, which is arguably a GDB bug. +You can work around this by setting the breakpoint by filename and +line number, e.g.@: @code{break thread.c:@var{ln}} where @var{ln} is +the line number of the first declaration in @func{schedule}.} Be sure +to keep track of each thread's address +and state, and what procedures are on the call stack for each thread. +You will notice that when one thread calls @func{switch_threads}, +another thread starts running, and the first thing the new thread does +is to return from @func{switch_threads}. You will understand the thread +system once you understand why and how the @func{switch_threads} that +gets called is different from the @func{switch_threads} that returns. +@xref{Thread Switching}, for more information. + +@strong{Warning}: In Pintos, each thread is assigned a small, +fixed-size execution stack just under @w{4 kB} in size. The kernel +tries to detect stack overflow, but it cannot do so perfectly. You +may cause bizarre problems, such as mysterious kernel panics, if you +declare large data structures as non-static local variables, +e.g. @samp{int buf[1000];}. Alternatives to stack allocation include +the page allocator and the block allocator (@pxref{Memory Allocation}). + +@node Task 1 Source Files +@subsection Source Files + +Here is a brief overview of the files in the @file{threads} +directory. You will not need to modify most of this code, but the +hope is that presenting this overview will give you a start on what +code to look at. + +@table @file +@item loader.S +@itemx loader.h +The kernel loader. Assembles to 512 bytes of code and data that the +PC BIOS loads into memory and which in turn finds the kernel on disk, +loads it into memory, and jumps to @func{start} in @file{start.S}. +@xref{Pintos Loader}, for details. You should not need to look at +this code or modify it. + +@item start.S +Does basic setup needed for memory protection and 32-bit +operation on 80@var{x}86 CPUs. Unlike the loader, this code is +actually part of the kernel. @xref{Low-Level Kernel Initialization}, +for details. + +@item kernel.lds.S +The linker script used to link the kernel. Sets the load address of +the kernel and arranges for @file{start.S} to be near the beginning +of the kernel image. @xref{Pintos Loader}, for details. Again, you +should not need to look at this code +or modify it, but it's here in case you're curious. + +@item init.c +@itemx init.h +Kernel initialization, including @func{main}, the kernel's ``main +program.'' You should look over @func{main} at least to see what +gets initialized. You might want to add your own initialization code +here. @xref{High-Level Kernel Initialization}, for details. + +@item thread.c +@itemx thread.h +Basic thread support. Much of your work will take place in these files. +@file{thread.h} defines @struct{thread}, which you are likely to modify +in all four tasks. See @ref{struct thread} and @ref{Threads} for +more information. + +@item switch.S +@itemx switch.h +Assembly language routine for switching threads. Already discussed +above. @xref{Thread Functions}, for more information. + +@item palloc.c +@itemx palloc.h +Page allocator, which hands out system memory in multiples of 4 kB +pages. @xref{Page Allocator}, for more information. + +@item malloc.c +@itemx malloc.h +A simple implementation of @func{malloc} and @func{free} for +the kernel. @xref{Block Allocator}, for more information. + +@item interrupt.c +@itemx interrupt.h +Basic interrupt handling and functions for turning interrupts on and +off. @xref{Interrupt Handling}, for more information. + +@item intr-stubs.S +@itemx intr-stubs.h +Assembly code for low-level interrupt handling. @xref{Interrupt +Infrastructure}, for more information. + +@item synch.c +@itemx synch.h +Basic synchronization primitives: semaphores, locks, condition +variables, and optimization barriers. You will need to use these for +synchronization in all +four tasks. @xref{Synchronization}, for more information. + +@item io.h +Functions for I/O port access. This is mostly used by source code in +the @file{devices} directory that you won't have to touch. + +@item vaddr.h +@itemx pte.h +Functions and macros for working with virtual addresses and page table +entries. These will be more important to you in task 3. For now, +you can ignore them. + +@item flags.h +Macros that define a few bits in the 80@var{x}86 ``flags'' register. +Probably of no interest. See @bibref{IA32-v1}, section 3.4.3, ``EFLAGS +Register,'' for more information. +@end table + +@menu +* devices code:: +* lib files:: +@end menu + +@node devices code +@subsubsection @file{devices} code + +The basic threaded kernel also includes these files in the +@file{devices} directory: + +@table @file +@item timer.c +@itemx timer.h +System timer that ticks, by default, 100 times per second. You will +modify this code in this task. + +@item vga.c +@itemx vga.h +VGA display driver. Responsible for writing text to the screen. +You should have no need to look at this code. @func{printf} +calls into the VGA display driver for you, so there's little reason to +call this code yourself. + +@item serial.c +@itemx serial.h +Serial port driver. Again, @func{printf} calls this code for you, +so you don't need to do so yourself. +It handles serial input by passing it to the input layer (see below). + +@item block.c +@itemx block.h +An abstraction layer for @dfn{block devices}, that is, random-access, +disk-like devices that are organized as arrays of fixed-size blocks. +Out of the box, Pintos supports two types of block devices: IDE disks +and partitions. Block devices, regardless of type, won't actually be +used until task 2. + +@item ide.c +@itemx ide.h +Supports reading and writing sectors on up to 4 IDE disks. + +@item partition.c +@itemx partition.h +Understands the structure of partitions on disks, allowing a single +disk to be carved up into multiple regions (partitions) for +independent use. + +@item kbd.c +@itemx kbd.h +Keyboard driver. Handles keystrokes passing them to the input layer +(see below). + +@item input.c +@itemx input.h +Input layer. Queues input characters passed along by the keyboard or +serial drivers. + +@item intq.c +@itemx intq.h +Interrupt queue, for managing a circular queue that both kernel +threads and interrupt handlers want to access. Used by the keyboard +and serial drivers. + +@item rtc.c +@itemx rtc.h +Real-time clock driver, to enable the kernel to determine the current +date and time. By default, this is only used by @file{thread/init.c} +to choose an initial seed for the random number generator. + +@item speaker.c +@itemx speaker.h +Driver that can produce tones on the PC speaker. + +@item pit.c +@itemx pit.h +Code to configure the 8254 Programmable Interrupt Timer. This code is +used by both @file{devices/timer.c} and @file{devices/speaker.c} +because each device uses one of the PIT's output channel. +@end table + +@node lib files +@subsubsection @file{lib} files + +Finally, @file{lib} and @file{lib/kernel} contain useful library +routines. (@file{lib/user} will be used by user programs, starting in +task 2, but it is not part of the kernel.) Here's a few more +details: + +@table @file +@item ctype.h +@itemx inttypes.h +@itemx limits.h +@itemx stdarg.h +@itemx stdbool.h +@itemx stddef.h +@itemx stdint.h +@itemx stdio.c +@itemx stdio.h +@itemx stdlib.c +@itemx stdlib.h +@itemx string.c +@itemx string.h +A subset of the standard C library. @xref{C99}, for +information +on a few recently introduced pieces of the C library that you might +not have encountered before. @xref{Unsafe String Functions}, for +information on what's been intentionally left out for safety. + +@item debug.c +@itemx debug.h +Functions and macros to aid debugging. @xref{Debugging Tools}, for +more information. + +@item random.c +@itemx random.h +Pseudo-random number generator. The actual sequence of random values +may vary from one Pintos run to another. + +@item round.h +Macros for rounding. + +@item syscall-nr.h +System call numbers. Not used until task 2. + +@item kernel/list.c +@itemx kernel/list.h +Doubly linked list implementation. Used all over the Pintos code, and +you'll probably want to use it a few places yourself in task 1. + +@item kernel/bitmap.c +@itemx kernel/bitmap.h +Bitmap implementation. You can use this in your code if you like, but +you probably won't have any need for it in task 1. + +@item kernel/hash.c +@itemx kernel/hash.h +Hash table implementation. Likely to come in handy for task 3. + +@item kernel/console.c +@itemx kernel/console.h +@item kernel/stdio.h +Implements @func{printf} and a few other functions. +@end table + +@node Task 1 Synchronization +@subsection Synchronization + +Proper synchronization is an important part of the solutions to these +problems. Any synchronization problem can be easily solved by turning +interrupts off: while interrupts are off, there is no concurrency, so +there's no possibility for race conditions. Therefore, it's tempting to +solve all synchronization problems this way, but @strong{don't}. +Instead, use semaphores, locks, and condition variables to solve the +bulk of your synchronization problems. Read the tour section on +synchronization (@pxref{Synchronization}) or the comments in +@file{threads/synch.c} if you're unsure what synchronization primitives +may be used in what situations. + +In the Pintos tasks, the only class of problem best solved by +disabling interrupts is coordinating data shared between a kernel thread +and an interrupt handler. Because interrupt handlers can't sleep, they +can't acquire locks. This means that data shared between kernel threads +and an interrupt handler must be protected within a kernel thread by +turning off interrupts. + +This task only requires accessing a little bit of thread state from +interrupt handlers. For the alarm clock, the timer interrupt needs to +wake up sleeping threads. In the advanced scheduler, the timer +interrupt needs to access a few global and per-thread variables. When +you access these variables from kernel threads, you will need to disable +interrupts to prevent the timer interrupt from interfering. + +When you do turn off interrupts, take care to do so for the least amount +of code possible, or you can end up losing important things such as +timer ticks or input events. Turning off interrupts also increases the +interrupt handling latency, which can make a machine feel sluggish if +taken too far. + +The synchronization primitives themselves in @file{synch.c} are +implemented by disabling interrupts. You may need to increase the +amount of code that runs with interrupts disabled here, but you should +still try to keep it to a minimum. + +Disabling interrupts can be useful for debugging, if you want to make +sure that a section of code is not interrupted. You should remove +debugging code before turning in your task. (Don't just comment it +out, because that can make the code difficult to read.) + +There should be no busy waiting in your submission. A tight loop that +calls @func{thread_yield} is one form of busy waiting. + +@node Development Suggestions +@subsection Development Suggestions + +In the past, many groups divided the assignment into pieces, then each +group member worked on his or her piece until just before the +deadline, at which time the group reconvened to combine their code and +submit. @strong{This is a bad idea. We do not recommend this +approach.} Groups that do this often find that two changes conflict +with each other, requiring lots of last-minute debugging. Some groups +who have done this have turned in code that did not even compile or +boot, much less pass any tests. + +@localgitpolicy{} + +You should expect to run into bugs that you simply don't understand +while working on this and subsequent tasks. When you do, +reread the appendix on debugging tools, which is filled with +useful debugging tips that should help you to get back up to speed +(@pxref{Debugging Tools}). Be sure to read the section on backtraces +(@pxref{Backtraces}), which will help you to get the most out of every +kernel panic or assertion failure. + +@node Task 1 Requirements +@section Requirements + +@menu +* Task 1 Design Document:: +* Alarm Clock:: +* Priority Scheduling:: +* Advanced Scheduler:: +@end menu + +@node Task 1 Design Document +@subsection Design Document + +Before you turn in your task, you must copy @uref{threads.tmpl, , the +task 1 design document template} into your source tree under the name +@file{pintos-ic/src/threads/DESIGNDOC} and fill it in. We recommend that +you read the design document template before you start working on the +task. @xref{Task Documentation}, for a sample design document +that goes along with a fictitious task. + +@node Alarm Clock +@subsection Alarm Clock + +Reimplement @func{timer_sleep}, defined in @file{devices/timer.c}. +Although a working implementation is provided, it ``busy waits,'' that +is, it spins in a loop checking the current time and calling +@func{thread_yield} until enough time has gone by. Reimplement it to +avoid busy waiting. + +@deftypefun void timer_sleep (int64_t @var{ticks}) +Suspends execution of the calling thread until time has advanced by at +least @w{@var{x} timer ticks}. Unless the system is otherwise idle, the +thread need not wake up after exactly @var{x} ticks. Just put it on +the ready queue after they have waited for the right amount of time. + +@func{timer_sleep} is useful for threads that operate in real-time, +e.g.@: for blinking the cursor once per second. + +The argument to @func{timer_sleep} is expressed in timer ticks, not in +milliseconds or any another unit. There are @code{TIMER_FREQ} timer +ticks per second, where @code{TIMER_FREQ} is a macro defined in +@code{devices/timer.h}. The default value is 100. We don't recommend +changing this value, because any change is likely to cause many of +the tests to fail. +@end deftypefun + +Separate functions @func{timer_msleep}, @func{timer_usleep}, and +@func{timer_nsleep} do exist for sleeping a specific number of +milliseconds, microseconds, or nanoseconds, respectively, but these will +call @func{timer_sleep} automatically when necessary. You do not need +to modify them. + +If your delays seem too short or too long, reread the explanation of the +@option{-r} option to @command{pintos} (@pxref{Debugging versus +Testing}). + +The alarm clock implementation is not needed for later tasks. + +@node Priority Scheduling +@subsection Priority Scheduling + +Implement priority scheduling in Pintos. +When a thread is added to the ready list that has a higher priority +than the currently running thread, the current thread should +immediately yield the processor to the new thread. Similarly, when +threads are waiting for a lock, semaphore, or condition variable, the +highest priority waiting thread should be awakened first. A thread +may raise or lower its own priority at any time, but lowering its +priority such that it no longer has the highest priority must cause it +to immediately yield the CPU. In both the priority scheduler and the +advanced scheduler you will write later, the running thread should +be that with the highest priority. + +Thread priorities range from @code{PRI_MIN} (0) to @code{PRI_MAX} (63). +Lower numbers correspond to lower priorities, so that priority 0 +is the lowest priority and priority 63 is the highest. +The initial thread priority is passed as an argument to +@func{thread_create}. If there's no reason to choose another +priority, use @code{PRI_DEFAULT} (31). The @code{PRI_} macros are +defined in @file{threads/thread.h}, and you should not change their +values. + +@subsection Priority Donation + +One issue with priority scheduling is ``priority inversion''. Consider +high, medium, and low priority threads @var{H}, @var{M}, and @var{L}, +respectively. If @var{H} needs to wait for @var{L} (for instance, for a +lock held by @var{L}), and @var{M} is on the ready list, then @var{H} +will never get the CPU because the low priority thread will not get any +CPU time. A partial fix for this problem is for @var{H} to ``donate'' +its priority to @var{L} while @var{L} is holding the lock, then recall +the donation once @var{L} releases (and thus @var{H} acquires) the lock. + +Implement priority donation. You will need to account for all different +situations in which priority donation is required. Be sure to handle +multiple donations, in which multiple priorities are donated to a single +thread. You must also handle nested donation: if @var{H} is waiting on +a lock that @var{M} holds and @var{M} is waiting on a lock that @var{L} +holds, then both @var{M} and @var{L} should be boosted to @var{H}'s +priority. If necessary, you may impose a reasonable limit on depth of +nested priority donation, such as 8 levels. + +You must implement priority donation for locks. You need not +implement priority donation for the other Pintos synchronization +constructs. You do need to implement priority scheduling in all +cases. + +Finally, implement the following functions that allow a thread to +examine and modify its own priority. Skeletons for these functions are +provided in @file{threads/thread.c}. + +@deftypefun void thread_set_priority (int @var{new_priority}) +Sets the current thread's priority to @var{new_priority}. If the +current thread no longer has the highest priority, yields. +@end deftypefun + +@deftypefun int thread_get_priority (void) +Returns the current thread's priority. In the presence of priority +donation, returns the higher (donated) priority. +@end deftypefun + +You need not provide any interface to allow a thread to directly modify +other threads' priorities. + +The priority scheduler is not used in any later task. + +@node Advanced Scheduler +@subsection Advanced Scheduler + +Implement a multilevel feedback queue scheduler similar to the +4.4@acronym{BSD} scheduler to +reduce the average response time for running jobs on your system. +@xref{4.4BSD Scheduler}, for detailed requirements. + +Like the priority scheduler, the advanced scheduler chooses the thread +to run based on priorities. However, the advanced scheduler does not do +priority donation. Thus, we recommend that you have the priority +scheduler working, except possibly for priority donation, before you +start work on the advanced scheduler. + +You must write your code to allow us to choose a scheduling algorithm +policy at Pintos startup time. By default, the priority scheduler +must be active, but we must be able to choose the 4.4@acronym{BSD} +scheduler +with the @option{-mlfqs} kernel option. Passing this +option sets @code{thread_mlfqs}, declared in @file{threads/thread.h}, to +true when the options are parsed by @func{parse_options}, which happens +early in @func{main}. + +When the 4.4@acronym{BSD} scheduler is enabled, threads no longer +directly control their own priorities. The @var{priority} argument to +@func{thread_create} should be ignored, as well as any calls to +@func{thread_set_priority}, and @func{thread_get_priority} should return +the thread's current priority as set by the scheduler. + +The advanced scheduler is not used in any later task. + +@node Task 1 FAQ +@section FAQ + +@table @b +@item How much code will I need to write? + +Here's a summary of our reference solution, produced by the +@command{diffstat} program. The final row gives total lines inserted +and deleted; a changed line counts as both an insertion and a deletion. + +The reference solution represents just one possible solution. Many +other solutions are also possible and many of those differ greatly from +the reference solution. Some excellent solutions may not modify all the +files modified by the reference solution, and some may modify files not +modified by the reference solution. + +@verbatim + devices/timer.c | 42 +++++- + threads/fixed-point.h | 120 ++++++++++++++++++ + threads/synch.c | 88 ++++++++++++- + threads/thread.c | 196 ++++++++++++++++++++++++++---- + threads/thread.h | 23 +++ + 5 files changed, 440 insertions(+), 29 deletions(-) +@end verbatim + +@file{fixed-point.h} is a new file added by the reference solution. + +@item How do I update the @file{Makefile}s when I add a new source file? + +@anchor{Adding Source Files} +To add a @file{.c} file, edit the top-level @file{Makefile.build}. +Add the new file to variable @samp{@var{dir}_SRC}, where +@var{dir} is the directory where you added the file. For this +task, that means you should add it to @code{threads_SRC} or +@code{devices_SRC}. Then run @code{make}. If your new file +doesn't get +compiled, run @code{make clean} and then try again. + +When you modify the top-level @file{Makefile.build} and re-run +@command{make}, the modified +version should be automatically copied to +@file{threads/build/Makefile}. The converse is +not true, so any changes will be lost the next time you run @code{make +clean} from the @file{threads} directory. Unless your changes are +truly temporary, you should prefer to edit @file{Makefile.build}. + +A new @file{.h} file does not require editing the @file{Makefile}s. + +@item What does @code{warning: no previous prototype for `@var{func}'} mean? + +It means that you defined a non-@code{static} function without +preceding it by a prototype. Because non-@code{static} functions are +intended for use by other @file{.c} files, for safety they should be +prototyped in a header file included before their definition. To fix +the problem, add a prototype in a header file that you include, or, if +the function isn't actually used by other @file{.c} files, make it +@code{static}. + +@item What is the interval between timer interrupts? + +Timer interrupts occur @code{TIMER_FREQ} times per second. You can +adjust this value by editing @file{devices/timer.h}. The default is +100 Hz. + +We don't recommend changing this value, because any changes are likely +to cause many of the tests to fail. + +@item How long is a time slice? + +There are @code{TIME_SLICE} ticks per time slice. This macro is +declared in @file{threads/thread.c}. The default is 4 ticks. + +We don't recommend changing this value, because any changes are likely +to cause many of the tests to fail. + +@item How do I run the tests? + +@xref{Testing}. + +@item Why do I get a test failure in @func{pass}? + +@anchor{The pass function fails} +You are probably looking at a backtrace that looks something like this: + +@example +0xc0108810: debug_panic (lib/kernel/debug.c:32) +0xc010a99f: pass (tests/threads/tests.c:93) +0xc010bdd3: test_mlfqs_load_1 (...threads/mlfqs-load-1.c:33) +0xc010a8cf: run_test (tests/threads/tests.c:51) +0xc0100452: run_task (threads/init.c:283) +0xc0100536: run_actions (threads/init.c:333) +0xc01000bb: main (threads/init.c:137) +@end example + +This is just confusing output from the @command{backtrace} program. It +does not actually mean that @func{pass} called @func{debug_panic}. In +fact, @func{fail} called @func{debug_panic} (via the @func{PANIC} +macro). GCC knows that @func{debug_panic} does not return, because it +is declared @code{NO_RETURN} (@pxref{Function and Parameter +Attributes}), so it doesn't include any code in @func{fail} to take +control when @func{debug_panic} returns. This means that the return +address on the stack looks like it is at the beginning of the function +that happens to follow @func{fail} in memory, which in this case happens +to be @func{pass}. + +@xref{Backtraces}, for more information. + +@item How do interrupts get re-enabled in the new thread following @func{schedule}? + +Every path into @func{schedule} disables interrupts. They eventually +get re-enabled by the next thread to be scheduled. Consider the +possibilities: the new thread is running in @func{switch_thread} (but +see below), which is called by @func{schedule}, which is called by one +of a few possible functions: + +@itemize @bullet +@item +@func{thread_exit}, but we'll never switch back into such a thread, so +it's uninteresting. + +@item +@func{thread_yield}, which immediately restores the interrupt level upon +return from @func{schedule}. + +@item +@func{thread_block}, which is called from multiple places: + +@itemize @minus +@item +@func{sema_down}, which restores the interrupt level before returning. + +@item +@func{idle}, which enables interrupts with an explicit assembly STI +instruction. + +@item +@func{wait} in @file{devices/intq.c}, whose callers are responsible for +re-enabling interrupts. +@end itemize +@end itemize + +There is a special case when a newly created thread runs for the first +time. Such a thread calls @func{intr_enable} as the first action in +@func{kernel_thread}, which is at the bottom of the call stack for every +kernel thread but the first. +@end table + +@menu +* Alarm Clock FAQ:: +* Priority Scheduling FAQ:: +* Advanced Scheduler FAQ:: +@end menu + +@node Alarm Clock FAQ +@subsection Alarm Clock FAQ + +@table @b +@item Do I need to account for timer values overflowing? + +Don't worry about the possibility of timer values overflowing. Timer +values are expressed as signed 64-bit numbers, which at 100 ticks per +second should be good for almost 2,924,712,087 years. By then, we +expect Pintos to have been phased out of the @value{coursenumber} curriculum. +@end table + +@node Priority Scheduling FAQ +@subsection Priority Scheduling FAQ + +@table @b +@item Doesn't priority scheduling lead to starvation? + +Yes, strict priority scheduling can lead to starvation +because a thread will not run if any higher-priority thread is runnable. +The advanced scheduler introduces a mechanism for dynamically +changing thread priorities. + +Strict priority scheduling is valuable in real-time systems because it +offers the programmer more control over which jobs get processing +time. High priorities are generally reserved for time-critical +tasks. It's not ``fair,'' but it addresses other concerns not +applicable to a general-purpose operating system. + +@item What thread should run after a lock has been released? + +When a lock is released, the highest priority thread waiting for that +lock should be unblocked and put on the list of ready threads. The +scheduler should then run the highest priority thread on the ready +list. + +@item If the highest-priority thread yields, does it continue running? + +Yes. If there is a single highest-priority thread, it continues +running until it blocks or finishes, even if it calls +@func{thread_yield}. +If multiple threads have the same highest priority, +@func{thread_yield} should switch among them in ``round robin'' order. + +@item What happens to the priority of a donating thread? + +Priority donation only changes the priority of the donee +thread. The donor thread's priority is unchanged. +Priority donation is not additive: if thread @var{A} (with priority 5) donates +to thread @var{B} (with priority 3), then @var{B}'s new priority is 5, not 8. + +@item Can a thread's priority change while it is on the ready queue? + +Yes. Consider a ready, low-priority thread @var{L} that holds a lock. +High-priority thread @var{H} attempts to acquire the lock and blocks, +thereby donating its priority to ready thread @var{L}. + +@item Can a thread's priority change while it is blocked? + +Yes. While a thread that has acquired lock @var{L} is blocked for any +reason, its priority can increase by priority donation if a +higher-priority thread attempts to acquire @var{L}. This case is +checked by the @code{priority-donate-sema} test. + +@item Can a thread added to the ready list preempt the processor? + +Yes. If a thread added to the ready list has higher priority than the +running thread, the correct behavior is to immediately yield the +processor. It is not acceptable to wait for the next timer interrupt. +The highest priority thread should run as soon as it is runnable, +preempting whatever thread is currently running. + +@item How does @func{thread_set_priority} affect a thread receiving donations? + +It sets the thread's base priority. The thread's effective priority +becomes the higher of the newly set priority or the highest donated +priority. When the donations are released, the thread's priority +becomes the one set through the function call. This behavior is checked +by the @code{priority-donate-lower} test. + +@item Doubled test names in output make them fail. + +Suppose you are seeing output in which some test names are doubled, +like this: + +@example +(alarm-priority) begin +(alarm-priority) (alarm-priority) Thread priority 30 woke up. +Thread priority 29 woke up. +(alarm-priority) Thread priority 28 woke up. +@end example + +What is happening is that output from two threads is being +interleaved. That is, one thread is printing @code{"(alarm-priority) +Thread priority 29 woke up.\n"} and another thread is printing +@code{"(alarm-priority) Thread priority 30 woke up.\n"}, but the first +thread is being preempted by the second in the middle of its output. + +This problem indicates a bug in your priority scheduler. After all, a +thread with priority 29 should not be able to run while a thread with +priority 30 has work to do. + +Normally, the implementation of the @code{printf()} function in the +Pintos kernel attempts to prevent such interleaved output by acquiring +a console lock during the duration of the @code{printf} call and +releasing it afterwards. However, the output of the test name, +e.g., @code{(alarm-priority)}, and the message following it is output +using two calls to @code{printf}, resulting in the console lock being +acquired and released twice. +@end table + +@node Advanced Scheduler FAQ +@subsection Advanced Scheduler FAQ + +@table @b +@item How does priority donation interact with the advanced scheduler? + +It doesn't have to. We won't test priority donation and the advanced +scheduler at the same time. + +@item Can I use one queue instead of 64 queues? + +Yes. In general, your implementation may differ from the description, +as long as its behavior is the same. + +@item Some scheduler tests fail and I don't understand why. Help! + +If your implementation mysteriously fails some of the advanced +scheduler tests, try the following: + +@itemize +@item +Read the source files for the tests that you're failing, to make sure +that you understand what's going on. Each one has a comment at the +top that explains its purpose and expected results. + +@item +Double-check your fixed-point arithmetic routines and your use of them +in the scheduler routines. + +@item +Consider how much work your implementation does in the timer +interrupt. If the timer interrupt handler takes too long, then it +will take away most of a timer tick from the thread that the timer +interrupt preempted. When it returns control to that thread, it +therefore won't get to do much work before the next timer interrupt +arrives. That thread will therefore get blamed for a lot more CPU +time than it actually got a chance to use. This raises the +interrupted thread's recent CPU count, thereby lowering its priority. +It can cause scheduling decisions to change. It also raises the load +average. +@end itemize +@end table diff --git a/doc/threads.tmpl b/doc/threads.tmpl new file mode 100644 index 0000000..065894a --- /dev/null +++ b/doc/threads.tmpl @@ -0,0 +1,162 @@ + +-------------------+ + | OS 211 | + | TASK 1: THREADS | + | DESIGN DOCUMENT | + +-------------------+ + +---- GROUP ---- + +>> Fill in the names and email addresses of your group members. + +FirstName LastName +FirstName LastName +FirstName LastName + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for the +>> TAs, or extra credit, please give them here. + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, course +>> text, lecture notes, and course staff. + + ALARM CLOCK + =========== + +---- DATA STRUCTURES ---- + +>> A1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> A2: Briefly describe what happens in a call to timer_sleep(), +>> including the effects of the timer interrupt handler. + +>> A3: What steps are taken to minimize the amount of time spent in +>> the timer interrupt handler? + +---- SYNCHRONIZATION ---- + +>> A4: How are race conditions avoided when multiple threads call +>> timer_sleep() simultaneously? + +>> A5: How are race conditions avoided when a timer interrupt occurs +>> during a call to timer_sleep()? + +---- RATIONALE ---- + +>> A6: Why did you choose this design? In what ways is it superior to +>> another design you considered? + + PRIORITY SCHEDULING + =================== + +---- DATA STRUCTURES ---- + +>> B1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +>> B2: Explain the data structure used to track priority donation. +>> Use ASCII art to diagram a nested donation. (Alternately, submit a +>> .png file.) + +---- ALGORITHMS ---- + +>> B3: How do you ensure that the highest priority thread waiting for +>> a lock, semaphore, or condition variable wakes up first? + +>> B4: Describe the sequence of events when a call to lock_acquire() +>> causes a priority donation. How is nested donation handled? + +>> B5: Describe the sequence of events when lock_release() is called +>> on a lock that a higher-priority thread is waiting for. + +---- SYNCHRONIZATION ---- + +>> B6: Describe a potential race in thread_set_priority() and explain +>> how your implementation avoids it. Can you use a lock to avoid +>> this race? + +---- RATIONALE ---- + +>> B7: Why did you choose this design? In what ways is it superior to +>> another design you considered? + + ADVANCED SCHEDULER + ================== + +---- DATA STRUCTURES ---- + +>> C1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> C2: Suppose threads A, B, and C have nice values 0, 1, and 2. Each +>> has a recent_cpu value of 0. Fill in the table below showing the +>> scheduling decision and the priority and recent_cpu values for each +>> thread after each given number of timer ticks: + +timer recent_cpu priority thread +ticks A B C A B C to run +----- -- -- -- -- -- -- ------ + 0 + 4 + 8 +12 +16 +20 +24 +28 +32 +36 + +>> C3: Did any ambiguities in the scheduler specification make values +>> in the table uncertain? If so, what rule did you use to resolve +>> them? Does this match the behaviour of your scheduler? + +>> C4: How is the way you divided the cost of scheduling between code +>> inside and outside interrupt context likely to affect performance? + +---- RATIONALE ---- + +>> C5: Briefly critique your design, pointing out advantages and +>> disadvantages in your design choices. If you were to have extra +>> time to work on this part of the task, how might you choose to +>> refine or improve your design? + +>> C6: The assignment explains arithmetic for fixed-point mathematics in +>> detail, but it leaves it open to you to implement it. Why did you +>> decide to implement it the way you did? If you created an +>> abstraction layer for fixed-point mathematics, that is, an abstract +>> data type and/or a set of functions or macros to manipulate +>> fixed-point numbers, why did you do so? If not, why not? + + SURVEY QUESTIONS + ================ + +Answering these questions is optional, but it will help us improve the +course in future quarters. Feel free to tell us anything you +want--these questions are just to spur your thoughts. You may also +choose to respond anonymously in the course evaluations at the end of +the quarter. + +>> In your opinion, was this assignment, or any one of the three problems +>> in it, too easy or too hard? Did it take too long or too little time? + +>> Did you find that working on a particular part of the assignment gave +>> you greater insight into some aspect of OS design? + +>> Is there some particular fact or hint we should give students in +>> future quarters to help them solve the problems? Conversely, did you +>> find any of our guidance to be misleading? + +>> Do you have any suggestions for the TAs to more effectively assist +>> students, either for future quarters or the remaining tasks? + +>> Any other comments? diff --git a/doc/userprog.texi b/doc/userprog.texi new file mode 100644 index 0000000..36c3213 --- /dev/null +++ b/doc/userprog.texi @@ -0,0 +1,1226 @@ +@node Task 2--User Programs +@chapter Task 2: User Programs + +Now that you've worked with Pintos and are becoming familiar with its +infrastructure and thread package, it's time to start working on the +parts of the system that allow running user programs. +The base code already supports loading and +running user programs, but no I/O or interactivity +is possible. In this task, you will enable programs to interact with +the OS via system calls. + +You will be working out of the @file{userprog} directory for this +assignment, but you will also be interacting with almost every +other part of Pintos. We will describe the +relevant parts below. + +You can build task 2 on top of your task 1 submission or you can +start fresh. No code from task 1 is required for this +assignment. The ``alarm clock'' functionality may be useful in +task 3, but it is not strictly required. + +You might find it useful to go back and reread how to run the tests +(@pxref{Testing}). + +@menu +* Task 2 Background:: +* Task 2 Suggested Order of Implementation:: +* Task 2 Requirements:: +* Task 2 FAQ:: +* 80x86 Calling Convention:: +@end menu + +@node Task 2 Background +@section Background + +Up to now, all of the code you have run under Pintos has been part +of the operating system kernel. This means, for example, that all the +test code from the last assignment ran as part of the kernel, with +full access to privileged parts of the system. Once we start running +user programs on top of the operating system, this is no longer true. +This task deals with the consequences. + +We allow more than one process to run at a time. Each process has one +thread (multithreaded processes are not supported). User programs are +written under the illusion that they have the entire machine. This +means that when you load and run multiple processes at a time, you must +manage memory, scheduling, and other state correctly to maintain this +illusion. + +In the previous task, we compiled our test code directly into your +kernel, so we had to require certain specific function interfaces within +the kernel. From now on, we will test your operating system by running +user programs. This gives you much greater freedom. You must make sure +that the user program interface meets the specifications described here, +but given that constraint you are free to restructure or rewrite kernel +code however you wish. + +@menu +* Task 2 Source Files:: +* Using the File System:: +* How User Programs Work:: +* Virtual Memory Layout:: +* Accessing User Memory:: +@end menu + +@node Task 2 Source Files +@subsection Source Files + +The easiest way to get an overview of the programming you will be +doing is to simply go over each part you'll be working with. In +@file{userprog}, you'll find a small number of files, but here is +where the bulk of your work will be: + +@table @file +@item process.c +@itemx process.h +Loads ELF binaries and starts processes. + +@item pagedir.c +@itemx pagedir.h +A simple manager for 80@var{x}86 hardware page tables. +Although you probably won't want to modify this code for this task, +you may want to call some of its functions. +@xref{Page Tables}, for more information. + +@item syscall.c +@itemx syscall.h +Whenever a user process wants to access some kernel functionality, it +invokes a system call. This is a skeleton system call +handler. Currently, it just prints a message and terminates the user +process. In part 2 of this task you will add code to do everything +else needed by system calls. + +@item exception.c +@itemx exception.h +When a user process performs a privileged or prohibited operation, it +traps into the kernel as an ``exception'' or ``fault.''@footnote{We +will treat these terms as synonyms. There is no standard +distinction between them, although Intel processor manuals make +a minor distinction between them on 80@var{x}86.} These files handle +exceptions. Currently all exceptions simply print a message and +terminate the process. Some, but not all, solutions to task 2 +require modifying @func{page_fault} in this file. + +@item gdt.c +@itemx gdt.h +The 80@var{x}86 is a segmented architecture. The Global Descriptor +Table (GDT) is a table that describes the segments in use. These +files set up the GDT. You should not need to modify these +files for any of the tasks. You can read the code if +you're interested in how the GDT works. + +@item tss.c +@itemx tss.h +The Task-State Segment (TSS) is used for 80@var{x}86 architectural +task switching. Pintos uses the TSS only for switching stacks when a +user process enters an interrupt handler, as does Linux. You +should not need to modify these files for any of the tasks. +You can read the code if you're interested in how the TSS +works. +@end table + +@node Using the File System +@subsection Using the File System + +You will need to interface to the file system code for this task, +because +user programs are loaded from the file system and many of the +system calls you must implement deal with the file system. However, +the focus of this task is not the file system, so we have +provided a simple but complete file system in the @file{filesys} +directory. You +will want to look over the @file{filesys.h} and @file{file.h} +interfaces to understand how to use the file system, and especially +its many limitations. + +There is no need to modify the file system code for this task, and so +we recommend that you do not. Working on the file system is likely to +distract you from this task's focus. + +You will have to tolerate the following limitations of the provided +filesystem implementation: + +@itemize @bullet +@item +No internal synchronization. Concurrent accesses will interfere with one +another. You should use synchronization to ensure that only one process at a +time is executing file system code. No finer-grained synchronisation +(for eg. per-file locking) is expected. + +@item +File size is fixed at creation time. The root directory is +represented as a file, so the number of files that may be created is also +limited. + +@item +File data is allocated as a single extent, that is, data in a single +file must occupy a contiguous range of sectors on disk. External +fragmentation can therefore become a serious problem as a file system is +used over time. + +@item +No subdirectories. + +@item +File names are limited to 14 characters. + +@item +A system crash mid-operation may corrupt the disk in a way +that cannot be repaired automatically. There is no file system repair +tool anyway. +@end itemize + +One important feature is included: + +@itemize @bullet +@item +Unix-like semantics for @func{filesys_remove} are implemented. +That is, if a file is open when it is removed, its blocks +are not deallocated and it may still be accessed by any +threads that have it open, until the last one closes it. @xref{Removing +an Open File}, for more information. +@end itemize + +You need to be able to create a simulated disk with a file system +partition. The @command{pintos-mkdisk} program provides this +functionality. From the @file{userprog/build} directory, execute +@code{pintos-mkdisk filesys.dsk --filesys-size=2}. This command +creates a simulated disk named @file{filesys.dsk} that contains a @w{2 +MB} Pintos file system partition. Then format the file system +partition by passing @option{-f -q} on the kernel's command line: +@code{pintos -f -q}. The @option{-f} option causes the file system to +be formatted, and @option{-q} causes Pintos to exit as soon as the +format is done. + +You'll need a way to copy files in and out of the simulated file system. +The @code{pintos} @option{-p} (``put'') and @option{-g} (``get'') +options do this. To copy @file{@var{file}} into the +Pintos file system, use the command @file{pintos -p @var{file} -- -q}. +(The @samp{--} is needed because @option{-p} is for the @command{pintos} +script, not for the simulated kernel.) To copy it to the Pintos file +system under the name @file{@var{newname}}, add @option{-a +@var{newname}}: @file{pintos -p @var{file} -a @var{newname} -- -q}. The +commands for copying files out of a VM are similar, but substitute +@option{-g} for @option{-p}. + +Incidentally, these commands work by passing special commands +@command{extract} and @command{append} on the kernel's command line and copying +to and from a special simulated ``scratch'' partition. If you're very +curious, you can look at the @command{pintos} script as well as +@file{filesys/fsutil.c} to learn the implementation details. + +Here's a summary of how to create a disk with a file system partition, +format the file system, copy the @command{echo} program into the new +disk, and then run @command{echo}, passing argument @code{x}. +(Argument passing won't work until you implemented it.) It assumes +that you've already built the examples in @file{examples} and that the +current directory is @file{userprog/build}: + +@example +pintos-mkdisk filesys.dsk --filesys-size=2 +pintos -f -q +pintos -p ../../examples/echo -a echo -- -q +pintos -q run 'echo x' +@end example + +The three final steps can actually be combined into a single command: + +@example +pintos-mkdisk filesys.dsk --filesys-size=2 +pintos -p ../../examples/echo -a echo -- -f -q run 'echo x' +@end example + +If you don't want to keep the file system disk around for later use or +inspection, you can even combine all four steps into a single command. +The @code{--filesys-size=@var{n}} option creates a temporary file +system partition +approximately @var{n} megabytes in size just for the duration of the +@command{pintos} run. The Pintos automatic test suite makes extensive +use of this syntax: + +@example +pintos --filesys-size=2 -p ../../examples/echo -a echo -- -f -q run 'echo x' +@end example + +You can delete a file from the Pintos file system using the @code{rm +@var{file}} kernel action, e.g.@: @code{pintos -q rm @var{file}}. Also, +@command{ls} lists the files in the file system and @code{cat +@var{file}} prints a file's contents to the display. + +@node How User Programs Work +@subsection How User Programs Work + +Pintos can run normal C programs, as long as they fit into memory and use +only the system calls you implement. Notably, @func{malloc} cannot be +implemented because none of the system calls required for this task +allow for memory allocation. Pintos also can't run programs that use +floating point operations, since the kernel doesn't save and restore the +processor's floating-point unit when switching threads. + +The @file{src/examples} directory contains a few sample user +programs. The @file{Makefile} in this directory +compiles the provided examples, and you can edit it +compile your own programs as well. Some of the example programs will +only work once task 3 has been implemented. + +Pintos can load @dfn{ELF} executables with the loader provided for you +in @file{userprog/process.c}. ELF is a file format used by Linux, +Solaris, and many other operating systems for object files, +shared libraries, and executables. You can actually use any compiler +and linker that output 80@var{x}86 ELF executables to produce programs +for Pintos. (We've provided compilers and linkers that should do just +fine.) + +You should realize immediately that, until you copy a +test program to the simulated file system, Pintos will be unable to do +useful work. You won't be able to do +interesting things until you copy a variety of programs to the file system. +You might want to create a clean reference file system disk and copy that +over whenever you trash your @file{filesys.dsk} beyond a useful state, +which may happen occasionally while debugging. + +@node Virtual Memory Layout +@subsection Virtual Memory Layout + +Virtual memory in Pintos is divided into two regions: user virtual +memory and kernel virtual memory. User virtual memory ranges from +virtual address 0 up to @code{PHYS_BASE}, which is defined in +@file{threads/vaddr.h} and defaults to @t{0xc0000000} (3 GB). Kernel +virtual memory occupies the rest of the virtual address space, from +@code{PHYS_BASE} up to 4 GB. + +User virtual memory is per-process. +When the kernel switches from one process to another, it +also switches user virtual address spaces by changing the processor's +page directory base register (see @func{pagedir_activate} in +@file{userprog/pagedir.c}). @struct{thread} contains a pointer to a +process's page table. + +Kernel virtual memory is global. It is always mapped the same way, +regardless of what user process or kernel thread is running. In +Pintos, kernel virtual memory is mapped one-to-one to physical +memory, starting at @code{PHYS_BASE}. That is, virtual address +@code{PHYS_BASE} accesses physical +address 0, virtual address @code{PHYS_BASE} + @t{0x1234} accesses +physical address @t{0x1234}, and so on up to the size of the machine's +physical memory. + +A user program can only access its own user virtual memory. An attempt to +access kernel virtual memory causes a page fault, handled by +@func{page_fault} in @file{userprog/exception.c}, and the process +will be terminated. Kernel threads can access both kernel virtual +memory and, if a user process is running, the user virtual memory of +the running process. However, even in the kernel, an attempt to +access memory at an unmapped user virtual address +will cause a page fault. + +@menu +* Typical Memory Layout:: +@end menu + +@node Typical Memory Layout +@subsubsection Typical Memory Layout + +Conceptually, each process is +free to lay out its own user virtual memory however it +chooses. In practice, user virtual memory is laid out like this: + +@html +
    +@end html +@example +@group + PHYS_BASE +----------------------------------+ + | user stack | + | | | + | | | + | V | + | grows downward | + | | + | | + | | + | | + | grows upward | + | ^ | + | | | + | | | + +----------------------------------+ + | uninitialized data segment (BSS) | + +----------------------------------+ + | initialized data segment | + +----------------------------------+ + | code segment | + 0x08048000 +----------------------------------+ + | | + | | + | | + | | + | | + 0 +----------------------------------+ +@end group +@end example +@html +
    +@end html + +In this task, the user stack is fixed in size, but in task 3 it +will be allowed to grow. Traditionally, the size of the uninitialized +data segment can be adjusted with a system call, but you will not have +to implement this. + +The code segment in Pintos starts at user virtual address +@t{0x08084000}, approximately 128 MB from the bottom of the address +space. This value is specified in @bibref{SysV-i386} and has no deep +significance. + +The linker sets the layout of a user program in memory, as directed by a +``linker script'' that tells it the names and locations of the various +program segments. You can learn more about linker scripts by reading +the ``Scripts'' chapter in the linker manual, accessible via @samp{info +ld}. + +To view the layout of a particular executable, run @command{objdump} +(80@var{x}86) with the @option{-p} +option. + +@node Accessing User Memory +@subsection Accessing User Memory + +As part of a system +call, the kernel must often access memory through pointers provided by a user +program. The kernel must be very careful about doing so, because +the user can pass a null pointer, a pointer to +unmapped virtual memory, or a pointer to kernel virtual address space +(above @code{PHYS_BASE}). All of these types of invalid pointers must +be rejected without harm to the kernel or other running processes, by +terminating the offending process and freeing its resources. + +There are at least two reasonable ways to do this correctly. The +first method is to verify +the validity of a user-provided pointer, then dereference it. If you +choose this route, you'll want to look at the functions in +@file{userprog/pagedir.c} and in @file{threads/vaddr.h}, specifically +@func{pagedir_get_page} and @func{is_user_vaddr}. This is the +simplest way to handle user memory access. + +The second method is to check only that a user +pointer points below @code{PHYS_BASE}, then dereference it. +An invalid user pointer will cause a ``page fault'' that you can +handle by modifying the code for @func{page_fault} in +@file{userprog/exception.c}. This technique is normally faster +because it takes advantage of the processor's MMU, so it tends to be +used in real kernels (including Linux). + +In either case, you need to make sure not to ``leak'' resources. For +example, suppose that your system call has acquired a lock or +allocated memory with @func{malloc}. If you encounter an invalid user pointer +afterward, you must still be sure to release the lock or free the page +of memory. If you choose to verify user pointers before dereferencing +them, this should be straightforward. It's more difficult to handle +if an invalid pointer causes a page fault, +because there's no way to return an error code from a memory access. +Therefore, for those who want to try the latter technique, we'll +provide a little bit of helpful code: + +@verbatim +/* Reads a byte at user virtual address UADDR. + UADDR must be below PHYS_BASE. + Returns the byte value if successful, -1 if a segfault + occurred. */ +static int +get_user (const uint8_t *uaddr) +{ + int result; + asm ("movl $1f, %0; movzbl %1, %0; 1:" + : "=&a" (result) : "m" (*uaddr)); + return result; +} + +/* Writes BYTE to user address UDST. + UDST must be below PHYS_BASE. + Returns true if successful, false if a segfault occurred. */ +static bool +put_user (uint8_t *udst, uint8_t byte) +{ + int error_code; + asm ("movl $1f, %0; movb %b2, %1; 1:" + : "=&a" (error_code), "=m" (*udst) : "q" (byte)); + return error_code != -1; +} +@end verbatim + +Each of these functions assumes that the user address has already been +verified to be below @code{PHYS_BASE}. They also assume that you've +modified @func{page_fault} so that a page fault in the kernel merely +sets @code{eax} to @t{0xffffffff} and copies its former value +into @code{eip}. + +@node Task 2 Suggested Order of Implementation +@section Suggested Order of Implementation + +We suggest first implementing the following, which can happen in +parallel: + +@itemize +@item +Argument passing (@pxref{Argument Passing}). Every user program will +page fault immediately until argument passing is implemented. + +For now, you may simply wish to change +@example +*esp = PHYS_BASE; +@end example +@noindent to +@example +*esp = PHYS_BASE - 12; +@end example +in @func{setup_stack}. That will work for any test program that doesn't +examine its arguments, although its name will be printed as +@code{(null)}. + +Until you implement argument passing, you should only run programs +without passing command-line arguments. Attempting to pass arguments to +a program will include those arguments in the name of the program, which +will probably fail. + +@item +User memory access (@pxref{Accessing User Memory}). All system calls +need to read user memory. Few system calls need to write to user +memory. + +@item +System call infrastructure (@pxref{System Calls}). Implement enough +code to read the system call number from the user stack and dispatch to +a handler based on it. + +@item +The @code{exit} system call. Every user program that finishes in the +normal way calls @code{exit}. Even a program that returns from +@func{main} calls @code{exit} indirectly (see @func{_start} in +@file{lib/user/entry.c}). + +@item +The @code{write} system call for writing to fd 1, the system console. +All of our test programs write to the console (the user process version +of @func{printf} is implemented this way), so they will all malfunction +until @code{write} is available. + +@item +For now, change @func{process_wait} to an infinite loop (one that waits +forever). The purpose of @func{process_wait} is described in more detail +above it's function stub in @file{src/userprog/process.c}, and more +information can be found in the description of the @code{wait} system call +later in this document. +The provided implementation returns immediately, so Pintos +will power off before any processes actually get to run. You will +eventually need to provide a correct implementation. +@end itemize + +After the above are implemented, user processes should work minimally. +At the very least, they can write to the console and exit correctly. +You can then refine your implementation so that some of the tests start +to pass (your first step should be to complete @func{process_wait} so +that user programs return correctly). In order to minimise the amount of +time you spend on this exercise, it is vital that you implement the +@code{write}, @code{exit} and @code{wait} system calls before beginning the +others. + +@node Task 2 Requirements +@section Requirements + +@menu +* Task 2 Design Document:: +* Process Termination Messages:: +* Argument Passing:: +* System Calls:: +* Denying Writes to Executables:: +@end menu + +@node Task 2 Design Document +@subsection Design Document + +Before you turn in your task, you must copy @uref{userprog.tmpl, , +the task 2 design document template} into your source tree under the +name @file{pintos-ic/src/userprog/DESIGNDOC} and fill it in. We recommend +that you read the design document template before you start working on +the task. @xref{Task Documentation}, for a sample design document +that goes along with a fictitious task. + +@node Process Termination Messages +@subsection Process Termination Messages + +Whenever a user process terminates, because it called @code{exit} +or for any other reason, print the process's name +and exit code, formatted as if printed by @code{printf ("%s: +exit(%d)\n", @dots{});}. The name printed should be the full name +passed to @func{process_execute}, omitting command-line arguments. +Do not print these messages when a kernel thread that is not a user +process terminates, or +when the @code{halt} system call is invoked. The message is optional +when a process fails to load. + +Aside from this, don't print any other +messages that Pintos as provided doesn't already print. You may find +extra messages useful during debugging, but they will confuse the +grading scripts and thus lower your score. + +@node Argument Passing +@subsection Argument Passing + +Currently, @func{process_execute}, found in @file{src/userprog/process.c}, +does not support passing arguments to new processes. Implement this +functionality, by extending @func{process_execute} so that instead of +simply taking a program file name as its argument, it divides it into words +at spaces. The first word is the program name, the second word is the first +argument, and so on. That is, @code{process_execute("grep foo bar")} should +run @command{grep} passing two arguments @code{foo} and @code{bar}. + +Within a command line, multiple spaces are equivalent to a single +space, so that @code{process_execute("grep @w{ }foo @w{ }@w{ }bar")} +is equivalent to our original example. You can impose a reasonable +limit on the length of the command line arguments. For example, you +could limit the arguments to those that will fit in a single page (4 +kB). (There is an unrelated limit of 128 bytes on command-line +arguments that the @command{pintos} utility can pass to the kernel.) + +You can parse argument strings any way you like. If you're lost, +look at @func{strtok_r}, prototyped in @file{lib/string.h} and +implemented with thorough comments in @file{lib/string.c}. You can +find more about it by looking at the man page (run @code{man strtok_r} +at the prompt). + +Virtually all the code you will write relating to argument passing +will be in @func{process_execute} and @func{start_process}. +@func{process_execute} creates a new thread, calling @func{start_process} +to load the actual process into the thread and set up the stack and other +related structures. + +@xref{Program Startup Details}, for information on exactly how you +need to set up the stack. + +@node System Calls +@subsection System Calls + +Implement the system call handler in @file{userprog/syscall.c}. The +skeleton implementation we provide ``handles'' system calls by +terminating the process. It will need to retrieve the system call +number, then any system call arguments, and carry out appropriate actions. + +Implement the following system calls. The prototypes listed are those +seen by a user program that includes @file{lib/user/syscall.h}. (This +header, and all others in @file{lib/user}, are for use by user +programs only.) System call numbers for each system call are defined in +@file{lib/syscall-nr.h}: + +@deftypefn {System Call} void halt (void) +Terminates Pintos by calling @func{shutdown_power_off} (declared in +@file{devices/shutdown.h}). This should be seldom used, because you lose +some information about possible deadlock situations, etc. +@var{Warning: The original Pintos documentation on the Stanford website +is outdated and incorrectly places the shutdown function in the wrong +location. It's advisable that you don't use it as a reference +in completing any of the tasks.} +@end deftypefn + +@deftypefn {System Call} void exit (int @var{status}) +Terminates the current user program, returning @var{status} to the +kernel. If the process's parent @code{wait}s for it (see below), this +is the status +that will be returned. Conventionally, a @var{status} of 0 indicates +success and nonzero values indicate errors. +@end deftypefn + +@deftypefn {System Call} pid_t exec (const char *@var{cmd_line}) +Runs the executable whose name is given in @var{cmd_line}, passing any +given arguments, and returns the new process's program id (pid). Must +return pid -1, which otherwise should not be a valid pid, if +the program cannot load or run for any reason. +Thus, the parent process cannot return from the @code{exec} until it +knows whether the child process successfully loaded its executable. +You must use appropriate synchronization to ensure this. +@end deftypefn + +@deftypefn {System Call} int wait (pid_t @var{pid}) +Waits for a child process @var{pid} and retrieves the child's exit status. + +If @var{pid} is still alive, waits until it terminates. Then, returns +the status that @var{pid} passed to @code{exit}. If @var{pid} did not +call @code{exit()}, but was terminated by the kernel (e.g.@: killed +due to an exception), @code{wait(pid)} must return -1. It is perfectly +legal for a parent process to wait for child processes that have already +terminated by the time the parent calls @code{wait}, but the kernel must +still allow the parent to retrieve its child's exit status, or learn +that the child was terminated by the kernel. + +@code{wait} must fail and return -1 immediately if any of the +following conditions is true: +@itemize @bullet +@item +@var{pid} does not refer to a direct child of the calling process. +@var{pid} is a direct child of the calling process if and +only if the calling process received @var{pid} as a return value +from a successful call to @code{exec}. + +Note that children are not inherited: if @var{A} spawns child @var{B} +and @var{B} spawns child process @var{C}, then @var{A} cannot wait for +@var{C}, even if @var{B} is dead. A call to @code{wait(C)} by process +@var{A} must fail. Similarly, orphaned processes are not assigned to +a new parent if their parent process exits before they do. + +@item +The process that calls @code{wait} has already called @code{wait} on +@var{pid}. That is, a process may wait for any given child at most +once. +@end itemize + +Processes may spawn any number of children, wait for them in any order, +and may even exit without having waited for some or all of their children. +Your design should consider all the ways in which waits can occur. +All of a process's resources, including its @struct{thread}, must be +freed whether its parent ever waits for it or not, and regardless of +whether the child exits before or after its parent. + +You must ensure that Pintos does not terminate until the initial +process exits. The supplied Pintos code tries to do this by calling +@func{process_wait} (in @file{userprog/process.c}) from @func{main} +(in @file{threads/init.c}). We suggest that you implement +@func{process_wait} according to the comment at the top of the +function and then implement the @code{wait} system call in terms of +@func{process_wait}. + +Implementing this system call requires considerably more work than any +of the rest. +@end deftypefn + +@deftypefn {System Call} bool create (const char *@var{file}, unsigned @var{initial_size}) +Creates a new file called @var{file} initially @var{initial_size} bytes +in size. Returns true if successful, false otherwise. +Creating a new file does not open it: opening the new file is a +separate operation which would require a @code{open} system call. +@end deftypefn + +@deftypefn {System Call} bool remove (const char *@var{file}) +Deletes the file called @var{file}. Returns true if successful, false +otherwise. +A file may be removed regardless of whether it is open or closed, and +removing an open file does not close it. @xref{Removing an Open +File}, for details. +@end deftypefn + +@deftypefn {System Call} int open (const char *@var{file}) +Opens the file called @var{file}. Returns a nonnegative integer handle +called a ``file descriptor'' (fd), or -1 if the file could not be +opened. + +File descriptors numbered 0 and 1 are reserved for the console: fd 0 +(@code{STDIN_FILENO}) is standard input, fd 1 (@code{STDOUT_FILENO}) is +standard output. The @code{open} system call will never return either +of these file descriptors, which are valid as system call arguments only +as explicitly described below. + +Each process has an independent set of file descriptors. File +descriptors are not inherited by child processes. + +When a single file is opened more than once, whether by a single +process or different processes, each @code{open} returns a new file +descriptor. Different file descriptors for a single file are closed +independently in separate calls to @code{close} and they do not share +a file position. +@end deftypefn + +@deftypefn {System Call} int filesize (int @var{fd}) +Returns the size, in bytes, of the file open as @var{fd}. +@end deftypefn + +@deftypefn {System Call} int read (int @var{fd}, void *@var{buffer}, unsigned @var{size}) +Reads @var{size} bytes from the file open as @var{fd} into +@var{buffer}. Returns the number of bytes actually read (0 at end of +file), or -1 if the file could not be read (due to a condition other +than end of file). Fd 0 reads from the keyboard using +@func{input_getc}, which can be found in @file{src/devices/input.h}. +@end deftypefn + +@deftypefn {System Call} int write (int @var{fd}, const void *@var{buffer}, unsigned @var{size}) +Writes @var{size} bytes from @var{buffer} to the open file @var{fd}. +Returns the number of bytes actually written, which may be less than +@var{size} if some bytes could not be written. + +Writing past end-of-file would normally extend the file, but file growth +is not implemented by the basic file system. The expected behavior is +to write as many bytes as possible up to end-of-file and return the +actual number written, or 0 if no bytes could be written at all. + +Fd 1 writes to the console. Your code to write to the console should +write all of @var{buffer} in one call to @func{putbuf}, at least as +long as @var{size} is not bigger than a few hundred bytes. (It is +reasonable to break up larger buffers.) Otherwise, +lines of text output by different processes may end up interleaved on +the console, confusing both human readers and our grading scripts. +@end deftypefn + +@deftypefn {System Call} void seek (int @var{fd}, unsigned @var{position}) +Changes the next byte to be read or written in open file @var{fd} to +@var{position}, expressed in bytes from the beginning of the file. +(Thus, a @var{position} of 0 is the file's start.) + +A seek past the current end of a file is not an error. A later read +obtains 0 bytes, indicating end of file. A later write extends the +file, filling any unwritten gap with zeros. (However, in Pintos files +have a fixed length until task 4 is complete, so writes past end of +file will return an error.) These semantics are implemented in the +file system and do not require any special effort in system call +implementation. +@end deftypefn + +@deftypefn {System Call} unsigned tell (int @var{fd}) +Returns the position of the next byte to be read or written in open +file @var{fd}, expressed in bytes from the beginning of the file. +@end deftypefn + +@deftypefn {System Call} void close (int @var{fd}) +Closes file descriptor @var{fd}. +Exiting or terminating a process implicitly closes all its open file +descriptors, as if by calling this function for each one. +@end deftypefn + +The file defines other syscalls. Ignore them for now. You will +implement the rest in task 3, so be +sure to design your system with extensibility in mind. + +To implement syscalls, you need to provide ways to read and write data +in user virtual address space. +You need this ability before you can +even obtain the system call number, because the system call number is +on the user's stack in the user's virtual address space. +This can be a bit tricky: what if the user provides an invalid +pointer, a pointer into kernel memory, or a block +partially in one of those regions? You should handle these cases by +terminating the user process. We recommend +writing and testing this code before implementing any other system +call functionality. @xref{Accessing User Memory}, for more information. + +You must synchronize system calls so that +any number of user processes can make them at once. In particular, it +is not safe to call into the file system code provided in the +@file{filesys} directory from multiple threads at once. Your system +call implementation must treat the file system code as a critical +section. Don't forget +that @func{process_execute} also accesses files. For now, we +recommend against modifying code in the @file{filesys} directory. + +We have provided you a user-level function for each system call in +@file{lib/user/syscall.c}. These provide a way for user processes to +invoke each system call from a C program. Each uses a little inline +assembly code to invoke the system call and (if appropriate) returns the +system call's return value. + +When you're done with this part, and forevermore, Pintos should be +bulletproof. Nothing that a user program can do should ever cause the +OS to crash, panic, fail an assertion, or otherwise malfunction. It is +important to emphasize this point: our tests will try to break your +system calls in many, many ways. You need to think of all the corner +cases and handle them. The sole way a user program should be able to +cause the OS to halt is by invoking the @code{halt} system call. + +If a system call is passed an invalid argument, acceptable options +include returning an error value (for those calls that return a +value), returning an undefined value, or terminating the process. + +@xref{System Call Details}, for details on how system calls work. + +@node Denying Writes to Executables +@subsection Denying Writes to Executables + +Add code to deny writes to files in use as executables. Many OSes do +this because of the unpredictable results if a process tried to run code +that was in the midst of being changed on disk. This is especially +important once virtual memory is implemented in task 3, but it can't +hurt even now. + +You can use @func{file_deny_write} to prevent writes to an open file. +Calling @func{file_allow_write} on the file will re-enable them (unless +the file is denied writes by another opener). Closing a file will also +re-enable writes. Thus, to deny writes to a process's executable, you +must keep it open as long as the process is still running. + +@node Task 2 FAQ +@section FAQ + +@table @asis +@item How much code will I need to write? + +Here's a summary of our reference solution, produced by the +@command{diffstat} program. The final row gives total lines inserted +and deleted; a changed line counts as both an insertion and a deletion. + +The reference solution represents just one possible solution. Many +other solutions are also possible and many of those differ greatly from +the reference solution. Some excellent solutions may not modify all the +files modified by the reference solution, and some may modify files not +modified by the reference solution. + +@verbatim + threads/thread.c | 13 + threads/thread.h | 26 + + userprog/exception.c | 8 + userprog/process.c | 247 ++++++++++++++-- + userprog/syscall.c | 468 ++++++++++++++++++++++++++++++- + userprog/syscall.h | 1 + 6 files changed, 725 insertions(+), 38 deletions(-) +@end verbatim + +@item The kernel always panics when I run @code{pintos -p @var{file} -- -q}. + +Did you format the file system (with @samp{pintos -f})? + +Is your file name too long? The file system limits file names to 14 +characters. A command like @samp{pintos -p ../../examples/echo -- -q} +will exceed the limit. Use @samp{pintos -p ../../examples/echo -a echo +-- -q} to put the file under the name @file{echo} instead. + +Is the file system full? + +Does the file system already contain 16 files? The base Pintos file +system has a 16-file limit. + +The file system may be so fragmented that there's not enough contiguous +space for your file. + +@item When I run @code{pintos -p ../file --}, @file{file} isn't copied. + +Files are written under the name you refer to them, by default, so in +this case the file copied in would be named @file{../file}. You +probably want to run @code{pintos -p ../file -a file --} instead. + +You can list the files in your file system with @code{pintos -q ls}. + +@item All my user programs die with page faults. + +This will happen if you haven't implemented argument passing +(or haven't done so correctly). The basic C library for user programs tries +to read @var{argc} and @var{argv} off the stack. If the stack +isn't properly set up, this causes a page fault. + +@item All my user programs die with @code{system call!} + +You'll have to implement system calls before you see anything else. +Every reasonable program tries to make at least one system call +(@func{exit}) and most programs make more than that. Notably, +@func{printf} invokes the @code{write} system call. The default system +call handler just prints @samp{system call!} and terminates the program. +Until then, you can use @func{hex_dump} to convince yourself that +argument passing is implemented correctly (@pxref{Program Startup Details}). + +@item How can I disassemble user programs? + +The @command{objdump} (80@var{x}86) utility can disassemble entire user +programs or object files. Invoke it as @code{objdump -d +@var{file}}. You can use GDB's +@code{disassemble} command to disassemble individual functions +(@pxref{GDB}). + +@item Why do many C include files not work in Pintos programs? +@itemx Can I use lib@var{foo} in my Pintos programs? + +The C library we provide is very limited. It does not include many of +the features that are expected of a real operating system's C library. +The C library must be built specifically for the operating system (and +architecture), since it must make system calls for I/O and memory +allocation. (Not all functions do, of course, but usually the library +is compiled as a unit.) + +The chances are good that the library you want uses parts of the C library +that Pintos doesn't implement. It will probably take at least some +porting effort to make it work under Pintos. Notably, the Pintos +user program C library does not have a @func{malloc} implementation. + +@item How do I compile new user programs? + +Modify @file{src/examples/Makefile}, then run @command{make}. + +@item Can I run user programs under a debugger? + +Yes, with some limitations. @xref{GDB}. + +@item What's the difference between @code{tid_t} and @code{pid_t}? + +A @code{tid_t} identifies a kernel thread, which may have a user +process running in it (if created with @func{process_execute}) or not +(if created with @func{thread_create}). It is a data type used only +in the kernel. + +A @code{pid_t} identifies a user process. It is used by user +processes and the kernel in the @code{exec} and @code{wait} system +calls. + +You can choose whatever suitable types you like for @code{tid_t} and +@code{pid_t}. By default, they're both @code{int}. You can make them +a one-to-one mapping, so that the same values in both identify the +same process, or you can use a more complex mapping. It's up to you. +@end table + +@menu +* Argument Passing FAQ:: +* System Calls FAQ:: +@end menu + +@node Argument Passing FAQ +@subsection Argument Passing FAQ + +@table @asis +@item Isn't the top of stack in kernel virtual memory? + +The top of stack is at @code{PHYS_BASE}, typically @t{0xc0000000}, which +is also where kernel virtual memory starts. +But before the processor pushes data on the stack, it decrements the stack +pointer. Thus, the first (4-byte) value pushed on the stack +will be at address @t{0xbffffffc}. + +@item Is @code{PHYS_BASE} fixed? + +No. You should be able to support @code{PHYS_BASE} values that are +any multiple of @t{0x10000000} from @t{0x80000000} to @t{0xf0000000}, +simply via recompilation. +@end table + +@node System Calls FAQ +@subsection System Calls FAQ + +@table @asis +@item Can I just cast a @code{struct file *} to get a file descriptor? +@itemx Can I just cast a @code{struct thread *} to a @code{pid_t}? + +You will have to make these design decisions yourself. +Most operating systems do distinguish between file +descriptors (or pids) and the addresses of their kernel data +structures. You might want to give some thought as to why they do so +before committing yourself. + +@item Can I set a maximum number of open files per process? + +It is better not to set an arbitrary limit. You may impose a limit of +128 open files per process, if necessary. + +@item What happens when an open file is removed? +@anchor{Removing an Open File} + +You should implement the standard Unix semantics for files. That is, when +a file is removed any process which has a file descriptor for that file +may continue to use that descriptor. This means that +they can read and write from the file. The file will not have a name, +and no other processes will be able to open it, but it will continue +to exist until all file descriptors referring to the file are closed +or the machine shuts down. + +@item How can I run user programs that need more than 4 kB stack space? + +You may modify the stack setup code to allocate more than one page of +stack space for each process. In the next task, you will implement a +better solution. + +@item What should happen if an @code{exec} fails midway through loading? + +@code{exec} should return -1 if the child process fails to load for +any reason. This includes the case where the load fails part of the +way through the process (e.g.@: where it runs out of memory in the +@code{multi-oom} test). Therefore, the parent process cannot return +from the @code{exec} system call until it is established whether the +load was successful or not. The child must communicate this +information to its parent using appropriate synchronization, such as a +semaphore (@pxref{Semaphores}), to ensure that the information is +communicated without race conditions. +@end table + +@node 80x86 Calling Convention +@section 80@var{x}86 Calling Convention + +This section summarizes important points of the convention used for +normal function calls on 32-bit 80@var{x}86 implementations of Unix. +Some details are omitted for brevity. If you do want all the details, +refer to @bibref{SysV-i386}. + +The calling convention works like this: + +@enumerate 1 +@item +The caller pushes each of the function's arguments on the stack one by +one, normally using the @code{PUSH} assembly language instruction. +Arguments are pushed in right-to-left order. + +The stack grows downward: each push decrements the stack pointer, then +stores into the location it now points to, like the C expression +@samp{*--sp = @var{value}}. + +@item +The caller pushes the address of its next instruction (the @dfn{return +address}) on the stack and jumps to the first instruction of the callee. +A single 80@var{x}86 instruction, @code{CALL}, does both. + +@item +The callee executes. When it takes control, the stack pointer points to +the return address, the first argument is just above it, the second +argument is just above the first argument, and so on. + +@item +If the callee has a return value, it stores it into register @code{EAX}. + +@item +The callee returns by popping the return address from the stack and +jumping to the location it specifies, using the 80@var{x}86 @code{RET} +instruction. + +@item +The caller pops the arguments off the stack. +@end enumerate + +Consider a function @func{f} that takes three @code{int} arguments. +This diagram shows a sample stack frame as seen by the callee at the +beginning of step 3 above, supposing that @func{f} is invoked as +@code{f(1, 2, 3)}. The initial stack address is arbitrary: + +@html +
    +@end html +@example + +----------------+ + 0xbffffe7c | 3 | + 0xbffffe78 | 2 | + 0xbffffe74 | 1 | +stack pointer --> 0xbffffe70 | return address | + +----------------+ +@end example +@html +
    +@end html + +@menu +* Program Startup Details:: +* System Call Details:: +@end menu + +@node Program Startup Details +@subsection Program Startup Details + +The Pintos C library for user programs designates @func{_start}, in +@file{lib/user/entry.c}, as the entry point for user programs. This +function is a wrapper around @func{main} that calls @func{exit} if +@func{main} returns: + +@example +void +_start (int argc, char *argv[]) +@{ + exit (main (argc, argv)); +@} +@end example + +The kernel must put the arguments for the initial function on the stack +before it allows the user program to begin executing. The arguments are +passed in the same way as the normal calling convention (@pxref{80x86 +Calling Convention}). + +Consider how to handle arguments for the following example command: +@samp{/bin/ls -l foo bar}. +First, break the command into words: @samp{/bin/ls}, +@samp{-l}, @samp{foo}, @samp{bar}. Place the words at the top of the +stack. Order doesn't matter, because they will be referenced through +pointers. + +Then, push the address of each string plus a null pointer sentinel, on +the stack, in right-to-left order. These are the elements of +@code{argv}. The null pointer sentinel ensures that @code{argv[argc]} +is a null pointer, as required by the C standard. The order ensures +that @code{argv[0]} is at the lowest virtual address. Word-aligned +accesses are faster than unaligned accesses, so for best performance +round the stack pointer down to a multiple of 4 before the first push. + +Then, push @code{argv} (the address of @code{argv[0]}) and @code{argc}, +in that order. Finally, push a fake ``return address'': although the +entry function will never return, its stack frame must have the same +structure as any other. + +The table below shows the state of the stack and the relevant registers +right before the beginning of the user program, assuming +@code{PHYS_BASE} is @t{0xc0000000}: + +@html +
    +@end html +@multitable {@t{0xbfffffff}} {return address} {@t{/bin/ls\0}} {@code{void (*) ()}} +@item Address @tab Name @tab Data @tab Type +@item @t{0xbffffffc} @tab @code{argv[3][@dots{}]} @tab @samp{bar\0} @tab @code{char[4]} +@item @t{0xbffffff8} @tab @code{argv[2][@dots{}]} @tab @samp{foo\0} @tab @code{char[4]} +@item @t{0xbffffff5} @tab @code{argv[1][@dots{}]} @tab @samp{-l\0} @tab @code{char[3]} +@item @t{0xbfffffed} @tab @code{argv[0][@dots{}]} @tab @samp{/bin/ls\0} @tab @code{char[8]} +@item @t{0xbfffffec} @tab word-align @tab 0 @tab @code{uint8_t} +@item @t{0xbfffffe8} @tab @code{argv[4]} @tab @t{0} @tab @code{char *} +@item @t{0xbfffffe4} @tab @code{argv[3]} @tab @t{0xbffffffc} @tab @code{char *} +@item @t{0xbfffffe0} @tab @code{argv[2]} @tab @t{0xbffffff8} @tab @code{char *} +@item @t{0xbfffffdc} @tab @code{argv[1]} @tab @t{0xbffffff5} @tab @code{char *} +@item @t{0xbfffffd8} @tab @code{argv[0]} @tab @t{0xbfffffed} @tab @code{char *} +@item @t{0xbfffffd4} @tab @code{argv} @tab @t{0xbfffffd8} @tab @code{char **} +@item @t{0xbfffffd0} @tab @code{argc} @tab 4 @tab @code{int} +@item @t{0xbfffffcc} @tab return address @tab 0 @tab @code{void (*) ()} +@end multitable +@html +
    +@end html + +In this example, the stack pointer would be initialized to +@t{0xbfffffcc}. + +As shown above, your code should start the stack at the very top of +the user virtual address space, in the page just below virtual address +@code{PHYS_BASE} (defined in @file{threads/vaddr.h}). + +You may find the non-standard @func{hex_dump} function, declared in +@file{}, useful for debugging your argument passing code. +Here's what it would show in the above example: + +@verbatim +bfffffc0 00 00 00 00 | ....| +bfffffd0 04 00 00 00 d8 ff ff bf-ed ff ff bf f5 ff ff bf |................| +bfffffe0 f8 ff ff bf fc ff ff bf-00 00 00 00 00 2f 62 69 |............./bi| +bffffff0 6e 2f 6c 73 00 2d 6c 00-66 6f 6f 00 62 61 72 00 |n/ls.-l.foo.bar.| +@end verbatim + +@node System Call Details +@subsection System Call Details + +The first task already dealt with one way that the operating system +can regain control from a user program: interrupts from timers and I/O +devices. These are ``external'' interrupts, because they are caused +by entities outside the CPU (@pxref{External Interrupt Handling}). + +The operating system also deals with software exceptions, which are +events that occur in program code (@pxref{Internal Interrupt +Handling}). These can be errors such as a page fault or division by +zero. Exceptions are also the means by which a user program +can request services (``system calls'') from the operating system. + +In the 80@var{x}86 architecture, the @samp{int} instruction is the +most commonly used means for invoking system calls. This instruction +is handled in the same way as other software exceptions. In Pintos, +user programs invoke @samp{int $0x30} to make a system call. The +system call number and any additional arguments are expected to be +pushed on the stack in the normal fashion before invoking the +interrupt (@pxref{80x86 Calling Convention}). + +Thus, when the system call handler @func{syscall_handler} gets control, +the system call number is in the 32-bit word at the caller's stack +pointer, the first argument is in the 32-bit word at the next higher +address, and so on. The caller's stack pointer is accessible to +@func{syscall_handler} as the @samp{esp} member of the +@struct{intr_frame} passed to it. (@struct{intr_frame} is on the kernel +stack.) + +The 80@var{x}86 convention for function return values is to place them +in the @code{EAX} register. System calls that return a value can do +so by modifying the @samp{eax} member of @struct{intr_frame}. + +You should try to avoid writing large amounts of repetitive code for +implementing system calls. Each system call argument, whether an +integer or a pointer, takes up 4 bytes on the stack. You should be able +to take advantage of this to avoid writing much near-identical code for +retrieving each system call's arguments from the stack. diff --git a/doc/userprog.tmpl b/doc/userprog.tmpl new file mode 100644 index 0000000..3d77d0f --- /dev/null +++ b/doc/userprog.tmpl @@ -0,0 +1,136 @@ + +-------------------------+ + | OS 211 | + | TASK 2: USER PROGRAMS | + | DESIGN DOCUMENT | + +-------------------------+ + +---- GROUP ---- + +>> Fill in the names and email addresses of your group members. + +FirstName LastName +FirstName LastName +FirstName LastName + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for the +>> TAs, or extra credit, please give them here. + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, course +>> text, lecture notes, and course staff. + + ARGUMENT PASSING + ================ + +---- DATA STRUCTURES ---- + +>> A1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> A2: Briefly describe how you implemented argument parsing. How do +>> you arrange for the elements of argv[] to be in the right order? +>> How do you avoid overflowing the stack page? + +---- RATIONALE ---- + +>> A3: Why does Pintos implement strtok_r() but not strtok()? + +>> A4: In Pintos, the kernel separates commands into a executable name +>> and arguments. In Unix-like systems, the shell does this +>> separation. Identify at least two advantages of the Unix approach. + + SYSTEM CALLS + ============ + +---- DATA STRUCTURES ---- + +>> B1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +>> B2: Describe how file descriptors are associated with open files. +>> Are file descriptors unique within the entire OS or just within a +>> single process? + +---- ALGORITHMS ---- + +>> B3: Describe your code for reading and writing user data from the +>> kernel. + +>> B4: Suppose a system call causes a full page (4,096 bytes) of data +>> to be copied from user space into the kernel. What is the least +>> and the greatest possible number of inspections of the page table +>> (e.g. calls to pagedir_get_page()) that might result? What about +>> for a system call that only copies 2 bytes of data? Is there room +>> for improvement in these numbers, and how much? + +>> B5: Briefly describe your implementation of the "wait" system call +>> and how it interacts with process termination. + +>> B6: Any access to user program memory at a user-specified address +>> can fail due to a bad pointer value. Such accesses must cause the +>> process to be terminated. System calls are fraught with such +>> accesses, e.g. a "write" system call requires reading the system +>> call number from the user stack, then each of the call's three +>> arguments, then an arbitrary amount of user memory, and any of +>> these can fail at any point. This poses a design and +>> error-handling problem: how do you best avoid obscuring the primary +>> function of code in a morass of error-handling? Furthermore, when +>> an error is detected, how do you ensure that all temporarily +>> allocated resources (locks, buffers, etc.) are freed? In a few +>> paragraphs, describe the strategy or strategies you adopted for +>> managing these issues. Give an example. + +---- SYNCHRONIZATION ---- + +>> B7: The "exec" system call returns -1 if loading the new executable +>> fails, so it cannot return before the new executable has completed +>> loading. How does your code ensure this? How is the load +>> success/failure status passed back to the thread that calls "exec"? + +>> B8: Consider parent process P with child process C. How do you +>> ensure proper synchronization and avoid race conditions when P +>> calls wait(C) before C exits? After C exits? How do you ensure +>> that all resources are freed in each case? How about when P +>> terminates without waiting, before C exits? After C exits? Are +>> there any special cases? + +---- RATIONALE ---- + +>> B9: Why did you choose to implement access to user memory from the +>> kernel in the way that you did? + +>> B10: What advantages or disadvantages can you see to your design +>> for file descriptors? + +>> B11: The default tid_t to pid_t mapping is the identity mapping. +>> If you changed it, what advantages are there to your approach? + + SURVEY QUESTIONS + ================ + +Answering these questions is optional, but it will help us improve the +course in future quarters. Feel free to tell us anything you +want--these questions are just to spur your thoughts. You may also +choose to respond anonymously in the course evaluations at the end of +the quarter. + +>> In your opinion, was this assignment, or any one of the three problems +>> in it, too easy or too hard? Did it take too long or too little time? + +>> Did you find that working on a particular part of the assignment gave +>> you greater insight into some aspect of OS design? + +>> Is there some particular fact or hint we should give students in +>> future quarters to help them solve the problems? Conversely, did you +>> find any of our guidance to be misleading? + +>> Do you have any suggestions for the TAs to more effectively assist +>> students, either for future quarters or the remaining tasks? + +>> Any other comments? diff --git a/doc/vm.texi b/doc/vm.texi new file mode 100644 index 0000000..848a964 --- /dev/null +++ b/doc/vm.texi @@ -0,0 +1,795 @@ +@node Task 3--Virtual Memory +@chapter Task 3: Virtual Memory + +By now you should have some familiarity with the inner workings of +Pintos. Your +OS can properly handle multiple threads of execution with proper +synchronization, and can load multiple user programs at once. However, +the number and size of programs that can run is limited by the machine's +main memory size. In this assignment, you will remove that limitation. + +You will build this assignment on top of the last one. Test programs +from task 2 should also work with task 3. You should take care to +fix any bugs in your task 2 submission before you start work on +task 3, because those bugs will most likely cause the same problems +in task 3. + +You will continue to handle Pintos disks and file systems the same way +you did in the previous assignment (@pxref{Using the File System}). + +@menu +* Task 3 Background:: +* Task 3 Suggested Order of Implementation:: +* Task 3 Requirements:: +* Task 3 FAQ:: +@end menu + +@node Task 3 Background +@section Background + +@menu +* Task 3 Source Files:: +* Memory Terminology:: +* Resource Management Overview:: +* Managing the Supplemental Page Table:: +* Managing the Frame Table:: +* Managing the Swap Table:: +* Managing Memory Mapped Files Back:: +@end menu + +@node Task 3 Source Files +@subsection Source Files + +You will work in the @file{vm} directory for this task. The +@file{vm} directory contains only @file{Makefile}s. The only +change from @file{userprog} is that this new @file{Makefile} turns on +the setting @option{-DVM}. All code you write will be in new +files or in files introduced in earlier tasks. + +You will probably be encountering just a few files for the first time: + +@table @file +@item devices/block.h +@itemx devices/block.c +Provides sector-based read and write access to block device. You will +use this interface to access the swap partition as a block device. +@end table + +@node Memory Terminology +@subsection Memory Terminology + +Careful definitions are needed to keep discussion of virtual memory from +being confusing. Thus, we begin by presenting some terminology for +memory and storage. Some of these terms should be familiar from task +2 (@pxref{Virtual Memory Layout}), but much of it is new. + +@menu +* Pages:: +* Frames:: +* Page Tables:: +* Swap Slots:: +@end menu + +@node Pages +@subsubsection Pages + +A @dfn{page}, sometimes called a @dfn{virtual page}, is a continuous +region of virtual memory 4,096 bytes (the @dfn{page size}) in length. A +page must be @dfn{page-aligned}, that is, start on a virtual address +evenly divisible by the page size. Thus, a 32-bit virtual address can +be divided into a 20-bit @dfn{page number} and a 12-bit @dfn{page +offset} (or just @dfn{offset}), like this: + +@example +@group + 31 12 11 0 + +-------------------+-----------+ + | Page Number | Offset | + +-------------------+-----------+ + Virtual Address +@end group +@end example + +Each process has an independent set of @dfn{user (virtual) pages}, which +are those pages below virtual address @code{PHYS_BASE}, typically +@t{0xc0000000} (3 GB). The set of @dfn{kernel (virtual) pages}, on the +other hand, is global, remaining the same regardless of what thread or +process is active. The kernel may access both user and kernel pages, +but a user process may access only its own user pages. @xref{Virtual +Memory Layout}, for more information. + +Pintos provides several useful functions for working with virtual +addresses. @xref{Virtual Addresses}, for details. + +@node Frames +@subsubsection Frames + +A @dfn{frame}, sometimes called a @dfn{physical frame} or a @dfn{page +frame}, is a continuous region of physical memory. Like pages, frames +must be page-size and page-aligned. Thus, a 32-bit physical address can +be divided into a 20-bit @dfn{frame number} and a 12-bit @dfn{frame +offset} (or just @dfn{offset}), like this: + +@example +@group + 31 12 11 0 + +-------------------+-----------+ + | Frame Number | Offset | + +-------------------+-----------+ + Physical Address +@end group +@end example + +The 80@var{x}86 doesn't provide any way to directly access memory at a +physical address. Pintos works around this by mapping kernel virtual +memory directly to physical memory: the first page of kernel virtual +memory is mapped to the first frame of physical memory, the second page +to the second frame, and so on. Thus, frames can be accessed through +kernel virtual memory. + +Pintos provides functions for translating between physical addresses and +kernel virtual addresses. @xref{Virtual Addresses}, for details. + +@node Page Tables +@subsubsection Page Tables + +In Pintos, a @dfn{page table} is a data structure that the CPU uses to +translate a virtual address to a physical address, that is, from a page +to a frame. The page table format is dictated by the 80@var{x}86 +architecture. Pintos provides page table management code in +@file{pagedir.c} (@pxref{Page Table}). + +The diagram below illustrates the relationship between pages and frames. +The virtual address, on the left, consists of a page number and an +offset. The page table translates the page number into a frame number, +which is combined with the unmodified offset to obtain the physical +address, on the right. + +@example +@group + +----------+ + .--------------->|Page Table|-----------. + / +----------+ | + 0 | 12 11 0 0 V 12 11 0 + +---------+----+ +---------+----+ + |Page Nr | Ofs| |Frame Nr | Ofs| + +---------+----+ +---------+----+ + Virt Addr | Phys Addr ^ + \_______________________________________/ +@end group +@end example + +@node Swap Slots +@subsubsection Swap Slots + +A @dfn{swap slot} is a continuous, page-size region of disk space in the +swap partition. Although hardware limitations dictating the placement of +slots are looser than for pages and frames, swap slots should be +page-aligned because there is no downside in doing so. + +@node Resource Management Overview +@subsection Resource Management Overview + +You will need to design the following data structures: + +@table @asis +@item Supplemental page table + +Enables page fault handling by supplementing the page table. +@xref{Managing the Supplemental Page Table}. + +@item Frame table + +Allows efficient implementation of eviction policy. +@xref{Managing the Frame Table}. + +@item Swap table + +Tracks usage of swap slots. +@xref{Managing the Swap Table}. + +@item Table of file mappings + +Processes may map files into their virtual memory space. You need a +table to track which files are mapped into which pages. +@end table + +You do not necessarily need to implement four completely distinct data +structures: it may be convenient to wholly or partially merge related +resources into a unified data structure. + +For each data structure, you need to determine what information each +element should contain. You also need to decide on the data structure's +scope, either local (per-process) or global (applying to the whole +system), and how many instances are required within its scope. + +To simplify your design, you may store these data structures in +non-pageable memory. That means that you can be sure that pointers +among them will remain valid. + +Possible choices of data structures include arrays, lists, bitmaps, and +hash tables. An array is often the simplest approach, but a sparsely +populated array wastes memory. Lists are also simple, but traversing a +long list to find a particular position wastes time. Both arrays and +lists can be resized, but lists more efficiently support insertion and +deletion in the middle. + +Pintos includes a bitmap data structure in @file{lib/kernel/bitmap.c} +and @file{lib/kernel/bitmap.h}. A bitmap is an array of bits, each of +which can be true or false. Bitmaps are typically used to track usage +in a set of (identical) resources: if resource @var{n} is in use, then +bit @var{n} of the bitmap is true. Pintos bitmaps are fixed in size, +although you could extend their implementation to support resizing. + +Pintos also includes a hash table data structure (@pxref{Hash Table}). +Pintos hash tables efficiently support insertions and deletions over a +wide range of table sizes. + +Although more complex data structures may yield performance or other +benefits, they may also needlessly complicate your implementation. +Thus, we do not recommend implementing any advanced data structure +(e.g.@: a balanced binary tree) as part of your design. + +@node Managing the Supplemental Page Table +@subsection Managing the Supplemental Page Table + +The @dfn{supplemental page table} supplements the page table with +additional data about each page. It is needed because of the +limitations imposed by the page table's format. Such a data structure +is often called a ``page table'' also; we add the word ``supplemental'' +to reduce confusion. + +The supplemental page table is used for at least two purposes. Most +importantly, on a page fault, the kernel looks up the virtual page that +faulted in the supplemental page table to find out what data should be +there. Second, the kernel consults the supplemental page table when a +process terminates, to decide what resources to free. + +You may organize the supplemental page table as you wish. There are at +least two basic approaches to its organization: in terms of segments or +in terms of pages. Optionally, you may use the page table itself as an +index to track the members of the supplemental page table. You will +have to modify the Pintos page table implementation in @file{pagedir.c} +to do so. We recommend this approach for advanced students only. +@xref{Page Table Entry Format}, for more information. + +The most important user of the supplemental page table is the page fault +handler. In task 2, a page fault always indicated a bug in the +kernel or a user program. In task 3, this is no longer true. Now, a +page fault might only indicate that the page must be brought in from a +file or swap. You will have to implement a more sophisticated page +fault handler to handle these cases. Your page fault handler, which you +should implement by modifying @func{page_fault} in +@file{userprog/exception.c}, needs to do roughly the following: + +@enumerate 1 +@item +Locate the page that faulted in the supplemental page table. If the +memory reference is valid, use the supplemental page table entry to +locate the data that goes in the page, which might be in the file +system, or in a swap slot, or it might simply be an all-zero page. If +you implement sharing, the page's data might even already be in a page +frame, but not in the page table. + +If the supplemental page table indicates that the user process should +not expect any data at the address it was trying to access, or if the +page lies within kernel virtual memory, or if the access is an attempt +to write to a read-only page, then the access is invalid. Any invalid +access terminates the process and thereby frees all of its resources. + +@item +Obtain a frame to store the page. @xref{Managing the Frame Table}, for +details. + +If you implement sharing, the data you need may already be in a frame, +in which case you must be able to locate that frame. + +@item +Fetch the data into the frame, by reading it from the file system or +swap, zeroing it, etc. + +If you implement sharing, the page you need may already be in a frame, +in which case no action is necessary in this step. + +@item +Point the page table entry for the faulting virtual address to the +physical page. You can use the functions in @file{userprog/pagedir.c}. +@end enumerate + +@node Managing the Frame Table +@subsection Managing the Frame Table + +The @dfn{frame table} contains one entry for each frame that contains a +user page. Each entry in the frame table contains a pointer to the +page, if any, that currently occupies it, and other data of your choice. +The frame table allows Pintos to efficiently implement an eviction +policy, by choosing a page to evict when no frames are free. + +The frames used for user pages should be obtained from the ``user +pool,'' by calling @code{palloc_get_page(PAL_USER)}. You must use +@code{PAL_USER} to avoid allocating from the ``kernel pool,'' which +could cause some test cases to fail unexpectedly (@pxref{Why +PAL_USER?}). If you modify @file{palloc.c} as part of your frame table +implementation, be sure to retain the distinction between the two pools. + +The most important operation on the frame table is obtaining an unused +frame. This is easy when a frame is free. When none is free, a frame +must be made free by evicting some page from its frame. + +If no frame can be evicted without allocating a swap slot, but swap is +full, panic the kernel. Real OSes apply a wide range of policies to +recover from or prevent such situations, but these policies are beyond +the scope of this task. + +The process of eviction comprises roughly the following steps: + +@enumerate 1 +@item +Choose a frame to evict, using your page replacement algorithm. The +``accessed'' and ``dirty'' bits in the page table, described below, will +come in handy. + +@item +Remove references to the frame from any page table that refers to it. + +Unless you have implemented sharing, only a single page should refer to +a frame at any given time. + +@item +If necessary, write the page to the file system or to swap. +@end enumerate + +The evicted frame may then be used to store a different page. + +@menu +* Accessed and Dirty Bits:: +@end menu + +@node Accessed and Dirty Bits +@subsubsection Accessed and Dirty Bits + +80@var{x}86 hardware provides some assistance for implementing page +replacement algorithms, through a pair of bits in the page table entry +(PTE) for each page. On any read or write to a page, the CPU sets the +@dfn{accessed bit} to 1 in the page's PTE, and on any write, the CPU +sets the @dfn{dirty bit} to 1. The CPU never resets these bits to 0, +but the OS may do so. + +You need to be aware of @dfn{aliases}, that is, two (or more) pages that +refer to the same frame. When an aliased frame is accessed, the +accessed and dirty bits are updated in only one page table entry (the +one for the page used for access). The accessed and dirty bits for the +other aliases are not updated. + +In Pintos, every user virtual page is aliased to its kernel virtual +page. You must manage these aliases somehow. For example, your code +could check and update the accessed and dirty bits for both addresses. +Alternatively, the kernel could avoid the problem by only accessing user +data through the user virtual address. + +Other aliases should only arise if you implement sharing for extra +credit (@pxref{VM Extra Credit}), or if there is a bug in your code. + +@xref{Page Table Accessed and Dirty Bits}, for details of the functions +to work with accessed and dirty bits. + +@node Managing the Swap Table +@subsection Managing the Swap Table + +The swap table tracks in-use and free swap slots. It should allow +picking an unused swap slot for evicting a page from its frame to the +swap partition. It should allow freeing a swap slot when its page is read +back or the process whose page was swapped is terminated. + +You may use the @code{BLOCK_SWAP} block device for swapping, obtaining +the @struct{block} that represents it by calling @func{block_get_role}. +From the +@file{vm/build} directory, use the command @code{pintos-mkdisk swap.dsk +--swap-size=@var{n}} to create an disk named @file{swap.dsk} that +contains a @var{n}-MB swap partition. +Afterward, @file{swap.dsk} will automatically be attached as an extra disk +when you run @command{pintos}. Alternatively, you can tell +@command{pintos} to use a temporary @var{n}-MB swap disk for a single +run with @option{--swap-size=@var{n}}. + +Swap slots should be allocated lazily, that is, only when they are +actually required by eviction. Reading data pages from the executable +and writing them to swap immediately at process startup is not lazy. +Swap slots should not be reserved to store particular pages. + +Free a swap slot when its contents are read back into a frame. + +@node Managing Memory Mapped Files Back +@subsection Managing Memory Mapped Files + +The file system is most commonly accessed with @code{read} and +@code{write} system calls. A secondary interface is to ``map'' the file +into virtual pages, using the @code{mmap} system call. The program can +then use memory instructions directly on the file data. + +Suppose file @file{foo} is @t{0x1000} bytes (4 kB, or one page) long. +If @file{foo} is mapped into memory starting at address @t{0x5000}, then +any memory accesses to locations @t{0x5000}@dots{}@t{0x5fff} will access +the corresponding bytes of @file{foo}. + +Here's a program that uses @code{mmap} to print a file to the console. +It opens the file specified on the command line, maps it at virtual +address @t{0x10000000}, writes the mapped data to the console (fd 1), +and unmaps the file. + +@example +#include +#include +int main (int argc UNUSED, char *argv[]) +@{ + void *data = (void *) 0x10000000; /* @r{Address at which to map.} */ + + int fd = open (argv[1]); /* @r{Open file.} */ + mapid_t map = mmap (fd, data); /* @r{Map file.} */ + write (1, data, filesize (fd)); /* @r{Write file to console.} */ + munmap (map); /* @r{Unmap file (optional).} */ + return 0; +@} +@end example + +A similar program with full error handling is included as @file{mcat.c} +in the @file{examples} directory, which also contains @file{mcp.c} as a +second example of @code{mmap}. + +Your submission must be able to track what memory is used by memory +mapped files. This is necessary to properly handle page faults in the +mapped regions and to ensure that mapped files do not overlap any other +segments within the process. + +@node Task 3 Suggested Order of Implementation +@section Suggested Order of Implementation + +We suggest the following initial order of implementation: + +@enumerate 1 +@item +Frame table (@pxref{Managing the Frame Table}). Change @file{process.c} +to use your frame table allocator. + +Do not implement swapping yet. If you run out of frames, fail the +allocator or panic the kernel. + +After this step, your kernel should still pass all the task 2 test +cases. + +@item +Supplemental page table and page fault handler (@pxref{Managing the +Supplemental Page Table}). Change @file{process.c} to record the +necessary information in the supplemental page table when loading an +executable and setting up its stack. Implement loading of code and data +segments in the page fault handler. For now, consider only valid +accesses. + +After this step, your kernel should pass all of the task 2 +functionality test cases, but only some of the robustness tests. + +@item +From here, you can implement stack growth, mapped files, and page +reclamation on process exit in parallel. + +@item +The next step is to implement eviction (@pxref{Managing the Frame +Table}). Initially you could choose the page to evict randomly. At +this point, you need to consider how to manage accessed and dirty bits +and aliasing of user and kernel pages. Synchronization is also a +concern: how do you deal with it if process A faults on a page whose +frame process B is in the process of evicting? +@end enumerate + +@node Task 3 Requirements +@section Requirements + +This assignment is an open-ended design problem. We are going to say as +little as possible about how to do things. Instead we will focus on +what functionality we require your OS to support. We will expect +you to come up with a design that makes sense. You will have the +freedom to choose how to handle page faults, how to organize the swap +partition, how to implement paging, etc. + +@menu +* Task 3 Design Document:: +* Paging:: +* Stack Growth:: +* Memory Mapped Files:: +@end menu + +@node Task 3 Design Document +@subsection Design Document + +Before you turn in your task, you must copy @uref{vm.tmpl, , the +task 3 design document template} into your source tree under the name +@file{pintos-ic/src/vm/DESIGNDOC} and fill it in. We recommend that you +read the design document template before you start working on the +task. @xref{Task Documentation}, for a sample design document +that goes along with a fictitious task. + +@node Paging +@subsection Paging + +Implement paging for segments loaded from executables. All of these +pages should be loaded lazily, that is, only as the kernel intercepts +page faults for them. Upon eviction, pages modified since load (e.g.@: +as indicated by the ``dirty bit'') should be written to swap. +Unmodified pages, including read-only pages, should never be written to +swap because they can always be read back from the executable. + +Your design should allow for parallelism. If one page fault requires +I/O, in the meantime processes that do not fault should continue +executing and other page faults that do not require I/O should be able +to complete. This will require some synchronization effort. + +You'll need to modify the core of the program loader, which is the loop +in @func{load_segment} in @file{userprog/process.c}. Each time around +the loop, @code{page_read_bytes} receives the number of bytes to read +from the executable file and @code{page_zero_bytes} receives the number +of bytes to initialize to zero following the bytes read. The two always +sum to @code{PGSIZE} (4,096). The handling of a page depends on these +variables' values: + +@itemize @bullet +@item +If @code{page_read_bytes} equals @code{PGSIZE}, the page should be demand +paged from the underlying file on its first access. + +@item +If @code{page_zero_bytes} equals @code{PGSIZE}, the page does not need to +be read from disk at all because it is all zeroes. You should handle +such pages by creating a new page consisting of all zeroes at the +first page fault. + +@item +Otherwise, neither @code{page_read_bytes} nor @code{page_zero_bytes} +equals @code{PGSIZE}. In this case, an initial part of the page is to +be read from the underlying file and the remainder zeroed. +@end itemize + +@node Stack Growth +@subsection Stack Growth + +Implement stack growth. In task 2, the stack was a single page at +the top of the user virtual address space, and programs were limited to +that much stack. Now, if the stack grows past its current size, +allocate additional pages as necessary. + +Allocate additional pages only if they ``appear'' to be stack accesses. +Devise a heuristic that attempts to distinguish stack accesses from +other accesses. + +User programs are buggy if they write to the stack below the stack +pointer, because typical real OSes may interrupt a process at any time +to deliver a ``signal,'' which pushes data on the stack.@footnote{This rule is +common but not universal. One modern exception is the +@uref{http://www.x86-64.org/documentation/abi.pdf, @var{x}86-64 System V +ABI}, which designates 128 bytes below the stack pointer as a ``red +zone'' that may not be modified by signal or interrupt handlers.} +However, the 80@var{x}86 @code{PUSH} instruction checks access +permissions before it adjusts the stack pointer, so it may cause a page +fault 4 bytes below the stack pointer. (Otherwise, @code{PUSH} would +not be restartable in a straightforward fashion.) Similarly, the +@code{PUSHA} instruction pushes 32 bytes at once, so it can fault 32 +bytes below the stack pointer. + +You will need to be able to obtain the current value of the user +program's stack pointer. Within a system call or a page fault generated +by a user program, you can retrieve it from the @code{esp} member of the +@struct{intr_frame} passed to @func{syscall_handler} or +@func{page_fault}, respectively. If you verify user pointers before +accessing them (@pxref{Accessing User Memory}), these are the only cases +you need to handle. On the other hand, if you depend on page faults to +detect invalid memory access, you will need to handle another case, +where a page fault occurs in the kernel. Since the processor only +saves the stack pointer when an exception causes a switch from user +to kernel mode, reading @code{esp} out of the @struct{intr_frame} +passed to @func{page_fault} would yield an undefined value, not the +user stack pointer. You will need to arrange another way, such as +saving @code{esp} into @struct{thread} on the initial transition +from user to kernel mode. + +You should impose some absolute limit on stack size, as do most OSes. +Some OSes make the limit user-adjustable, e.g.@: with the +@command{ulimit} command on many Unix systems. On many GNU/Linux systems, +the default limit is 8 MB. + +The first stack page need not be allocated lazily. You can allocate +and initialize it with the command line arguments at load time, with +no need to wait for it to be faulted in. + +All stack pages should be candidates for eviction. An evicted stack +page should be written to swap. + +@node Memory Mapped Files +@subsection Memory Mapped Files + +Implement memory mapped files, including the following system calls. + +@deftypefn {System Call} mapid_t mmap (int @var{fd}, void *@var{addr}) +Maps the file open as @var{fd} into the process's virtual address +space. The entire file is mapped into consecutive virtual pages +starting at @var{addr}. + +Your VM system must lazily load pages in @code{mmap} regions and use the +@code{mmap}ed file itself as backing store for the mapping. That is, +evicting a page mapped by @code{mmap} writes it back to the file it was +mapped from. + +If the file's length is not a multiple of @code{PGSIZE}, then some +bytes in the final mapped page ``stick out'' beyond the end of the +file. Set these bytes to zero when the page is faulted in from the +file system, +and discard them when the page is written back to disk. + +If successful, this function returns a ``mapping ID'' that +uniquely identifies the mapping within the process. On failure, +it must return -1, which otherwise should not be a valid mapping id, +and the process's mappings must be unchanged. + +A call to @code{mmap} may fail if the file open as @var{fd} has a +length of zero bytes. It must fail if @var{addr} is not page-aligned +or if the range of pages mapped overlaps any existing set of mapped +pages, including the stack or pages mapped at executable load time. +It must also fail if @var{addr} is 0, because some Pintos code assumes +virtual page 0 is not mapped. Finally, file descriptors 0 and 1, +representing console input and output, are not mappable. +@end deftypefn + +@deftypefn {System Call} void munmap (mapid_t @var{mapping}) +Unmaps the mapping designated by @var{mapping}, which must be a +mapping ID returned by a previous call to @code{mmap} by the same +process that has not yet been unmapped. +@end deftypefn + +All mappings are implicitly unmapped when a process exits, whether via +@code{exit} or by any other means. When a mapping is unmapped, whether +implicitly or explicitly, all pages written to by the process are +written back to the file, and pages not written must not be. The pages +are then removed from the process's list of virtual pages. + +Closing or removing a file does not unmap any of its mappings. Once +created, a mapping is valid until @code{munmap} is called or the process +exits, following the Unix convention. @xref{Removing an Open File}, for +more information. You should use the @code{file_reopen} function to +obtain a separate and independent reference to the file for each of +its mappings. + +If two or more processes map the same file, there is no requirement that +they see consistent data. Unix handles this by making the two mappings +share the same physical page, but the @code{mmap} system call also has +an argument allowing the client to specify whether the page is shared or +private (i.e.@: copy-on-write). + +@subsection Accessing User Memory +You will need to adapt your code to access user memory (@pxref{Accessing +User Memory}) while handling a system call. Just as user processes may +access pages whose content is currently in a file or in swap space, so +can they pass addresses that refer to such non-resident pages to system +calls. Moreover, unless your kernel takes measures to prevent this, +a page may be evicted from its frame even while it is being accessed +by kernel code. If kernel code accesses such non-resident user pages, +a page fault will result. + +While accessing user memory, your kernel must either be prepared to handle +such page faults, or it must prevent them from occurring. The kernel +must prevent such page faults while it is holding resources it would +need to acquire to handle these faults. In Pintos, such resources include +locks acquired by the device driver(s) that control the device(s) containing +the file system and swap space. As a concrete example, you must not +allow page faults to occur while a device driver accesses a user buffer +passed to @code{file_read}, because you would not be able to invoke +the driver while handling such faults. + +Preventing such page faults requires cooperation between the code within +which the access occurs and your page eviction code. For instance, +you could extend your frame table to record when a page contained in +a frame must not be evicted. (This is also referred to as ``pinning'' +or ``locking'' the page in its frame.) Pinning restricts your page +replacement algorithm's choices when looking for pages to evict, so be +sure to pin pages no longer than necessary, and avoid pinning pages when +it is not necessary. + +@node Task 3 FAQ +@section FAQ + +@table @b +@item How much code will I need to write? + +Here's a summary of our reference solution, produced by the +@command{diffstat} program. The final row gives total lines inserted +and deleted; a changed line counts as both an insertion and a deletion. + +This summary is relative to the Pintos base code, but the reference +solution for task 3 starts from the reference solution to task 2. +@xref{Task 2 FAQ}, for the summary of task 2. + +The reference solution represents just one possible solution. Many +other solutions are also possible and many of those differ greatly from +the reference solution. Some excellent solutions may not modify all the +files modified by the reference solution, and some may modify files not +modified by the reference solution. + +@verbatim + Makefile.build | 4 + devices/timer.c | 42 ++ + threads/init.c | 5 + threads/interrupt.c | 2 + threads/thread.c | 31 + + threads/thread.h | 37 +- + userprog/exception.c | 12 + userprog/pagedir.c | 10 + userprog/process.c | 319 +++++++++++++----- + userprog/syscall.c | 545 ++++++++++++++++++++++++++++++- + userprog/syscall.h | 1 + vm/frame.c | 162 +++++++++ + vm/frame.h | 23 + + vm/page.c | 297 ++++++++++++++++ + vm/page.h | 50 ++ + vm/swap.c | 85 ++++ + vm/swap.h | 11 + 17 files changed, 1532 insertions(+), 104 deletions(-) +@end verbatim + +@item Do we need a working Task 2 to implement Task 3? + +Yes. + +@item What extra credit is available? +@anchor{VM Extra Credit} +You may implement an advanced page replacement algorithm such as +the ``second chance'' or the ``clock'' algorithms. + +You may implement sharing: when multiple processes are created that use +the same executable file, share read-only pages among those processes +instead of creating separate copies of read-only segments for each +process. If you carefully designed your data structures, +sharing of read-only pages should not make this part significantly +harder. + +@item How do we resume a process after we have handled a page fault? + +Returning from @func{page_fault} resumes the current user process +(@pxref{Internal Interrupt Handling}). +It will then retry the instruction to which the instruction pointer points. + +@item Why do user processes sometimes fault above the stack pointer? + +You might notice that, in the stack growth tests, the user program faults +on an address that is above the user program's current stack pointer, +even though the @code{PUSH} and @code{PUSHA} instructions would cause +faults 4 and 32 bytes below the current stack pointer. + +This is not unusual. The @code{PUSH} and @code{PUSHA} instructions are +not the only instructions that can trigger user stack growth. +For instance, a user program may allocate stack space by decrementing the +stack pointer using a @code{SUB $n, %esp} instruction, and then use a +@code{MOV ..., m(%esp)} instruction to write to a stack location within +the allocated space that is @var{m} bytes above the current stack pointer. +Such accesses are perfectly valid, and your kernel must grow the +user program's stack to allow those accesses to succeed. + +@item Does the virtual memory system need to support data segment growth? + +No. The size of the data segment is determined by the linker. We still +have no dynamic allocation in Pintos (although it is possible to +``fake'' it at the user level by using memory-mapped files). Supporting +data segment growth should add little additional complexity to a +well-designed system. + +@item Why should I use @code{PAL_USER} for allocating page frames? +@anchor{Why PAL_USER?} + +Passing @code{PAL_USER} to @func{palloc_get_page} causes it to allocate +memory from the user pool, instead of the main kernel pool. Running out +of pages in the user pool just causes user programs to page, but running +out of pages in the kernel pool will cause many failures because so many +kernel functions need to obtain memory. +You can layer some other allocator on top of @func{palloc_get_page} if +you like, but it should be the underlying mechanism. + +Also, you can use the @option{-ul} kernel command-line option to limit +the size of the user pool, which makes it easy to test your VM +implementation with various user memory sizes. +@end table diff --git a/doc/vm.tmpl b/doc/vm.tmpl new file mode 100644 index 0000000..2b3bb41 --- /dev/null +++ b/doc/vm.tmpl @@ -0,0 +1,155 @@ + +--------------------------+ + | OS 211 | + | TASK 3: VIRTUAL MEMORY | + | DESIGN DOCUMENT | + +--------------------------+ + +---- GROUP ---- + +>> Fill in the names and email addresses of your group members. + +FirstName LastName +FirstName LastName +FirstName LastName + +---- PRELIMINARIES ---- + +>> If you have any preliminary comments on your submission, notes for the +>> TAs, or extra credit, please give them here. + +>> Please cite any offline or online sources you consulted while +>> preparing your submission, other than the Pintos documentation, course +>> text, lecture notes, and course staff. + + PAGE TABLE MANAGEMENT + ===================== + +---- DATA STRUCTURES ---- + +>> A1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> A2: In a few paragraphs, describe your code for locating the frame, +>> if any, that contains the data of a given page. + +>> A3: How does your code coordinate accessed and dirty bits between +>> kernel and user virtual addresses that alias a single frame, or +>> alternatively how do you avoid the issue? + +---- SYNCHRONIZATION ---- + +>> A4: When two user processes both need a new frame at the same time, +>> how are races avoided? + +---- RATIONALE ---- + +>> A5: Why did you choose the data structure(s) that you did for +>> representing virtual-to-physical mappings? + + PAGING TO AND FROM DISK + ======================= + +---- DATA STRUCTURES ---- + +>> B1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> B2: When a frame is required but none is free, some frame must be +>> evicted. Describe your code for choosing a frame to evict. + +>> B3: When a process P obtains a frame that was previously used by a +>> process Q, how do you adjust the page table (and any other data +>> structures) to reflect the frame Q no longer has? + +>> B4: Explain your heuristic for deciding whether a page fault for an +>> invalid virtual address should cause the stack to be extended into +>> the page that faulted. + +---- SYNCHRONIZATION ---- + +>> B5: Explain the basics of your VM synchronization design. In +>> particular, explain how it prevents deadlock. (Refer to the +>> textbook for an explanation of the necessary conditions for +>> deadlock.) + +>> B6: A page fault in process P can cause another process Q's frame +>> to be evicted. How do you ensure that Q cannot access or modify +>> the page during the eviction process? How do you avoid a race +>> between P evicting Q's frame and Q faulting the page back in? + +>> B7: Suppose a page fault in process P causes a page to be read from +>> the file system or swap. How do you ensure that a second process Q +>> cannot interfere by e.g. attempting to evict the frame while it is +>> still being read in? + +>> B8: Explain how you handle access to paged-out pages that occur +>> during system calls. Do you use page faults to bring in pages (as +>> in user programs), or do you have a mechanism for "locking" frames +>> into physical memory, or do you use some other design? How do you +>> gracefully handle attempted accesses to invalid virtual addresses? + +---- RATIONALE ---- + +>> B9: A single lock for the whole VM system would make +>> synchronization easy, but limit parallelism. On the other hand, +>> using many locks complicates synchronization and raises the +>> possibility for deadlock but allows for high parallelism. Explain +>> where your design falls along this continuum and why you chose to +>> design it this way. + + MEMORY MAPPED FILES + =================== + +---- DATA STRUCTURES ---- + +>> C1: Copy here the declaration of each new or changed `struct' or +>> `struct' member, global or static variable, `typedef', or +>> enumeration. Identify the purpose of each in 25 words or less. + +---- ALGORITHMS ---- + +>> C2: Describe how memory mapped files integrate into your virtual +>> memory subsystem. Explain how the page fault and eviction +>> processes differ between swap pages and other pages. + +>> C3: Explain how you determine whether a new file mapping overlaps +>> any existing segment. + +---- RATIONALE ---- + +>> C4: Mappings created with "mmap" have similar semantics to those of +>> data demand-paged from executables, except that "mmap" mappings are +>> written back to their original files, not to swap. This implies +>> that much of their implementation can be shared. Explain why your +>> implementation either does or does not share much of the code for +>> the two situations. + + SURVEY QUESTIONS + ================ + +Answering these questions is optional, but it will help us improve the +course in future quarters. Feel free to tell us anything you +want--these questions are just to spur your thoughts. You may also +choose to respond anonymously in the course evaluations at the end of +the quarter. + +>> In your opinion, was this assignment, or any one of the three problems +>> in it, too easy or too hard? Did it take too long or too little time? + +>> Did you find that working on a particular part of the assignment gave +>> you greater insight into some aspect of OS design? + +>> Is there some particular fact or hint we should give students in +>> future quarters to help them solve the problems? Conversely, did you +>> find any of our guidance to be misleading? + +>> Do you have any suggestions for the TAs to more effectively assist +>> students, either for future quarters or the remaining tasks? + +>> Any other comments? diff --git a/specs/8254.pdf b/specs/8254.pdf new file mode 100644 index 0000000..ef00c17 Binary files /dev/null and b/specs/8254.pdf differ diff --git a/specs/8259A.pdf b/specs/8259A.pdf new file mode 100644 index 0000000..baa9258 Binary files /dev/null and b/specs/8259A.pdf differ diff --git a/specs/ata-3-std.pdf b/specs/ata-3-std.pdf new file mode 100644 index 0000000..621d602 Binary files /dev/null and b/specs/ata-3-std.pdf differ diff --git a/specs/elf.pdf b/specs/elf.pdf new file mode 100644 index 0000000..78711dc Binary files /dev/null and b/specs/elf.pdf differ diff --git a/specs/freevga/feedback.htm b/specs/freevga/feedback.htm new file mode 100644 index 0000000..1b60c96 --- /dev/null +++ b/specs/freevga/feedback.htm @@ -0,0 +1,59 @@ + + + + + + + FreeVGA Feedback + + + +
    Home  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Feedback Form  +
    +        Note - this form requires that +www.goodnet.com be reachable from your client in order to be sent. If it +does not return with a success message, check the connection and try sending +again. +
    Name: +
    +
    Email: +
    +
    Project/Company/Program: +
    +
    Application of Information: +
    Just Curious +
    Game Programming +
    Demo Programming +
    OS Development +
    Driver Development +
    Other +
    Type of Software: +
    Not Applicable +
    Personal Use +
    Public Domain +
    Freeware +
    Shareware +
    GNU or other +free license +
    Commercial +
    Other +
    Body of Message:
    + +
    + +
    + + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/freevga.htm b/specs/freevga/freevga.htm new file mode 100644 index 0000000..7d9c1ed --- /dev/null +++ b/specs/freevga/freevga.htm @@ -0,0 +1,255 @@ + + + + + + + FreeVGA--About the FreeVGA Project + + + +

    Home Intro Completeness +Contribution Membership +Questions Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    About the FreeVGA Project  +
    + + +

    Introduction +
            As a software programmer +for the PC architecture I have noticed that there is plenty of free information +existing on the Internet for programming PC hardware, with the notable +exception of video adapters, MPEG cards and the like. When the VGA was +the standard adapter card in PC's, programming was relatively straightforward. +However, when SVGA adapters appeared on the market, there was little standardization +between separate vendor's products. For this reason, applications and graphical +user interface systems required specialized drivers to utilize the extended +capabilities of the SVGA adapters. Either these specialized drivers were +written by the video card vendor or they had to be written by the application +vendor. Unfortunately due to the cost of specifications and complexity +of hardware, most free or shareware programs supported only the standard +VGA. The goal of this project, under my direction, is to explore the area +of low-level video programming and provide information for programmers +of free software throughout the world. +
            With the growing popularity +of the free software concept, more and more specialized applications are +written every day. Many of these applications could take advantage of the +specialized features of the video hardware but often the information required +is not available. This projects goal is to make that information available +By gaining the cooperation of both programmers and hardware manufacturers +an excellent reference can be developed. Internet technology makes it possible +to provide a free resource updated in a timely fashion as opposed to printed +matter which takes time to print and deliver. + +

    Completeness +
            Why would a low-level programmer +need this reference rather than simply use datasheets and chipset specifications +from the manufacturers themselves? First, the manufacturers may no longer +provide literature on "obsolete" chipsets. Second, such datasheets are +often aimed at hardware designers rather than programmers. Programmers +are trying to implement software on existing hardware rather design hardware. +Many of the details needed to program the hardware are dependent on the +implementation of the video adapter. Datasheets typically only provide +information on one particular component of the video hardware. It is necessary +to understand how the components in the adapter are "wired" together to +be able to program the adapter. This information is rarely found in datasheets +or manufacturer's documentation. Third, as demonstrated on the VGA and +some specific SVGA hardware, there are always programmers who can find +ways to cleverly program the hardware to provide capabilities unimagined +by the manufacturer. To do this, it requires the programmer to have intimate +knowledge of the hardware, as BIOS services provide a "lowest common denominator" +of capability. + +

    Contribution +
            This section is for +those interested in contributing to the FreeVGA Project. The primary objective +of this project is to gather information about video hardware, to verify +this information as best as possible, then to organize the information +in a form usable for any application, and finally to make this information +freely available to all programmers. Because of the non-profit nature of +this project, all information provided is the result of generous contribution +by myself or others. The primary resources required by this project are: +chipset datasheets/documentation, developer kits for video boards, video +adapter boards used for testing and verifying information, and finally +"postcards from the bleeding edge" i.e. information about the real world +problems and their workarounds from video programmers. If you can provide +any of these resources to the project or any other related assistance myself +and other programmers who benefit give thanks to your generosity. Your +name will be forever enshrined on the list of contributors, along with +a link to your homepage if you so desire. +
         I can be reached via the feedback +form. If you wish to donate hardware or documentation, please send +it, along with your name and the link you wish to include in the list of +contributors to: + +

    Joshua Neal +
    FreeVGA Project +
    925 N. Coronado Dr. +
    Gilbert, AZ 85234 + +

    Membership +
            Because of the nature of +this project, any contributor can consider themselves a member if they +wish to do so. As the founder of this project I am willing to donate my +time and resources to ensuring the continuing organization, accuracy, and +usability of the FreeVGA Project's documentation. I will continue to do +this indefinitely, although if the task becomes overwhelming I will solicit +volunteers to assist with the project. There may at some point in the future +be special considerations for vendors that choose to support the projects +goals. + +

    Open-Ended Questions +
            The problem of documenting +video card operation and behavior is difficult because of the large amount +of information available, and because video hardware is constantly evolving, +making documentation a problem of hitting a moving target. Another problem +is that because video cards are projects of human endeavor and due to their +complexity, their implementation often differs from published specifications. +Even two manufacturer's products based upon the same chipset can contain +enough variation to make them separate cases from a programmer's viewpoint. +The FreeVGA Project attempts to provide answers to the following questions: +

      +
    • +How does one detect what VGA/SVGA adapter is present even when no access +to BIOS is available?
    • +
    +        Because video hardware was developed +by many independent vendors along separate evolutionary paths, there is +very little knowledge about how to identify the particular model of video +card present in a machine. Because identifying the particular model is +crucial to utilizing the advanced features of a specific model, this is +an important task that nearly all software written for the SVGA needs to +perform. In many cases, such as when writing programs under an operating +system other than MS-DOS/Windows, it may not be possible nor is it good +practice to access BIOS for determining the specific model. Furthermore, +the more recent PCI bus design is being incorporated into many systems +with non-80x86 chips, such as the PowerPC, Alpha, and even high end workstations. +However, the manufacturers of the hardware may only support Mac and PC +versions of their cards, considering other platforms too much of a niche +market to support. Since their inception in the PC market, most video cards +have had the capability to work with another card (albeit different) in +the same machine. Some newer PCI cards allow multiple cards to be used +in one machine. Until recently this capability has been unsupported by +operating systems and programs. For debugging video routines there is no +equal to having a second monitor attached--one monitor can display the +program's output while the other provides the debugging interface. Note +that the need for a second monitor could be reduced somewhat if better +virtualization of the display hardware was implemented. This project aims +to give programmers the skills and knowledge to better utilize the video +hardware. +
      +
    • +How does one perform standard video operations on a particular card +without utilizing the video BIOS interface?
    • +
    +        BIOS was designed to support +MS-DOS programs on the 8086. Computers have progressed far beyond that +point but BIOS has remained basically the same (with the exception of VESA +BIOS.) Even then, most facilities provided by the video BIOS are not particularly +useful. An example is the BIOS Read and Write Pixel commands. These are +the only BIOS provided method of accessing video memory other than text +functions. Anyone like myself who started learning 8086 assembler to speed +up their graphics discovered this function, and said "Cool. This is going +to be easy." I then ran my first program and discovered that I had just +lost a few orders of magnitude of performance. Thus I started to learn +to interact with the video card hardware directly. VESA BIOS is better, +although I have seen very few on-board implementations that work properly +(if present), usually resulting in the user having to run a TSR program +that provides VESA services in RAM. While VESA BIOS does provide some facilities +for non-real mode problems it still does place a function call penalty +on video code. My biggest complaint is that with VESA BIOS you are restricted +to programming to the lowest common denominator of all video hardware, +and thus to utilize any special features of a chipset, you still have to +learn to program the card directly. Many video cards are now considered +"obsolete" by their manufacturer (or the manufacturer has joined the great +corporation in the sky...) and developer support is no longer available. +The unfortunate problem is that these "obsolete" cards are the same cards +being used by non-profit organizations and schools who could otherwise +reap the benefits of a wide variety of free software. +
      +
    • +What are the specialized hardware features (2D/3D acceleration, hardware +cursor, video acceleration, etc.) does a particular card have and how does +one utilize these features?
    • +
    +       This is probably the area where +the documentation is most important. These features are programmed differently +on each vendor's video hardware, but can improve the performance of a particular +application by several orders of magnitude. Programmers such as video game +programmers and assembly demo programmers have demonstrated that by pushing +hardware to its limits, previously inconceivable animation and special +effects are possible. Recent advances in 3D acceleration have made virtual +reality on the desktop a possibility. It is crucial to developers that +they be able to thoroughly understand the hardware's operation to maximize +its performance. +
      +
    • +What are the differences between specific implementations of particular +chipsets, and if so how does one write software that works with these differences?
    • +
    +      Even programmers programming for +the relatively standard VGA hardware can run into problems in this area. +This is the reason so many packages state support for IBM VGA or 100% compatible. +Frequently programmers encounter slight differences in hardware that, under +specific circumstances can cause their programs to fail to operate properly. +If programmers had detailed documentation of the hardware differences of +specific implementations, programmers could, and are generally willing +to, write workarounds in their code in order to provide support for this +hardware. Occasionally subtle hardware problems arise in a particular version +of a board and is corrected in a later revision (possibly by simply revising +the BIOS.) It is important to recognize the earlier version and be able +to write software that can deal with its particular problem. In addition, +many chipsets are designed in such a way that they can work with a variety +of support devices such as clock generators and Video DACs. It is important +to know how to detect and control these support devices, which may (and +usually is) be different in every implementation. Some of these devices +could be interchanged with pin-compatible devices which could provide additional +functionality. However, this would require special programming to utilize +the device's features. +
      +
    • +How does one perform diagnostics on a particular video card in order +to identify inoperative, semi-inoperative, and improperly configured hardware?
    • +
    +      While testing and verifying the operation +of various video boards. I discovered some cards that did not respond properly +to my programming. My initial thought was that I was doing something wrong +and tried to figure out what was wrong. However, further testing on another +identical card demonstrated that the first board had simply failed. There +is little diagnostic software for the VGA and SVGA adapters particularly +when dealing with some of the more esoteric features. This is primarily +because little has been identified about the correct behavior of video +cards. Many manufacturers fail to include a thorough diagnostics utility +with their hardware, and the diagnostics that are provided are usually +specific to one operating system. +
      +
    • +How does one properly emulate a particular VGA/SVGA adapter in order +to properly implement compatibility for legacy full-screen applications?
    • +
    +        This is a particular +are of interest to myself and others. This knowledge can be used to create +a virtual machine capable of multitasking legacy applications. Particular +features that could be provided are the ability to execute a full-screen +program in the background, execute a full-screen program in a virtual window +on a desktop, emulate a particular video adapter and translate its output +to a form compatible with the hardware on the machine, provide the ability +to remotely view an applications screen across a network, provide the ability +to debug a full-screen application without having to use a dual monitor +system or attached text terminal. Huge benefits can be reaped, but all +of the details of a particular hardware configuration must be known for +proper emulation/virtualization. For example, programs that attempt to +autodetect the hardware often rely on undocumented behaviors of video adapters. +These undocumented behaviors must be emulated properly for the application +to work properly. + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License + + diff --git a/specs/freevga/glossary.htm b/specs/freevga/glossary.htm new file mode 100644 index 0000000..825f9d6 --- /dev/null +++ b/specs/freevga/glossary.htm @@ -0,0 +1,100 @@ + + + + + + + FreeVGA - Video Programming Glossary + + + +

    Home Back
    + +
    # | A | B | C | D | E | F +| G | H | I | J | K | L | M | N | O | P | Q | R +| S | T | U | V | W | X | Y | Z
    + +
    +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    Video Programming Glossary  +
    +Introduction +
            This page is a glossary +covering video programming related terms.  It is extremely difficult +for me to determine which terms should be included in this page, thus if +you have come here looking for a particular term and are dismayed at not +finding it listed here, please send a note with the Feedback +Form including the term in the body of the message, and it will be +added here. +
    ------ C ------
    + + +

    CLUT -- see Color Look-Up Table + +

    Color Look-Up Table -- see +Palette Look-Up Table +
      +
      +

    ------ D ------
    + + +

    DAC -- acronym for Digital to Analog Converter, +which is a integrated circuit that converts intensity values to analog +signal values.  This is used in video chipsets to produce the analog +signals that are sent to the monitor.  Some DACs, known as RAMDACs +contain a palette look-up table. + +

    Digital to Analog Converter -- see DAC +

    ------ F ------
    + + +

    Frame Buffer -- RAM that is part of the graphics hardware that +stores the pixel values that are converted into color intensity. +

    ------ P ------
    + + +

    Palette Look-Up Table -- a small table of +RAM that contains a set of RGB intensity values for +each index, of which there are 256 locations.  The information in +this table is used by the DAC to generate the analog +signal. Also known as a Color Look-Up Table or CLUT. +

    ------ R ------
    + + +

    RAMDAC -- A DAC that contains a built-in palette +look-up table. + +

    RGB -- acronym for Red, Blue & Green, which +describes the three primary colors of light that a CRT generates to produce +the range of visible colors. +

    ------ S ------
    + + +

    Super VGA -- see SVGA + +

    SVGA -- acronym for Super-VGA, +which is a term applied to chipsets and the advanced functionality of those +chipsets that are above and beyond the capabilities of the original IBM +VGA chipset. +

    ------ V ------
    + + +

    VGA -- acronym for Video Graphics Array, which +is the term for IBM's successor to the EGA graphics chipset.  This +term is also used when describing register compatible functions of other +chipsets such as SVGA chipsets. + +

    Video Graphics Array -- see VGA +
      +
      + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/hardovr.htm b/specs/freevga/hardovr.htm new file mode 100644 index 0000000..3dc08cd --- /dev/null +++ b/specs/freevga/hardovr.htm @@ -0,0 +1,92 @@ + + + + + + + FreeVGA - Overview of Video Hardware Functionality + + + +

    Home Introduction Frame +Buffer Graphics Controller Display +Generation Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    Overview of Video Hardware Functionality  +
    +Introduction +
            This page contains a general +overview of the functionality of VGA and SVGA cards into various sections, +and gives a description of the functions of each section.  This is +intended to be a general description for those unfamiliar to the functionality +and capabilities of graphics hardware.  The basic function of graphics +hardware is to allow the CPU to manipulate memory specific to the graphics +hardware, and to take the information stored in that memory and output +it in a form that a monitor or LCD panel can use. + +

    Frame Buffer +
            This is the component of +the video hardware that stores the pixels and information to be displayed +on the monitor.  This is the center of the video hardware, as nearly +all operations are performed on or using this data.  The frame buffer +is a form of RAM, which is typically located outside the main graphics +chip and are implemented using DRAM chips; however, more sophisticated +forms of RAM that are ideal for video hardware applications, such as VRAM.  +The amount of video memory that is present determines the maximum resolution +that the hardware can generate.  The frame buffer is usually mapped +into a region of the host CPU's address space allowing it to be accessed +as if it were a portion of the main memory.  For example, in the VGA, +this memory is mapped into the lower 1M of the CPU address space, allowing +it to be directly accessable to real mode applications, which cannot directly +access the remaining memory.  In the VGA, this memory is broken up +into 4 separate color planes, which are recombined to produce the actual +pixel values at the time of display generation. + +

    Graphics Controller +
            This is the video chipset's +host interface to the frame buffer, and is part of the main graphics chip +or chips.  It allows the host CPU to manipulate the frame buffer in +a fashion suited to the task of graphics operations.  It allows certain +methods of access that are designed to reduced the CPU requirements for +performing standard video operations, particularly in accelerated chipsets, +which can have a quite complicated set of access methods which can include +line drawing, area and pattern fill, color conversion/expansion, and even +3d rendering acceleration.  For example, in the VGA the graphics controller +allows one write by the CPU to its mapped memory region below 1M to affect +all four color planes, as well as allowing faster transfers of video data +from one region to another in video memory. + +

    Display Generation +
            This portion of the graphics +hardware is involved in taking the data in the frame buffer, converting +the pixel or character information stored by the graphics controller, and +converting it into the analog signals required by the monitor or lcd display.  +The pixel data is first sequenced, or read serially from the frame buffer, +then converted into analog color information, either by a palette look-up +table, or by directly converting into red, green, and blue components.  +The CRT controller at the same time adds timing signals that allow the +monitor to display the analog color information on the display.  For +example, in the VGA these components are made up of the sequencer, attribute +controller, CRT controller, DAC, and palette table.  The sequencer +reads the information from the frame buffer, and converts it into pixel +color information, as well as sends signals to the CRT controller such +that it can provide the timing signals the monitor requires.  This +color information is formatted by the attribute controller in such a way +that the pixel values can be submitted to the DAC.  The DAC then looks +up these values in its palette table which contains red, green, and blue +intensities for each of the colors that the attribute controller generates, +then converts it into an analog signal that is output to the VGA connector +along with the timing signals generated by the CRT controller.  If +the display is an LCD panel such as found in laptops, the DAC and associated +support hardware convert the pixel values to signals that the LCD panel +displays directly. + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      + + diff --git a/specs/freevga/hardrec.htm b/specs/freevga/hardrec.htm new file mode 100644 index 0000000..0f865b3 --- /dev/null +++ b/specs/freevga/hardrec.htm @@ -0,0 +1,127 @@ + + + + + + + FreeVGA - Product Recommendations for Video Developers + + + +

    Home Introduction Recommended +Failed Test Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Product Recommendations for Video Developers  +
    +Introduction +
            This page is to provide +hardware recommendations for those implementing the information on this +site.  There are no recommendations for video cards, as the goal is +to increase programmer support for all cards, existing or otherwise, rather +than try to influence people to buy a specific video card implementation.  +I will, however recommend hardware, other than the video cards themselves +that are helpful in the development of software for video cards in general. +Monitors are a strong issue to me, both for safety concerns, and financial +concerns, as it is usually advantageous to buy an new, indestructable monitor +than to burn through many cheap, expendable monitors. + +

     Monitors Recommended +
            For a monitor to be recommended +it must meet all of the following criteria: +

      +
    • +It should be able to tolerate improperly formed video inputs for an extended +period of time without the possibility of being damaged.  It should +also be tolerant to extremely frequent mode changes.  Damage due to +this kind of operation should not be excluded by the standard manufacturer's +warranty.  This is critical due to the replacement cost of high-performance +monitors, and due to the possible safety and fire hazards failing monitors +may cause.
    • + +
    • +It should be able to synchronize to a wide variety of properly formed signals, +including both standard and custom video timings.  This important +for developing the modes required for special applications.
    • + +
    • +It should handle the maximum frequencies/resolutions that can be generated +by current and future (to a reasonable extent) video chipsets.
    • + +
    • +It should be compliant to all levels of display to host communication and +power mamagement so code can be developed that implements these features +of the video hardware.
    • + +
    • +If the monitor allows the picture controls to be saved/restored on changes +in mode, it should be allowed to defeat this feature so that the generated +video timings can be adjusted to minimize the visible effects of mode change.
    • + +
    • +It should be currently available on the market and covered by manufacturers +warranty for the period of time required to develop the desired application.
    • + +
    • +It has been put through my own personal monitor torture tests, as well +as operated for an extended period of time under conditions related to +video software development.
    • +
    +The following monitors have been evaluated by myself personally, and have +been determined to meet all these criteria. +
      +
    • +At this time, there are no monitors that I have determined meet all these +criteria.
    • +
    +Monitor Failures +
            This section lists monitors +that have either died for me while testing, or have died for others in +a fashion that would imply that the programmer was responsible for their +failure.  This does not imply that the programmer was at fault, as +these things naturally happen when developing drivers.  I strongly +recommend the purchase of one of the recommended monitors, to avoid damaging +a valuable monitor + +

    Compaq VGA (not SVGA) -- I do not know the model specifically.  +It was a fixed frequency model, and the horizontal circuitry was damaged.  +The problem was repeatable after repairs were made, so I believe that the +monitor can be damaged by normal mode testing. I have met others who claim +to have experienced this same problem.  Not recommended. + +

    CTX CMS-1561LR -- The problem with this monitor occured when +driving the monitor at the high end of its frequency envelope.  The +monitor synced to the frequency, but may have been slightly overdriven.  +The horizontal output transistor and some capacitors were replaced and +the monitor was restored to working order.  The problem has not been +repeated, so ordinary failure is likely. + +

    NCR MBR 2321 -- This one comes from a friend in Fayetteville, +AR, whose monitor blew caps while writing a svgalib video driver.  +The explosion from the capacitors shattered the rear of the picture tube, +damaging the monitor beyond repair.  Not recommended due to the catastrophic +nature of the failure.  The operation being performed when the failure +occurred was frequent mode changing. + +

    Test Equipment Recommended +
            There are certain pieces +of test equipment that can come in handy when working with video cards.  +This can be especially important when verifying that the video signal being +generated is, in fact the one intended by the programmer.  Failure +to do this can cause catastrophic failure when the driver is used in conjunction +with a fixed-frequency or other monitor that can be damaged by improper +inputs. +

      +
    • +At this time, I cannot recommend any test equipment other than a good frequency +counter, as this is really not in my area of expertise.  If you can +help me in my research into this, I would be greatly appreciative.
    • +
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License + + diff --git a/specs/freevga/home.htm b/specs/freevga/home.htm new file mode 100644 index 0000000..e27835d --- /dev/null +++ b/specs/freevga/home.htm @@ -0,0 +1,486 @@ + + + + + + + FreeVGA Project Home - Hardware level VGA and SVGA programming info + + + +
    Home News Mirrors +Preface Background VGA +SVGA Tricks Links +Disclaimer Products Feedback +Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    Home  +
    + +
     
    + +
    ' . + &t2h_anchor('', $href, $entry) . + '  ' . + $descr . + "
    ' . + $entry . + '' . $descr . + "
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    76543210
    FREE VGA
    + +
    This page is home to the FreeVGA Project +-- dedicated to providing a totally FREE source of information about video +hardware.  Additional goals/information are +located here.
    + +
    "Keep on rocking in the free world." - Neil Young
    + + +

    Latest News +
            08/01/1998 -- More +information is now up, including a large portion of the "standard" VGA +reference.  Some other minor changes have been made to other information.  +Expect more updates in the not too far future. + +

            06/20/1998 -- The +work contiues.  Added three new mirrors.  Some of the information +that was located in the VGA reference, but really applies to video programming +in general has been moved to the new Background Information (formerly Introduction) +section of this page, and has been released.  Also, a glossary has +been added defining terms related to video programming, but is not very +comprehensive at the moment, although this should improve over time. Many +minor corrections have been made to the released material after being pointed +out by the insightful people reading the information.  Thank you! + +

            06/08/1998 -- The +mirror list has been updated with the new entries. Special thanks goes +out to all those who have donated their personal resources to advance the +project's goals.  Also, the first section of *real* information is +online, the low-level programming introduction.  This section has +been relatively stable for quite some time, and seems to be releasable.  +It is my goal to release the information after it stabilizes, and has been +verified for accuracy. + +

            06/04/1998 -- If +you are looking for the current work-in-progress, and have been given the +passwords for the archive for review purposes, it can be downloaded here.  +For those with current problems/questions that this page addresses, please +feel free to use the Feedback Form to contact the +author.  If a link is marked with (WIP), it is not posted online +and at this time is available only for review, upon request, and under +specific limitations. + +

    Mirror Sites +
            At this time, the project +is experimenting with the feasibility of maintaining mirror sites to make +this information more widely available.  The following mirror sites +are provided for your convenience.  If you are interested in hosting +a mirror site of this information, please contact the author for more information. +If you are experiencing problems with any of these mirrors please use the +Feedback Form to contact the author, as it is likely +my fault that the problem has arisen. +

    +Preface +
            This page's purpose is to +provide free low-level programming information to programmers interested +in the low-level details of programming the VGA and Super VGA adapters, +in a format independent of operating environment or programming language. +This page is not intended to be a reference to graphics or game programming +algorithms but rather a reference for those people attempting to implement +such algorithms in whatever environment they are using. This page is not +intended to be a showcase of web technology and thus will use HTML features +and graphics only when it is necessary to convey information. For example, +I have left the colors and fonts set to the default, so you can actually +use the default preferences in your browser. I am continuously adding material +to this page and have tried to incorporate links to other sites with valuable +information. + +Background Information +
            Foremost, this page is meant +to be a place online where one can learn about low-level programming (If +everyone knew all of this information then this page would be redundant!) +This section contains general information that can be very helpful when +attempting to use the programming information located on this site. + +Standard VGA Chipset Reference +
            This section documents +the subset of functionality present on nearly all computers today. The +VGA BIOS utilizes only a fraction of the capability of the VGA hardware. +By programming the hardware at the lowest level, one gains the flexibility +to unleash the hardware's full potential. + +Super VGA Hardware Chipset Reference +
            This section documents the +known functionality of specific VGA/SVGA chipsets. Because developers of +chipsets and video cards face incredible competition, they have added features +and functionality far beyond the standard VGA hardware. Unfortunately to +programmers, these features have been implemented differently in each particular +chipset, and even differently between products by the same manufacturer. +It is difficult to obtaining information on these chipsets and their implementations, +particularly so if the chipset is considered "obsolete" by the manufacturer. +Because of the open-ended nature of this topic (chipsets are under constant +development) this page will be updated constantly as new information becomes +available. + +Other Video Hardware Reference +
            This section is for +video hardware that does not specifically fit into the category of VGA +or SVGA, such as MPEG hardware, video capture hardware, non-integrated +3D accelerators, virtual reality gear, digital video cameras, stereoscopic +3D goggles, TV tuner cards, non-VGA compatible video adapters and the like. +This is another open-ended topic but is not the primary focus of this page. + +

    Tricks and Techniques +
            This section contains useful +information on how to utilize the VGA/SVGA hardware to optimize specific +tasks or implement visual effects. Many of these techniques have been utilized +by game and demo programmers alike to push the envelope of the hardware's +capacity. +

    +Other References: +
            This section gives +some pointers to other available VGA hardware information available. Note +that they are listed here because +
      +
    • +Online Information
    • + +
        +
      • +The COMP.SYS.IBM.PC.HARDWARE.VIDEO +Frequently Asked Questions (FAQ) document, although not programming +oriented does contain much useful information about video hardware. The +site also includes links to nearly every vendor of video cards and monitors, +as well as links to pages covering monitor specifications. If you are looking +for video hardware related information not covered by the FreeVGA Project's +goals, you will likely find it or a link to it here.
      • + +
      • +Finn Thøgersen's VGADOC +& WHATVGA Homepage -- An excellent collection of information for +programming VGA and SVGA.
      • + +
      • +Boone's Programming +the VGA Registers -- Contains a very sketchy "Documentation Over the +I/O Registers for Standard VGA Cards" by "Shaggy of The Yellow One." It +is free and distributable over the "Feel free to spread this to whoever +wants it....." licensing agreement.
      • + +
      • +Andrew Scott's VGA Programmers Master Reference Manual (click +here to download from ftp.cdrom.com) -- A dated ('91) document that +is interesting if only because it attempts to document the VGA hardware +(actually the Trident TVGA8900 hardware) in a form useful for "writing +an applications specific BIOS." Begins with a very general description +the topic (a wordy definition of computation in general) and ends with +detailed register descriptions. Unfortunately, it lacks much material between +these areas. Worse, far from being a free resource, it requires shareware +registration fees that must be sent to the U.K. by means of a check drawn +from a U.K. bank only!
      • + +
      • +Richard Wilton's Programmer's Guide to PC and PS/2 Video Systems +(click here +to download from www.dc.ee) An older reference, covers MDA, Hercules, +CGA, MCGA, and VGA. Not much VGA material but does have some register documentation.
      • + +
      • +IBM's RS/6000 CHRP I/O Device Reference Appendix +A: VGA Programming Model -- A good VGA reference from the makers of +the IBM VGA. Better than most on-line references as it contains programming +information in addition to a register description of the hardware; however +it is still vague in many areas. Especially interesting as it begins with +an acknowledgment of the many "clones" of the VGA hardware.
      • + +
      • +Some brief VGA register info is available from the Chip +Directory, also mirrored at other sites (see Chip +Directory home page).
      • + +
      • +Eric S. Raymond's The +XFree86 Video Timings HOWTO -- explains video mode and timing information +used in configuring XFree86 to support a given monitor, intended to be +used by the end user.  Much of the information is not sepcific to +XFree86, and can be used by a programmer as an example of how a low-level +video routine can allow the end-users to setup video modes that pertain +to their monitors, as well as being useful to an end-user of such a program +attempting to configure such a routine to work with their monitors.
      • + +
      • +Tomi Engdahl's electronics +info page has some information about video and vga timings, as well +as a section on VGA to TV converters and homemade circuitry.  The +VGA +to TV converter page contans much information that pertains to driving +custom TV and monitors with a VGA or SVGA card that doesn't have the capability +built-in.
      • +
      + +
    • +Offline Information
    • + +
        +
      • +Richard F. Ferraro's Programmer's Guide to the EGA, VGA, and +Super VGA Cards, Third Edition -- A good text, one of the few good +books on a subject as broad and as complicated as low-level I/O.
      • + +
      • +Frank van Gilluwe's The Undocumented PC, Second Edition -- A Programmer's +Guide to I/O, CPUs and Fixed Memory Areas -- An excellent book, which +is the likely the most complete PC technical reference ever written, and +includes 100+ pages of video programming information, although very little +VGA register information.
      • + +
      • +Bertelsons, Rasch & Hoffman's PC Underground: Unconventional Programming +Topics -- I bought this book on markdown, due to it having some VGA information +in it.  I was surprised to find that not only did it have a register +description, but it also described some possible effects that can be done +with that register.
      • +
      + +
    • +Miscellaneous Information (Information not specific to video hardware, +but useful to video programmers.)
    • + +
        +
      • +Norman Walsh's The +comp.fonts FAQ -- An excellent resource on fonts, typefaces, and such.  +Particularly helpful is the section on intellectual property protection +for fonts, as the copyright legality of fonts and typefaces is somewhat +confusing.  Note -- Norman Walsh has ceased maintaining the FAQ, however, +this link will remain until a new version of the FAQ is produced.
      • +
      +
    +Product Recommendations +
            The FreeVGA Project does +not make hardware recommendations as pertains to hardware covered by the +documentation, in an attempt to prevent any conflicts of interest.  +However, there are other products that can be extremely helpful when implementing +the information found here, such as monitors, test equipment, and software.  +I will not refuse any request to list a product on this page, however I +will categorize it depending upon its importance and suitability for video +related software development using opinions of myself and others.  +If you disagree with the opinion here, please use the Feedback Form to +voice that opinion, such that it can be taken into account. + +Warnings and Disclaimer +
      +
    • +Danger: Monitors are designed to operate within certain frequency +ranges, or for fixed frequency monitors at certain frequencies. Driving +a monitor at a frequency that it is not designed for is not recommended +and may cause damage to the monitor's circuitry which can result in a fire +and safety risk. It is wise to know and understand the specifications +of the monitor(s) that you will be driving in order to prevent damage. +Consult the manufacturers documentation for the monitor for the information, +or if not available, contact the manufacturer directly.  If the monitor +makes unusual noises, or the internal temperature exceeds the rated temperature +of its components, the monitor is likely to experience failure.  This +failure may not be immediate, but is under most circumstances inevitable.  +Monitor failures can be violent in nature, and can explode and produce +shrapnel, as well as overheat and catch fireIn no circumstance +should one leave a monitor unattended in an uncertain state.  Furthermore, +exceeding the rated maximum frequencies of a monitor may cause the phosphors +to age prematurely, as well as increase the amount of harmful radiation +projected towards the viewer beyond the specified maximums.
    • + +
    • +Warning: Clock chips and RAMDACs as well as other components +of the video card are designed with a maximum frequency. Programming +these chips to operate at a frequency greater than they were designed for +causes the chips to run hotter than they were designed to operate, and +may cause the component to fail. It is wise to know and understand +the maximum operating frequency of the components of any video subsystem +you will be programming. Do not assume that the component is safe to operate +at a particular frequency because it can be programmed to operate at that +frequency.  The rated frequencies are rated and verified according +to batch yield.  As clock frequencies increase, the failure rate of +the chips during manufacturing testing increases.  It is impossible +to predict the actual point at which a given semiconductor will fail, thus +manufacturers monitor the failure rate statistically to determine the frequency +that gives an acceptable batch yield.  These failures are typically +unobservable and require a method of testing every gate on the chip, as +many failures may only be observable under very specific circumstances, +typically resulting in intermittent failures, although complete "meltdown" +due to a newly formed short is also possible.   If they +occur, the entire semiconductor must be rejected due to these failures +being irrepairable.  As you exceed the rated frequency you are taking +a semiconductor that has passed a thourough test at its rated frequency +and entering the realm of statistical probability.  Attempting to +find the maximum frequency is impossible, as by the time a failure is noticable +the semiconductor has already been permanently damaged.  Cooling the +external package by using a heat sink and/or fan may increase the frequency +at which a semiconductor can operate; however, there is still no way to +determine the frequency at which a specific semiconductor will fail as +it can only be done statistically and practically undetectable without +being able to determine the proper operation of every gate on the semiconductor.  +Semiconductors such as fast CPU's are rated with the required heat sink +and/or cooling fan in place.  Aftermarket cooling devices are sold +as "performance coolers" due to the inability to determine the statistical +likelyhood of failure and the inability of the end user to simply reject +failed semiconductors.  Under no circumstances should a programmer +develop software that overclocks an end-user's hardware without the end +user being warned of the statistical likelyhood of failure.  +Making any claims about the safety of the software's operation can leave +the programmer with legal liability that cannot be excluded by disclaimer.
    • + +
    • +Disclaimer: The author presents this information as-is without +any warranty, including suitability for intended purpose. The author is +not responsible for damages resulting by the use of the information, incidental +or otherwise. By utilizing this information, you as the programmer take +full liability for any damages caused by your use of this information. +If you are not satisfied with these terms, then your only recourse is to +not use this information. While every reasonable effort is made to ensure +that this information is correct, the possibility exists for error and +is not guaranteed for accuracy, and disclaims liability for any changes, +errors or omissions and is not responsible for any damages that may arise +from the use or misuse of this information.  License to use this information +is only granted where this disclaimer applies in whole.
    • +
    +Feedback +
            I can be reached online +via the Feedback Form.  Consider it your +moral obligation to send feedback about the page, including inaccuracies, +confusing parts, missing info, questions/answers and other feedback type +thingies. +
      + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License + + diff --git a/specs/freevga/license.htm b/specs/freevga/license.htm new file mode 100644 index 0000000..3aa676d --- /dev/null +++ b/specs/freevga/license.htm @@ -0,0 +1,124 @@ + + + + + + + FreeVGA Copyright License + + + +

    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    FreeVGA Project Copyright License  +
    +Introduction +
            This document contains the +FreeVGA Copyright License which states the conditions under which the FreeVGA +Project's Copyrighted information may be used and distributed.  The +conditions of this license ensure that all parties with a need for this +information have the same availability, to the maximum extent possible +as well as ensure the integrity of the documentation. + +

    Disclaimer +
            The author presents this +information as-is without any warranty, including suitability for intended +purpose. The author is not responsible for damages resulting by the use +of the information, incidental or otherwise. By utilizing this information, +you as the programmer take full liability for any damages caused by your +use of this information. If you are not satisfied with these terms, then +your only recourse is to not use this information. While every reasonable +effort is made to ensure that this information is correct, the possibility +exists for error and is not guaranteed for accuracy, and disclaims liability +for any changes, errors or omissions and is not responsible for any damages +that may arise from the use or misuse of this information.  License +to use this information is only granted where this disclaimer applies in +whole. + +

    License +
            The following copyright +license applies to all works by the FreeVGA Project. All of the FreeVGA +Project's documentation is copyrighted by its author, Joshua Neal. + +

    License to utilize the FreeVGA Project documentation is subject to the +following conditions: +

      +
    • +The copyright notice and this permission notice must be preserved complete +on all copies, complete or partial.
    • + +
    • +Duplication is permitted only for personal purposes.  Reduplication +is permitted only under the FreeVGA Project documentation's redistribution +license.
    • + +
    • +The use of the FreeVGA Project documentation to produce translations or +derivative works must be approved specifically by the author.
    • + +
    • +All warnings and disclaimers present in the complete documentation must +apply to the licensee and may not be restricted by locality.  These +must be read before use, and determined to be applicable to the licensee +before the material may be utilized.
    • + +
    • +It is forbidden to represent the FreeVGA Project or to use the FreeVGA +Project's name to solicit or obtain information, services, product, or +endorsements from another party, commercial or otherwise.
    • +
    +If all of the previous conditions are not met, then permission to utilize +the FreeVGA Project's documentation is not granted, and all rights are +reserved. + +

    License to distribute the FreeVGA Project documentation is subject to +the following conditions: +

      +
    • +The copyright notice and this permission notice must be preserved complete +on all copies, complete or partial.
    • + +
    • +An archive of the FreeVGA Project documentation may be distributed in electronic +form only in its entirety, without adding or removing any material, notices, +advertisement, or other information.  Only exact copies of archives +produced or specifically approved by the author may be distributed, and +at the time of distribution, the most recent archive must be distributed.  +The FreeVGA Project documentation must be excluded from any compilation +copyright or other restrictions.  No fee other than the cost of transmission +or the physical media containing the archive may be charged without prior +approval by the author.  The documentation may not be distributed +electronically in part, which includes mirroring in html format on the +internet, unless specific permission is granted by the author.
    • + +
    • +The FreeVGA Project documentation may be distributed in non-electronic +form to students or members of a programming team subject to the condition +that it be provided free of charge.  The documentation may not be +included with or within other copyrighted works unless the other copyrighted +works are also provided free of charge.
    • + +
    • +Small portions may be reproduced as illustrations for reviews or quotes +in other works without this permission notice if proper citation is given +(including URL if the work is online.)
    • + +
    • +Only the current documentation may be distributed.  The URL of the +FreeVGA project online documentation must be provided.  The author +reserves the right to limit distribution by any parties at any time.
    • +
    +If all of the previous conditions are not met, then permission to redistribute +the FreeVGA Project's documentation is not granted, and all distribution +rights are reserved. + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/llintro.htm b/specs/freevga/llintro.htm new file mode 100644 index 0000000..7131b01 --- /dev/null +++ b/specs/freevga/llintro.htm @@ -0,0 +1,246 @@ + + + + + + + FreeVGA -- Introduction to Low-level Programming + + + +

    Home Intro Know +Why Assembly Hex +Conventions Memory Accessing +Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Introduction to Low-level Programming  +
    + + +

    Introduction +
            This section is intended +to give one a general background in low-level programming, specifically +related to video programming. It assumes that you already know how to program +in your intended programming environment, and answers questions such as: +

    +What do I need to know? +
            To program video +hardware at the lowest level you should have an understanding of hexadecimal +and binary numbers, how the CPU accesses memory and I/O ports, and finally +how to perform these operations in your particular programming environment. +In addition you need detailed descriptions of the particular graphics chipset +you are programming. + +

    Why write hardware-level code? +
            One reason for writing hardware-level +code is to develop drivers for operating systems or applications. Another +reason is when existing drivers do not provide the required performance +or capabilities for your application, such as for programming games or +multimedia. Finally, the most important reason is enjoyment. There is a +whole programming scene dedicated to producing "demos" of what the VGA/SVGA +can do. + +

    Do I need to know assembly language? +
            No, but it helps. Assembly +language is the basis all CPU operations. All functions of the processor +and computer are potentially accessible from assembly. With crafty use +of assembly language, one can write software that exceeds greatly the potential +of higher level languages. However, any programming environment that provides +direct access to I/O and memory will work. + +

    What are hex and binary numbers? +
            Humans use a number system +using 10 different digits (0-9), probably because that is the number of +fingers we have. Each digit represents part of a number with the rightmost +digit being ones, the second to right being tens, then hundreds, thousands +and so on. These represent the powers of 10 and is called "base 10" or +"decimal." +
            Computers are digital (and +don't usually have fingers) and use two states, either on or off to represent +numbers. The off state is represented by 0 and the on state is represented +by 1. Each digit (called a bit, short for Binary digIT) here represents +a power of 2, such as ones, twos, fours, and doubling each subsequent digit. +Thus this number system is called "base 2" or "binary." +
            Computer researchers realized +that binary numbers are unwieldy for humans to deal with; for example, +a 32-bit number would be represented in binary as 11101101010110100100010010001101. +Converting decimal to binary or vice versa requires multiplication or division, +something early computers performed very slowly, and researchers instead +created a system where the numbers were broken into groups of 3 (such as +11 101 101 010 110 100 100 010 010 001 101) and assigned a number from +0-7 based on the triplet's decimal value. This number system is called +"base 8" or "octal." +
            Computers deal with numbers +in groups of bits usually a length that is a power of 2, for example, four +bits is called a "nibble", eight bits is called a "byte", 16 bits is a +"word", and 32 bits is a "double word." These powers of two are not equally +divisible by 3, so as you see in the divided example a "double word" is +represented by 10 complete octal digits plus two-thirds of an octal digit. +It was then realized that by grouping bits into groups of four, a byte +could be accurately represented by two digits. Because a group of four +bits can represent 16 decimal numbers, they could not be represented by +simply using 0-9, so they simply created more digits, or rather re-used +the letters A-F (or a-f) to represent 10-15. So for example the rightmost +digits of our example binary number is 1101, which translates to 13 decimal +or D in this system, which is called "base 16" or "hexadecimal." +
            Computers nowadays usually +speak decimal (multiplication and division is much faster now) but when +it comes to low-level and hardware stuff, hexadecimal or binary is usually +used instead. Programming environments require you to explicitly specify +the base a number is in when overriding the default decimal. In most assembler +packages this is accomplished by appending "h" for hex and "b" for binary +to end of the number, such as 2A1Eh or 011001b. In addition, if the hex +value starts with a letter, you have to add a "0" to the beginning of the +number to allow it to distinguish it from a label or identifier, such as +0BABEh. In C and many other languages the standard is to append "0x" to +the beginning of the hex number and to append "%" to the beginning of a +binary number. Consult your programming environment's documentation for +how specifically to do this. +
            Historical Note: Another +possible explanation for the popularity of the octal system is that early +computers used 12 bit addressing instead of the 16 or 32 bit addressing +currently used by modern processors. Four octal digits conveniently covers +this range exactly, thus the historical architecture of early computers +may have been the cause for octal's popularity. + +

    What are the numerical conventions used +in this reference? +
            Decimal is used often +in this reference, as it is conventional to specify certain details about +the VGA's operation in decimal. Decimal numbers have no letter after them. +An example you might see would be a video mode described as 640x480z256. +This means that the video mode has 640 pixels across by 480 pixels down +with 256 possible simultaneous colors. Similarly an 80x25 text mode means +that the text mode has 80 characters wide (columns) by 25 characters high +(rows.) Binary is frequently used to specify fields within a particular +register and also for bitmap patterns, and has a trailing letter b (such +as 10011100b) to distinguish it from a decimal number containing only 0's +and 1's. Octal you are not likely to encounter in this reference, but if +used has a trailing letter o (such as 145o) to distinguish it from a decimal. +Hexadecimal is always used for addressing, such as when describing I/O +ports, memory offsets, or indexes. It is also often used in fields longer +than 3 bits, although in some cases it is conventional to utilize decimal +instead (for example in a hypothetical screen-width field.) +
            Note: Decimal numbers in +the range 0-1 are also binary digits, and if only a single digit is present, +the decimal and binary numbers are equivalent. Similarly, for octal the +a single digit between 0-7 is equivalent to the decimal numbers in the +same range. With hexadecimal, the single-digit numbers 0-9 are equivalent +to decimal numbers 0-9. Under these circumstances, the number is often +given as decimal where another format would be conventional, as the number +is equivalent to the decimal value. +
      + +

    How do memory and I/O ports work? +
            80x86 machines have both +a memory address space and an input/output (I/O) address space. Most of +the memory is provided as system RAM on the motherboard and most of the +I/O devices are provided by cards (although the motherboard does provide +quite a bit of I/O capability, varying on the motherboard design.) Also +some cards also provide memory. The VGA and SVGA display adapters provide +memory in the form of video memory, and they also handle I/O addresses +for controlling the display, so you must learn to deal with both. An adapter +card could perform all of its functions using solely memory or I/O (and +some do), but I/O is usually used because the decoding circuitry is simpler +and memory is used when higher performance is required. +
            The original PC design was +based upon the capabilities of the 8086/8088, which allowed for only 1 +MB of memory, of which a small range (64K) was allotted for graphics memory. +Designers of high-resolution video cards needed to put more than 64K of +memory on their video adapters to support higher resolution modes, and +used a concept called "banking" which made the 64K available to the processor +into a "window" which shows a 64K chunk of video memory at once. Later +designs used multiple banks and other techniques to simplify programming. +Since modern 32-bit processors have 4 gigabytes of address space, some +designers allow you to map all of the video memory into a "linear frame +buffer" allowing access to the entire video memory at once without having +to change the current window pointer (which can be time consuming.) while +still providing support for the windowed method. +
            Memory can be accessed most +flexibly as it can be the source and/or target of almost every machine +language instruction the CPU is capable of executing, as opposed to a very +limited set of I/O instructions. I/O space is divided into 65536 addresses +in the range 0-65535. Most I/O devices are configured to use a limited +set of addresses that cannot conflict with another device. The primary +instructions for accessing I/O are the assembly instructions "IN" and "OUT", +simply enough. Most programming environments provide similarly named instructions, +functions, or procedures for accessing these. +
            Memory can be a bit confusing +though, because the CPU has two memory addressing modes, Real mode and +Protected mode. Real mode was the only method available on the 8086 and +is still the primary addressing mode used in DOS. Unfortunately, real mode +only provides access to the first 1 MB of memory. Protected mode is used +on the 80286 and up to allow access to more memory. (There are also other +details such as protection, virtual memory, and other aspects not particularly +applicable to this discussion.) Memory is accessed by the 80x86 processors +using segments and offsets. Segments tell the memory management unit where +in memory is located, and the offset is the displacement from that address. +In real mode offsets are limited to 64K, because of the 16-bit nature of +the 8086. In protected mode, segments can be any size up to the full address +capability of the machine. Segments are accessed via special segment registers +in the processor. In real mode, the segment address is shifted left four +bits and added to the offset, allowing for a 20 bit address (20 bits = +1 MB); in protected mode segments are offsets into a table in memory which +tells where the segment is located. Your particular programming environment +may create code for real and/or protected mode, and it is important to +know which mode is being used. An added difficulty is the fact that protected +mode provides for I/O and memory protection (hence protected mode), in +order to allow multiple programs to share one processor and prevent them +from corrupting other processes. This means that you may need to interact +with the operating system to gain rights to access the hardware directly. +If you write your own protected mode handler for DOS or are using a DOS +extender, then this should be simple, but it is much more complicated under +multi-tasking operating systems such as Windows or Linux. + +

    How do I access these from my programming environment? +
            That is a very important +question, one that is very difficult to answer without knowing all of the +details of your programming environment. The documentation that accompanies +your particular development environment is best place to look for this +information, in particular the compiler, operating system, and/or the chip +specifications for the platform. + +

    Details for some common programming environments are given in: +

    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/256left.gif b/specs/freevga/vga/256left.gif new file mode 100644 index 0000000..9098b84 Binary files /dev/null and b/specs/freevga/vga/256left.gif differ diff --git a/specs/freevga/vga/256left.txt b/specs/freevga/vga/256left.txt new file mode 100644 index 0000000..3a7630a --- /dev/null +++ b/specs/freevga/vga/256left.txt @@ -0,0 +1,20 @@ + 256-Color Shift Mode Diagram (Left) + ----------------------------------- + + Plane 0 Plane 1 Plane 2 Plane 3 + Carried 7654 3210 7654 3210 7654 3210 7654 3210 +From Prev 0000 1111 0000 1111 0000 1111 0000 1111 + | | | | | | | | | + | | | | | | | | | + XXXX 0000 | | | | | | | + 0 0000 1111 | | | | | | + 1 1111 0000 | | | | | + 2 0000 1111 | | | | + 3 1111 0000 | | | + 4 0000 1111 | | + 5 1111 0000 | +<----- Direction of Shift 6 0000 1111 + 7 | + v + Carried + To Next diff --git a/specs/freevga/vga/256right.gif b/specs/freevga/vga/256right.gif new file mode 100644 index 0000000..11721f6 Binary files /dev/null and b/specs/freevga/vga/256right.gif differ diff --git a/specs/freevga/vga/256right.txt b/specs/freevga/vga/256right.txt new file mode 100644 index 0000000..f213bfe --- /dev/null +++ b/specs/freevga/vga/256right.txt @@ -0,0 +1,18 @@ + 256-Color Shift Mode Diagram (Right) + ----------------------------------- + + Plane 3 Plane 2 Plane 1 Plane 0 + 7654 3210 7654 3210 7654 3210 7654 3210 Carried + 0000 1111 0000 1111 0000 1111 0000 1111 From Prev + | | | | | | | | | + | | | | | | | 1111 XXXX + | | | | | | 0000 1111 0 + | | | | | 1111 0000 1 + | | | | 0000 1111 2 + | | | 1111 0000 3 + | | 0000 1111 4 + | 1111 0000 5 + 0000 1111 6 + | 7 + Carried + To Next diff --git a/specs/freevga/vga/attrreg.htm b/specs/freevga/vga/attrreg.htm new file mode 100644 index 0000000..e1ed97b --- /dev/null +++ b/specs/freevga/vga/attrreg.htm @@ -0,0 +1,360 @@ + + + + + + + VGA/SVGA Video Programming--Attribute Controller Registers + + + +
    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Attribute Controller Registers  +
    + + +

            The Attribute Controller +Registers are accessed via a pair of registers, the Attribute Address/Data +Register and the Attribute Data Read Register. See the Accessing +the VGA Registers section for more detals. The Address/Data Register +is located at port 3C0h and the Data Read Register is located at port 3C1h. +
      + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Attribute Address Register(3C0h)
    76543210
    PASAttribute Address
    +  +

      PAS -- Palette Address Source
      +
      "This bit is set to 0 to load color values to the registers in the +internal palette. It is set to 1 for normal operation of the attribute +controller. Note: Do not access the internal palette while this bit is +set to 1. While this bit is 1, the Type 1 video subsystem disables accesses +to the palette; however, the Type 2 does not, and the actual color value +addressed cannot be ensured." +
    • +Attribute Address
      +
      This field specifies the index value of the attribute register to be +read or written.
    • +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Palette Registers (Index 00-0Fh)
    76543210
    Internal Palette Index
    +  +
      Internal Palette Index
      +
      "These 6-bit registers allow a dynamic mapping between the text +attribute or graphic color input value and the display color on the CRT +screen. When set to 1, this bit selects the appropriate color. The Internal +Palette registers should be modified only during the vertical retrace interval +to avoid problems with the displayed image. These internal palette values +are sent off-chip to the video DAC, where they serve as addresses into +the DAC registers."
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Attribute Mode Control Register +(Index 10h)
    76543210
    P54S8BITPPMBLINKLGEMONOATGE
    +  +
      P54S -- Palette Bits 5-4 Select
      +
      "This bit selects the source for the P5 and P4 video bits that act +as inputs to the video DAC. When this bit is set to 0, P5 and P4 are the +outputs of the Internal Palette registers. When this bit is set to 1, P5 +and P4 are bits 1 and 0 of the Color Select register." +
      8BIT -- 8-bit Color Enable
      +
      "When this bit is set to 1, the video data is sampled so that eight +bits are available to select a color in the 256-color mode (0x13). This +bit is set to 0 in all other modes." +
    • +PPM -- Pixel Panning Mode
    • + +
      This field allows the upper half of the screen to pan independently +of the lower screen. If this field is set to 0 then nothing special occurs +during a successful line compare (see the Line +Compare field.) If this field is set to 1, then upon a successful line +compare, the bottom portion of the screen is displayed as if the Pixel +Shift Count and Byte Panning fields are +set to 0. +
      BLINK - Blink Enable
      +
      "When this bit is set to 0, the most-significant bit of the attribute +selects the background intensity (allows 16 colors for background). When +set to 1, this bit enables blinking." +
    • +LGA - Line Graphics Enable
    • + +
      This field is used in 9 bit wide character modes to provide continuity +for the horizontal line characters in the range C0h-DFh. If this field +is set to 0, then the 9th column of these characters is replicated from +the 8th column of the character. Otherwise, if it is set to 1 then the +9th column is set to the background like the rest of the characters. +
    • +MONO - Monochrome Emulation
    • + +
      This field is used to store your favorite bit. According to IBM, "When +this bit is set to 1, monochrome emulation mode is selected. When this +bit is set to 0, color |emulation mode is selected." It is present and +programmable in all of the hardware but it apparently does nothing. The +internal palette is used to provide monochrome emulation instead. +
    • +ATGE - Attribute Controller Graphics Enable
      +
      "When set to 1, this bit selects the graphics mode of operation."
    • +
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Overscan Color Register (Index 11h)
    76543210
    Overscan Palette Index
    +  +
      Overscan Palette Index
      +
      "These bits select the border color used in the 80-column alphanumeric +modes and in the graphics modes other than modes 4, 5, and D. (Selects +a color from one of the DAC registers.)"
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Color Plane Enable Register (Index +12h)
    76543210
    Color Plane Enable
    +  +
      +
    • +Color Plane Enable
      +
      "Setting a bit to 1, enables the corresponding display-memory color +plane."
    • +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Horizontal Pixel Panning Register +(Index 13h)
    76543210
    Pixel Shift Count
    +  +
      Pixel Shift Count
      +
      "These bits select the number of pels that the video data is shifted +to the left. PEL panning is available in both alphanumeric and graphics +modes."
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Color Select Register (Index 14h)
    76543210
    Color Select 7-6Color Select 5-4
    +  +
      Color Select 7-6
      +
      "In modes other than mode 13 hex, these are the two most-significant +bits of the 8-bit digital color value to the video DAC. In mode 13 hex, +the 8-bit attribute is the digital color value to the video DAC. These +bits are used to rapidly switch between sets of colors in the video DAC." +
      Color Select 5-4
      +
      "These bits can be used in place of the P4 and P5 bits from the +Internal Palette registers to form the  8-bit digital color value +to the video DAC. Selecting these bits is done in the Attribute Mode Control  +register (index 0x10). These bits are used to rapidly switch between colors +sets within the video DAC."
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/char.txt b/specs/freevga/vga/char.txt new file mode 100644 index 0000000..b255efb --- /dev/null +++ b/specs/freevga/vga/char.txt @@ -0,0 +1,22 @@ +_____Examples_of_Text_Mode_Bitmap_Characters_____ + + 7 8x8 0 ___Legend___ 7 8x16 0 7 9x16 0 + 0--XX---- - Background 0-------- 0-------- + -XXXX--- X Foreground -------- -------- + XX--XX-- ? Undisplayed ---X---- XX----XX + XX--XX-- --XXX--- XXX--XXX + XXXXXX-- -XX-XX-- XXXXXXXX + XX--XX-- XX---XX- XXXXXXXX + XX--XX-- XX---XX- XX-XX-XX + 7-------- <------+ XXXXXXX- XX----XX + 8???????? | XX---XX- XX----XX + ???????? XX---XX- XX----XX + ???????? Maximum Scan XX---XX- XX----XX + ???????? Line XX---XX- XX----XX + ???????? -------- -------- + ???????? | -------- -------- + ???????? | -------- -------- + ???????? +------> 15-------- 15-------- + ???????? 16???????? 16???????? + ... ... ... +31???????? 31???????? 31???????? diff --git a/specs/freevga/vga/colorreg.htm b/specs/freevga/vga/colorreg.htm new file mode 100644 index 0000000..c57fdf2 --- /dev/null +++ b/specs/freevga/vga/colorreg.htm @@ -0,0 +1,253 @@ + + + + + + + VGA/SVGA Video Programming--Color Regsters + + + +
      +
      Home Back  +
      Hardware Level VGA and SVGA Video Programming Information +Page
      + +
      Color Registers
      + +
      +
      +
    +        The Color Registers in the standard +VGA provide a mapping between the palette of between 2 and 256 colors to +a larger 18-bit color space. This capability allows for efficient use of +video memory while providing greater flexibility in color choice. The standard +VGA has 256 palette entries containing six bits each of red, green, and +blue values. The palette RAM is accessed via a pair of address registers +and a data register. To write a palette entry, output the palette entry's +index value to the DAC Address Write Mode Register then +perform 3 writes to the DAC Data Register, loading the +red, green, then blue values into the palette RAM. The internal write address +automatically advances allowing the next value's RGB values to be loaded +without having to reprogram the DAC Address Write Mode Register.  +This allows the entire palette to be loaded in one write operation. +To read a palette entry, output the palette entry's index to the DAC +Address Read Mode Register. Then perform 3 reads from the DAC +Data Register, loading the red, green, then blue values from palette +RAM. The internal write address automatically advances allowing the next +RGB values to be written without having to reprogram the DAC +Address Read Mode Register. + +

    Note: I have noticed some great variance in the actual +behavior of these registers on VGA chipsets. The best way to ensure compatibility +with the widest range of cards is to start an operation by writing to the +appropriate address register and performing reads and writes in groups +of 3 color values. While the automatic increment works fine on all cards +tested, reading back the value from the DAC Address Write +Mode Register may not always produce the expected result. Also interleaving +reads and writes to the DAC Data Register without first +writing to the respected address register may produce unexpected results. +In addition, writing values in anything other than groups of 3 to the DAC +Data Register and then performing reads may produce unexpected results. +I have found that some cards fail to perform the desired update until the +third value is written. +

    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    DAC Address Write Mode Register +(Read/Write at 3C8h)
    76543210
    DAC Write Address
    +  +
      +
    • +DAC Write Address
    • + +
      Writing to this register prepares the DAC hardware to accept writes +of data to the DAC Data Register. The value written +is the index of the first DAC entry to be written (multiple DAC entries +may be written without having to reset the write address due to the auto-increment.) +Reading this register returns the current index, or at least theoretically +it should. However it is likely the value returned is not the one expected, +and is dependent on the particular DAC implementation. (See note +above)
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    DAC Address Read Mode Register +(Write at 3C7h)
    76543210
    DAC Read Address
    +  +
      +
    • +DAC Read Address
    • + +
      Writing to this register prepares the DAC hardware to accept reads +of data to the DAC Data Register. The value written +is the index of the first DAC entry to be read (multiple DAC entries may +be read without having to reset the write address due to the auto-increment.)
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    DAC Data Register (Read/Write at +3C9h)
    76543210
    DAC Data
    +  +
      +
    • +DAC Data
    • + +
      Reading or writing to this register returns a value from the DAC memory. +Three successive I/O operations accesses three intensity values, first +the red, then green, then blue intensity values. The index of the DAC entry +accessed is initially specified by the DAC Address Read +Mode Register or the DAC Address Write Mode Register, +depending on the I/O operation performed. After three I/O operations the +index automatically increments to allow the next DAC entry to be read without +having to reload the index. I/O operations to this port should always be +performed in sets of three, otherwise the results are dependent on the +DAC implementation. (See note above)
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    DAC State Register (Read at 3C7h)
    76543210
    DAC State
    +  +
      +
    • +DAC State
    • + +
      This field returns whether the DAC is prepared to accept reads or writes +to the DAC Data Register. In practice, this field is +seldom used due to the DAC state being known after the index has been written. +This field can have the following values: + +
    + + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/crtcreg.htm b/specs/freevga/vga/crtcreg.htm new file mode 100644 index 0000000..db4aa6a --- /dev/null +++ b/specs/freevga/vga/crtcreg.htm @@ -0,0 +1,1355 @@ + + + + + + + VGA/SVGA Video Programming--CRT Controller Registers + + + +

    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    CRT Controller Registers  +
    + + +

            The CRT Controller (CRTC) +Registers are accessed via a pair of registers, the CRTC Address Register +and the CRTC Data Register. See the Accessing the +VGA Registers section for more details. The Address Register is located +at port 3x4h and the Data Register is located at port 3x5h.  The value +of the x in 3x4h and 3x5h is dependent on the state of the Input/Output +Address Select field, which allows these registers to be mapped at +3B4h-3B5h or 3D4h-3D5h.   Note that when the CRTC +Registers Protect Enable field is set to 1, writing to register indexes +00h-07h is prevented, with the exception of the Line Compare +field of the Overflow Register. +

    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Horizontal Total Register (Index +00h)
    76543210
    Horizontal Total
    +  +
      +
    • +Horizontal Total
    • + +
      This field is used to specify the number of character clocks per scan +line.  This field, along with the dot rate selected, controls the +horizontal refresh rate of the VGA by specifying the amount of time one +scan line takes.  This field is not programmed with the actual number +of character clocks, however.  Due to timing factors of the VGA hardware +(which, for compatibility purposes has been emulated by VGA compatible  +chipsets), the actual horizontal total is 5 character clocks more than +the value stored in this field, thus one needs to subtract 5 from the actual +horizontal total value desired before programming it into this register.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    End Horizontal Display Register +(Index 01h)
    76543210
    End Horizontal Display
    +  +
      +
    • +End Horizontal Display
    • + +
      This field is used to control the point that the sequencer stops outputting +pixel values from display memory, and sequences the pixel value specified +by the Overscan Palette Index field for the +remainder of the scan line.  The overscan begins the character clock +after the the value programmed into this field.  This register should +be programmed with the number of character clocks in the active display +- 1.  Note that the active display may be affected by the Display +Enable Skew field. +
       
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Start Horizontal Blanking Register +(Index 02h)
    76543210
    Start Horizontal Blanking
    +  +
      +
    • +Start Horizontal Blanking
    • + +
      This field is used to specify the character clock at which the horizontal +blanking period begins.  During the horizontal blanking period, the +VGA hardware forces the DAC into a blanking state, where all of the intensities +output are at minimum value, no matter what color information the attribute +controller is sending to the DAC.  This field works in conjunction +with the End Horizontal Blanking field to specify the +horizontal blanking period.  Note that the horizontal blanking can +be programmed to appear anywhere within the scan line, as well as being +programmed to a value greater than the Horizontal +Total field preventing the horizontal blanking from occurring at all.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    End Horizontal Blanking Register +(Index 03h)
    76543210
    EVRADisplay Enable SkewEnd Horizontal Blanking
    +  +
      +
    • +EVRA -- Enable Vertical Retrace Access
    • + +
      This field was used in the IBM EGA to provide access to the light pen +input values as the light pen registers were mapped over CRTC indexes 10h-11h.  +The VGA lacks capability for light pen input, thus this field is normally +forced to 1 (although always writing it as 1 might be a good idea for compatibility) +, which in the EGA would enable access to the vertical retrace fields instead +of the light pen fields. +
    • +Display Enable Skew
    • + +
      This field affects the timings of the display enable circuitry in the +VGA. The value of this field is the number of character clocks that the +display enable "signal" is delayed.  In all the VGA/SVGA chipsets +I've tested, including a PS/2 VGA this field is always programmed to 0.  +Programming it to non-zero values results in the overscan being displayed +over the number of characters programmed into this field at the beginning +of the scan line, as well as the end of the active display being shifted +the number of characters programmed into this field.  The characters +that extend past the normal end of the active display can be garbled in +certain circumstances that is dependent on the particular VGA implementation.  +According to documentation from IBM, "This skew control is needed to +provide sufficient time for the CRT controller to read a character and +attribute code from the video buffer, to gain access to the character generator, +and go through the Horizontal PEL Panning register in the attribute controller. +Each access requires the 'display enable' signal to be skewed one character +clock so that the video output is synchronized with the horizontal and +vertical retrace signals." as well as "Note: Character skew is not +adjustable on the Type 2 video and the bits are ignored; however, programs +should set these bits for the appropriate skew to maintain compatibility."  +This may be required for some early IBM VGA implementations or may be simply +an unused "feature" carried over along with its register description from +the IBM EGA implementations that require the use of this field. +
    • +End Horizontal Blanking
    • + +
      This contains bits 4-0 of the End Horizontal Blanking field which specifies +the end of the horizontal blanking period.  Bit 5 is located After +the period has begun as specified by the Start Horizontal +Blanking field, the 6-bit value of this field is compared against the +lower 6 bits of the character clock.  When a match occurs, the horizontal +blanking signal is disabled.  This provides from 1 to 64 character +clocks although some implementations may match in the character clock specified +by the Start Horizontal Blanking field, in which case +the range is 0 to 63.  Note that if blanking extends past the end +of the scan line, it will end on the first match of this field on the next +scan line.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Start Horizontal Retrace Register +(Index 04h)
    76543210
    Start Horizontal Retrace
    +  +
      +
    • +Start Horizontal Retrace
    • + +
      This field specifies the character clock at which the VGA begins sending +the horizontal synchronization pulse to the display which signals the monitor +to retrace back to the left side of the screen.  The end of this pulse +is controlled by the End Horizontal Retrace field.  +This pulse may appear anywhere in the scan line, as well as set to a position +beyond the Horizontal Total field which effectively +disables the horizontal synchronization pulse.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    End Horizontal Retrace Register +(Index 05h)
    76543210
    EHB5Horiz. Retrace SkewEnd Horizontal Retrace
    +  +
      +
    • +EHB5 -- End Horizontal Blanking (bit 5)
    • + +
      This contains bit 5 of the End Horizontal Blanking field.  See +the End Horizontal Blanking Register for details. +
    • +Horiz. Retrace Skew -- Horizontal Retrace Skew
    • + +
      This field delays the start of the horizontal retrace period by the +number of character clocks equal to the value of this field.  From +observation, this field is programmed to 0, with the exception of the 40 +column text modes where this field is set to 1.  The VGA hardware +simply acts as if this value is added to the Start Horizontal +Retrace field.  According to IBM documentation, "For certain +modes, the 'horizontal retrace' signal takes up the entire blanking interval. +Some internal timings are generated by the falling edge of the 'horizontal +retrace' signal. To ensure that the signals are latched properly, the 'retrace' +signal is started before the end of the 'display enable' signal and then +skewed several character clock times to provide the proper screen centering."  +This does not appear to be the case, leading me to believe this is yet +another holdout from the IBM EGA implementations that do require the use +of this field. +
    • +End Horizontal Retrace
    • + +
      This field specifies the end of the horizontal retrace period, which +begins at the character clock specified in the Start Horizontal +Retrace field.  The horizontal retrace signal is enabled until +the lower 5 bits of the character counter match the 5 bits of this field.  +This provides for a horizontal retrace period from 1 to 32 character clocks.  +Note that some implementations may match immediately instead of 32 clocks +away, making the effective range 0 to 31 character clocks.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Vertical Total Register (Index 06h)
    76543210
    Vertical Total
    +  +
      Vertical Total +
      This contains the lower 8 bits of the Vertical Total field.  Bits +9-8 of this field are located in the Overflow Register. +This field determines the number of scanlines in the active display and +thus the length of each vertical retrace.  This field contains the +value of the scanline counter at the beginning of the last scanline in +the vertical period.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Overflow Register (Index 07h)
    76543210
    VRS9VDE9VT9LC8SVB8VRS8VDE8VT8
    +  +
      +
    • +VRS9 -- Vertical Retrace Start (bit 9)
    • + +
      Specifies bit 9 of the Vertical Retrace Start field.  See the +Vertical Retrace Start Register for details. +
    • +VDE9 -- Vertical Display End (bit9)
    • + +
      Specifies bit 9 of the Vertical Display End field.  See the Vertical +Display End Register for details. +
    • +VT9 -- Vertical Total (bit 9)
    • + +
      Specifies bit 9 of the Vertical Total field.  See the Vertical +Total Register for details. +
    • +LC8 -- Line Compare (bit 8)
    • + +
      Specifies bit 8 of the Line Compare field. See the Line +Compare Register for details. +
    • +SVB8 -- Start Vertical Blanking (bit 8)
    • + +
      Specifies bit 8 of the Start Vertical Blanking field.  See the +Start Vertical Blanking Register for details. +
    • +VRS8 -- Vertical Retrace Start (bit 8)
    • + +
      Specifies bit 8 of the Vertical Retrace Start field.  See the +Vertical Retrace Start Register for details. +
    • +VDE8 -- Vertical Display End (bit 8)
    • + +
      Specifies bit 8 of the Vertical Display End field.  See the Vertical +Display End Register for details. +
    • +VT8 -- Vertical Total (bit 8)
    • + +
      Specifies bit 8 of the Vertical Total field.  See the Vertical +Total Register for details.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Preset Row Scan Register (Index +08h)
    76543210
    Byte PanningPreset Row Scan
    +  +
      +
    • +Byte Panning
    • + +
      The value of this field is added to the Start +Address Register when calculating the display memory address for the +upper left hand pixel or character of the screen. This allows for a maximum +shift of 15, 31, or 35 pixels without having to reprogram the Start +Address Register. +
    • +Preset Row Scan
    • + +
      This field is used when using text mode or any mode with a non-zero +Maximum Scan Line field to provide for more +precise vertical scrolling than the Start Address +Register provides. The value of this field specifies how many scan +lines to scroll the display upwards. Valid values range from 0 to the value +of the Maximum Scan Line field. Invalid values +may cause undesired effects and seem to be dependent upon the particular +VGA implementation.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Maximum Scan Line Register (Index +09h)
    76543210
    SDLC9SVB9Maximum Scan Line
    +  +
      SD -- Scan Doubling
      +
      "When this bit is set to 1, 200-scan-line video data is converted +to 400-scan-line output. To do this, the clock in the row scan counter +is divided by 2, which allows the 200-line modes to be displayed as 400 +lines on the display (this is called double scanning; each line is displayed +twice). When this bit is set to 0, the clock to the row scan counter is +equal to the horizontal scan rate." +
    • +LC9 -- Line Compare (bit 9)
    • + +
      Specifies bit 9 of the Line Compare field. See the Line +Compare Register for details. +
    • +SVB9 -- Start Vertical Blanking (bit 9)
    • + +
      Specifies bit 9 of the Start Vertical Blanking field.  See the +Start Vertical Blanking Register for details. +
    • +Maximum Scan Line
    • + +
      In text modes, this field is programmed with the character height - +1 (scan line numbers are zero based.) In graphics modes, a non-zero value +in this field will cause each scan line to be repeated by the value of +this field + 1.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Cursor Start Register (Index 0Ah)
    76543210
    CDCursor Scan Line Start
    +  +
      +
    • +CD -- Cursor Disable
    • + +
      This field controls whether or not the text-mode cursor is displayed. +Values are: +
        +
      • +0 -- Cursor Enabled
      • + +
      • +1 -- Cursor Disabled
      • +
      + +
    • +Cursor Scan Line Start
    • + +
      This field controls the appearance of the text-mode cursor by specifying +the scan line location within a character cell at which the cursor should +begin, with the top-most scan line in a character cell being 0 and the +bottom being with the value of the Maximum Scan Line +field.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
     Cursor End Register (Index +0Bh)
    76543210
    Cursor SkewCursor Scan Line End
    +  +
      +
    • +CSK -- Cursor Skew
    • + +
      This field was necessary in the EGA to synchronize the cursor with +internal timing. In the VGA it basically is added to the cursor location. +In some cases when this value is non-zero and the cursor is near the left +or right edge of the screen, the cursor will not appear at all, or a second +cursor above and to the left of the actual one may appear. This behavior +may not be the same on all VGA compatible adapter cards. +
    • +Cursor Scan Line End
    • + +
      This field controls the appearance of the text-mode cursor by specifying +the scan line location within a character cell at which the cursor should +end, with the top-most scan line in a character cell being 0 and the bottom +being with the value of the Maximum Scan Line field. +If this field is less than the Cursor Scan Line Start +field, the cursor is not drawn. Some graphics adapters, such as the IBM +EGA display a split-block cursor instead.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Start Address High Register (Index +0Ch)
    76543210
    Start Address High
    +  +
      +
    • +Start Address High
    • + +
      This contains specifies bits 15-8 of the Start Address field. See the +Start Address Low Register for details.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Start Address Low Register (Index +0Dh)
    76543210
    Start Address Low
    +  +
      +
    • +Start Address Low
    • + +
      This contains the bits 7-0 of the Start Address field. The upper 8 +bits are specified by the Start Address High Register. +The Start Address field specifies the display memory address of the upper +left pixel or character of the screen. Because the standard VGA has a maximum +of 256K of memory, and memory is accessed 32 bits at a time, this 16-bit +field is sufficient to allow the screen to start at any memory address. +Normally this field is programmed to 0h, except when using virtual resolutions, +paging, and/or split-screen operation. Note that the VGA display will wrap +around in display memory if the starting address is too high. (This may +or may not be desirable, depending on your intentions.)
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Cursor Location High Register (Index 0Eh)
    76543210
    Cursor Location High
    +  + +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Cursor Location Low Register (Index +0Fh)
    76543210
    Cursor Location Low
    +  +
      +
    • +Cursor Location Low
    • + +
      This field specifies bits 7-0 of the Cursor Location field. When the +VGA hardware is displaying text mode and the text-mode cursor is enabled, +the hardware compares the address of the character currently being displayed +with sum of value of this field and the sum of the Cursor +Skew field. If the values equal then the scan lines in that character +specified by the Cursor Scan Line Start field and the +Cursor Scan Line End field are replaced with the foreground +color.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Vertical Retrace Start Register +(Index 10h)
    76543210
    Vertical Retrace Start
    +  +
      +
    • +Vertical Retrace Start
    • + +
      This field specifies bits 7-0 of the Vertical Retrace Start field.  +Bits 9-8 are located in the Overflow Register.  +This field controls the start of the vertical retrace pulse which signals +the display to move up to the beginning of the active display.  This +field contains the value of the vertical scanline counter at the beginning +of the first scanline where the vertical retrace signal is asserted.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Vertical Retrace End Register (Index +11h)
    76543210
    ProtectBandwidthVertical Retrace End
    +  +
      +
    • +Protect -- CRTC Registers Protect Enable
    • + +
      This field is used to protect the video timing registers from being +changed by programs written for earlier graphics chipsets that attempt +to program these registers with values unsuitable for VGA timings.  +When this field is set to 1, the CRTC register indexes 00h-07h ignore write +access, with the exception of bit 4 of the Overflow Register, +which holds bit 8 of the Line Compare field. +
    • +Bandwidth -- Memory Refresh Bandwidth
    • + +
      Nearly all video chipsets include a few registers that control memory, +bus, or other timings not directly related to the output of the video card.  +Most VGA/SVGA implementations ignore the value of this field; however, +in the least, IBM VGA adapters do utilize it and thus for compatibility +with these chipsets this field should be programmed.  This register +is used in the IBM VGA hardware to control the number of DRAM refresh cycles +per scan line.  The three refresh cycles per scanline is appropriate +for the IBM VGA horizontal frequency of approximately 31.5 kHz.  For +horizontal frequencies greater than this, this setting will work as the +DRAM will be refreshed more often.  However, refreshing not often +enough for the DRAM can cause memory loss.  Thus at some point slower +than 31.5 kHz the five refresh cycle setting should be used.  At which +particular point this should occur, would require better knowledge of the +IBM VGA's schematics than I have available.  According to IBM documentation, +"Selecting five refresh cycles allows use of the VGA chip with 15.75 +kHz displays." which isn't really enough to go by unless the mode you +are defining has a 15.75 kHz horizontal frequency. +
    • +Vertical Retrace End
    • + +
      This field determines the end of the vertical retrace pulse, and thus +its length.  This field contains the lower four bits of the vertical +scanline counter at the beginning of the scanline immediately after the +last scanline where the vertical retrace signal is asserted.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Vertical Display End Register (Index +12h)
    76543210
    Vertical Display End
    +  +
      +
    • +Vertical Display End
    • + +
      This contains the bits 7-0 of the Vertical Display End field.  +Bits 9-8 are located in the Overflow Register.  +The field contains the value of the vertical scanline counter at the beggining +of the scanline immediately after the last scanline of active display.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Offset Register (Index 13h)
    76543210
    Offset
    +  +
      +
    • +Offset
    • + +
      This field specifies the address difference between consecutive scan +lines or two lines of characters. Beginning with the second scan line, +the starting scan line is increased by twice the value of this register +multiplied by the current memory address size (byte = 1, word = 2, double-word += 4) each line. For text modes the following equation is used: +
              Offset = Width / ( MemoryAddressSize +* 2 ) +
      and in graphics mode, the following equation is used: +
               Offset = Width +/ ( PixelsPerAddress * MemoryAddressSize * 2 ) +
      where Width is the width in pixels of the screen. This register can +be modified to provide for a virtual resolution, in which case Width is +the width is the width in pixels of the virtual screen. PixelsPerAddress +is the number of pixels stored in one display memory address, and MemoryAddressSize +is the current memory addressing size.
    +  +
      + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Underline Location Register (Index +14h)
    76543210
    DWDIV4Underline Location
    +  +
      DW - Double-Word Addressing
      +
      "When this bit is set to 1, memory addresses are doubleword addresses. +See the description of the word/byte mode bit (bit 6) in the CRT Mode Control +Register" +
      DIV4 - Divide Memory Address Clock by 4
      +
      "When this bit is set to 1, the memory-address counter is clocked +with the character clock divided by 4, which is used when doubleword addresses +are used." +
      Underline Location
      +
      "These bits specify the horizontal scan line of a character row +on which an underline occurs. The value programmed is the scan line desired +minus 1."
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Start Vertical Blanking Register +(Index 15h)
    76543210
    Start Vertical Blanking
    +  +
      +
    • +Start Vertical Blanking
    • + +
      This contains bits 7-0 of the Start Vertical Blanking field.  +Bit 8 of this field is located in the Overflow Register, +and bit 9 is located in the Maximum Scan Line Register.  +This field determines when the vertical blanking period begins, and contains +the value of the vertical scanline counter at the beginning of the first +vertical scanline of blanking.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + +
    End Vertical Blanking Register (Index +16h)
    76543210
    End Vertical Blanking
    +  +
      +
    • +End Vertical Blanking
    • + +
      This field determines when the vertical blanking period ends, and contains +the value of the vertical scanline counter at the beginning of the vertical +scanline immediately after the last scanline of blanking.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    CRTC Mode Control Register (Index +17h)
    76543210
    SEWord/ByteAWDIV2SLDIVMAP14MAP13
    +  +
      SE -- Sync Enable
      +
      "When set to 0, this bit disables the horizontal and vertical retrace +signals and forces them to an inactive level. When set to 1, this bit enables +the horizontal and vertical retrace signals. This bit does not reset any +other registers or signal outputs." +
      Word/Byte -- Word/Byte Mode Select
      +
      "When this bit is set to 0, the word mode is selected. The word +mode shifts the memory-address counter bits to the left by one bit; the +most-significant bit of the counter appears on the least-significant bit +of the memory address outputs.  The doubleword bit in the Underline +Location register (0x14) also controls the addressing. When the doubleword +bit is 0, the word/byte bit selects the mode. When the doubleword bit is +set to 1, the addressing is shifted by two bits. When set to 1, bit 6 selects +the byte address mode." +
      AW -- Address Wrap Select
      +
      "This bit selects the memory-address bit, bit MA 13 or MA 15, that +appears on the output pin MA 0, in the word address mode. If the VGA is +not in the word address mode, bit 0 from the address counter appears on +the output pin, MA 0. When set to 1, this bit selects MA 15. In odd/even +mode, this bit should be set to 1 because 256KB of video memory is installed +on the system board. (Bit MA 13 is selected in applications where only +64KB is present. This function maintains compatibility with the IBM Color/Graphics +Monitor Adapter.)" +
      DIV2 -- Divide Memory Address clock by 2
      +
      "When this bit is set to 0, the address counter uses the character +clock. When this bit is set to 1, the address counter uses the character +clock input divided by 2. This bit is used to create either a byte or word +refresh address for the display buffer." +
      SLDIV -- Divide Scan Line clock by 2
      +
      "This bit selects the clock that controls the vertical timing counter. +The clocking is either the horizontal retrace clock or horizontal retrace +clock divided by 2. When this bit is set to 1. the horizontal retrace clock +is divided by 2. Dividing the clock effectively doubles the vertical resolution +of the CRT controller. The vertical counter has a maximum resolution of +1024 scan lines because the vertical total value is 10-bits wide. If the +vertical counter is clocked with the horizontal retrace divided by 2, the +vertical resolution is doubled to 2048 scan lines." +
      MAP14 -- Map Display Address 14
      +
      "This bit selects the source of bit 14 of the output multiplexer. +When this bit is set to 0, bit 1 of the row scan counter is the source. +When this bit is set to 1, the bit 14 of the address counter is the source." +
      MAP13 -- Map Display Address 13
      +
      "This bit selects the source of bit 13 of the output multiplexer. +When this bit is set to 0, bit 0 of the row scan counter is the source, +and when this bit is set to 1, bit 13 of the address counter is the source. +The CRT controller used on the IBM Color/Graphics Adapter was capable of +using 128 horizontal scan-line addresses. For the VGA to obtain 640-by-200 +graphics resolution, the CRT controller is  programmed for 100 horizontal +scan lines with two scan-line addresses per character row. Row scan  +address bit 0 becomes the most-significant address bit to the display buffer. +Successive scan lines of  the display image are displaced in 8KB of +memory. This bit allows compatibility with the graphics modes of earlier +adapters."
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Line Compare Register (Index 18h)
    76543210
    Line Compare Register
    +  +
      +
    • +Line Compare Register
    • + +
      This field specifies bits 7-0 of the Line Compare field. Bit 9 of this +field is located in the Maximum Scan Line Register, and +bit 8 of this field is located in the Overflow Register. +The Line Compare field specifies the scan line at which a horizontal division +can occur, providing for split-screen operation. If no horizontal division +is required, this field should be set to 3FFh. When the scan line counter +reaches the value in the Line Compare field, the current scan line address +is reset to 0 and the Preset Row Scan is presumed to be 0. If the Pixel +Panning Mode field is set to 1 then the Pixel Shift Count and Byte +Panning fields are reset to 0 for the remainder of the display cycle.
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution +
    is subject to the terms of the FreeVGA Project +Copyright License. + + diff --git a/specs/freevga/vga/extreg.htm b/specs/freevga/vga/extreg.htm new file mode 100644 index 0000000..d8ceb76 --- /dev/null +++ b/specs/freevga/vga/extreg.htm @@ -0,0 +1,282 @@ + + + + + + + VGA/SVGA Video Programming--External Regsters + + + +
      +
      Home Back  +
      Hardware Level VGA and SVGA Video Programming Information +Page
      + +
      External Regsters
      + +
      +
      +
    +        The External Registers (sometimes +called the General Registers) each have their own unique I/O location in +the VGA, although sometimes the Read Port differs from the Write port, +and some are Read-only.. See the Accessing the VGA +Registers section for more detals. +
      +
    • +Port 3CCh/3C2h -- Miscellaneous Output Register
    • + +
    • +Port 3CAh/3xAh -- Feature Control Register
    • + +
    • +Port 3C2h -- Input Status #0 Register
    • + +
    • +Port 3xAh -- Input Status #1 Register
    • +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Miscellaneous Output Register +(Read at 3CCh, Write at 3C2h)
    76543210
    VSYNCPHSYNCPO/E PageClock SelectRAM En.I/OAS
    +  +
      VSYNCP -- Vertical Sync Polarity
      +
      "Determines the polarity of the vertical sync pulse and can be used +(with HSP) to control the vertical size of the display by utilizing the +autosynchronization feature of VGA displays. +
        = 0 selects a positive vertical retrace sync pulse." +
      HSYNCP -- Horizontal Sync Polarity
      +
      "Determines the polarity of the horizontal sync pulse. +
        = 0 selects a positive horizontal retrace sync pulse." +
      O/E Page -- Odd/Even Page Select
      +
      "Selects the upper/lower 64K page of memory when the system is in +an eve/odd mode (modes 0,1,2,3,7). +
        = 0 selects the low page +
        = 1 selects the high page" +
    • +Clock Select
    • + +
      This field controls the selection of the dot clocks used in driving +the display timing.  The standard hardware has 2 clocks available +to it, nominally 25 Mhz and 28 Mhz.  It is possible that there may +be other "external" clocks that can be selected by programming this register +with the undefined values.  The possible valuse of this register are: +
        +
      • +00 -- select 25 Mhz clock (used for 320/640 pixel wide modes)
      • + +
      • +01 -- select 28 Mhz clock (used for 360/720 pixel wide modes)
      • + +
      • +10 -- undefined (possible external clock)
      • + +
      • +11 -- undefined (possible external clock)
      • +
      +RAM En. -- RAM Enable
      +
      "Controls system access to the display buffer. +
        = 0 disables address decode for the display buffer from the +system +
        = 1 enables address decode for the display buffer from the +system" +
      I/OAS -- Input/Output Address Select
      +
      "This bit selects the CRT controller addresses. When set to 0, this +bit sets the CRT controller addresses to 0x03Bx and the address for the +Input Status Register 1 to 0x03BA for compatibility withthe monochrome +adapter.  When set to 1, this bit sets CRT controller addresses to +0x03Dx and the Input Status Register 1 address to 0x03DA for compatibility +with the color/graphics adapter. The Write addresses to the Feature Control +register are affected in the same manner."
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Feature Control Register (Read +at 3CAh, Write at 3BAh (mono) or 3DAh (color))
    76543210
    FC1FC0
    +  +
      +
    • +FC1 -- Feature Control bit 1
      +
      "All bits are reserved."
    • + +
    • +FC2 -- Feature Control bit 0
      +
      "All bits are reserved."
    • +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Input Status #0 Register (Read-only +at 3C2h)
    76543210
    SS
    +  +
      SS - Switch Sense
      +
      "Returns the status of the four sense switches as selected by the +CS field of the Miscellaneous Output Register."
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Input Status #1 Register (Read +at 3BAh (mono) or 3DAh (color))
    76543210
    VRetraceDD
    +  +
      VRetrace -- Vertical Retrace
      +
      "When set to 1, this bit indicates that the display is in a vertical +retrace interval.This bit can be programmed, through the Vertical Retrace +End register, to generate an interrupt at the start of the vertical retrace." +
      DD -- Display Disabled
      +
      "When set to 1, this bit indicates a horizontal or vertical retrace +interval. This bit is the real-time status of the inverted 'display enable' +signal. Programs have used this status bit to restrict screen updates to +the inactive display intervals in order to reduce screen flicker. The video +subsystem is designed to eliminate this software requirement; screen updates +may be made at any time without screen degradation."
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/graphreg.htm b/specs/freevga/vga/graphreg.htm new file mode 100644 index 0000000..e557863 --- /dev/null +++ b/specs/freevga/vga/graphreg.htm @@ -0,0 +1,585 @@ + + + + + + + VGA/SVGA Video Programming--Graphics Registers + + + +
    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Graphics Registers  +
    + + +

            The Graphics Registers are +accessed via a pair of registers, the Graphics Address Register and the +Graphics Data Register. See the Accessing the VGA +Registers section for more details. The Address Register is located +at port 3CEh and the Data Register is located at port 3CFh. +

      +
    • +Index 00h -- Set/Reset Register
    • + +
    • +Index 01h -- Enable Set/Reset Register
    • + +
    • +Index 02h -- Color Compare Register
    • + +
    • +Index 03h -- Data Rotate Register
    • + +
    • +Index 04h -- Read Map Select Register
    • + +
    • +Index 05h -- Graphics Mode Register
    • + +
    • +Index 06h -- Miscellaneous Graphics Register
    • + +
    • +Index 07h -- Color Don't Care Register
    • + +
    • +Index 08h -- Bit Mask Register
    • +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Set/Reset Register (Index 00h)
    76543210
    Set/Reset
    +  +
      +
    • +Set/Reset
    • + +
      Bits 3-0 of this field represent planes 3-0 of the VGA display memory. +This field is used by Write Mode 0 and Write Mode 3 (See the Write +Mode field.) In Write Mode 0, if the corresponding bit in the Enable +Set/Reset field is set, and in Write Mode 3 regardless of the Enable +Set/Reset field, the value of the bit in this field is expanded to +8 bits and substituted for the data of the respective plane and passed +to the next stage in the graphics pipeline, which for Write Mode 0 is the +Logical Operation unit and for Write Mode 3 is the Bit +Mask unit.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Enable Set/Reset Register (Index +01h)
    76543210
    Enable Set/Reset
    +  +
      +
    • +Enable Set/Reset
    • + +
      Bits 3-0 of this field represent planes 3-0 of the VGA display memory. +This field is used in Write Mode 0 (See the Write Mode +field) to select whether data for each plane is derived from host data +or from expansion of the respective bit in the Set/Reset +field.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Color Compare Register (Index 02h)
    76543210
    Color Compare
    +  +
      +
    • +Color Compare
    • + +
      Bits 3-0 of this field represent planes 3-0 of the VGA display memory. +This field holds a reference color that is used by Read Mode 1 (See the +Read Mode field.) Read Mode 1 returns the result of the +comparison between this value and a location of display memory, modified +by the Color Don't Care field.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Data Rotate Register (Index 03h)
    76543210
    Logical OperationRotate Count
    +  +
      +
    • +Logical Operation
    • + +
      This field is used in Write Mode 0 and Write Mode 2 (See the Write +Mode field.) The logical operation stage of the graphics pipeline is +32 bits wide (1 byte * 4 planes) and performs the operations on its inputs +from the previous stage in the graphics pipeline and the latch register. +The latch register remains unchanged and the result is passed on to the +next stage in the pipeline. The results based on the value of this field +are: +
        +
      • +00b - Result is input from previous stage unmodified.
      • + +
      • +01b - Result is input from previous stage logical ANDed with latch register.
      • + +
      • +10b - Result is input from previous stage logical ORed with latch register.
      • + +
      • +11b - Result is input from previous stage logical XORed with latch register.
      • +
      + +
    • +Rotate Count
    • + +
      This field is used in Write Mode 0 and Write Mode 3 (See the Write +Mode field.) In these modes, the host data is rotated to the right +by the value specified by the value of this field. A rotation operation +consists of moving bits 7-1 right one position to bits 6-0, simultaneously +wrapping bit 0 around to bit 7, and is repeated the number of times specified +by this field.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Read Map Select Register (Index +04h)
    76543210
    Read Map Select
    +  +
      +
    • +Read Map Select
    • + +
      This value of this field is used in Read Mode 0 (see the Read +Mode field) to specify the display memory plane to transfer data from. +Due to the arrangement of video memory, this field must be modified four +times to read one or more pixels values in the planar video modes.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Graphics Mode Register (Index 05h)
    76543210
    Shift256Shift Reg.Host O/ERead ModeWrite Mode
    +  +
      +
    • +Shift256 -- 256-Color Shift Mode
      +
      "When set to 0, this bit allows bit 5 to control the loading of +the shift registers. When set to 1, this bit causes the shift registers +to be loaded in a manner that supports the 256-color mode."
    • + +
      Shift Reg. -- Shift Register Interleave Mode
      +
      "When set to 1, this bit directs the shift registers in the graphics +controller to format the serial data stream with even-numbered bits from +both maps on even-numbered maps, and odd-numbered bits from both maps on +the odd-numbered maps. This bit is used for modes 4 and 5." +
      Host O/E -- Host Odd/Even Memory Read Addressing Enable
      +
      "When set to 1, this bit selects the odd/even addressing mode used +by the IBM Color/Graphics Monitor Adapter. Normally, the value here follows +the value of Memory Mode register bit 2 in the sequencer." +
    • +Read Mode
    • + +
      This field selects between two read modes, simply known as Read Mode +0, and Read Mode 1, based upon the value of this field: +
        +
      • +0b -- Read Mode 0: In this mode, a byte from one of the four planes is +returned on read operations. The plane from which the data is returned +is determined by the value of the Read Map Select field.
      • +
      + +
    • +1b -- Read Mode 1: In this mode, a comparison is made between display memory +and a reference color defined by the Color Compare field. +Bit planes not set in the Color Don't Care field then +the corresponding color plane is not considered in the comparison. Each +bit in the returned result represents one comparison between the reference +color, with the bit being set if the comparison is true.
    • + +
    • +Write Mode
    • + +
      This field selects between four write modes, simply known as Write +Modes 0-3, based upon the value of this field: +
        +
      • +00b -- Write Mode 0: In this mode, the host data is first rotated as per +the Rotate Count field, then the Enable +Set/Reset mechanism selects data from this or the Set/Reset +field. Then the selected Logical Operation is performed +on the resulting data and the data in the latch register. Then the Bit +Mask field is used to select which bits come from the resulting data +and which come from the latch register. Finally, only the bit planes enabled +by the Memory Plane Write Enable field are +written to memory.
      • + +
      • +01b -- Write Mode 1: In this mode, data is transferred directly from the +32 bit latch register to display memory, affected only by the Memory +Plane Write Enable field. The host data is not used in this mode.
      • + +
      • +10b -- Write Mode 2: In this mode, the bits 3-0 of the host data are replicated +across all 8 bits of their respective planes. Then the selected Logical +Operation is performed on the resulting data and the data in the latch +register. Then the Bit Mask field is used to select which +bits come from the resulting data and which come from the latch register. +Finally, only the bit planes enabled by the Memory +Plane Write Enable field are written to memory.
      • + +
      • +11b -- Write Mode 3: In this mode, the data in the Set/Reset +field is used as if the Enable Set/Reset field were set +to 1111b. Then the host data is first rotated as per the Rotate +Count field, then logical ANDed with the value of the Bit +Mask field. The resulting value is used on the data obtained from the +Set/Reset field in the same way that the Bit Mask field +would ordinarily be used. to select which bits come from the expansion +of the Set/Reset field and which come from the latch +register. Finally, only the bit planes enabled by the Memory +Plane Write Enable field are written to memory.
      • +
      +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Miscellaneous Graphics Register +(Index 06h)
    76543210
    Memory Map SelectChain O/EAlpha Dis.
    +  +
      +
    • +Memory Map Select
      +
      This field specifies the range of host memory addresses that is decoded +by the VGA hardware and mapped into display memory accesses.  The +values of this field and their corresponding host memory ranges are:
    • + +
        +
      • +00b -- A0000h-BFFFFh (128K region)
      • + +
      • +01b -- A0000h-AFFFFh (64K region)
      • + +
      • +10b -- B0000h-B7FFFh (32K region)
      • + +
      • +11b -- B8000h-BFFFFh (32K region)
      • +
      +Chain O/E -- Chain Odd/Even Enable
      +
      "When set to 1, this bit directs the system address bit, A0, to +be replaced by a higher-order bit. The odd map is then selected when A0 +is 1, and the even map when A0 is 0." +
      Alpha Dis. -- Alphanumeric Mode Disable
      +
      "This bit controls alphanumeric mode addressing. When set to 1, +this bit selects graphics modes, which also disables the character generator +latches."
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Color Don't Care Register (Index +07h)
    76543210
    Color Don't Care
    +  +
      +
    • +Color Don't Care
    • + +
      Bits 3-0 of this field represent planes 3-0 of the VGA display memory. +This field selects the planes that are used in the comparisons made by +Read Mode 1 (See the Read Mode field.) Read Mode 1 returns +the result of the comparison between the value of the Color +Compare field and a location of display memory. If a bit in this field +is set, then the corresponding display plane is considered in the comparison. +If it is not set, then that plane is ignored for the results of the comparison.
    +  + + + + + + + + + + + + + + + + + + + + + + + + +
    Bit Mask Register (Index 08h)
    76543210
    Bit Mask
    +  +
      +
    • +Bit Mask
    • + +
      This field is used in Write Modes 0, 2, and 3 (See the Write +Mode field.) It it is applied to one byte of data in all four display +planes. If a bit is set, then the value of corresponding bit from the previous +stage in the graphics pipeline is selected; otherwise the value of the +corresponding bit in the latch register is used instead. In Write Mode +3, the incoming data byte, after being rotated is logical ANDed with this +byte and the resulting value is used in the same way this field would normally +be used by itself.
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/license.htm b/specs/freevga/vga/license.htm new file mode 100644 index 0000000..18b2b53 --- /dev/null +++ b/specs/freevga/vga/license.htm @@ -0,0 +1,124 @@ + + + + + + + FreeVGA Copyright License + + + +
    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    FreeVGA Project Copyright License  +
    +Introduction +
            This document contains the +FreeVGA Copyright License which states the conditions under which the FreeVGA +Project's Copyrighted information may be used and distributed.  The +conditions of this license ensure that all parties with a need for this +information have the same availability, to the maximum extent possible +as well as ensure the integrity of the documentation. + +

    Disclaimer +
            The author presents this +information as-is without any warranty, including suitability for intended +purpose. The author is not responsible for damages resulting by the use +of the information, incidental or otherwise. By utilizing this information, +you as the programmer take full liability for any damages caused by your +use of this information. If you are not satisfied with these terms, then +your only recourse is to not use this information. While every reasonable +effort is made to ensure that this information is correct, the possibility +exists for error and is not guaranteed for accuracy, and disclaims liability +for any changes, errors or omissions and is not responsible for any damages +that may arise from the use or misuse of this information.  License +to use this information is only granted where this disclaimer applies in +whole. + +

    License +
            The following copyright +license applies to all works by the FreeVGA Project. All of the FreeVGA +Project's documentation is copyrighted by its author, Joshua Neal. + +

    License to utilize the FreeVGA Project documentation is subject to the +following conditions: +

      +
    • +The copyright notice and this permission notice must be preserved complete +on all copies, complete or partial.
    • + +
    • +Duplication is permitted only for personal purposes.  Reduplication +is permitted only under the FreeVGA Project documentation's redistribution +license.
    • + +
    • +The use of the FreeVGA Project documentation to produce translations or +derivative works must be approved specifically by the author.
    • + +
    • +All warnings and disclaimers present in the complete documentation must +apply to the licensee and may not be restricted by locality.  These +must be read before use, and determined to be applicable to the licensee +before the material may be utilized.
    • + +
    • +It is forbidden to represent the FreeVGA Project or to use the FreeVGA +Project's name to solicit or obtain information, services, product, or +endorsements from another party, commercial or otherwise.
    • +
    +If all of the previous conditions are not met, then permission to utilize +the FreeVGA Project's documentation is not granted, and all rights are +reserved. + +

    License to distribute the FreeVGA Project documentation is subject to +the following conditions: +

      +
    • +The copyright notice and this permission notice must be preserved complete +on all copies, complete or partial.
    • + +
    • +An archive of the FreeVGA Project documentation may be distributed in electronic +form only in its entirety, without adding or removing any material, notices, +advertisement, or other information.  Only exact copies of archives +produced or specifically approved by the author may be distributed, and +at the time of distribution, the most recent archive must be distributed.  +The FreeVGA Project documentation must be excluded from any compilation +copyright or other restrictions.  No fee other than the cost of transmission +or the physical media containing the archive may be charged without prior +approval by the author.  The documentation may not be distributed +electronically in part, which includes mirroring in html format on the +internet, unless specific permission is granted by the author.
    • + +
    • +The FreeVGA Project documentation may be distributed in non-electronic +form to students or members of a programming team subject to the condition +that it be provided free of charge.  The documentation may not be +included with or within other copyrighted works unless the other copyrighted +works are also provided free of charge.
    • + +
    • +Small portions may be reproduced as illustrations for reviews or quotes +in other works without this permission notice if proper citation is given +(including URL if the work is online.)
    • + +
    • +Only the current documentation may be distributed.  The URL of the +FreeVGA project online documentation must be provided.  The author +reserves the right to limit distribution by any parties at any time.
    • +
    +If all of the previous conditions are not met, then permission to redistribute +the FreeVGA Project's documentation is not granted, and all distribution +rights are reserved. + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/vga/paging.gif b/specs/freevga/vga/paging.gif new file mode 100644 index 0000000..74605a6 Binary files /dev/null and b/specs/freevga/vga/paging.gif differ diff --git a/specs/freevga/vga/paging.txt b/specs/freevga/vga/paging.txt new file mode 100644 index 0000000..f33472c --- /dev/null +++ b/specs/freevga/vga/paging.txt @@ -0,0 +1,29 @@ + Paging Memory Utilization Example + --------------------------------- + + 0 +-------------------------+ 79 + | | + | | + | | + | PAGE 0 | + | | + | | + | | + 1920 | | 1999 + +-------------------------+ + | 48 bytes unused | + +-------------------------+ + 2048 | | 2127 + | | + | | + | PAGE 1 | + | | + | | + | | + 3968 | | 4047 + +-------------------------+ + | | + +-------------------------+ + | | + |_ __ __ __ _ _| + -- --_- - --___-- -- diff --git a/specs/freevga/vga/portidx.htm b/specs/freevga/vga/portidx.htm new file mode 100644 index 0000000..45de68d --- /dev/null +++ b/specs/freevga/vga/portidx.htm @@ -0,0 +1,99 @@ + + + + + + + FreeVGA - VGA I/O Port Index + + + +

    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    VGA I/O Port Index  +
    +Introduction +
            This index lists the VGA's +I/O ports in numerical order, making looking up a specific I/O port access +simpler. +
      + + + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      +
      + + diff --git a/specs/freevga/vga/seqpack.gif b/specs/freevga/vga/seqpack.gif new file mode 100644 index 0000000..dac6336 Binary files /dev/null and b/specs/freevga/vga/seqpack.gif differ diff --git a/specs/freevga/vga/seqpack.txt b/specs/freevga/vga/seqpack.txt new file mode 100644 index 0000000..8400edc --- /dev/null +++ b/specs/freevga/vga/seqpack.txt @@ -0,0 +1,25 @@ + Packed Shift Mode Diagram + ------------------------- + + Plane 0 Plane 1 + /-----------------^-----------------\ /-----------------^-----------------\ + 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 ++---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +| | | | | | | | | | | | | | | | | | | | | | | | ++---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ + | | | | | | | | | | | | | | | | + \ | \ | \ | \ | \ | \ | \ | \ | +3 2 | | 3 2 | | 3 2 | | 3 2 | | 3 2 | | 3 2 | | 3 2 | | 3 2 | | +[][][][] [][][][] [][][][] [][][][] [][][][] [][][][] [][][][] [][][][] + | | 1 0 | | 1 0 | | 1 0 | | 1 0 | | 1 0 | | 1 0 | | 1 0 | | 1 0 + | \ | \ | \ | \ | \ | \ | \ | \ + | | | | | | | | | | | | | | | | ++---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +| | | | | | | | | | | | | | | | | | | | | | | | ++---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ +---+---+ + 7 6 5 4 3 2 1 0 7 6 5 4 3 2 1 0 + \-----------------v-----------------/ \-----------------v-----------------/ + Plane 2 Plane 3 + + <------- Direction of Shift + diff --git a/specs/freevga/vga/seqplanr.gif b/specs/freevga/vga/seqplanr.gif new file mode 100644 index 0000000..d8d8270 Binary files /dev/null and b/specs/freevga/vga/seqplanr.gif differ diff --git a/specs/freevga/vga/seqplanr.txt b/specs/freevga/vga/seqplanr.txt new file mode 100644 index 0000000..ad14c27 --- /dev/null +++ b/specs/freevga/vga/seqplanr.txt @@ -0,0 +1,31 @@ + Planar Shift Mode Diagram + ------------------------- + + Pixel Value Display Memory + + 3 2 1 0 7 6 5 4 3 2 1 0 ++-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+ +|.....|.....|.....|.....|___|.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| +|.....|.....|.....|.....| |.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| ++-----+-----+-----+-----+ +-----+-----+-----+-----+-----+-----+-----+-----+ + | | | Plane 0 | | | | | | | | + | | | +-----+-----+-----+-----+-----+-----+-----+-----+ + | | |____________|.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| + | | Plane 1|.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| + | | +-----+-----+-----+-----+-----+-----+-----+-----+ + | | | | | | | | | | + | | +-----+-----+-----+-----+-----+-----+-----+-----+ + | |__________________|.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| + | Plane 2|.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| + | +-----+-----+-----+-----+-----+-----+-----+-----+ + | | | | | | | | | + | +-----+-----+-----+-----+-----+-----+-----+-----+ + |________________________|.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| + Plane 3|.....|%%%%%|:::::|%%%%%|:::::|%%%%%|:::::|%%%%%| + +-----+-----+-----+-----+-----+-----+-----+-----+ + | | | | | | | | + + Pixel: 0 1 2 3 4 5 6 7 + + <-------- Direction of Shift + diff --git a/specs/freevga/vga/seqreg.htm b/specs/freevga/vga/seqreg.htm new file mode 100644 index 0000000..ab9569c --- /dev/null +++ b/specs/freevga/vga/seqreg.htm @@ -0,0 +1,381 @@ + + + + + + + VGA/SVGA Video Programming--Sequencer Registers + + + +

    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Sequencer Registers  +
    + + +

            The Sequencer Registers are +accessed via a pair of registers, the Sequencer Address Register and the +Sequencer Data Register. See the Accessing the VGA +Registers section for more detals. The Address Register is located +at port 3C4h and the Data Register is located at port 3C5h. +

      +
    • +Index 00h -- Reset Register
    • + +
    • +Index 01h -- Clocking Mode Register
    • + +
    • +Index 02h -- Map Mask Register
    • + +
    • +Index 03h -- Character Map Select Register
    • + +
    • +Index 04h -- Sequencer Memory Mode Register
    • +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Reset Register (Index 00h)
    76543210
    SRAR
    +  +
      SR -- Sychnronous Reset +
      "When set to 0, this bit commands the sequencer to synchronously +clear and halt. Bits 1 and 0 must be 1 to allow the sequencer to operate. +To prevent the loss of data, bit 1 must be set to 0 during the active display +interval before changing the clock selection. The clock is changed through +the Clocking Mode register or the Miscellaneous Output register." +
      AR -- Asynchronous Reset +
      "When set to 0, this bit commands the sequencer to asynchronously +clear and halt. Resetting the sequencer with this bit can cause loss of +video data"
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Clocking Mode Register (Index 01h)
    76543210
    SDS4DCRSLR9/8DM
    +  +
      SD -- Screen Disable +
      "When set to 1, this bit turns off the display and assigns maximum +memory bandwidth to the system. Although the display is blanked, the synchronization +pulses are maintained. This bit can be used for rapid full-screen updates." +
      S4 -- Shift Four Enable +
      "When the Shift 4 field and the Shift Load Field are set to 0, the +video serializers are loaded every character clock. When the Shift 4 field +is set to 1, the video serializers are loaded every forth character clock, +which is useful when 32 bits are fetched per cycle and chained together +in the shift registers." +
      DCR -- Dot Clock Rate +
      "When set to 0, this bit selects the normal dot clocks derived from +the sequencer master clock input. When this bit is set to 1, the master +clock will be divided by 2 to generate the dot clock. All other timings +are affected because they are derived from the dot clock. The dot clock +divided by 2 is used for 320 and 360 horizontal PEL modes." +
      SLR -- Shift/Load Rate +
      "When this bit and bit 4 are set to 0, the video serializers are +loaded every character clock. When this bit is set to 1, the video serializers +are loaded every other character clock, which is useful when 16 bits are +fetched per cycle and chained together in the shift registers. The Type +2 video behaves as if this bit is set to 0; therefore, programs should +set it to 0." +
    • +9/8DM -- 9/8 Dot Mode
    • + +
      This field is used to select whether a character is 8 or 9 dots wide. +This can be used to select between 720 and 640 pixel modes (or 360 and +320) and also is used to provide 9 bit wide character fonts in text mode. +The possible values for this field are: +
        +
      • +0 - Selects 9 dots per character.
      • + +
      • +1 - Selects 8 dots per character.
      • +
      +
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Map Mask Register (Index 02h)
    76543210
    Memory Plane Write Enable
    +  +
      +
    • +Memory Plane Write Enable
    • + +
      Bits 3-0 of this field correspond to planes 3-0 of the VGA display +memory. If a bit is set, then write operations will modify the respective +plane of display memory. If a bit is not set then write operations will +not affect the respective plane of display memory.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Character Map Select Register (Index +03h)
    76543210
    CSAS2CSBS2Character Set A SelectCharacter Set B Select
    +  +
      +
    • +CSAS2 -- Bit 2 of Character Set A Select
    • + +
      This is bit 2 of the Character Set A Select field. See Character +Set A Select below. +
    • +CSBS2 -- Bit 2 of Character Set B Select
    • + +
      This is bit 2 of the Character Set B field. See Character +Set B Select below. +
    • +Character Set A Select
    • + +
      This field is used to select the font that is used in text mode when +bit 3 of the attribute byte for a character is set to 1. Note that this +field is not contiguous in order to provide EGA compatibility. The font +selected resides in plane 2 of display memory at the address specified +by this field, as follows: +
        +
      • +000b -- Select font residing at 0000h - 1FFFh
      • +
      + +
        +
      • +001b -- Select font residing at 4000h - 5FFFh
      • + +
      • +010b -- Select font residing at 8000h - 9FFFh
      • + +
      • +011b -- Select font residing at C000h - DFFFh
      • + +
      • +100b -- Select font residing at 2000h - 3FFFh
      • + +
      • +101b -- Select font residing at 6000h - 7FFFh
      • + +
      • +110b -- Select font residing at A000h - BFFFh
      • + +
      • +111b -- Select font residing at E000h - FFFFh
      • +
      + +
    • +Character Set B Select
    • + +
      This field is used to select the font that is used in text mode when +bit 3 of the attribute byte for a character is set to 0. Note that this +field is not contiguous in order to provide EGA compatibility. The font +selected resides in plane 2 of display memory at the address specified +by this field, identical to the mapping used by Character +Set A Select above.
    +  + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Sequencer Memory Mode Register (Index +04h)
    76543210
    Chain 4O/E Dis.Ext. Mem
    +  +
      Chain 4 -- Chain 4 Enable +
      "This bit controls the map selected during system read operations. +When set to 0, this bit enables system addresses to sequentially access +data within a bit map by using the Map Mask register. When setto 1, this +bit causes the two low-order bits to select the map accessed as shown below. +
      Address Bits +
        A0 A1            +Map Selected +
         0   0              +0 +
         0   1              +1 +
         1   0              +2 +
         1   1              +3" +
      O/E Dis. -- Odd/Even Host Memory Write Adressing Disable
      +
      "When this bit is set to 0, even system addresses access maps 0 +and 2, while odd system addresses access maps 1 and 3. When this bit is +set to 1, system addresses sequentially access data within a bit map, and +the maps are accessed according to the value in the Map Mask register (index +0x02)." +
      Ext. Mem -- Extended Memory
      +
      "When set to 1, this bit enables the video memory from 64KB to 256KB. +This bit must be set to 1 to enable the character map selection described +for the previous register."
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/textcur.htm b/specs/freevga/vga/textcur.htm new file mode 100644 index 0000000..3f1fa5f --- /dev/null +++ b/specs/freevga/vga/textcur.htm @@ -0,0 +1,137 @@ + + + + + + + VGA/SVGA Video Programming--Manipulating the Text-mode Cursor + + + +
    Home Intro Visibility +Position Shape Blink +Rate Color Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Manipulating the Text-mode Cursor  +
    + + +Introduction +
            When dealing with +the cursor in most high-level languages, the cursor is defined as the place +where the next text output will appear on the display. When dealing directly +with the display, the cursor is simply a blinking area of a particular +character cell. A program may write text directly to the display independent +of the current location of the cursor. The VGA provides facilities for +specifying whether a cursor is to be displayed, where the cursor is to +appear, and the shape of the cursor itself. Note that this cursor is only +used in the text modes of the standard VGA and is not to be confused with +the graphics cursor capabilities of particular SVGA chipsets. + +

    Enabling/Disabling the Cursor +
            On the VGA there are three +main ways of disabling the cursor. The most straightforward is to set the +Cursor Disable field to 1. Another way is +to set the Cursor Scan Line End field to a +value less than that of the Cursor Scan Line Start +field. On some adapters such as the IBM EGA, this will result instead in +a split block cursor. The third way is to set the cursor location to a +location off-screen. The first two methods are specific to VGA and compatible +adapters and are not guaranteed to work on non-VGA adapters, while the +third method should. + +

    Manipulating the Cursor Position +
            When dealing with +the cursor in standard BIOS text modes, the cursor position is specified +by row and column. The VGA hardware, due to its flexibility to display +any different text modes, specifies cursor position as a 16-bit address. +The upper byte of this address is specified by the Cursor +Location High Register, and the lower by the Cursor +Location Low Register. In addition this value is affected by the Cursor +Skew field. When the hardware fetches a character from display memory +it compares the address of the character fetched to that of the cursor +location added to the Cursor Skew field. If +they are equal and the cursor is enabled, then the character is written +with the current cursor pattern superimposed. Note that the address compared +to the cursor location is the address in display memory, not the address +in host memory. Characters and their attributes are stored at the same +address in display memory in different planes, and it is the odd/even addressing +mode usually used in text modes that makes the interleaved character/attribute +pairs in host memory possible. Note that it is possible to set the cursor +location to an address not displayed, effectively disabling the cursor. +
            The Cursor +Skew field was used on the EGA to synchronize the cursor with internal +timing. On the VGA this is not necessary, and setting this field to any +value other than 0 may result in undesired results. For example, on one +particular card, setting the cursor position to the rightmost column and +setting the skew to 1 made the cursor disappear entirely. On the same card, +setting the cursor position to the leftmost column and setting the skew +to 1 made an additional cursor appear above and to the left of the correct +cursor. At any other position, setting the skew to 1 simply moved the cursor +right one position. Other than these undesired effects, there is no function +that this register can provide that could not be obtained by simply increasing +the cursor location. + +

    Manipulating the Cursor Shape +
           On the VGA, the text-mode +cursor consists of a line or block of lines that extend horizontally across +the entire scan line of a character cell. The first, topmost line is specified +by the Cursor Scan Line Start field. The last, +bottom most line is specified by the Cursor Scan +Line End field. The scan lines in a character cell are numbered from +0 up to the value of the Maximum Scan Line +field. On the VGA if the Cursor Scan Line End +field is less than the Cursor Scan Line Start +field, no cursor will be displayed. Some adapters, such as the IBM EGA +may display a split-block cursor instead. + +

    Cursor Blink Rate +
            On the standard VGA, the +blink rate is dependent on the vertical frame rate. The on/off state of +the cursor changes every 16 vertical frames, which amounts to 1.875 blinks +per second at 60 vertical frames per second. The cursor blink rate is thus +fixed and cannot be software controlled on the standard VGA. Some SVGA +chipsets provide non-standard means for changing the blink rate of the +text-mode cursor. + +

    Cursor Color +
            On the standard VGA, the +cursor color is obtained from the foreground color of the character that +the cursor is superimposing. On the standard VGA there is no way to modify +this behavior. +
      + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      + + diff --git a/specs/freevga/vga/vga.htm b/specs/freevga/vga/vga.htm new file mode 100644 index 0000000..39f95a9 --- /dev/null +++ b/specs/freevga/vga/vga.htm @@ -0,0 +1,226 @@ + + + + + + + VGA/SVGA Video Programming--Standard VGA Chipset Reference + + + +

    Home Intro General +Registers Index Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    VGA Chipset Reference  +
    + + +Introduction +
            This section is intended +to be a reference to the common functionality of the original IBM VGA and +compatible adapters. If you are writing directly to hardware then this +is the lowest common denominator of nearly all video cards in use today. +Nearly all programs requiring the performance of low-level hardware access +resort to this baseline capacity, so this information is still valuable +to programmers. In addition most of the VGA functions apply to SVGA cards +when operating in SVGA modes, so it is best to know how to use them even +when programming more advanced hardware. +
            Most VGA references I have +seen document the VGA by describing its operation in the various BIOS modes. +However, because BIOS was designed for use in MS-DOS real mode applications, +its functionality is limited in other environments. This document is structured +in a way that explains the VGA hardware and its operation independent of +the VGA BIOS modes, which will allow for better understanding of the capabilities +of the VGA hardware. +
            This reference has grown +out of my own notes and experimentation while learning to program the VGA +hardware. During this process I have identified errors in various references +that I have used and have attempted to document the VGA hardware's actual +behavior as best as possible. If in your experience you find any of this +information to be inaccurate, or even if you find this information to be +misleading or inaccurate, please let me know! +
            One of the reasons I started +this reference was that I was using existing references and found myself +wishing for a hypertext reference as almost every register is affected +by the operation of another, and was constantly flipping pages. Here I +simply use links for the register references, such as Offset +Register, rather than stating something like: Offset Register (CRTC: +Offset = 13h, bits 7-0). While the second method is more informative, using +them for every reference to the register makes the text somewhat bogged +down. HTML allows simply clicking on the register name and all of the details +are provided. Another is that no single reference had all of the information +I was looking for, and that I had penciled many corrections and clarifications +into the references themselves. This makes it difficult to switch to a +newer version of a book when another edition comes out -- I still use my +heavily annotated second edition of Ferarro's book, rather than the more +up-to-date third edition. + +

    General Programming Information +
            This section is intended +to provide functional information on various aspects of the VGA. If you +are looking simply for VGA register descriptions look in the next section. +The VGA hardware is complex and can be confusing to program. Rather than +attempt to document the VGA better than existing references by using more +words to describe the registers, this section breaks down the functionality +of the VGA into specific categories of similar functions or by detailing +procedures for performing certain operations. +

    +Input/Output Register Information +
            This section is intended +to provide a detailed reference of the VGA's internal registers. It attempts +to combine information from a variety of sources, including the references +listed in the reference section of the home page; however, rather than +attempting to condense this information into one reference, leaving out +significant detail, I have attempted to expand upon the information available +and provide an accurate, detailed reference that should be useful to any +programmer of the VGA and SVGA. Only those registers that are present and +functional on the VGA are given, so if you are seeking information specific +to the CGA, EGA, MCGA, or MGA adapters try the Other References section +on the home page. +
            In some cases I have changed +the name of the register, not to protect the innocent but simply to make +it clearer to understand. One clarification is the use of "Enable" and +"Disable". A the function of a field with the name ending with "Enable" +is enabled when it is 1, and likewise a field with a name ending in Disable +is disabled when it is 1. Another case is when two fields have similar +or identical names, I have added more description to the name to differentiate +them. +
            It can be difficult to understand +how to manipulate the VGA registers as many registers have been packed +into a small number of I/O ports and accessing them can be non-intuituve, +especially the Attribute Controller Registers, so I have provided a tutorial +for doing this. + +        In order to facilitate understanding +of the registers, one should view them as groups of similar registers, +based upon how they are accessed, as the VGA uses indexed registers to +access most parameters. This also roughly places them in groups of similar +functionality; however, in many cases the fields do not fit neatly into +their category. In certain cases I have utilized quotes from the IBM VGA +Programmer's Reference, this information is given in "italic."  +This is meant to be a temporary placeholder until a better description +can be written, it may not be applicable to a standard VGA implementation.  +Presented to roughly based upon their place in the graphics pipeline between +the CPU and the video outputs are the: + +Indices +
            In order to locate a particular +register quickly, the following indexes are provided. The first is a listing +of all of the register fields of the VGA hardware. This is especially useful +for fields that are split among multiple registers, or for finding the +location of a field that are packed in with other fields in one register. +The second is indexed by function groups each pertaining to a particular +part of the VGA hardware. This makes understanding and programming the +VGA hardware easier by listing the fields by subsystem, as the VGA's fields +are grouped in a somewhat haphazard fashion. The third is intended for +matching a read or write to a particular I/O port address to the section +where it is described. +
      +
    • +VGA Field Index -- An alphabetical listing of +all fields and links to their location.
    • + +
    • +VGA Functional Index -- A listing of all fields +and links to their location grouped by function.
    • + +
    • +VGA I/O Port Index -- A listing of VGA I/O ports +in numerical order.
    • +
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + +

      + + diff --git a/specs/freevga/vga/vgacrtc.htm b/specs/freevga/vga/vgacrtc.htm new file mode 100644 index 0000000..71c5b7a --- /dev/null +++ b/specs/freevga/vga/vgacrtc.htm @@ -0,0 +1,210 @@ + + + + + + + FreeVGA - VGA Display Generation + + + +

    Home Intro Clocks +Horizontal Vertical Monitoring +Misc Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    VGA Display Generation  +
    +Introduction +
            This page documents the +configuration of the VGA's CRTC registers which control the framing and +timing of video signals sent to the display device, usually a monitor. + +

    Dot Clocks +
            The standard VGA has two +"standard" dot clock frequencies available to it, as well as a possible +"external" clock source, which is implementation dependent.  The two +standard clock frequencies are nominally 25 Mhz and 28 MHz.  Some +chipsets use 25.000 MHz and 28.000 MHz, while others use slightly greater +clock frequencies.  The IBM VGA chipset I have uses 25.1750 MHz  +Mhz and 28.3220 crystals.  Some newer cards use the closest generated +frequency produced by their clock chip.  In most circumstances the +IBM VGA timings can be assumed as the monitor should allow an amount of +variance; however, if you know the actual frequencies used you should use +them in your timing calculations. +
            The dot clock source in +the VGA hardware is selected using the Clock +Select field.  For the VGA, two of the values are undefined; some +SVGA chipsets use the undefined values for clock frequencies used for 132 +column mode and such.  The 25 MHz clock is designed for 320 and 640 +pixel modes and the 28 MHz is designed for 360 and 720 pixel modes. The +Dot Clock Rate field specifies whether to use +the dot clock source directly or to divide it in half before using it as +the actual dot clock rate. + +

    Horizontal Timing +
            The VGA measures horizontal +timing periods in terms of character clocks, which can either be 8 or 9 +dot clocks, as specified by the 9/8 Dot Mode +field.  The 9 dot clock mode was included for monochrome emulation +and 9-dot wide character modes, and can be used to provide 360 and 720 +pixel wide modes that work on all standard VGA monitors, when combined +with a 28 Mhz dot clock. The VGA uses a horizontal character counter which +is incremented at each character, which the horizontal timing circuitry +compares against the values of the horizontal timing fields to control +the horizontal state. The horizontal periods that are controlled are the +active display, overscan, blanking, and refresh periods. +
            The start of the active +display period coincides with the resetting of the horizontal character +counter, thus is fixed at zero.  The value at which the horizontal +character is reset is controlled by the Horizontal +Total field. Note, however, that the value programmed into the Horizontal +Total field is actually 5 less than the actual value due to timing +concerns. +
            The end of the active display +period is controlled by the End Horizontal Display +field.  When the horizontal character counter is equal to the value +of this field, the sequencer begins outputting the color specified by the +Overscan Palette Index field.  This continues +until the active display begins at the beginning of the next scan line +when the active display begins again.  Note that the horizontal blanking +takes precedence over the sequencer and attribute controller. +
            The horizontal blanking +period begins when the character clock equals the value of the Start +Horizontal Blanking field.  During the horizontal blanking period, +the output voltages of the DAC signal the monitor to turn off the guns.   +Under normal conditions, this prevents the overscan color from being displayed +during the horizontal retrace period.  This period extends until the +lower 6 bits of the End Horizontal Blanking +field match the lower 6 bits of the horizontal character counter.  +This allows for a blanking period from 1 to 64 character clocks, although +some implementations may treat 64 as 0 character clocks in length.  +The blanking period may occur anywhere in the scan line, active display +or otherwise even though its meant to appear outside the active display +period.  It takes precedence over all other VGA output.  There +is also no requirement that blanking occur at all.  If the Start +Horizontal Blanking field falls outside the maximum value of the character +clock determined by the Horizontal Total field, +then no blanking will occur at all.  Note that due to the setting +of the Horizontal Total field, the first match +for the End Horizontal Blanking field may +be on the following scan line. +
            Similar to the horizontal +blanking period, the horizontal retrace period is specified by the Start +Horizontal Retrace and End Horizontal Retrace +fields. The horizontal retrace period begins when the character clock equals +the value stored in the Start Horizontal Retrace +field.  The horizontal retrace ends when the lower 5 bits of the character +clock match the bit pattern stored in the End +Horizontal Retrace field, allowing a retrace period from 1 to 32 clocks; +however, a particular implementation may treat 32 clocks as zero clocks +in length.  The operation of this is identical to that of the horizontal +blanking mechanism with the exception of being a 5 bit comparison instead +of 6, and affecting the horizontal retrace signal instead of the horizontal +blanking. +
            There are two horizontal +timing fields that are described as being related to internal timings of +the VGA, the Display Enable Skew and Horizontal +Retrace Skew fields.  In the VGA they do seem to affect the timing, +but also do not seem to be necessary for the operation of the VGA and are +pretty much unused.  These registers were required by the IBM VGA +implementations, so I'm assuming this was added in the early stages of +the VGA design for EGA compatibility, but the internal timings were changed +to more friendly ones making the use of these fields unnecessary.  +It seems to be totally safe to set these fields to 0 and ignore them.  +See the register descriptions for more details, if you have to deal with +software that programs them. + +

    Vertical Timing +
            The VGA maintains a scanline +counter which is used to measure vertical timing periods.  This counter +begins at zero which coincides with the first scan line of the active display.  +This counter is set to zero before the beginning of the first scanline +of the active display.  Depending on the setting of the Divide +Scan Line Clock by 2 field, this counter is incremented either every +scanline, or every second scanline.  The vertical scanline counter +is incremented before the beginning of each horizontal scan line, as all +of the VGA's vertical timing values are measured at the beginning of the +scan line, after the counter has ben set/incremented.  The maximum +value of the scanline counter is specified by the Vertical +Total field.  Note that, like the rest of the vertical timing +values that "overflow" an 8-bit register, the most significant bits are +located in the Overflow Register.  The +Vertical Total field is programmed with the +value of the scanline counter at the beginning of the last scanline. +
            The vertical active display +period begins when the scanline counter is at zero, and extends up to the +value specified by the Vertical Display End +field.  This field is set with the value of the scanline counter at +the beginning of the first inactive scanline, telling the video hardware +when to stop outputting scanlines of sequenced pixel data and outputs the +attribute specified by the Overscan Palette Index +field in the horizontal active display period of those scanlines.  +This continues until the start of the next frame when the active display +begins again. +
            The Start +Vertical Blanking and End Vertical Blanking +fields control the vertical blanking interval.  The Start +Vertical Blanking field is programmed with the value of the scanline +counter at the beginning of the scanline to begin blanking at.  The +value of the End Vertical Blanking field is +set to the lower eight bits of the scanline counter at the beginning of +the scanline after the last scanline of vertical blanking. +
            The Vertical +Retrace Start and Vertical Retrace End +fields determine the length of the vertical retrace interval.  The +Vertical Retrace Start field contains the +value of the scanline counter at the beginning of the first scanline where +the vertical retrace signal is asserted.  The Vertical +Retrace End field is programmed with the value of the lower four bits +of the scanline counter at the beginning of the scanline after the last +scanline where the vertical retrace signal is asserted. + +

    Monitoring Timing +
            There are certain operations +that should be performed during certain periods of the display cycle to +minimize visual artifacts, such as attribute and DAC writes.  There +are two bit fields that return the current state of the VGA, the Display +Disabled and Vertical Retrace fields. +The Display Disabled field is set to 1 when +the display enable signal is not asserted, providing the programmer with +a means to determine if the video hardware is currently refreshing the +active display or it is currently outputting blanking. +
            The Vertical +Retrace field signals whether or not the VGA is in a vertical retrace +period.  This is useful for determining the end of a display period, +which can be used by applications that need to update the display every +period such as when doing animation.  Under normal conditions, when +the blanking signal is asserted during the entire vertical retrace, this +can also be used to detect this period of blanking, such that a large amount +of register accesses can be performed, such as reloading the complete set +of DAC entries. + +

    Miscellaneous +
            There are a few registers +that affect display generation, but don't fit neatly into the horizontal +or vertical timing categories.  The first is the Sync +Enable field which controls whether the horizontal and vertical sync +signals are sent to the display or masked off.  The sync signals should +be disabled while setting up a new mode to ensure that an improper signal +that could damage the display is not being output.  Keeping the sync +disabled for a period of one or more frames helps the display determine +that a mode change has occurred as well. +
        The Memory Refresh Bandwidth +field is used by the original IBM VGA hardware and some compatible VGA/SVGA +chipsets to control how often the display memory is refreshed.  This +field controls whether the VGA hardware provides 3 or 5 memory refresh +cycles per scanline.  At or above VGA horizontal refresh rates, this +field should be programmed for 3 memory refresh cycles per scanline.  +Below this rate, for compatibility's sake the 5 memory refresh cycles per +scanline setting might be safer, see the Memory +Refresh Bandwidth field for (slightly) more information. + +

      +
    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/vgadac.htm b/specs/freevga/vga/vgadac.htm new file mode 100644 index 0000000..4a7c05f --- /dev/null +++ b/specs/freevga/vga/vgadac.htm @@ -0,0 +1,190 @@ + + + + + + + VGA/SVGA Video Programming--DAC Operation + + + +

    Home Intro DAC +Programming Precautions +Flicker State Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    DAC Operation  +
    + + +Introduction +
            One of the improvements +the VGA has over the EGA hardware is in the amount of possible colors that +can be generated, in addition to an increase in the amount of colors that +can be displayed at once. The VGA hardware has provisions for up to 256 +colors to be displayed at once, selected from a range of 262,144 (256K) +possible colors. This capability is provided by the DAC subsystem, which +accepts attribute information for each pixel and converts it into an analog +signal usable by VGA displays. + +

    DAC Subsystem +
            The VGA's DAC subsystem +accepts an 8 bit input from the attribute subsystem and outputs an analog +signal that is presented to the display circuitry. Internally it contains +256 18-bit memory locations that store 6 bits each of red, blue, and green +signal levels which have values ranging from 0 (minimum intensity) to 63 +(maximum intensity.) The DAC hardware takes the 8-bit value from the attribute +subsystem and uses it as an index into the 256 memory locations and obtains +a red, green, and blue triad and produces the necessary output. +
            Note -- the DAC subsystem +can be implemented in a number of ways, including discrete components, +in a DAC chip which may or may not contain internal ram, or even integrated +into the main chipset ASIC itself. Many modern DAC chipsets include additional +functionality such as hardware cursor support, extended color mapping, +video overlay, gamma correction, and other functions. Partly because of +this it is difficult to generalize the DAC subsystem's exact behavior. +This document focuses on the common functionality of all VGA DACs; functionality +specific to a particular chipset are described elsewhere. + +

    Programming the DAC +
            The DAC's primary host interface +(there may be a secondary non-VGA compatible access method) is through +a set of four external registers containing the DAC +Write Address, the DAC Read Address, +the DAC Data, and the DAC +State fields. The DAC memory is accessed by writing an index value +to the DAC Write Address field for write +operations, and to the DAC Read Address +field for read operations. Then reading or writing the DAC +Data field, depending on the selected operation, three times in succession +returns 3 bytes, each containing 6 bits of red, green, and blue intensity +values, with red being the first value and blue being the last value read/written. +The read or write index then automatically increments such that the next +entry can be read without having to reprogramming the address. In this +way, the entire DAC memory can be read or written in 768 consecutive I/O +cycles to/from the DAC Data field. The DAC +State field reports whether the DAC is setup to accept reads or writes +next. + +

    Programming Precautions +
            Due to the variances in +the different implementations, programming the DAC takes extra care to +ensure proper operation across the range of possible implementations. There +are a number of things can cause undesired effects, but the simplest way +to avoid problems is to ensure that you program the DAC +Read Address field or the DAC Write Address +field before each read operation (note that a read operation may include +reads/writes to multiple DAC memory entries.) And always perform writes +and reads in groups of 3 color values. The DAC memory may not be updated +properly otherwise. Reading the value of the DAC +Write Address field may not produce the expected result, as some implementations +may return the current index and some may return the next index. This operation +may even be dependent on whether a read or write operation is being performed. +While it may seem that the DAC implements 2 separate indexes for read and +write, this is often not the case, and interleaving read and write operations +may not work properly without reprogramming the appropriate index. +

      +
    • +Read Operation
    • + +
        +
      • +Disable interrupts (this will ensure that a interrupt service routine will +not change the DAC's state)
      • + +
      • +Output beginning DAC memory index to the DAC +Read Address register.
      • + +
      • +Input red, blue, and green values from the DAC +Data register, repeating for the desired number of entries to be read.
      • + +
      • +Enable interrupts
      • +
      + +
    • +Write Operation
    • + +
        +
      • +Disable interrupts (this will ensure that a interrupt service routine will +not change the DAC's state)
      • + +
      • +Output beginning DAC memory index to the DAC +Write Address register.
      • + +
      • +Output red, blue, and green values to the DAC +Data register, repeating for the desired number of entries to be read.
      • + +
      • +Enable interrupts
      • +
      +
    +Eliminating Flicker +
            An important consideration +when programming the DAC memory is the possible effects on the display +generation. If the DAC memory is accessed by the host CPU at the same time +the DAC memory is being used by the DAC hardware, the resulting display +output may experience side effects such as flicker or "snow". Note that +both reading and writing to the DAC memory has the possibility of causing +these effects. The exact effects, if any, are dependent on the specific +DAC implementation. Unfortunately, it is not possible to detect when side-effects +will occur in all circumstances. The best measure is to only access the +DAC memory during periods of horizontal or vertical blanking. However, +this puts a needless burden on programs run on chipsets that are not affected. +If performance is an issue, then allowing the user to select between flicker-prone +and flicker-free access methods could possibly improve performance. + +

    The DAC State +
            The DAC +State field seems to be totally useless, as the DAC state is usually +known by the programmer and it does not give enough information (about +whether a red, green, or blue value is expected next) for a interrupt routine +or such to determine the DAC state. However, I can think of one possible +use for it. You can use the DAC state to allow an interrupt driven routine +to access the palette (like for palette rotation effects or such) while +still allowing the main thread to write to the DAC memory. When the interrupt +routine executes it should check the DAC state. If the DAC state is in +a write state, it should not access the DAC memory. If it is in a read +state, the routine should perform the necessary DAC accesses then return +the DAC to a read state. This means that the main thread use the DAC state +to control the execution of the ISR. Also it means that it can perform +writes to the DAC without having to disable interrupts or otherwise inhibit +the ISR. +
      + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      + + diff --git a/specs/freevga/vga/vgafunc.htm b/specs/freevga/vga/vgafunc.htm new file mode 100644 index 0000000..6d09947 --- /dev/null +++ b/specs/freevga/vga/vgafunc.htm @@ -0,0 +1,402 @@ + + + + + + + VGA/SVGA Video Programming--VGA Functional Index + + + +

    Home Register +Memory Sequencing Cursor +Attribute DAC Display +Misc Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    VGA Functional Index  +
    + + +

    Register Access Functions +
            These fields control the +acessability/inaccessability of the VGA registers. These registers are +used for compatibiltiy with older programs that may attempt to program +the VGA in a fashion suited only to an EGA, CGA, or monochrome card. +

    +Display Memory Access Functions +
            These fields control the +way the video RAM is mapped into the host CPU's address space and how memory +reads/writes affect the display memory. + +Display Sequencing Functions +
            These fields affect the +way the video memory is serialized for display. + +Cursor Functions +
            These fields affect the +operation of the cursor displayed while the VGA hardware is in text mode. + +Attribute Functions +
            These fields control the +way the video data is submitted to the RAMDAC, providing color/blinking +capability in text mode and facilitating the mapping of colors in graphics +mode. + +DAC Functions +
            These fields allow control +of the VGA's 256-color palette that is part of the RAMDAC. + +Display Generation Functions +
            These fields control the +formatting and timing of the VGA's video signal output. + +Miscellaneous Functions +
            These fields are used to +detect the state of possible VGA hardware such as configuration switches/jumpers +and feature connector inputs. + +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/vga/vgafx.htm b/specs/freevga/vga/vgafx.htm new file mode 100644 index 0000000..f7e058f --- /dev/null +++ b/specs/freevga/vga/vgafx.htm @@ -0,0 +1,295 @@ + + + + + + + FreeVGA--Special Effects Hardware + + + +
    Home Intro Windowing +Paging Smooth Scrolling Split-Screen +Back +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Special Effects Hardware  +
    + +
      +
    • +Introduction -- describes the capabilities of the +VGA special effects hardware.
    • + +
    • +Windowing -- provides rough panning and scrolling +of a larger virtual image.
    • + +
    • +Paging -- provides the ability to switch between +multiple screens rapidly.
    • + +
    • +Smooth Panning and Scrolling -- provides more precise +control when panning and scrolling.
    • + +
    • +Split-Screen Operation -- provides a horizontal division +which allows independent scrolling and panning of the top window.
    • +
    +Introduction +
            This section describes the +capabilities of the VGA hardware that can be used to implement special +effects such as windowing, paging, smooth panning and scrolling, and split +screen operation.. These functions are probably the least utilized of all +of the VGA's capabilities, possibly because most texts devoted to video +hardware provide only brief documentation. Also, the video BIOS provides +no support for these capabilities so the VGA card must be programmed at +the hardware level in order to utilize these capabilities. Windowing allows +a program to view a portion of an image in display memory larger than the +current display resolution, providing rough panning and scrolling. Paging +allows multiple display screens to be stored in the display memory allowing +rapid switching between them. Smooth panning and scrolling works in conjunction +with windowing to provide more precise control of window position. Split-screen +operation allows the creation of a horizontal division on the screen that +creates a window below that remains fixed in place independent of the panning +and scrolling of the window above. These features can be combined to provide +powerful control of the display with minimal demand on the host CPU. + +

    Windowing +
            The VGA hardware has the +ability treat the display as a window which can pan and/or scroll across +an image larger than the screen, which is used by some windowing systems +to provide a virtual scrolling desktop, and by some games and assembly +demos to provide scrolling. Some image viewers use this to allow viewing +of images larger than the screen. This capability is not limited to graphics +mode; some terminal programs use this capability to provide a scroll-back +buffer, and some editors use this to provide an editing screen wider than +80 columns. +
            This feature can be implemented +by brute force by simply copying the portion of the image to be displayed +to the screen. Doing this, however takes significant processor horsepower. +For example, scrolling a 256 color 320x200 display at 30 frames per second +by brute force requires a data transfer rate of 1.92 megabytes/second. +However, using the hardware capability of the VGA the same operation would +require a data transfer rate of only 120 bytes/second. Obviously there +is an advantage to using the VGA hardware. However, there are some limitations--one +being that the entire screen must scroll (or the top portion of the screen +if split-screen mode is used.) and the other being that the maximum size +of the virtual image is limited to the amount of video memory accessible, +although it is possible to redraw portions of the display memory to display +larger virtual images. +
            In text mode, windowing +allows panning at the character resolution. In graphics mode, windowing +allows panning at 8-bit resolution and scrolling at scan-line resolution. +For more precise control, see Smooth Panning and Scrolling +below. Because the VGA BIOS and most programming environment's graphics +libraries do not support windowing, you must modify or write your own routines +to write to the display for functions such as writing text or graphics. +This section assumes that you have the ability to work with the custom +resolutions possible when windowing is used. +
            In order to understand virtual +resolutions it is necessary to understand how the VGA's Start +Address High Register, Start Address Low Register, +and Offset field work. Because display memory +in the VGA is accessed by a 32-bit bus, a 16-bit address is sufficient +to uniquely identify any location in the VGA's 256K address space. The +Start Address High Register and Start +Address Low Register provide such an address. This address is used +to specify either the location of the first character in text mode or the +position of the first byte of pixels in graphics mode. At the end of the +vertical retrace, the current line start address is loaded with this value. +This causes one scan line of pixels or characters to be output starting +at this address. At the beginning of the next scan-line (or character row +in text mode) the value of the Offset Register +multiplied by the current memory address size * 2 is added to the current +line start address. The Double-Word Addressing +field and the Word/Byte field specify the +current memory address size. If the value of the Double-Word +Addressing field is 1, then the current memory address size is four +(double-word). Otherwise, the Word/Byte field +specifies the current memory address size. If the value of the Word/Byte +field is 0 then the current memory address size is 2 (word) otherwise, +the current memory address size is 1 (byte). +
            Normally in graphics modes, +the offset register is programmed to represent (after multiplication) the +number of bytes in a scan line. This means that (unless a CGA/MDA emulation +mode is in effect) scan lines will be arranged sequentially in memory with +no space in between, allowing for the most compact representation in display +memory. However, this does not have to be the case--in fact, by increasing +the value of the offset register we can leave "extra space" between lines. +This is what provides for virtual widths. By programming the offset register +to the value of the equation: + +

            Offset = VirtualWidth +/ ( PixelsPerAddress * MemoryAddressSize * 2 ) + +

    VirtualWidth is the width of the virtual resolution in pixels, and PixelsPerAddress +is the number of pixels per display memory address (1, 2, 4 or 8) depending +on the current video mode. For virtual text modes, the offset register +is programmed with the value of the equation: + +

            Offset = VirtualWidth +/ ( MemoryAddressSize * 2 ) + +

    In text mode, there is always one character per display memory address. +In standard CGA compatible text modes, MemoryAddressSize is 2 (word). +
            After you have programmed +the new offset, the screen will now display only a portion of a virtual +display. The screen will display the number of scan-lines as specified +by the current mode. If the screen reaches the last byte of memory, the +next byte of memory will wrap around to the first byte of memory. Remember +that the Start Address specifies the display memory address of the upper-left +hand character or pixel. Thus the maximum height of a virtual screen depends +on the width of the virtual screen. By increasing this by the number of +bytes in a scan-line (or character row), the display will scroll one scan-line +or character row vertically downwards. By increasing the Start Address +by less than the number of bytes in a scan line, you can move the virtual +window horizontally to the right. If the virtual width is the same as the +actual width, one can create a vertical scrolling mode. This is used sometimes +as an "elevator" mode or to provide rapid scrollback capability in text +mode. If the virtual height is the same as the actual height, then only +horizontal panning is possible, sometimes called "panoramic" mode. In any +case, the equation for calculating the Start Address is: + +

            Start Address = StartingOffset ++ Y * BytesPerVirtualRow + X + +

    Y is the vertical position, from 0 to the value of the VitrualHeight +- ActualHeight. X is the horizontal position, from 0 to the value of BytesPerVirtualRow +- BytesPerActualRow . These ranges prevent wrapping around to the left +side of the screen, although you may find it useful to use the wrap-around +for whatever your purpose. Note that the wrap-around simply starts displaying +the next row/scan-line rather than the current one, so is not that useful +(except when using programming techniques that take this factor into account.) +Normally StartingOffset is 0, but if paging or split-screen mode is being +used, or even if you simply want to relocate the screen, you must change +the starting offset to the address of the upper-left hand pixel of the +virtual screen. +
            For example, a 512x300 virtual +screen in a 320x200 16-color 1 bit/pixel planar display would require 512 +pixels / 8 pixels/byte = 64 bytes per row and 64 bytes/row * 300 lines += 19200 bytes per screen. Assuming the VGA is in byte addressing mode, +this means that we need to program the offset register Offset +field with 512 pixels / (8 pixels/byte * 1 * 2) = 32 (20h). Adding one +to the start address will move the display screen to the right eight pixels. +More precise control is provided by the smooth scrolling mechanism. Adding +64 to the start address will move the virtual screen down one scan line. +See the following chart which shows the virtual screen when the start address +is calculated with an X and Y of 0: +

    Click for Textified Virtual Screen Mode Example
    + + +

    Paging +
            The video display memory +may be able to hold more than one screen of data (or virtual screen if +virtual resolutions are used.) These multiple screens, called pages, allows +rapid switching between them. As long as they both have the same actual +(and virtual if applicable) resolution, simply changing the Start Address +as given by the Start Address High Register +and Start Address Low Register pair to point +to the memory address of the first byte of the page (or set the StartingOffset +term in the equation for virtual resolutions to the first memory address +of the page.) If they have different virtual widths, then the Offset +field must be reprogrammed. It is possible to store both graphics and text +pages simultaneously in memory, in addition to different graphics mode +pages. In this case, the video mode must be changed when changing pages. +In addition, in text mode the Cursor Location must be reprogrammed for +each page if it is to be displayed. Also paging allows for double buffering +of the display -- the CPU can write to one page while the VGA hardware +is displaying another. By switching between pages during the vertical retrace +period, flicker free screen updates can be implemented. +
            An example of paging is +that used by the VGA BIOS in the 80x25 text mode. Each page of text takes +up 2000 memory address locations, and the VGA uses a 32K memory aperture, +with the Odd/Even addressing enabled. Because Odd/Even addressing is enabled, +each page of text takes up 4000 bytes in host memory, thus 32768 / 4000 += 8 (rounded down) pages can be provided and can be accessed at one time +by the CPU. Each page starts at a multiple of 4096 (1000h). Because the +display controller circuitry works independent of the host memory access +mode, this means that each page starts at a display address that is a multiple +of 2048 (800h), thus the Starting Address is programmed to the value obtained +by multiplying the page to be displayed by 2048 (800h). See the following +chart which shows the arrangement of these pages in display memory: +
      +

    Click here to display a textified Paging Memory Utilization Example
    + + +

    Smooth Panning and Scrolling +
            Because the Start Address +field only provides for scrolling and panning at the memory address level, +more precise panning and scrolling capability is needed to scroll at the +pixel level as multiple pixels may reside at the same memory address especially +in text mode where the Start Address field only allows panning and scrolling +at the character level. +
            Pixel level panning is controlled +by the Pixel Shift Count and Byte +Panning fields. The Pixel Shift Count +field specifies the number of pixels to shift left. In all graphics modes +and text modes except 9 dot text modes and 256 color graphics modes, the +Pixel Shift Count is defined for values 0-7. +This provides the pixel level control not provided by the Start +Address Register or the Byte Panning fields. +In 9 dot text modes the Pixel Shift Count +is field defined for values 8, and 0-7, with 8 being the minimum shift +amount and 7 being the maximum. In 256 color graphics modes, due to the +way the hardware makes a 256 color value by combining 2 16-bit values, +the Pixel Shift Count field is only defined +for values 0, 2, 4, and 6. Values 1, 3, 5, and 7 cause the screen to be +distorted due to the hardware combining 4 bits from each of 2 adjacent +pixels. The Byte Panning field is added to +the Start Address Register when determining +the address of the top-left hand corner of the screen, and has the value +from 0-3. Combined, both panning fields allow a shift of 15, 31, or 35 +pixels, dependent upon the video mode. Note that programming the Pixel +Shift Count field to an undefined value may cause undesired effects +and these effects are not guaranteed to be identical on all chipsets, so +it is best to be avoided. +
            Pixel level scrolling is +controlled by the Preset Row Scan field. This +field may take any value from 0 up to the value of the Maximum +Scan Line field; anything greater causes interesting artifacts (there +is no guarantee that the result will be the same for all VGA chipsets.) +Incrementing this value will shift the screen upwards by one scan line, +allowing for smooth scrolling in modes where the Offset field does not +provide precise control. + +

    Split-screen Operation +
            The VGA hardware provides +the ability to specify a horizontal division which divides the screen into +two windows which can start at separate display memory addresses. In addition, +it provides the facility for panning the top window independent of the +bottom window. The hardware does not provide for split-screen modes where +multiple video modes are possible in one display screen as provided by +some non-VGA graphics controllers. In addition, there are some limitations, +the first being that the bottom window's starting display memory address +is fixed at 0. This means that (unless you are using split screen mode +to duplicate memory on purpose) the bottom screen must be located first +in memory and followed by the top. The second limitation is that either +both windows are panned by the same amount, or only the top window pans, +in which case, the bottom window's panning values are fixed at 0. Another +limitation is that the Preset Row Scan field +only applies to the top window -- the bottom window has an effective Preset +Row Scan value of 0. +
            The Line Compare field in +the VGA, of which bit 9 is in the Maximum Scan +Line Register, bit 8 is in the Overflow Register, +and bits 7-0 are in the Line Compare Register, +specifies the scan line address of the horizontal division. When the line +counter reaches the value in the Line Compare Register, the current scan +line start address is reset to 0. If the Pixel +Panning Mode field is set to 1 then the Pixel +Shift Count and Byte Panning fields are +reset to 0 for the remainder of the display cycle allowing the top window +to pan while the bottom window remains fixed. Otherwise, both windows pan +by the same amount. +
      + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. + + diff --git a/specs/freevga/vga/vgamem.htm b/specs/freevga/vga/vgamem.htm new file mode 100644 index 0000000..bef7f09 --- /dev/null +++ b/specs/freevga/vga/vgamem.htm @@ -0,0 +1,334 @@ + + + + + + + VGA/SVGA Video Programming--Accessing the VGA Display Memory + + + +

    Home Intro Detecting +Mapping Addressing Manipulating +Reading Writing Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Accessing the VGA Display Memory  +
    + + +Introduction +
            The standard VGA hardware +contains up to 256K of onboard display memory. While it would seem logical +that this memory would be directly available to the processor, this is +not the case. The host CPU accesses the display memory through a window +of up to 128K located in the high memory area. (Note that many SVGA chipsets +provide an alternate method of accessing video memory directly, called +a Linear Frame Buffer.) Thus in order to be able to access display memory +you must deal with registers that control the mapping into host address +space. To further complicate things, the VGA hardware provides support +for memory models similar to that used by the monochrome, CGA, EGA, and +MCGA adapters. In addition, due to the way the VGA handles 16 color modes, +additional hardware is included that can speed access immensely. Also, +hardware is present that allows the programer to rapidly copy data from +one area of display memory to another. While it is quite complicated to +understand, learning to utilize the VGA's hardware at a low level can vastly +improve performance. Many game programmers utilize the BIOS mode 13h, simply +because it offers the simplest memory model and doesn't require having +to deal with the VGA's registers to draw pixels. However, this same decision +limits them from being able to use the infamous X modes, or higher resolution +modes. + +

    Detecting the Amount of Display Memory on the +Adapter +
            Most VGA cards in +existence have 256K on board; however there is the possibility that some +VGA boards have less. To actually determine further if the card has 256K +one must actually write to display memory and read back values. If RAM +is not present in a location, then the value read back will not equal the +value written. It is wise to utilize multiple values when doing this, as +the undefined result may equal the value written. Also, the card may alias +addresses, causing say the same 64K of RAM to appear 4 times in the 256K +address space, thus it is wise to change an address and see if the change +is reflected anywhere else in display memory. In addition, the card may +buffer one location of video memory in the chipset, making it appear that +there is RAM at an address where there is none present, so you may have +to read or write to a second location to clear the buffer. Not that if +the Extended Memory field is not set to 1, +the adapter appears to only have 64K onboard, thus this bit should be set +to 1 before attempting to determine the memory size. + +

    Mapping of Display Memory into CPU Address +Space +
            The first element that defines +this mapping is whether or not the VGA decodes accesses from the CPU. This +is controlled by the RAM Enable field. +If display memory decoding is disabled, then the VGA hardware ignores writes +to its address space. The address range that the VGA hardware decodes is +based upon the Memory Map Select field. The +following table shows the address ranges in absolute 32-bit form decoded +for each value of this field: +

      +
    • +00 -- A0000h-BFFFFh -- 128K
    • + +
    • +01 -- A0000h-AFFFFh -- 64K
    • + +
    • +10 -- B0000h-B7FFFh -- 32K
    • + +
    • +11 -- B8000h-BFFFFh -- 32K
    • +
    +Note -- It would seem that by setting the Memory +Map Select field to 00 and then using planar memory access that you +could gain access to more than 256K of memory on an SVGA card. However, +I have found that some cards simply mirror the first 64K twice within the +128K address space. This memory map is intended for use in the Chain Odd/Even +modes, eliminating the need to use the Odd/Even Page Select field. Also +I have found that MS-DOS memory managers don't like this very much and +are likely to lock up the system if configured to use the area from B0000h-B7FFFh +for loading device drivers high. + +

    Host Address to Display Address Translation +
            The most complicated part +of accessing display memory involves the translation between a host address +and a display memory address. Internally, the VGA has a 64K 32-bit memory +locations. These are divided into four 64K bit planes. Because the VGA +was designed for 8 and 16 bit bus systems, and due to the way the Intel +chips handle memory accesses, it is impossible for the host CPU to access +the bit planes directly, instead relying on I/O registers to make part +of the memory accessible. The most straightforward display translation +is where a host access translates directly to a display memory address. +What part of the particular 32-bit memory location is dependent on certain +registers and is discussed in more detail in Manipulating Display Memory +below. The VGA has three modes for addressing, Chain 4, Odd/Even mode, +and normal mode: +

      +
    • +Chain 4: This mode is used for MCGA emulation in the 320x200 256-color +mode. The address is mapped to memory MOD 4 (shifted right 2 places.)
    • +
    +<More to be added here.> + +

    Manipulating Display Memory +
            The VGA hardware contains +hardware that can perform bit manipulation on data and allow the host to +operate on all four display planes in a single operation. These features +are fairly straightforward, yet complicated enough that most VGA programmers +choose to ignore them. This is unfortunate, as properly utilization of +these registers is crucial to programming the VGA's 16 color modes. Also, +knowledge of this functionality can in many cases enhance performance in +other modes including text and 256 color modes. In addition to normal read +and write operations the VGA hardware provides enhanced operations such +as the ability to perform rapid comparisons, to write to multiple planes +simultaneously, and to rapidly move data from one area of display memory +to another, faster logical operations (AND/OR/XOR) as well as bit rotation +and masking. + +

    Reading from Display Memory +
            The VGA hardware has two +read modes, selected by the Read Mode field. +The first is a straightforward read of one or more consecutive bytes (depending +on whether a byte, word or dword operation is used) from one bit plane. +The value of the Read Map Select field is +the page that will be read from. The second read mode returns the result +of a comparison of the display memory and the Color +Compare field and masked by the Color Don't +Care field. This mode which can be used to rapidly perform up to 32 +pixel comparisons in one operation in the planar video modes, helpful for +the implementation of fast flood-fill routines. A read from display memory +also loads a 32 bit latch register, one byte from each plane. This latch +register, is not directly accessible from the host CPU; rather it can be +used as data for the various write operations. The latch register retains +its value until the next read and thus may be used with more than one write +operation. +
           The two read modes, simply called +Read Mode 0-1 based on the value of the Read +Mode field are: +

      +
    • +Read Mode 0:
    • + +
             Read Mode 0 is used to read one +byte from a single plane of display memory. The plane read is the value +of the Read Map Select field. In order to +read a single pixel's value in planar modes, four read operations must +be performed, one for each plane. If more than one bytes worth of data +is being read from the screen it is recommended that you read it a plane +at a time instead of having to perform four I/O operations to the Read +Map Select field for each byte, as this will allow the use of faster +string copy instructions and reduce the number I/O operations performed. +
    • +Read Mode 1:
    • + +
              Read Mode 1 is used to perform +comparisons against a reference color, specified by the Color +Compare field. If a bit is set in the Color +Don't Care field then the corresponding color plane is considered for +by the comparison, otherwise it is ignored. Each bit in the returned result +represents one comparison between the reference color from the Color +Compare field, with the bit being set if the comparison is true. This +mode is mainly used by flood fill algorithms that fill an area of a specific +color, as it requires 1/4 the number of reads to determine the area that +needs to be filled in addition to the additional work done by the comparison. +Also an efficient "search and replace" operation that replaces one color +with another can be performed when this mode is combined with Write Mode +3.
    +Writing to Display Memory +
            The VGA has four write modes, +selected by the Write Mode field. This controls +how the write operation and host data affect the display memory. The VGA, +depending on the Write Mode field performs +up to five distinct operations before the write affects display memory. +Note that not all write modes use all of pipelined stages in the write +hardware, and others use some of the pipelined stages in different ways. +
            The first of these allows +the VGA hardware to perform a bitwise rotation on the data written from +the host. This is accomplished via a barrel rotator that rotates the bits +to the right by the number of positions specified by the Rotate +Count field. This performs the same operation as the 8086 ROR instruction, +shifting bits to the right (from bit 7 towards bit 0.) with the bit shifted +out of position 0 being "rolled" into position 7. Note that if the rotate +count field is zero then no rotation is performed. +
            The second uses the Enable +Set/Reset and Set/Reset fields. These +fields can provide an additional data source in addition to the data written +and the latched value from the last read operation performed. Normally, +data from the host is replicated four times, one for each plane. In this +stage, a 1 bit in the Enable Set/Reset field +will cause the corresponding bit plane to be replaced by the bit value +in the corresponding Set/Reset field location, +replicated 8 times to fill the byte, giving it either the value 00000000b +or 11111111b. If the Enable Set/Reset field +for a given plane is 0 then the host data byte is used instead. Note that +in some write modes, the host data byte is used for other purposes, and +the set/reset register is always used as data, and in other modes the set/reset +mechanism is not used at all. +
          The third stage performs logical operations +between the host data, which has been split into four planes and is now +32-bits wide, and the latch register, which provides a second 32-bit operand. +The Logical Operation field selects the operation +that this stage performs. The four possibilities are: NOP (the host data +is passed directly through, performing no operation), AND (the data is +logically ANDed with the latched data.), OR (the data is logically ORed +with the latched data), and XOR (the data is logically XORed with the latched +data.) The result of this operation is then passed on. whilst the latched +data remains unchanged, available for use in successive operations. +
            In the fourth stage, individual +bits may be selected from the result or copied from the latch register. +Each bit of the Bit Mask field determines +whether the corresponding bits in each plane are the result of the previous +step or are copied directly from the latch register. This allows the host +CPU to modify only a single bit, by first performing a dummy read to fill +the latch register +
            The fifth stage allows specification +of what planes, if any a write operation affects, via the Memory +Plane Write Enable field. The four bits in this field determine whether +or not the write affects the corresponding plane If the a planes bit is +1 then the data from the previous step will be written to display memory, +otherwise the display buffer location in that plane will remain unchanged. +
            The four write modes, of +which the current one is set by writing to the Write +Mode field The four write modes, simply called write modes 0-3, based +on the value of the Write Mode field are: +
      +
    • +Write Mode 0:
    • + +
              Write Mode 0 is the standard +and most general write mode. While the other write modes are designed to +perform a specific task, this mode can be used to perform most tasks as +all five operations are performed on the data. The data byte from the host +is first rotated as specified by the Rotate Count +field, then is replicated across all four planes. Then the Enable +Set/Reset field selects which planes will receive their values from +the host data and which will receive their data from that plane's Set/Reset +field location. Then the operation specified by the Logical +Operation field is performed on the resulting data and the data in +the read latches. The Bit Mask field is then +used to select between the resulting data and data from the latch register. +Finally, the resulting data is written to the display memory planes enabled +in the Memory Plane Write Enable field. +
    • +Write Mode 1:
    • + +
              Write Mode 1 is used to +transfer the data in the latches register directly to the screen, affected +only by the Memory Plane Write Enable field. +This can facilitate rapid transfer of data on byte boundaries from one +area of video memory to another or filling areas of the display with a +pattern of 8 pixels. When Write Mode 0 is used with the Bit +Mask field set to 00000000b the operation of the hardware is identical +to this mode, although it is entirely possible that this mode is faster +on some cards. +
    • +Write Mode 2:
    • + +
              Write Mode 2 is used to +unpack a pixel value packed into the lower 4 bits of the host data byte +into the 4 display planes. In the byte from the host, the bit representing +each plane will be replicated across all 8 bits of the corresponding planes. +Then the operation specified by the Logical Operation +field is performed on the resulting data and the data in the read latches. +The Bit Mask field is then used to select +between the resulting data and data from the latch register. Finally, the +resulting data is written to the display memory planes enabled in the Memory +Plane Write Enable field. +
    • +Write Mode 3:
    • + +
              Write Mode 3 is used +when the color written is fairly constant but the Bit +Mask field needs to be changed frequently, such as when drawing single +color lines or text. The value of the Set/Reset +field is expanded as if the Enable Set/Reset +field were set to 1111b, regardless of its actual value. The host data +is first rotated as specified by the Rotate Count +field, then is ANDed with the Bit Mask field. +The resulting value is used where the Bit Mask +field normally would be used, selecting data from either the expansion +of the Set/Reset field or the latch register. +Finally, the resulting data is written to the display memory planes enabled +in the Memory Plane Write Enable field.
    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/vga/vgareg.htm b/specs/freevga/vga/vgareg.htm new file mode 100644 index 0000000..45d3162 --- /dev/null +++ b/specs/freevga/vga/vgareg.htm @@ -0,0 +1,508 @@ + + + + + + + VGA/SVGA Video Programming--Accessing the VGA Registers + + + +
    Home Intro Advice +Fudge Paranoia External +Indexed Attribute Color +Binary Example Masking +Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    Accessing the VGA Registers  +
    + + +Introduction +
            This section discusses methods +of manipulating the particular registers present in VGA hardware. Depending +upon which register one is accessing, the method of accessing them is different +and sometimes difficult to understand. The VGA has many more registers +than it has I/O ports, thus it must provide a way to re-use or multiplex +many registers onto a relatively small number of ports. All of the VGA +ports are accessed by inputting and outputting bytes to I/O ports; however, +in many cases it is necessary to perform additional steps to ready the +VGA adapter for reading and writing data. Port addresses are given at their +hexadecimal address, such as 3C2h. + +

    General Advice +
            If a program takes +control of the video card and changes its state, it is considered good +programming practice to keep track of the original values of any register +it changes such that upon termination (normal or abnormal) it can write +them back to the hardware to restore the state. Anyone who has seen a graphics +application abort in the middle of a graphics screen knows how annoying +this can be. Almost all of the VGA registers can be saved and restored +in this fashion. In addition when changing only a particular field of a +register, the value of the register should be read and the byte should +be masked so that only the field one is trying to change is actually changed. + +

    I/O Fudge Factor +
            Often a hardware device +is not capable handling I/O accesses as fast as the processor can issue +them. In this case, a program must provide adequate delay between I/O accesses +to the same device. While many modern chipsets provide this delay in hardware, +there are still many implementations in existence that do not provide this +delay. If you are attempting to write programs for the largest possible +variety of hardware configurations, then it is necessary to know the amount +of delay necessary. Unfortunately, this delay is not often specified, and +varies from one VGA implementation to another. In the interest of performance +it is ideal to keep this delay to the minimum necessary. In the interest +of compatibility it is necessary to implement a delay independent of clock +speed. (Faster processors are continuously being developed, and also a +user may change clock speed dynamically via the Turbo button on their case.) + +

    Paranoia +
            If one wishes to be extra +cautious when writing to registers, after writing to a register one can +read the value back and compare it with the original value. If they differ +it may mean that the VGA hardware has a stuck bit in one its registers, +that you are attempting to modify a locked or unsupported register, or +that you are not providing enough delay between I/O accesses. As long as +reading the register twice doesn't have any unintended side effects, when +reading a registers value, one can read the register twice and compare +the values read, after masking out any fields that may change without CPU +intervention. If the values read back are different it may mean that you +are not providing enough delay between I/O accesses, that the hardware +is malfunctioning, or are reading the wrong register or field. Other problems +that these techniques can address are noise on the I/O bus due to faulty +hardware, dirty contacts, or even sunspots! When perform I/O operations +and these checks fail, try repeating the operation, possibly with increased +I/O delay time. By providing extra robustness, I have found that my own +programs will work properly on hardware that causes less robust programs +to fail. + +

    Accessing the External Registers +
            The external registers are +the easiest to program, because they each have their own separate I/O address. +Reading and writing to them is as simple as inputting and outputting bytes +to their respective port address. Note, however some, such as the Miscellaneous +Output Register is written at port 3C2h, but is read at port 3CCh. The +reason for this is for backwards compatibility with the EGA and previous +adapters. Many registers in the EGA were write only, and thus the designers +placed read-only registers at the same location as write-only ones. However, +the biggest complaint programmers had with the EGA was the inability to +read the EGA's video state and thus in the design of the VGA most of these +write-only registers were changed to read/write registers. However, for +backwards compatibility, the read-only register had to remain at 3C2h, +so they used a different port. + +

    Accessing the Sequencer, Graphics, and CRT +Controller Registers +
            These registers are accessed +in an indexed fashion. Each of the three have two unique read/write ports +assigned to them. The first port is the Address Register for the group. +The other is the Data Register for the group. By writing a byte to the +Address Register equal to the index of the particular sub-register you +wish to access, one can address the data pointed to by that index by reading +and writing the Data Register. The current value of the index can be read +by reading the Address Register. It is best to save this value and restore +it after writing data, particularly so in an interrupt routine because +the interrupted process may be in the middle of writing to the same register +when the interrupt occurred. To read and write a data register in one of +these register groups perform the following procedure: +

      +
    1. +Input the value of the Address Register and save it for step 6
    2. + +
    3. +Output the index of the desired Data Register to the Address Register.
    4. + +
    5. +Read the value of the Data Register and save it for later restoration upon +termination, if needed.
    6. + +
    7. +If writing, modify the value read in step 3, making sure to mask off bits +not being modified.
    8. + +
    9. +If writing, write the new value from step 4 to the Data register.
    10. + +
    11. +Write the value of Address register saved in step 1 to the Address Register.
    12. +
    +        If you are paranoid, then you +might want to read back and compare the bytes written in step 2, 5, and +6 as in the Paranoia section above. Note that certain +CRTC registers can be protected from read or write access for compatibility +with programs written prior to the VGA's existence. This protection is +controlled via the Enable Vertical Retrace Access +and CRTC Registers Protect Enable fields. +Ensuring that access is not prevented even if your card does not normally +protect these registers makes your + +

    Accessing the Attribute Registers +
             The attribute registers +are also accessed in an indexed fashion, albeit in a more confusing way. +The address register is read and written via port 3C0h. The data register +is written to port 3C0h and read from port 3C1h. The index and the data +are written to the same port, one after another. A flip-flop inside the +card keeps track of whether the next write will be handled is an index +or data. Because there is no standard method of determining the state of +this flip-flop, the ability to reset the flip-flop such that the next write +will be handled as an index is provided. This is accomplished by reading +the Input Status #1 Register (normally port 3DAh) (the data received is +not important.) This can cause problems with interrupts because there is +no standard way to find out what the state of the flip-flop is; therefore +interrupt routines require special card when reading this register. (Especially +since the Input Status #1 Register's purpose is to determine whether a +horizontal or vertical retrace is in progress, something likely to be read +by an interrupt routine that deals with the display.) If an interrupt were +to read 3DAh in the middle of writing to an address/data pair, then the +flip-flop would be reset and the data would be written to the address register +instead. Any further writes would also be handled incorrectly and thus +major corruption of the registers could occur. To read and write an data +register in the attribute register group, perform the following procedure: +

      +
    1. +Input a value from the Input Status #1 Register (normally port 3DAh) and +discard it.
    2. + +
    3. +Read the value of the Address/Data Register and save it for step 7.
    4. + +
    5. +Output the index of the desired Data Register to the Address/Data Register
    6. + +
    7. +Read the value of the Data Register and save it for later restoration upon +termination, if needed.
    8. + +
    9. +If writing, modify the value read in step 4, making sure to mask off bits +not being modified.
    10. + +
    11. +If writing, write the new value from step 5 to the Address/Data register.
    12. + +
    13. +Write the value of Address register saved in step 1 to the Address/Data +Register.
    14. + +
    15. +If you wish to leave the register waiting for an index, input a value from +the Input Status #1 Register (normally port 3DAh) and discard it.
    16. +
    +        If you have control over interrupts, +then you can disable interrupts while in the middle of writing to the register. +If not, then you may be able to implement a critical section where you +use a byte in memory as a flag whether it is safe to modify the attribute +registers and have your interrupt routine honor this. And again, it pays +to be paranoid. Resetting the flip-flop even though it should be +in the reset state already helps prevent catastrophic problems. Also, you +might want to read back and compare the bytes written in step 3, 6, and +7 as in the Paranoia section above. +
            On the IBM VGA implementation, +an undocumented register (CRTC Index=24h, bit 7) can be read to determine +the status of the flip-flop (0=address,1=data) and many VGA compatible +chipsets duplicate this behavior, but it is not guaranteed. However, it +is a simple matter to determine if this is the case. Also, some SVGA chipsets +provide the ability to access the attribute registers in the same fashion +as the CRT, Sequencer, and Graphics controllers. Because this functionality +is vendor specific it is really only useful when programming for that particular +chipset. To determine if this undocumented bit is supported, perform the +following procedure: +
      +
    1. +Input a value from the Input Status #1 Register (normally port 3DAh) and +discard it.
    2. + +
    3. +Verify that the flip-flop status bit (CRTC Index 24, bit 7) is 0. If bit=1 +then feature is not supported, else continue to step 3.
    4. + +
    5. +Output an address value to the Attribute Address/Data register.
    6. + +
    7. +Verify that the flip-flop status bit (CRTC Index 24, bit 7) is 1. If bit=0 +then feature is not supported, else continue to step 5.
    8. + +
    9. +Input a value from the Input Status #1 Register (normally port 3DAh) and +discard it.
    10. + +
    11. +Verify that the flip-flop status bit (CRTC Index 24, bit 7) is 0. If bit=1 +then feature is not supported, else feature is supported.
    12. +
    +Accessing the Color Registers +
         The color registers require an altogether +different technique; this is because the 256-color palette requires 3 bytes +to store 18-bit color values. In addition the hardware supports the capability +to load all or portions of the palette rapidly. To write to the palette, +first you must output the value of the palette entry to the PEL Address +Write Mode Register (port 3C8h.) Then you should output the component values +to the PEL Data Register (port 3C9h), in the order red, green, then blue. +The PEL Address Write Mode Register will then automatically increment, +allowing the component values of the palette entry to be written to the +PEL Data Register. Reading is performed similarly, except that the PEL +Address Read Mode Register (port 3C7h) is used to specify the palette entry +to be read, and the values are read from the PEL Data Register. Again, +the PEL Address Read Mode Register auto-increments after each triplet is +written. The current index for the current operation can be read from the +PEL Address Write Mode Register. Reading port 3C7h gives the DAC State +Register, which specifies whether a read operation or a write operation +is in effect. As in the attribute registers, there is guaranteed way for +an interrupt routine to access the color registers and return the color +registers to the state they were in prior to access without some communication +between the ISR and the main program. For some workarounds see the Accessing +the Attribute Registers section above. To read the color registers: +
      +
    1. +Read the DAC State Register and save the value for use in step 8.
    2. + +
    3. +Read the PEL Address Write Mode Register for use in step 8.
    4. + +
    5. +Output the value of the first color entry to be read to the PEL Address +Read Mode Register.
    6. + +
    7. +Read the PEL Data Register to obtain the red component value.
    8. + +
    9. +Read the PEL Data Register to obtain the green component value.
    10. + +
    11. +Read the PEL Data Register to obtain the blue component value.
    12. + +
    13. +If more colors are to be read, repeat steps 4-6.
    14. + +
    15. +Based upon the DAC State from step 1, write the value saved in step 2 to +either the PEL Address Write Mode Register or the PEL Address Read Mode +Register.
    16. +
    +Note: Steps 1, 2, and 8 are hopelessly optimistic. This in no way guarantees +that the state is preserved, and with some DAC implementations this may +actually guarantee that the state is never preserved. See the DAC +Operation page for more details. + +

    Binary Operations +
            In order to better +understand dealing with bit fields it is necessary to know a little bit +about logical operations such as logical-and (AND), logical-or (OR), and +exclusive-or(XOR.) These operations are performed on a bit by bit basis +using the truth tables below. All of these operations are commutative, +i.e. A OR B = B OR A, so you look up one bit in the left column and the +other in the top row and consult the intersecting row and column for the +answer. +
      +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    ANDORXOR
    010101
    000001001
    101111110
    +
    +Example Register +
            The following table is an +example of one particular register, the Mode Register of the Graphics Register. +Each number from 7-0 represents the bit position in the byte. Many registers +contain more than one field, each of which performs a different function. +This particular chart contains four fields, two of which are two bits in +length. It also contains two bits which are not implemented (to the best +of my knowledge) by the standard VGA hardware. +
      + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Mode Register (Index 05h)
    76543210
    Shift RegisterOdd/EvenRMWrite Mode
    +
    +Masking Bit-Fields +
            Your development environment +may provide some assistance in dealing with bit fields. Consult your documentation +for this. In addition it can be performed using the logical operators AND, +OR, and XOR (for details on these operators see the Binary +Operations section above.) To change the value of the Shift Register +field of the example register above, we would first mask out the bits we +do not wish to change. This is accomplished by performing a logical AND +of the value read from the register and a binary value in which all of +the bits we wish to leave alone are set to 1, which would be 10011111b +for our example. This leaves all of the bits except the Shift Register +field alone and set the Shift Register field to zero. If this was our goal, +then we would stop here and write the value back to the register. We then +OR the value with a binary number in which the bits are shifted into position. +To set this field to 10b we would OR the result of the AND with 01000000b. +The resulting byte would then be written to the register. To set a bitfield +to all ones the AND step is not necessary, similar to setting the bitfield +to all zeros using AND. To toggle a bitfield you can XOR a value with a +byte with a ones in the positions to toggle. For example XORing the value +read with 01100000b would toggle the value of the Shift Register bitfield. +By using these techniques you can assure that you do not cause any unwanted +"side-effects" when modifying registers. + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/vga/vgargidx.htm b/specs/freevga/vga/vgargidx.htm new file mode 100644 index 0000000..6fc4712 --- /dev/null +++ b/specs/freevga/vga/vgargidx.htm @@ -0,0 +1,385 @@ + + + + + + + VGA/SVGA Video Programming--VGA Field Index + + + +

    Home A B +C  D  E  +F  G  H  I  +J  K  L  M  N  +O  P  Q  R  +S  T  U  V  +W  X  Y  Z Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    VGA Field Index  +
    + +
    A | B | C |  +D | E | F | G | H +| I | J | K | L | M +| N | O | P | Q | R +| S | T | U | V | W +| X | Y | Z
    + + + + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/vga/vgaseq.htm b/specs/freevga/vga/vgaseq.htm new file mode 100644 index 0000000..9818979 --- /dev/null +++ b/specs/freevga/vga/vgaseq.htm @@ -0,0 +1,206 @@ + + + + + + + FreeVGA - VGA Sequencer Operation + + + +

    Home Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    VGA Sequencer Operation  +
    +Introduction +
            The sequencer portion of +the VGA hardware reads the display memory and converts it into data that +is sent to the attribute controller.  This would normally be a simple +part of the video hardware, but the VGA hardware was designed to provide +a degree of software compatibility with monochrome, CGA, EGA, and MCGA +adapters.  For this reason, the sequencer has quite a few different +modes of operation.  Further complicating programming, the sequencer +has been poorly documented, resulting in many variances between various +VGA/SVGA implementations. +
      +
    Sequencer Memory Addressing +
            The sequencer operates by +loading a display address into memory, then shifting it out pixel by pixel.   +The memory is organized internally as 64K addresses, 32 bits wide.  +The seqencer maintains an internal 16-bit counter that is used to calculate +the actual index of the 32-bit location to be loaded and shifted out.  +There are several different mappings from this counter to actual memory +addressing, some of which use other bits from other counters, as required +to provide compatibility with older hardware that uses those addressing +schemes. + +

    <More to be added here> +
      +
    Graphics Shifting Modes +
            When the Alphanumeric +Mode Disable field is set to 1, the sequencer operates in graphics +mode where data in memory references pixel values, as opposed to the character +map based operation used for alphanumeric mode. +
            The sequencer has three +methods of taking the 32-bit memory location loaded and shifting it into +4-bit pixel values suitable for graphics modes, one of which combines 2 +pixel values to form 8-bit pixel values.  The first method is the +one used for the VGA's 16 color modes.  This mode is selected when +both the 256-Color Shift Mode and Shift +Register Interleave Mode fields are set to 0.  In this mode, one +bit from each of the four 8-bit planes in the 32-bit memory is used to +form a 16 color value. This is shown in the diagram below, where the most +significant bit of each of the four planes is shifted out into a pixel +value, which is then sent to the attribute controller to be converted into +an index into the DAC palette.  Following this, the remaining bits +will be shifted out one bit at a time, from most to least significant bit, +with the bits from planes 0-3 going to pixel bits 0-3. +
      +
      +

    Click here for Textified Planar Shift Mode Diagram
    +  + +

            The second shift mode is +the packed shift mode, which is selected when both the 256-Color +Shift Mode field is set to 0 and the Shift +Register Interleave Mode field is set to 1.This is used by the VGA +bios to support video modes compatible with CGA video modes.  However, +the CGA only uses planes 0 and 1 providing for a 4 color packed mode; however, +the VGA hardware actually uses bits from two different bit planes, providing +for 16 color modes.  The bits for the first four pixels shifted out +for a given address are stored in planes 0 and 2.  The second four +are stored in planes 1 and 3.  For each pixel, bits 3-2 are shifted +out of the higher numbered plane and bits 1-0 are shifted out of the lower +numbered plane.  For example, bits 3-2 of the first pixel shifted +out are located in bits 7-6 of plane 2; likewise, bits 1-0 of the same +pixel are located in bits 7-6 of plane 0. +
      +
      +

    Click for Textified Packed Shift Mode Diagram
    +  + +

           The third shift mode is used for +256-color modes, which is selected when the 256-Color +Shift Mode field is set to 1 (this field takes precedence over the +Shift Register Interleave Mode field.)  +This behavior of this shift mode varies among VGA implementations, due +to it normally being used in combination with the 8-bit +Color Enable field of the attribute controller.  Thus certain +variances in the sequencing operations can be masked by similar variances +in the attribute controller.  However, the implementations I have +experimented with seem to fall into one of two similar behaviors, and thus +it is possible to describe both here.  Note that one is essentially +a mirror image of the other, leading me to believe that the designers knew +how it should work to be 100% IBM VGA compatible, but managed to get it +backwards in the actual implementation. Due to being very poorly documented +and understood, it is very possible that there are other implementations +that vary significantly from these two cases.  I do, however, feel +that attempting to specify each field's function as accurately possible +can allow more powerful utilization of the hardware. +
            When this shift mode is +enabled, the VGA hardware shifts 4 bit pixel values out of the 32-bit memory +location each dot clock.  This 4-bit value is processed by the attribute +controller, and the lower 4 bits of the resulting DAC index is combined +with the lower 4 bits of the previous attribute lookup to produce an 8-bit +index into the DAC palette.  This is why, for example, a 320 pixel +wide 256 color mode needs to be programmed with timing values for a 640 +pixel wide normal mode.  In 256-color mode, each plane holds a 8-bit +value which is intended to be the DAC palette index for that pixel.  +Every second 8-bit index generated should correspond to the values in planes +0-3, appearing left to right on the display.  This is masked by the +attribute controller, which in 256 color mode latches every second 8-bit +value as well.  This means that the intermediate 8-bit values are +not normally seen, and is where implementations can vary.  Another +variance is whether the even or odd pixel values generated are the intended +data bytes.  This also is masked by the attribute controller, which +latches the appropriate even or odd pixel values. +
            The first case is where +the 8-bit values are formed by shifting the 4 8-bit planes left.  +This is shown in the diagram below.  The first pixel value generated +will be the value held in bits 7-4 of plane 0, which is then followed by +bits 3-0 of plane 0.  This continues, shifting out the upper four +bits of each plane in sequence before the lower four bits, ending up with +bits 3-0 of plane 3.  Each pixel value is fed to the attribute controller, +where a lookup operation is performed using the attribute table.  +The previous 8-bit DAC index is shifted left by four, moving from the lower +four bits to the upper four bits of the DAC index, and the lower 4 bits +of the attribute table entry for the current pixel is shifted into the +lower 4 bits of the 8-bit value, producing a new 8-bit DAC index.  +Note how one 4-bit result carries over into the next display memory location +sequenced. +
            For example, assume planes +0-3 hold 01h, 23h, 45h, and 67h respectively, and the lower 4 bits of the +the attribute table entries hold the value of the index itself, essentially +using the index value as the result, and the last 8-bit DAC index generated +was FEh. The first cycle, the pixel value generated is 0h, which is fed +to the attribute controller and looked up, producing the table entry 0h +(surprise!) The previous DAC index, FEh, is shifted left by four bits, +while the new value, 0h is shifted into the lower four bits.  Thus, +the new DAC index output for this pixel is E0h.  The next pixel is +1h, which produces 1h at the other end of the attribute controller.  +The previous DAC index, E0h is shifted again producing 01h.  This +process continues, producing the DAC indexes, in order, 12h, 23h, 34h, +45h, 56h, and 67h.  Note that every second DAC index is the appropriate +8-bit value for a 256-color mode, while the values in between contain four +bits of the previous and four bits of the next DAC index. +
      +
      +

    Click for Textified 256-Color Shift Mode Diagram (Left)
    +  + +

           The second case is where the 8-bit +values are formed by shifting the 8-bit values right, as depicted in the +diagram below.  The first pixel value generated is the lower four +bits of plane 0, followed by the upper four bits.  This continues +for planes 1-3 until the last pixel value produced, which is the upper +four bits of Plane 3.  These pixel values are fed to the attribute +controller, where the corresponding entry in the attribute table is looked +up.  The previous 8-bit DAC index is shifted right 4 places. and the +lower four bits of the attribute table entry generated is used as the upper +four bits of the new DAC index. +
            For example, assume planes +0-3 hold 01h, 23h, 45h, and 67h respectively, and the lower 4 bits of the +the attribute table entries hold the value of the index itself, essentially +using the index value as the result, and the last 8-bit DAC index generated +was FEh. The first cycle, the pixel value generated is 1h, which is fed +to the attribute controller and looked up, producing the table entry 1h. +The previous DAC index, FEh, is shifted right by four bits, while the new +value, 1h is shifted into the upper four bits.  Thus, the new DAC +index output for this pixel is 1Fh.  The next pixel is 0h, which produces +0h at the other end of the attribute controller.  The previous DAC +index, 1Fh is shifted again producing 01h.  This process continues, +producing the DAC indexes, in order, 30h, 23h, 52h, 45h, 74h, and 67h.  +Again, note that every second DAC index is the appropriate 8-bit value +for a 256-color mode, while the values in between contain four bits of +the previous and four bits of the next DAC index. +
      +
      +

    Click for Textified 256-Color Shift Mode Diagram (Right)
    +  +
      +
            Another variance that can +exist is whether the first or second DAC index generated at the beginning +of a scan line is the appropriate 8-bit value.  If it is the second, +the first DAC index contains 4 bits from the contents of the DAC index +prior to the start of the scan line.  This could conceivably contain +any value, as it is normally masked by the attribute controller when in +256-color mode whcih would latch the odd pixel values.  Likely this +value will be either 00h or whatever the contents were at the end of the +previous scan line.  A similar circumstance arises where the last +pixel value generated falls on a boundary between memory addresses.  +In this circumstance, however, the value generated is produced by sequencing +the next display memory address as if the line continued, and is thus more +predictable. +
      + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      +
      + + diff --git a/specs/freevga/vga/vgatext.htm b/specs/freevga/vga/vgatext.htm new file mode 100644 index 0000000..a2e34a0 --- /dev/null +++ b/specs/freevga/vga/vgatext.htm @@ -0,0 +1,185 @@ + + + + + + + VGA/SVGA Video Programming--VGA Text Mode Operation + + + +

    Home Intro Memory +Attributes Fonts Cursor +Back  +
    Hardware Level VGA and SVGA Video Programming Information +Page
    + +
    VGA Text Mode Operation  +
    + +
      +
    • +Introduction -- gives scope of this page.
    • + +
    • +Display Memory Organization -- details how the VGA's +planes are utilized when in text mode.
    • + +
    • +Attributes -- details the fields of the attribute +byte.
    • + +
    • +Fonts -- details the operation of the character generation +hardware.
    • + +
    • +Cursor -- details on manipulating the text-mode cursor.
    • +
    +Introduction +
            This section is intended +to document the VGA's operation when it is in the text modes, including +attributes and fonts. While it would seem that the text modes are adequately +supported by the VGA BIOS, there is actually much that can be done with +the VGA text modes that can only be accomplished by going directly to the +hardware. Furthermore, I have found no good reference on the VGA text modes; +most VGA references take them for granted without delving into their operation. + +

    Display Memory Organization +
            The four display memory +planes are used for different purposes when the VGA is in text mode. Each +byte in plane 0 is used to store an index into the character font map. +The corresponding byte in plane 1 is used to specify the attributes of +the character possibly including color, font select, blink, underline and +reverse. For more details on attribute operation see the Attributes section +below. Display plane 2 is used to store the bitmaps for the characters +themselves. This is discussed in the Fonts section below. Normally, the +odd/even read and write addressing mode is used to make planes 0 and 1 +accessible at interleaved host memory addresses. + +

    Attributes +
            The attribute byte is divided +into two four bit fields. The field from 7-4 is used as an index into the +palette registers for the background color which used when a font bit is +0. The field from 3-0 is used as an index into the palette registers for +the foreground which is used when a font bit is 1. Also the attribute can +control several other aspects which may modify the way the character is +displayed. +
            If the Blink +Enable field is set to 1, character blinking is enabled. When blinking +is enabled, bit 3 of the background color is forced to 0 for attribute +generation purposes, and if bit 7 of the attribute byte for a character +is set to 1, the foreground color alternates between the foreground and +background, causing the character to blink. The blink rate is determined +by the vertical sync rate divided by 32. +
            If the bits 2-0 of the attribute +byte is equal to 001b and bits 6-4 of the attribute byte is equal to 000b, +then the line of the character specified by the Underline +Location field is replaced with the foreground color. Note if the line +specified by the Underline Location field +is not normally displayed because it is greater than the maximum scan line +of the characters displayed, then the underline capability is effectively +disabled. +
            Bit 3 of the attribute byte, +as well as selecting the foreground color for its corresponding character, +also is used to select between the two possible character sets (see Fonts +below.) If both character sets are the same, then the bit effectively functions +only to select the foreground color. + +

    Fonts +
            The VGA's text-mode hardware +provides for a very fast text mode. While this mode is not used as often +these days, it used to be the predominant mode of operation for applications. +The reason that the text mode was fast, much faster than a graphics mode +at the same resolution was that in text mode, the screen is partitioned +into characters. A single character/attribute pair is written to screen, +and the hardware uses a font table in video memory to map those character +and attribute pairs into video output, as opposed to having to write all +of the bits in a character, which could take over 16 operations to write +to screen. As CPU display memory bandwidth is somewhat limited (particularly +on on older cards), this made text mode the mode of choice for applications +which did not require graphics. + +

             For each character +position, bit 3 of the attribute byte selects which character set is used, +and the character byte selects which of the 256 characters in that font +are used. Up to eight sets of font bitmaps can be stored simultaneously +in display memory plane 2. The VGA's hardware provides for two banks of +256 character bitmaps to displayed simultaneously. Two fields, Character +Set A Select and Character Set B Select +field are used to determine which of the eight font bitmaps are currently +displayed. If bit 3 of a character's attribute byte is set to 1, then the +character set selected by Character Set A Select +field, otherwise the character set specified by Character +Set B Select field is used. Ordinarily, both character sets use the +same map in memory, as utilizing 2 different character sets causes character +set A to be limited to colors 0-7, and character set B to be limited to +colors 8-15. +
            Fonts are either 8 or 9 +pixels wide and can be from 1 to 32 pixels high. The width is determined +by the 9/8 Dot Mode field. Characters normally +have a line of blank pixels to the right and bottom of the character to +separate the character from its neighbor. Normally this is included in +the character's bitmap, leaving only 7 bit columns for the character. Characters +such as the capital M have to be squished to fit this, and would look better +if all 8 pixels in the bitmap could be used, as in 9 Dot mode where the +characters have an extra ninth bit in width, which is displayed in the +text background color, However, this causes the line drawing characters +to be discontinuous due to the blank column. Fortunately, the Line +Graphics Enable field can be set to allow character codes C0h-DFh to +have their ninth column be identical to their eighth column, providing +for continuity between line drawing characters. The height is determined +by the Maximum Scan Line field which is set +to one less than the number of scan lines in the character. +
            Display memory plane 2 is +divided up into eight 8K banks of characters, each of which holds 256 character +bitmaps. Each character is on a 32 byte boundary and is 32 bytes long. +The offset in plane 2 of a character within a bank is determined by taking +the character's value and multiplying it by 32. The first byte at this +offset contains the 8 pixels of the top scan line of the characters. Each +successive byte contains another scan line's worth of pixels. The best +way to read and write fonts to display memory, assuming familiarity with +the information from the Accessing the Display Memory +page, is to use standard (not Odd/Even) addressing and Read Mode 0 and +Write Mode 0 with plane 2 selected for read or write. +
            The following example shows +three possible bitmap representations of text characters. In the left example +an 8x8 character box is used. In this case, the Maximum +Scan Line field is programmed to 7 and the 9/8 +Dot Mode field is programmed to 0. Note that the bottom row and right-most +column is blank. This is used to provide inter-character spacing. The middle +example shows an 8x16 character. In this case the Maximum +Scan Line field is programmed to 15 and the 9/8 +Dot Mode field is programmed to 0. Note that the character has extra +space at the bottom below the baseline of the character. This is used by +characters with parts descending below the baseline, such as the lowercase +letter "g". The right example shows a 9x16 character. In this case the +Maximum Scan Line field is programmed to 15 +and the 9/8 Dot Mode field is programmed to +1. Note that the rightmost column is used by the character, as the ninth +column for 9-bit wide characters is assumed blank (excepting for the behavior +of the the Line Graphics Enable field.) allowing +all eight bits of width to be used to specify the character, instead of +having to devote an entire column for inter-character spacing. +

    Click for Textified Examples of Text Mode Bitmap Characters
    +  + +

      +
    Cursor +
          The VGA has the hardware capability +to display a cursor in the text modes. Further details on the text-mode +cursor's operation can be found in the following section: +

    +Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/freevga/vga/virtual.gif b/specs/freevga/vga/virtual.gif new file mode 100644 index 0000000..334d3fc Binary files /dev/null and b/specs/freevga/vga/virtual.gif differ diff --git a/specs/freevga/vga/virtual.txt b/specs/freevga/vga/virtual.txt new file mode 100644 index 0000000..055f36a --- /dev/null +++ b/specs/freevga/vga/virtual.txt @@ -0,0 +1,22 @@ + Virtual Screeen Mode Example + ---------------------------- + + 0 319 320 512 + 0 +----------------------------------------+---------------------+ + | | | + | | | + | | | + | | | + | Actual Resolution (Displayed) | | + | 320x200 | | + | | | + | | | + | | | +199 +----------------------------------------+ | +200 | | + | | + | Virtual Resolution (Not Displayed) | + | 512x300 | + | | +299 +--------------------------------------------------------------+ + diff --git a/specs/freevga/vtiming.htm b/specs/freevga/vtiming.htm new file mode 100644 index 0000000..fbbbedd --- /dev/null +++ b/specs/freevga/vtiming.htm @@ -0,0 +1,226 @@ + + + + + + + FreeVGA - Video Timing Information + + + +
    Home Intro Basics +Measurements Horizontal Vertical +Considerations Back  +
    Hardware Level VGA and SVGA Video Programming Information Page
    + +
    Video Timing Information  +
    +Introduction +
           This page is written to give the +necessary background on video timing that is useful for video programming.  +This is not a comprehensive reference on the subject, rather it just gives +the minimum information needed to know to perform mode setting and the +creation of custom video modes.  It includes a small bit of information +about the messy side of video adapters, the electrical output and how that +is interpreted by the monitor.  Much of this information pertains +both to monitors and other CRT devices such as television displays, and +is less applicable to LCD displays as they have different timing requirements. + +

    Basic Description +
            The video hardware produces +a continuous signal on its output connector, except when it is in reset +mode, where the video outputs are held in a single state.  The continuous +signal is required because the pixel information is only displayed for +a short period of time, and relies on the persistence of the phosphor glow +on the monitor as well as the ability of eyesight to perform automatic +averaging to appear to be a steady image.  That signal is usually +output on multiple pins of the monitor connector, although it could also +be a TV compatible output.  LCD displays use a similar technique, +although the timing is more advanced and depends on the specific type of +panel and its driver circuitry.  The signal includes both the pixel +data that the monitor displays, as well as timing and "framing" information +that the video display uses to drive its internal circuitry. +
            The image's pixels are "scanned" +on to the screen from left to right, top to bottom, and is broken up into +"dot periods" or pixels, which is controlled by the "dot clock" for the +mode, which is the basis for all the other video timings.  Each horizontal +"scan line" of dot periods is called a horizontal refresh as it "refreshes" +the information on the display in a horizontal line.  Many of these +scan lines (the amount depending on the video mode), scanning from top +to bottom, make up a vertical refresh, also known as a "frame".  There +are many vertical refreshes per second, where a higher refresh rate produces +an image with less flicker. + +

    Timing Measurements +
            One of the important pieces +of terminology to understand is how timing is measured.  These include +terms such as megahertz, kilohertz, and hertz.  The first three are +a measure of frequency which is based on the term hertz (abbreviated hz), +which can be replaced by the term "Cycles per second."  In video timing, +hertz is used to describe the frequencies of the timing signals, such as +when saying that the vertical refresh frequency is 60 hertz (or 60hz).  +This means that there are 60 cycles per second, which means that there +are 60 vertical refreshes per second.  Another case where hertz is +used is when saying the horizontal refresh rate, such as when saying 31500 +hz, which means that there are 31,500 horizontal refresh cycles per second.  +One abbreviation frequently found is the term kilohertz (abbreviated Khz) +which means 1,000 cycles/per second.  For example, 31.5 kilohertz +means 31.5 x 1000 hertz, or 31500 hz.  This is used to save writing +a few zeros and is a bit more concise.  Similarly the term megahertz +(abbreviated Mhz) is used, which means 1,000,000 cycles/per second.  +For example, instead of saying that a certain mode uses a 25,000,000 hz +dot clock, or saying that it uses a 25,000 Khz clock, it can be concisely +be stated by saying that it uses a 25 Mhz dot clock. +
            Similarly, the periods of +time involved in video timing are very short as they are typically small +fractions of a second.  The terms millisecond, microsecond, and nanosecond +are useful for expressing these small periods of time.  The term millisecond +(abbreviated ms) means one thousandth of a second, or 0.001 seconds.  +In one second, there are 1,000 milliseconds. This is used to describe things, +such as the length of time a vertical refresh takes, for example a 16.6 +millisecond vertical refresh means 16.6 thousands of a second, or 0.0166 +seconds.  In one second, there are 1,000,000 microseconds.  The +term microsecond (abbreviated us) is used to describe something in terms +of millionths of a second, or 0.000001 second.  For example the length +of a horizontal refresh could be 31.7 microseconds, or 31.7 millionths +of a second, 0.0000317 second, or 0.0317 ms.  The term nanosecond +(abbreviated ns) is used to describe one billionth of a second, or 0.000000001 +seconds.  There are 1,000,000,000 nanoseconds in one second.  +One circumstance where this is used, is to describe the period of time +one dot period takes.  For example, one dot period could be stated +as 40 nanoseconds, 0.04 us, 0.00004 ms, or 0.00000004 seconds.  In +each case, the most concise term is used, to provide a shorter, more concise +description. +
            Because the unit hertz is +defined using a unit of time (second), the period of one cycle can be determined +by division.  The simplest example is 1 hz, where the length of the +cycle, by definition would be 1 second. For other values, it can be calculated +according to the following formula: +

      +
    • +Period (in seconds) = 1 / frequency (in hertz)
    • +
    +        For example, a 60 hertz vertical +refresh would last 1 / 60 second, which is approximately 0.0166 seconds, +or 16.6 ms.  Similarly, a 31.5 Khz horizontal refresh would be 1 / +31500 second, which is approximately 0.000031 seconds, or 31.7 us.  +A 25 Mhz dot clock would produce a dot period of 1 / 25000000 second, which +is 0.00000004 seconds, or 40 ns.  If the period of a cycle is known, +then the frequency can be calculated as: +
      +
    • +Frequency (in hertz) = 1 / Period (in seconds).
    • +
    +        For example, a 16.6 ms period +would equate to 1 / 0.0166, which produces a frequency of approximately +60 hz.  Similarly a 31.7 us period would produce approximately a 31.5 +Khz frequency, and a 40 ns period would produce a 25 Mhz frequency. + +

    Horizontal Timing +
            From a monitor's standpoint, +the timing is fairly simple.  It detects the horizontal sync pulses +on the hsync line, then based on the polarity, frequency, and/or duration +of those pulses sets up its horizontal scan circuitry to scan across the +screen at the desired rate.  During this period it continuously displays +the signal input on the analog RGB pins/connectors.  It is important +to to know the horizontal sync frequency ranges of the monitor, as well +as the acceptable width of the sync pulse for those sync ranges.  +If the width of the sync pulse is incorrect, it can make the displayed +information too large or too small, as well as possibly preventing the +monitor from synchronizing to the generated signal.  The acceptable +range of sync pulse width and polarity for a given frequency should be +given in the specifications for the monitor; however, this is frequently +overlooked by the manufacturer.  It is recommended to contact the +manufacturer, otherwise the result has to be determined by trial and error +which can be damaging to the monitor's circuitry. +
            In addition to those that +control horizontal sync frequency and width, there are other horizontal +timing registers which tell the display generation hardware when to output +the active display, when to output the overscan, and when to perform blanking.  +The active display is when pixel data from the frame buffer are being output +and displayed.  This could also be overlaid by data from another source, +such as a TV or MPEG decoder, or a hardware cursor.  The overscan +is the border to the left and right of the screen.  This was more +important on older video hardware such as those monitors lacking horizontal +and vertical picture controls, and is provided for compatibility reasons +although current hardware typically reduces the need for this portion completely.  +The blanking period is used during the retrace portion of the horizontal +display cycle which is the period in which the horizontal sweeps from the +right of the screen back to the left.  Outputting non-zero intensities +during this period would end up being stretched, in reverse across the +end of the current scan line to the beginning of the next scan line which, +while interesting and possibly useful in a small number of circumstances. +would add a bit of blurring to the image.  Blanking is signaled to +the monitor by sending zero intensities of the red, green, and blue components +on the analog lines. +
            In the display generator,  +horizontal timings are specified by the number of dot periods they take.  +The dot period is controlled by selecting the desired dot clock frequency +by programming certain registers. + +

    Vertical Timing +
            Vertical timing is nearly +the same as the horizontal timing, except that it controls the vertical +movement of the display scan, spacing the scan lines the right width apart +so that they seem to form a rectangular image.  The monitor detects +the vertical sync pulses on the vsync line, then based on the polarity, +frequency, and/or duration of those pulses sets up its vertical circuitry +to scan down the screen at the desired rate.  It is necessary to know +the vertical sync frequency ranges for a given monitor, and the range of +acceptable vertical sync widths and polarities for those ranges.  +The rage of vertical sync frequencies supported by the monitor are nearly +always given my the monitor's specifications, but like the horizontal sync +widths, the vertical sync widths are not commonly specified.  Contact +the manufacturer, as attempting to guess the correct vertical sync width +can possibly cause the monitor to fail. +
            As well as being programmed +with the vertical sync frequency and pulse width, the display generation +hardware has other registers which control when to output the active display, +when to output the overscan, and when to perform blanking. In vertical +terms, the active display is the scan lines which contain horizontal active +display periods.  The overscan is the border on top and bottom of +the screen and, if present, consists of one or more entire scan lines in +which the active display period is replaced with the overscan color.  +The blanking is used during the vertical retrace, and consists of one or +more (usually more) scan lines in which the active display and overscan +periods are replaced with blanking period, making the entire line effectively +blanking.  This prevents intensity from overlaying the screen during +the vertical retrace where the monitor sweeps the vertical back to the +top of the screen.  Non-zero intensities output during this period +would be written in a zig-zag pattern from the bottom to the top of the +screen.  In the display generator, the vertical timings are specified +in terms of how many horizontal sync periods they take. +
      +
    Programming Considerations +
            For maximum flexibility, +video timings should be configurable by the end users to allow for the +specifications of their monitor.  However, it is probably a wise idea +to maintain a table of monitors and their rated specifications, to allow +the users to select thieir monitor and determine whether or not the configured +video timings are within the rated specifications of their monitor and +warn the user about this.  There is a distinct need for a comprehensive +and accurate software-usable database of monitor specifications in a platform +and video hardware independent form with sufficient information for a program +to both select timings for a particular video mode, as well as verify that +a given set of timings will function properly on the end-user's hardware.  +This database should contain a human readable description of the monitor +make and model, as well as software parsable fields giving corresponding +ranges of horizontal and vertical frequencies and sync polarities for those +ranges if applicable, as well as a method of determining the acceptable +widths of horizontal and vertical sync pulses for a given frequency in +the corresponding rages.  Framing information could be included in +this table in a frequency independent fashion, although this is something +that can be safely adjusted by the end user without risk of damage to the +monitor, thus it is preferrable to provide a method or interface for the +end-user to adjust these parameters to their preference. +
      + +

    Notice: All trademarks used or referred to on this page are the property +of their respective owners. +
    All pages are Copyright © 1997, 1998, J. D. Neal, except where +noted. Permission for utilization and distribution is subject to the terms +of the FreeVGA Project Copyright License. +
      +
      + + diff --git a/specs/kbd/abnt-keypad.html b/specs/kbd/abnt-keypad.html new file mode 100644 index 0000000..a0224bf --- /dev/null +++ b/specs/kbd/abnt-keypad.html @@ -0,0 +1,88 @@ + + + + ABNT keypad + + +

    ABNT keyboard keypad layout

    +
    Key label and scancode (hex) and Linux keycode (decimal) +for a Brazilian ABNT keyboard keypad.
    +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Num
    Lock
    / * - 45 e0 45 37 4a 69 98 55 74
    7
    Home
    8
    Up
    9
    PgUp
    + 47 48 49 4e 71 72 73 78
    4
    Left
    5 6
    Right
    . 4b 4c 4d 7e 75 76 77 121
    1
    End
    2
    Down
    3
    PgDn
    Enter 4f 50 51 e0 1c 79 80 81 96
    0
    Ins
    ,
    Del
    52 53 82 83
    + + diff --git a/specs/kbd/amstrad-s.jpg b/specs/kbd/amstrad-s.jpg new file mode 100644 index 0000000..cd30a3d Binary files /dev/null and b/specs/kbd/amstrad-s.jpg differ diff --git a/specs/kbd/amstrad.jpg b/specs/kbd/amstrad.jpg new file mode 100644 index 0000000..b9dedc0 Binary files /dev/null and b/specs/kbd/amstrad.jpg differ diff --git a/specs/kbd/compaq_easy_access.jpg b/specs/kbd/compaq_easy_access.jpg new file mode 100644 index 0000000..cf6fb37 Binary files /dev/null and b/specs/kbd/compaq_easy_access.jpg differ diff --git a/specs/kbd/compaq_unkn-s.jpg b/specs/kbd/compaq_unkn-s.jpg new file mode 100644 index 0000000..43028b1 Binary files /dev/null and b/specs/kbd/compaq_unkn-s.jpg differ diff --git a/specs/kbd/compaq_unkn.jpg b/specs/kbd/compaq_unkn.jpg new file mode 100644 index 0000000..ea8e074 Binary files /dev/null and b/specs/kbd/compaq_unkn.jpg differ diff --git a/specs/kbd/ibm_rapid_access.jpg b/specs/kbd/ibm_rapid_access.jpg new file mode 100644 index 0000000..7b3bd3d Binary files /dev/null and b/specs/kbd/ibm_rapid_access.jpg differ diff --git a/specs/kbd/ibm_rapid_access_II.jpg b/specs/kbd/ibm_rapid_access_II.jpg new file mode 100644 index 0000000..467b8ee Binary files /dev/null and b/specs/kbd/ibm_rapid_access_II.jpg differ diff --git a/specs/kbd/imb5576-2.jpg b/specs/kbd/imb5576-2.jpg new file mode 100644 index 0000000..9e50ef4 Binary files /dev/null and b/specs/kbd/imb5576-2.jpg differ diff --git a/specs/kbd/jp106-with-scancodes.jpg b/specs/kbd/jp106-with-scancodes.jpg new file mode 100644 index 0000000..73aebfc Binary files /dev/null and b/specs/kbd/jp106-with-scancodes.jpg differ diff --git a/specs/kbd/jp106.jpg b/specs/kbd/jp106.jpg new file mode 100644 index 0000000..b4feb26 Binary files /dev/null and b/specs/kbd/jp106.jpg differ diff --git a/specs/kbd/jplaptop.jpg b/specs/kbd/jplaptop.jpg new file mode 100644 index 0000000..f4147da Binary files /dev/null and b/specs/kbd/jplaptop.jpg differ diff --git a/specs/kbd/lk201-k.gif b/specs/kbd/lk201-k.gif new file mode 100644 index 0000000..584faeb Binary files /dev/null and b/specs/kbd/lk201-k.gif differ diff --git a/specs/kbd/lk411-left.jpg b/specs/kbd/lk411-left.jpg new file mode 100644 index 0000000..4289e69 Binary files /dev/null and b/specs/kbd/lk411-left.jpg differ diff --git a/specs/kbd/lk411-right.jpg b/specs/kbd/lk411-right.jpg new file mode 100644 index 0000000..96366a2 Binary files /dev/null and b/specs/kbd/lk411-right.jpg differ diff --git a/specs/kbd/lk411-s.jpg b/specs/kbd/lk411-s.jpg new file mode 100644 index 0000000..be3595f Binary files /dev/null and b/specs/kbd/lk411-s.jpg differ diff --git a/specs/kbd/lk411.jpg b/specs/kbd/lk411.jpg new file mode 100644 index 0000000..fbf7bb2 Binary files /dev/null and b/specs/kbd/lk411.jpg differ diff --git a/specs/kbd/logitech-access.jpg b/specs/kbd/logitech-access.jpg new file mode 100644 index 0000000..94f6cef Binary files /dev/null and b/specs/kbd/logitech-access.jpg differ diff --git a/specs/kbd/logitech-internet-s.jpg b/specs/kbd/logitech-internet-s.jpg new file mode 100644 index 0000000..0e19a3b Binary files /dev/null and b/specs/kbd/logitech-internet-s.jpg differ diff --git a/specs/kbd/logitech-internet.jpg b/specs/kbd/logitech-internet.jpg new file mode 100644 index 0000000..5eb8588 Binary files /dev/null and b/specs/kbd/logitech-internet.jpg differ diff --git a/specs/kbd/m24.jpg b/specs/kbd/m24.jpg new file mode 100644 index 0000000..e808879 Binary files /dev/null and b/specs/kbd/m24.jpg differ diff --git a/specs/kbd/m24kbd.png b/specs/kbd/m24kbd.png new file mode 100644 index 0000000..d23719d Binary files /dev/null and b/specs/kbd/m24kbd.png differ diff --git a/specs/kbd/ms_office.jpg b/specs/kbd/ms_office.jpg new file mode 100644 index 0000000..d0f2b4a Binary files /dev/null and b/specs/kbd/ms_office.jpg differ diff --git a/specs/kbd/ncr-s.jpg b/specs/kbd/ncr-s.jpg new file mode 100644 index 0000000..5bcf80e Binary files /dev/null and b/specs/kbd/ncr-s.jpg differ diff --git a/specs/kbd/nokia-left.jpg b/specs/kbd/nokia-left.jpg new file mode 100644 index 0000000..21590fb Binary files /dev/null and b/specs/kbd/nokia-left.jpg differ diff --git a/specs/kbd/nokia-right.jpg b/specs/kbd/nokia-right.jpg new file mode 100644 index 0000000..2b26ed9 Binary files /dev/null and b/specs/kbd/nokia-right.jpg differ diff --git a/specs/kbd/nokia-s.jpg b/specs/kbd/nokia-s.jpg new file mode 100644 index 0000000..ec4b134 Binary files /dev/null and b/specs/kbd/nokia-s.jpg differ diff --git a/specs/kbd/nokia-top.jpg b/specs/kbd/nokia-top.jpg new file mode 100644 index 0000000..678df28 Binary files /dev/null and b/specs/kbd/nokia-top.jpg differ diff --git a/specs/kbd/nokia.jpg b/specs/kbd/nokia.jpg new file mode 100644 index 0000000..bcff1f6 Binary files /dev/null and b/specs/kbd/nokia.jpg differ diff --git a/specs/kbd/samsung-s.jpg b/specs/kbd/samsung-s.jpg new file mode 100644 index 0000000..b505fc0 Binary files /dev/null and b/specs/kbd/samsung-s.jpg differ diff --git a/specs/kbd/samsung.jpg b/specs/kbd/samsung.jpg new file mode 100644 index 0000000..bf25640 Binary files /dev/null and b/specs/kbd/samsung.jpg differ diff --git a/specs/kbd/scancodes-1.html b/specs/kbd/scancodes-1.html new file mode 100644 index 0000000..dff2811 --- /dev/null +++ b/specs/kbd/scancodes-1.html @@ -0,0 +1,418 @@ + + + + + Keyboard scancodes: Keyboard scancodes + + + + + +Next +Previous +Contents +


    +

    1. Keyboard scancodes

    + +

    The data from a keyboard comes mainly in the form of scancodes, +produced by key presses or used in the protocol with the computer. +( +Different codes are used by the keyboard +firmware internally, and there also exist several +sets of scancodes. +Here in this section we only talk about the default codes - those from +translated scancode set 2. Less common modes are discussed +below.) +Each key press and key release produces between 0 and 6 scancodes. +

    +

    1.1 Key release +

    + +

    Below I'll only mention the scancode for key press (`make'). +The scancode for key release (`break') is obtained from it +by setting the high order bit (adding 0x80 = 128). +Thus, Esc press produces scancode 01, Esc release +scancode 81 (hex). +For sequences things are similar: Keypad-/ gives e0 35 +when pressed, e0 b5 when released. Most keyboards will +repeat the make code (key down code) when the key repeats. Some will also +fake Shift down and Shift up events during the repeat. +

    The keys PrtSc/SysRq and Pause/Break are special. +The former produces scancode e0 2a e0 37 +when no modifier key is pressed simultaneously, e0 37 +together with Shift or Ctrl, but 54 together with (left or right) Alt. +(And one gets the expected sequences upon release. But see +below.) +The latter produces scancode sequence +e1 1d 45 e1 9d c5 +when pressed (without modifier) and nothing at all upon release. +However, together with (left or right) Ctrl, one gets +e0 46 e0 c6, +and again nothing at release. It does not repeat. +

    See +below for a report on keys +with a different behaviour. +

    There are many reports of laptops with badly debounced key-up events. +Thus, unexpected key-up events should probably be regarded as not +unusual, and be ignored. Another source of key-up events without +preceding key-down can be the +fake shift. +

    +

    1.2 Protocol scancodes +

    + +

    Most scancodes indicate a key press or release. +Some are used in the communication protocol. +

    +

    +

    +00 Keyboard error - see ff
    +aa BAT (Basic Assurance Test) OK
    +ee Result of echo command
    +f1 Some keyboards, as reply to command a4:Password not installed
    +fa Acknowledge from kbd
    +fc BAT error or Mouse transmit error
    +fd Internal failure
    +fe Keyboard fails to ack, please resend
    +ff Keyboard error
    + +
    +

    Three common causes for keyboard error are: +(i) several keys pressed simultaneously, +(ii) keyboard buffer overflow, +(iii) parity error on the serial line used by keyboard +and keyboard controller for communication. +The error reported is ff in +scancode mode 1, +and 00 in scancode modes 2 and 3. +If translation is on, both 00 and ff +are translated as ff. +

    Usually these codes have the protocol meaning. However, +they also occur as actual scancodes, especially when +prefixed by e0. +

    +

    1.3 Escape scancodes +

    + +

    The codes e0 and e1 introduce scancode sequences, +and are not usually used as isolated scancodes themselves +(but see +below). +

    (The prefix e0 was originally used for the grey duplicates +of keys on the original PC/XT keyboard. These days e0 is +just used to expand code space. The prefix e1 used for +Pause/Break indicated that this key sends the make/break sequence +at make time, and does nothing upon release.) +

    This, and the above, means that the values +00, 60, 61, 6e, 71, +7a, 7c, 7e, 7f +are unavailable to signify key presses (on a default keyboard). +Nevertheless they also occur as scancodes, see for example the +Telerate and +Safeway SW23 keyboards below. +

    Also other prefixes occur, see +below. +

    +Logitech uses an e2 prefix +for the codes sent by a pointing device integrated on the keyboard. +

    +

    +

    1.4 Ordinary scancodes +

    + +

    The scancodes in translated scancode set 2 are given in hex. +Between parentheses the keycap on a US keyboard. +The scancodes are given in order, grouped according +to groups of keys that are usually found next to each other. +

    00 is normally an error code +

    01 (Esc) +

    02 (1!), 03 (2@), 04 (3#), 05 (4$), +06 (5%E), 07 (6^), 08 (7&), +09 (8*), 0a (9(), 0b (0)), 0c (-_), +0d (=+), 0e (Backspace) +

    0f (Tab), 10 (Q), 11 (W), 12 (E), +13 (R), 14 (T), 15 (Y), +16 (U), 17 (I), 18 (O), +19 (P), 1a ([{), 1b (]}) +

    1c (Enter) +

    1d (LCtrl) +

    1e (A), 1f (S), 20 (D), 21 (F), +22 (G), 23 (H), 24 (J), 25 (K), +26 (L), 27 (;:), 28 ('") +

    29 (`~) +

    2a (LShift) +

    2b (\|), on a 102-key keyboard +

    2c (Z), 2d (X), 2e (C), 2f (V), +30 (B), 31 (N), 32 (M), 33 (,<), +34 (.>), 35 (/?), 36 (RShift) +

    37 (Keypad-*) or (*/PrtScn) on a 83/84-key keyboard +

    38 (LAlt), 39 (Space bar), +

    3a (CapsLock) +

    3b (F1), 3c (F2), 3d (F3), 3e (F4), +3f (F5), 40 (F6), 41 (F7), +42 (F8), 43 (F9), 44 (F10) +

    45 (NumLock) +

    46 (ScrollLock) +

    47 (Keypad-7/Home), 48 (Keypad-8/Up), +49 (Keypad-9/PgUp) +

    4a (Keypad--) +

    4b (Keypad-4/Left), 4c (Keypad-5), +4d (Keypad-6/Right), 4e (Keypad-+) +

    4f (Keypad-1/End), 50 (Keypad-2/Down), +51 (Keypad-3/PgDn) +

    52 (Keypad-0/Ins), 53 (Keypad-./Del) +

    54 (Alt-SysRq) on a 84+ key keyboard +

    55 is less common; occurs e.g. as F11 on a Cherry G80-0777 keyboard, +as F12 on a Telerate keyboard, +as PF1 on a Focus 9000 keyboard, and as FN on an IBM ThinkPad. +

    56 mostly on non-US keyboards. It is often an unlabelled key +to the left +or +to the right +of the left Alt key.
    +

    + + +
    + +
    + + +
    +

    57 (F11), 58 (F12) both on a 101+ key keyboard +

    59-5a-...-7f are less common. +Assignment is essentially random. +Scancodes 55-59 occur as F11-F15 on the +Cherry G80-0777 keyboard. +Scancodes 59-5c occur on the +RC930 keyboard. +X calls 5d `KEY_Begin'. +Scancodes 61-64 occur on a +Telerate keyboard. +Scancodes 55, 6d, 6f, 73, 74, +77, 78, 79, 7a, 7b, +7c, 7e occur on the +Focus 9000 keyboard. +Scancodes 65, 67, 69, 6b +occur on a +Compaq Armada keyboard. +Scancodes 66-68, 73 occur on the +Cherry G81-3000 keyboard. +Scancodes 70, 73, 79, 7b, 7d +occur on a +Japanese 86/106 keyboard. +

    Scancodes f1 and f2 occur on +Korean keyboards. +

    +

    1.5 Escaped scancodes +

    + +

    Apart from the Pause/Break key, that has an escaped sequence starting +with e1, the escape used is e0. Often, the codes +are chosen in such a way that something meaningful happens when +the receiver just discards the e0. +

    +

    +e0 1c (Keypad Enter) 1c (Enter)
    +e0 1d (RCtrl) 1d (LCtrl)
    +e0 2a (fake LShift) 2a (LShift)
    +e0 35 (Keypad-/) 35 (/?)
    +e0 36 (fake RShift) 36 (RShift)
    +e0 37 (Ctrl-PrtScn) 37 (*/PrtScn)
    +e0 38 (RAlt) 38 (LAlt)
    +e0 46 (Ctrl-Break) 46 (ScrollLock)
    +e0 47 (Grey Home) 47 (Keypad-7/Home)
    +e0 48 (Grey Up) 48 (Keypad-8/UpArrow)
    +e0 49 (Grey PgUp) 49 (Keypad-9/PgUp)
    +e0 4b (Grey Left) 4b (Keypad-4/Left)
    +e0 4d (Grey Right) 4d (Keypad-6/Right)
    +e0 4f (Grey End) 4f (Keypad-1/End)
    +e0 50 (Grey Down) 50 (Keypad-2/DownArrow)
    +e0 51 (Grey PgDn) 51 (Keypad-3/PgDn)
    +e0 52 (Grey Insert) 52 (Keypad-0/Ins)
    +e0 53 (Grey Delete) 53 (Keypad-./Del)
    + +
    +

    These escaped scancodes occur only on 101+ key keyboards. +The +Microsoft keyboard adds +

    +

    +e0 5b (LeftWindow)
    +e0 5c (RightWindow)
    +e0 5d (Menu)
    + +
    +

    Other escaped scancodes occur - see below under the individual keyboards. +

    +

    1.6 Fake shifts +

    + +

    The ten grey keys Insert, Home, PgUp, Delete, End, PgDn, +Up, Left, Down, Right are supposed to function regardless +of the state of Shift and NumLock keys. But for an old AT keyboard +the keypad keys would produce digits when Numlock was on or Shift +was down. Therefore, in order to fool old programs, +fake scancodes are sent: when LShift is down, and Insert is +pressed, e0 aa e0 52 is sent; +upon release of Insert e0 d2 e0 2a +is sent. In other words, a fake LShift-up and +fake LShift-down are inserted. +

    If the Shift key is released earlier than the repeated key, +then a real Shift-up code occurs (without preceding fake Shift-down) +so that a program ignoring e0 would see one more Shift-up +than Shift-down. +

    When NumLock is on, no fake Shifts are sent when Shift was down, +but fake Shifts are sent when Shift was not down. Thus, +with Numlock, if Insert is pressed, +e0 2a e0 52 is sent +and upon release e0 d2 e0 aa is sent. +The keyboard maintains a private NumLock mode, toggled when +NumLock is pressed, and set when the NumLock LED is set. +

    In the same way, when Shift is down, the Grey-/ key produces +fake Shift-up and fake Shift-down sequences. However, it does +not react to the state of NumLock. The purpose of course is to +fool programs that identify Grey-/ with ordinary /, so that they +don't treat Shift-Grey-/ like Shift-/, i.e., ?. +

    On a Toshiba notebook, the three Windows keys are treated like +the group of ten keys mentioned, and get fake shifts when +(left or right) Shift is down. They do not react to NumLock. +

    +

    1.7 Added non-fake shifts +

    + +

    On my 121-key +Nokia Data keyboard there are +function keys F1, ..., F24, where F1, ..., F12 send the expected codes +3b, ..., 58, and F13, ..., F24 send the same codes +together with the LShift code 2a. +Thus, F13 gives 2a 3b on press, +and bb aa on release. +Similarly, there are keys with added LCtrl code 1d. +But there are also keys with added fake shifts e0 2a. +

    +Delorie +reports that the "Preh Commander AT" keyboard with additional F11-F22 keys +treats F11-F20 as Shift-F1..Shift-F10 and F21/F22 as Ctrl-F1/Ctrl-F2; the +Eagle PC-2 keyboard with F11-F24 keys treats those additional keys +in the same way. +

    +

    1.8 Turbo Mode +

    + +

    On some motherboards the LCtrl-LAlt-GreyPlus and LCtrl-LAlt-GreyMinus +switch Turbo mode on/off, respectively. For these, the motherboard +may generate the same scancode sequence when the Turbo button is +pushed: Turbo Switch (High->Low): +1d 38 4a ce b8 9d +and Turbo Switch (Low->High): +1d 38 4e ce b8 9d. +

    Other peculiar combinations in this style include +LCtrl-LAlt-LShift-GreyMinus and LCtrl-LAlt-LShift-GreyPlus to turn +system cache off/on. +

    If Green PC system power saving mode is enabled in AMIBIOS Setup, +the AMI MegaKey keyboard controller recognizes the combinations +Ctrl-Alt-\ (put the system into immediate power down mode), +Ctrl-Alt-[ (disable the Green PC power savings mode temporarily), +Ctrl-Alt-] (enables the Green PC power down mode). +

    Thio Yu Jin <jin@singmail.com> complains that on his Toshiba 4010CDS +the Ctrl-Alt-Shift-T key combination brings up the Toshiba user manual. +(04 Mar 1999 - not April 1.) +

    +

    +

    1.9 Power Saving +

    + +

    +Microsoft recommends: "i8042-based keyboards should deploy the +following scan codes for power management buttons, i.e., POWER and SLEEP +buttons: +

    +

    + Set-1 make/break Set-2 make/break
    +
    +Power e0 5e / e0 de e0 37 / e0 f0 37
    +Sleep e0 5f / e0 df e0 3f / e0 f0 3f
    +Wake e0 63 / e0 e3 e0 5e / e0 f0 5e
    + +
    +

    The Power, Sleep, and Wake event scan codes are the i8042 equivalents +to the System Power Down, System Sleep, and System Wake Up HID usages". +

    Many keyboards have Power/Sleep/Wake keys that have to be +activated by a fourth key (unlabeled, or labeled FN): pressing +one of these four keys does not produce any scancodes, but +when the FN key is pressed simultaneously, the Power/Sleep/Wake +keys give the codes listed above. +

    +

    +

    1.10 Initializing special keyboards +

    + +

    Many keyboards have more keys and buttons than the standard ones. +Sometimes these additional keys produce scancode combinations +that were unused before. But on other keyboard such additional +keys do not produce any code at all, until some initializing +action is taken. +

    Sometimes that action consists of writing some bytes to keyboard +registers. See, for example, the +IBM Rapid Access keyboard, and the +Omnibook keyboard. +

    +

    1.11 Manipulating extra LEDs +

    + +

    Some keyboards have additional LEDs, and in a few cases we know +how to manipulate those. +

    The +Chicony keyboard needs command sequences +eb 00 xy, with +xy = 01 for the Moon LED and +xy = 02 for the zzZ LED. +

    The +IBM EZ Button keyboard needs +command sequences eb 00 xy, with +xy = 01 for the Msg LED, +xy = 02 for the CD LED, +xy = 04 for the Power LED, +xy = 10 for the Talk LED, and +xy = 20 for the Message Waiting LED. +

    The +IBM Rapid Access keyboard needs +command sequences eb 00 xy, with +xy = 04 for the Suspend LED and +xy = 20 for the Mute LED. +

    The +IBM Rapid Access keyboard II needs +the command sequences eb 71 and eb 70 +to switch the Standby LED on and off. +

    The +Logitech Internet Keyboard +has an additional amber LED. It is turned on by sending eb, +and then blinks about once a second. It is turned off again by ec. +

    +

    1.12 The laptop FN key +

    + +

    Laptops have no room for all nonsensical keys one usually find +on a regular keyboard. So, the number pad and other keys are +folded into the main part of the keyboard. A key without label, +or labelled FN is often used to modify the meaning of other keys. +This FN does not produce scancodes itself, it only modifies the +scancodes produced by other keys. +

    + +Neil Brown reports about his Dell Latitude D800 laptop that it has +five key combinations that do not produce proper break codes. +The five combinations FN+F2, FN+F3, FN+F10, FN+Down, FN+Up +(labelled Wireless, Brighter, Darker, Battery, CDEject) +produce make codes e0 08, e0 07, +e0 09, e0 05, e0 06, +respectively. The first three do not produce any break code. +The last two have a break code that is identical to the make code. +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-10.html b/specs/kbd/scancodes-10.html new file mode 100644 index 0000000..b3916b8 --- /dev/null +++ b/specs/kbd/scancodes-10.html @@ -0,0 +1,805 @@ + + + + + Keyboard scancodes: The AT keyboard controller + + + + + +Next +Previous +Contents +
    +

    10. The AT keyboard controller

    + +

    A user program can talk to the keyboard controller on the motherboard. +The keyboard controller can again talk to the keyboard. +

    When a key is pressed the keyboard sends the corresponding +keyboard scancode to the keyboard controller, and the keyboard controller +translates that and interrupts the CPU, allowing the CPU to read the result. +

    More detailed: when a key is pressed, the keyboard sends +a start bit (low), followed by 8 data bits for the keyboard scancode +of the key (least significant first), followed by an odd parity bit, +followed by a stop bit (high). +The keyboard controller reads the data and checks the parity. +If incorrect, retransmission is requested. If incorrect again +a parity error is reported. +If the time between request to send and start of transmission is greater +than 15 ms, or if the eleven bits are not received within 2ms, +a timeout is reported. +In both cases (parity error or timeout), the data byte is set to 0xff. +

    The keyboard controller has three 8-bit registers involved in +communication with the CPU: its input buffer, that can be written +by the CPU by writing port 0x60 or port 0x64; its output buffer, +that can be read by the CPU by reading from port 0x60; and the +status register, that can be read by the CPU by reading from port 0x64. +

    If the CPU writes to port 0x64, the byte is interpreted as a command byte. +If the CPU writes to port 0x60, the byte is interpreted as a data byte. +

    The keyboard controller has two 8-bit I/O ports involved in +communication with the keyboard: the +input port P1 (receiving input from the keyboard) +and the +output port P2 (for sending output +to the keyboard). +

    +

    10.1 The keyboard controller status register +

    + +

    The keyboard controller has an 8-bit status register. +It can be inspected by the CPU by reading port 0x64. +

    (Typically, it has the value 0x14: keyboard not locked, self-test completed.) +

    +

    +PARE +TIM +AUXB +KEYL +C/D +SYSF +INPB +OUTB
    + +
    +

    Bit 7: + Parity error +

    +

    +0: OK. +1: Parity error with last byte. +
    +

    Bit 6: + Timeout +

    +

    +0: OK. +1: Timeout. +On PS/2 systems: General timeout. +On AT systems: Timeout on transmission from keyboard to keyboard controller. +Possibly parity error (in which case both bits 6 and 7 are set). +
    +

    Bit 5: + Auxiliary output buffer full +

    +

    +On PS/2 systems: +Bit 0 tells whether a read from port 0x60 will be valid. +If it is valid, this bit 5 tells what data will be read from port 0x60. +0: Keyboard data. 1: Mouse data. +

    On AT systems: +0: OK. +1: Timeout on transmission from keyboard controller to keyboard. +This may indicate that no keyboard is present. +

    +

    Bit 4: + Keyboard lock +

    +

    +0: Locked. +1: Not locked. +
    +

    Bit 3: + Command/Data +

    +

    +0: Last write to input buffer was data (written via port 0x60). +1: Last write to input buffer was a command (written via port 0x64). +(This bit is also referred to as Address Line A2.) +
    +

    Bit 2: + System flag +

    +

    +Set to 0 after power on reset. +Set to 1 after successful completion of the keyboard controller self-test +(Basic Assurance Test, BAT). +Can also be set by command (see +below). +
    +

    Bit 1: + Input buffer status +

    +

    +0: Input buffer empty, can be written. +1: Input buffer full, don't write yet. +
    +

    Bit 0: + Output buffer status +

    +

    +0: Output buffer empty, don't read yet. +1: Output buffer full, can be read. +(In the PS/2 situation bit 5 tells whether the available data is +from keyboard or mouse.) +This bit is cleared when port 0x60 is read. +
    +

    +

    10.2 The keyboard controller command byte +

    + +

    The keyboard controller is provided with some RAM, for example 32 bytes, +that can be accessed by the CPU. The most important part of this RAM is +byte 0, the Controller Command Byte (CCB). It can be read/written by +writing 0x20/0x60 to port 0x64 and then reading/writing a data byte +from/to port 0x60. +

    This byte has the following layout. +

    +

    +0 +XLATE +ME +KE +IGNLK +SYSF +MIE +KIE
    + +
    +

    Bit 7: + Unused +

    +

    +Always 0. +
    +

    Bit 6: + Translate +

    +

    +0: No translation. +1: Translate keyboard scancodes, using the +translation table given above. +MCA type 2 controllers cannot set this bit to 1. In this case +scan code conversion is set using keyboard command 0xf0 to port 0x60. +
    +

    Bit 5: + Mouse enable +

    +

    +On an EISA or PS/2 system: 0: Enable mouse. 1: Disable mouse +by driving the clock line low. +On an ISA system: "PC Mode": 0: use 11-bit codes, check parity and do +scan conversion. +1: use 8086 codes, don't check parity and don't do scan conversion. +
    +

    Bit 4: + Keyboard enable +

    +

    +0: Enable keyboard. 1: Disable keyboard +by driving the clock line low. +
    +

    Bit 3: + Ignore keyboard lock +

    +

    +For PS/2: Unused, always 0. +For AT: +0: No action. 1: Force +bit 4 of the status register +to 1, "not locked". This is used for keyboard testing after power on. +Maybe only on older motherboards. +
    +

    Bit 2: + System flag +

    +

    +This bit is shown in +bit 2 of the status register. +A "cold reboot" is one with this bit set to zero. +A "warm reboot" is one with this bit set to one (BAT already completed). +This will influence the tests and initializations done by the POST. +
    +

    Bit 1: + Mouse interrupt enable +

    +

    +On an ISA system: unused, always 0. On an EISA or PS/2 system: +0: Do not use mouse interrupts. +1: Send interrupt request IRQ12 when the mouse output buffer is full. +
    +

    Bit 0: + Keyboard interrupt enable +

    +

    +0: Do not use keyboard interrupts. +1: Send interrupt request IRQ1 when the keyboard output buffer is full. +

    When no interrupts are used, the CPU has to poll bits 0 (and 5) +of the status register. +

    +

    +

    10.3 Keyboard controller commands +

    + +

    The CPU can command the keyboard controller by writing port 0x64. +Useful, generally available, keyboard commands are: +

    +

    +

    + +20 Read keyboard controller command byte
    + +60 Write keyboard controller command byte
    + +aa Self test
    + +ab Interface test
    + +ad Disable keyboard
    + +ae Enable keyboard
    + +c0 Read input port
    + +d0 Read output port
    + +d1 Write output port
    + +e0 Read test inputs
    + +fe System reset
    + +
    +

    Useful, generally available, mouse commands are: +

    +

    +

    + +a7 Disable mouse port
    + +a8 Enable mouse port
    + +a9 Test mouse port
    + +d4 Write to mouse
    + +
    +

    Obscure, probably obsolete, commands: +

    +

    +

    + +00-1f Read keyboard controller RAM
    + +20-3f Read keyboard controller RAM
    + +40-5f Write keyboard controller RAM
    + +60-7f Write keyboard controller RAM
    + +90-93 Synaptics multiplexer prefix
    + +90-9f Write Port13-Port10
    + +a0 Read copyright
    + +a1 Read firmware version
    + +a2 Switch speed
    + +a3 Switch speed
    + +a4 Check if password installed
    + +a5 Load password
    + +a6 Check password
    + +ac Diagnostic dump
    + +af Read keyboard version
    + +b0-b5 Reset keyboard controller line
    + +b8-bd Set keyboard controller line
    + +c1 Continuous input port poll, low
    + +c2 Continuous input port poll, high
    + +c8 Unblock lines P22 and P23
    + +c9 Block lines P22 and P23
    + +ca Read keyboard controller mode
    + +cb Write keyboard controller mode
    + +d2 Write keyboard output buffer
    + +d3 Write mouse output buffer
    + +dd Disable A20 address line
    + +df Enable A20 address line
    + +f0-ff Pulse output bit
    + +
    +

    Command 0x00-0x1f: + Read keyboard controller RAM +

    +

    +(AMIBIOS only) Aliases for 0x20-0x3f. +
    +

    Command 0x20-0x3f: + Read keyboard controller RAM +

    +

    +The last six bits of the command specify the RAM address to read. +The read data is placed into the output buffer, and can be read +by reading port 0x60. +On MCA systems, type 1 controllers can access all 32 locations; +type 2 controllers can only access locations 0, 0x13-0x17, 0x1d, 0x1f. +

    Location 0 is the +Command byte, see above. +

    Location 0x13 (on MCA) is nonzero when a password is enabled. +

    Location 0x14 (on MCA) is nonzero when the password was matched. +

    Locations 0x16-0x17 (on MCA) give two make codes to be discarded +during password matching. +

    +

    Command 0x40-0x5f: + Write keyboard controller RAM +

    +

    +(AMIBIOS only) Aliases for 0x40-0x5f. +
    +

    Command 0x60-0x7f: + Write keyboard controller RAM +

    +

    Command 0x90-0x93: + Synaptics routing prefixes +

    +

    +Prefix a PS/2 mouse command with one of these to talk to one of at most four +multiplexed devices. See also the +multiplexing handshake below. +

    Unfortunately, VIA also uses this command: +

    +

    Command 0x90-0x9f: + Write Port13-Port10 +

    +(VIA VT82C42) Write low nibble to Port13-Port10. +
    +

    Command 0xa0: + Read copyright +

    +

    +On some keyboard controllers: an ASCIZ copyright string +(possibly just NUL) is made available for reading via port 0x60. +On other systems: no effect, the command is ignored. +
    +

    Command 0xa1: + Read controller firmware version +

    +

    +On some keyboard controllers: a single ASCII byte is made available +for reading via port 0x60. +On other systems: no effect, the command is ignored. +
    +

    Command 0xa2: + Switch speed +

    +

    +(On ISA/EISA systems with AMI BIOS) +Reset keyboard controller lines P22 and P23 low. +These lines can be used for speed switching via the keyboard controller. +When done, the keyboard controller sends one garbage byte to the system. +
    +

    Command 0xa3: + Switch speed +

    +

    +(On ISA/EISA systems with AMI BIOS) +Set keyboard controller lines P22 and P23 high. +These lines can be used for speed switching via the keyboard controller. +When done, the keyboard controller sends one garbage byte to the system. +

    (Compaq BIOS: Enable system speed control.) +

    +

    Command 0xa4: + Check if password installed +

    +

    +On MCA systems: +Return 0xf1 (via port 0x60) when no password is installed, +return 0xfa when a password has been installed. +Some systems without password facility always return 0xf1. +

    (On ISA/EISA systems with AMI BIOS) +Write Clock = Low. +

    (Compaq BIOS: toggle speed.) +

    +

    Command 0xa5: + Load password +

    +

    +On MCA systems: +Load a password by writing a NUL-terminated string to port 0x60. +The string is in scancode format. +

    (On ISA/EISA systems with AMI BIOS) +Write Clock = High. +

    (Compaq BIOS: special read of P2, with bits 4 and 5 replaced: +Bit 5: 0: 9-bit keyboard, 1: 11-bit keyboard. +Bit 4: 0: outp-buff-full interrupt disabled, 1: enabled.) +

    +

    Command 0xa6: + Check password +

    +

    +On MCA systems: +When a password is installed: +Check password by matching keystrokes with the stored password. +Enable keyboard upon successful match. +

    (On ISA/EISA systems with AMI BIOS) +Read Clock. 0: Low. 1: High. +

    +

    Command 0xa7: + Disable mouse port +

    +

    +On MCA systems: disable the mouse (auxiliary device) +by setting its clock line low, and set +bit 5 +of the +Command byte. Now P23 = 1. +

    (On ISA/EISA systems with AMI BIOS) +Write Cache Bad. +

    +

    Command 0xa8: + Enable mouse port +

    +

    +On MCA systems: enable the mouse (auxiliary device), +clear +bit 5 of the +Command byte. Now P23 = 0. +

    (On ISA/EISA systems with AMI BIOS) +Write Cache Good. +

    +

    Command 0xa9: + Test mouse port +

    +

    +On MCA and other systems: test the serial link between +keyboard controller and mouse. The result can be read from port 0x60. +0: OK. +1: Mouse clock line stuck low. +2: Mouse clock line stuck high. +3: Mouse data line stuck low. +4: Mouse data line stuck high. +0xff: No mouse. +

    (On ISA/EISA systems with AMI BIOS) +Read Cache Bad or Good. 0: Bad. 1: Good. +

    +

    Command 0xaa: + Self test +

    +

    +Perform self-test. Return 0x55 if OK, 0xfc if NOK. +
    +

    Command 0xab: + Interface test +

    +

    +Test the serial link between keyboard controller and keyboard. +The result can be read from port 0x60. +0: OK. +1: Keyboard clock line stuck low. +2: Keyboard clock line stuck high. +3: Keyboard data line stuck low. +4: Keyboard data line stuck high. +0xff: General error. +
    +

    Command 0xac: + Diagnostic dump +

    +

    +(On some systems) +Read from port 0x60 sixteen bytes of keyboard controller RAM, +and the output and input ports and the controller's program status word. +
    +

    Command 0xad: + Disable keyboard +

    +

    +Disable the keyboard clock line and set +bit 4 +of the +Command byte. +Any keyboard command enables the keyboard again. +
    +

    Command 0xae: + Enable keyboard +

    +

    +Enable the keyboard clock line and clear +bit 4 +of the +Command byte. +
    +

    Command 0xaf: + Read keyboard version +

    +

    +(Award BIOS, VIA) +
    +

    Command 0xb0-0xb5,0xb8-0xbd: + Reset/set keyboard controller line +

    +

    +AMI BIOS: +Commands 0xb0-0xb5 reset a keyboard controller line low. +Commands 0xb8-0xbd set the corresponding keyboard controller line high. +The lines are P10, P11, P12, P13, P22 and P23, respectively. +(In the case of the lines P10, P11, P22, P23 this is on ISA/EISA systems only.) +When done, the keyboard controller sends one garbage byte to the system. +

    VIA BIOS: +Commands 0xb0-0xb7 write 0 to lines P10, P11, P12, P13, P22, P23, P14, P15. +Commands 0xb8-0xbf write 1 to lines P10, P11, P12, P13, P22, P23, P14, P15. +

    +

    Command 0xc0: + Read input port +

    +

    +Read the +input port (P1), +and make the resulting byte available to be read from port 0x60. +
    +

    Command 0xc1: + Continuous input port poll, low +

    +

    +(MCA systems with type 1 controller only) +Continuously copy bits 3-0 of the input port to be read from bits 7-4 +of port 0x64, until another keyboard controller command is received. +
    +

    Command 0xc2: + Continuous input port poll, high +

    +

    +(MCA systems with type 1 controller only) +Continuously copy bits 7-4 of the input port to be read from bits 7-4 +of port 0x64, until another keyboard controller command is received. +
    +

    Command 0xc8: + Unblock keyboard controller lines P22 and P23 +

    +

    +(On ISA/EISA systems with AMI BIOS) +After this command, the system can make lines P22 and P23 low/high +using +command 0xd1. +
    +

    Command 0xc9: + Block keyboard controller lines P22 and P23 +

    +

    +(On ISA/EISA systems with AMI BIOS) +After this command, the system cannot make lines P22 and P23 low/high +using +command 0xd1. +
    +

    Command 0xca: + Read keyboard controller mode +

    +

    +(AMI BIOS, VIA) +Read keyboard controller mode to bit 0 of port 0x60. +0: ISA (AT) interface. +1: PS/2 (MCA)interface. +
    +

    Command 0xcb: + Write keyboard controller mode +

    +

    +(AMI BIOS) +Write keyboard controller mode to bit 0 of port 0x60. +0: ISA (AT) interface. +1: PS/2 (MCA)interface. +(First read the mode using command 0xca, then modify only +the last bit, then write the mode using this command.) +
    +

    Command 0xd0: + Read output port +

    +

    +Read the +output port (P2) +and place the result in the output buffer. +Use only when output buffer is empty. +
    +

    Command 0xd1: + Write output port +

    +

    +Write the +output port (P2). +Note that writing a 0 in bit 0 will cause a hardware reset. +

    (Compaq: the system speed bits are not set. Use commands 0xa1-0xa6 for that.) +

    +

    Command 0xd2: + Write keyboard output buffer +

    +

    +(MCA) +Write the keyboard controllers output buffer with the byte +next written to port 0x60, and act as if this was keyboard data. +(In particular, raise IRQ1 when +bit 0 +of the +Command byte says so.) +
    +

    Command 0xd3: + Write mouse output buffer +

    +

    +(MCA) +Write the keyboard controllers output buffer with the byte +next written to port 0x60, and act as if this was mouse data. +(In particular, raise IRQ12 when +bit 1 +of the +Command byte says so.) +

    Not all systems support this. +

    + Synaptics multiplexing +On the other hand, Synaptics (see +ps2-mux.PDF) +uses this command as a handshake between driver and controller: +if the driver gives this command three times, with data bytes +0xf0, 0x56, 0xa4 respectively, and reads 0xf0, 0x56, but not 0xa4 +back from the mouse output buffer, then the driver knows that the +controller supports Synaptics AUX port multiplexing, and the controller +knows that it does not have to do the usual data faking and goes +into multiplexed mode. The third byte read is the version of the +Synaptics standard. +

    There is a corresponding deactivation sequence, namely +0xf0, 0x56, 0xa5. (And again the last byte is changed to the +version number of the standard supported.) +This latter sequence works both in multiplexed mode and in legacy mode +and can thus be used to determine whether this feature is present +without activating it. +

    See also the multiplexer commands +0x90-0x93. +

    For some laptops it has been reported that bit 3 of every third +mouse byte is forced to 1 (as it would be with the standard +3-byte mouse packets). This may turn 0xf0, 0x56, 0xa4 into +0xf0, 0x56, 0xac and cause misdetection of Synaptics multiplexing +(for version 10.12). +

    +

    Command 0xd4: + Write to mouse +

    +

    +(MCA) +The byte next written to port 0x60 is transmitted to the mouse. +
    +

    Command 0xdd: + Disable A20 address line +

    +

    +(HP Vectra) +
    +

    Command 0xdf: + Enable A20 address line +

    +

    +(HP Vectra) +
    +

    Command 0xe0: + Read test inputs +

    +

    +This command makes the status of the +Test inputs T0 and T1 available +to be read via port 0x60 in bits 0 and 1, respectively. +Use only when the output port is empty. +
    +

    +

    Command 0xf0-0xff: + Pulse output bit +

    +

    +Bits 3-0 of the +output port P2 +of the keyboard controller may be pulsed low for approximately 6 µseconds. +Bits 3-0 of this command specify the output port bits to be pulsed. +0: Bit should be pulsed. +1: Bit should not be modified. +The only useful version of this command is Command 0xfe. +(For MCA, replace 3-0 by 1-0 in the above.) +
    +

    Command 0xfe: + System reset +

    +

    +Pulse bit 0 of the +output port P2 +of the keyboard controller. This will reset the CPU. +
    +

    +

    10.4 The input port P1 +

    + +

    This has the following layout. +

    +

    +bit 7 Keyboard lock 0: locked, 1: not locked
    +bit 6 Display 0: CGA, 1: MDA
    +bit 5 Manufacturing jumper 0: installed, 1: not installed
    + with jumper the BIOS runs an infinite diagnostic loop
    +bit 4 RAM on motherboard 0: 512 KB, 1: 256 KB
    +bit 3   Unused in ISA, EISA, PS/2 systems
    +   Can be configured for clock switching
    +bit 2   Unused in ISA, EISA, PS/2 systems
    +   Can be configured for clock switching
    + Keyboard power PS/2 MCA: 0: keyboard power normal, 1: no power
    +bit 1 Mouse data in Unused in ISA
    +bit 0 Keyboard data in Unused in ISA
    + +
    +

    Clearly only bits 1-0 are input bits. +Of the above, the original IBM AT used bits 7-4, while PS/2 MCA systems +use only bits 2-0. +

    Where in the above lines P10, P11, etc are used, these refer to the pins +corresponding to bit 0, bit 1, etc of port P1. +

    +

    10.5 The output port P2 +

    + +

    This has the following layout. +

    +

    +bit 7 Keyboard data data to keyboard
    +bit 6 Keyboard clock
    +bit 5 IRQ12 0: IRQ12 not active, 1: active
    +bit 4 IRQ1 0: IRQ1 not active, 1: active
    +bit 3 Mouse clock Unused in ISA
    +bit 2 Mouse data Unused in ISA. Data to mouse
    +bit 1 A20 0: A20 line is forced 0, 1: A20 enabled
    +bit 0 Reset 0: reset CPU, 1: normal
    + +
    +

    Where in the above lines P20, P21, etc are used, these refer to the pins +corresponding to bit 0, bit 1, etc of port P2. +

    +

    10.6 The test port T +

    + +

    bit 0 +

    +

    +Keyboard clock (input). +
    +

    bit 1 +

    +

    +(AT) Keyboard data (input). +(PS/2) Mouse clock (input). +
    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-11.html b/specs/kbd/scancodes-11.html new file mode 100644 index 0000000..8942da9 --- /dev/null +++ b/specs/kbd/scancodes-11.html @@ -0,0 +1,280 @@ + + + + + Keyboard scancodes: Keyboard commands + + + + + +Next +Previous +Contents +
    +

    11. Keyboard commands

    + +

    One can not only talk to the keyboard controller (by writing to +port 0x64), but also to the keyboard (by writing to port 0x60). +

    In order to avoid interference between scancode sequences or +mouse packets and the reponses given to commands, the keyboard +or mouse should always be disabled before giving a command that +requires a response, and probably enabled afterwards. +Some keyboards or mice do the disable automatically in this +situation, but still require an explicit enable afterwards. +

    Each command (other than 0xfe) is ACKed by 0xfa. +Each unknown command is NACKed by 0xfe. +Some mice expect a corrected byte as reply to the 0xfe, +and will double-NACK with 0xfc when also that is wrong. +

    Here a list with the common commands. +

    +

    +

    +0xed Write LEDs
    +0xee Diagnostic echo
    +0xf0 Set/Get scancode set
    +0xf2 Read keyboard ID
    +0xf3 Set repeat rate and delay
    +0xf4 Keyboard enable
    +0xf5 Set defaults and disable keyboard
    +0xf6 Set defaults
    +0xf7 Set all keys to repeat
    +0xf8 Set all keys to give make/break codes
    +0xf9 Set all keys to give make codes only
    +0xfa Set all keys to repeat and give make/break codes
    +0xfb Set a single key to repeat
    +0xfc Set a single key to give make/break codes
    +0xfd Set a single key to give make codes only
    +0xfe Resend
    +0xff Keyboard reset
    + +
    +

    If the command is preceded by writing 0xd4 to port 0x64, then +it goes to the mouse instead of the keyboard. Common commands: +

    +

    +

    +0xe6 Set mouse scaling to 1:1
    +0xe7 Set mouse scaling to 2:1
    +0xe8 Set mouse resolution
    +0xe9 Get mouse information
    +0xf2 Read mouse ID
    +0xf3 Set mouse sample rate
    +0xf4 Mouse enable
    +0xf5 Mouse disable
    +0xf6 Set defaults
    +0xff Mouse reset
    + +
    +

    +

    11.1 Keyboard command details +

    + +

    +

    Command e8: Nonstandard. Reported to give a +2-byte ID on an +OmniKey keyboard. +

    Command ea: Nonstandard. The sequences +ea 70 and ea 71 are +used by some IBM keyboards to disable and enable extra keys. +

    Command eb: Nonstandard. Sequences involving eb +are often used for +manipulating extra LEDs. +

    Command ec: Nonstandard. On the +IBM Rapid Access keyboard +this command yields a 2-byte ID. +

    Command ed: + Write LEDs +

    +

    +This command is followed by a byte indicating the desired LEDs setting. +Bits 7-3: unused, 0. +Bit 2: 1: CapsLock LED on. +Bit 1: 1: NumLock LED on. +Bit 0: 1: ScrollLock LED on. +When OK, both bytes are ACKed. If the second byte is recognized as a +command, that command is ACKed and done instead. Otherwise a NACK is +returned (and a keyboard enable may be needed). +
    +

    Command ee: + Diagnostic echo +

    +

    +This command returns a single byte, again ee. +
    +

    Command f0: + Set/Get scancode set +

    +

    +Many, but not all, keyboards can be switched to three different +scancode sets. +This command, followed by a byte 01, 02, or 03 +selects the corresponding scancode set. This command, followed by +a zero byte, reads the current scancode set. The reply (translated) +is 43, 41 or 3f, from untranslated 1, 2 or 3. +Note that scancode set 1 should not be translated, while sets +2 and 3 should be translated. +

    Set 2 was introduced by the AT. Set 3 by the PS/2. +

    +

    Command f2: + Read keyboard ID +

    +

    +This command reads a 2-byte +keyboard ID. +XT keyboards do not answer at all (of course), +AT keyboards reply with an ACK (fa) only, +MF2 and other keyboards reply with a 2-byte ID. +Wait at least 10ms after issuing this command. +

    For the mouse reply, see +below. +

    +

    Command f3: + Set repeat rate and delay +

    +

    +A following byte gives the desired delay before a pressed key +starts repeating, and the repeat rate. +

    Bit 7: unused, 0. +

    Bits 6-5: 0, 1, 2, 3: 250, 500, 750, 1000 ms delay. +Default after reset is 500 ms. +

    Bits 4-0: inter-character delay. The number of characters per second +is given by +

    +

    + 0 1 2 3 4 5 6 7
    +0 30.0 26.7 24.0 21.8 20.0 18.5 17.1 16.0
    +8 15.0 13.3 12.0 10.9 10.0 9.2 8.6 8.0
    +16 7.5 6.7 6.0 5.5 5.0 4.6 4.3 4.0
    +24 3.7 3.3 3.0 2.7 2.5 2.3 2.1 2.0
    + +
    +

    (that is, the inter-character delay is (2 ^ B) * (D + 8) / 240 sec, +where B gives Bits 4-3 and D gives Bits 2-0). +

    Default after reset is 10.9 characters per second. +

    Logitech extended commands +Logitech uses escape sequences involving f3 for extended commands. +A Logitech extended command looks like +f3 7f f3 00 f3 xx +(for varying 7-bit values of xx). For example: +

    xx = 01: SendStatus: send the E1 XX codes for SubDeviceType, +BatteryStatus, (Channel if relevant) KbdStatus (=wireless status). +

    xx = 02: OpenLocking +

    xx = 03: CloseLocking +

    xx = 06 f3 aa: +Read byte at address aa (in 0x01-0x1e). +

    xx = 07 F3 aa f3 dd: +Write dd at address aa (in 0x01-0x1e). +

    xx = 10 or 11: Clear all device-related data +in EEPROM and RAM. Now device is disconnected. +

    +

    Command f4: + Keyboard enable +

    +

    +If a transmit error occurs, the keyboard is automatically disabled. +This command re-enables the keyboard and clears its internal 16-byte +buffer. +
    +

    Command f5: + Set defaults and +disable keyboard +

    +

    +Reset keyboard, clear output buffer, switch off LEDs, reset +repeat rate and delay to defaults. Disable the keyboard scan. +
    +

    Command f6: + Set defaults +

    +

    +Reset keyboard, clear output buffer, switch off LEDs, reset +repeat rate and delay to defaults. +
    +

    Command f7: + Set all keys to repeat +

    +

    +Keyboards that support scancode Set 3 keep for each key two bits: +does it repeat? does it generate a break code? +This command sets the "repeat" bit for all keys. +It does not influence keyboard operation when the scancode set is not Set 3. +
    +

    Command f8: + Set all keys to give make/break +codes +

    +

    +This command sets the "generate break code" bit for all keys. +It does not influence keyboard operation when the scancode set is not Set 3. +
    +

    Command f9: + Set all keys to give +make codes only +

    +

    +This command clears the "generate break code" bit for all keys. +It does not influence keyboard operation when the scancode set is not Set 3. +
    +

    Command fa: + Set all keys to repeat +and give make/break codes +

    +

    +This command sets the "repeat" and "generate break code" bits for all keys. +It does not influence keyboard operation when the scancode set is not Set 3. +
    +

    Command fb: + Set some keys to repeat +

    +

    +This command sets the "repeat" bits for the indicated keys. +It is followed by the untranslated Set 3 scancodes of the keys +for which this bit must be set. The sequence is ended by a command +code (ed, ee, f0, f2-ff). +Afterwards, a "keyboard enable" f4 is required. +
    +

    Command fc: + Set some keys to give make/break +codes +

    +

    +This command sets the "generate break code" bits for the indicated keys. +It is followed by the untranslated Set 3 scancodes of the keys +for which this bit must be set. The sequence is ended by a command +code (ed, ee, f0, f2-ff). +Afterwards, a "keyboard enable" f4 is required. +
    +

    Command fd: + Set some keys to give make codes +only +

    +

    +This command clears the "generate break code" bits for the indicated keys. +It is followed by the untranslated Set 3 scancodes of the keys for which +this bit must be set. The sequence is ended by a recognized command code +(such as ed, ee, f0, f2-ff). +Afterwards, a "keyboard enable" f4 is required. +
    +

    Command fe: + Resend +

    +

    +Meant for use by the keyboard controller after a transmission error. +Not for use by the CPU. +
    +

    Command ff: + Keyboard reset +

    +

    +Reset and self-test. +The self-test (BAT) will return aa when OK, and fc otherwise. +As part of the self-test, all LEDs are flashed. +
    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-12.html b/specs/kbd/scancodes-12.html new file mode 100644 index 0000000..4530a44 --- /dev/null +++ b/specs/kbd/scancodes-12.html @@ -0,0 +1,502 @@ + + + + + Keyboard scancodes: The PS/2 Mouse + + + + + +Next +Previous +Contents +
    +

    12. The PS/2 Mouse

    + +

    +

    Mice come in various flavours - serial mice, PS/2 mice, busmice, USB mice. +Below a little about mice using the PS/2 protocol, since these also use +the keyboard controller. +

    A mouse has a number of buttons (1-5 is common) and must report +button presses. It has some way of detecting motion, and must report +the amount of movement in the X and Y direction, usually as differences +with the previously reported position, in a (dx,dy) pair. +Touchpads can also report absolute position. +

    Reports come in the form of mouse packets of between 1 and 8 bytes. +Various protocols are in use. +

    +

    12.1 Modes +

    + +

    A PS/2 mouse can be in stream mode (the default). +In this mode it produces a stream of packets indicating mouse movements +and button presses. Or it can be in remote mode. +In this mode the mouse only sends a packet when the host +requests one, using the +eb command. +Finally, it can be in echo ("wrap") mode, +in which everything the host sends is echoed back, until +either a reset (ff) or clear echo mode (ec) +is received. +

    +

    12.2 Scaling +

    + +

    Scaling can be set to 1:1 or 2:1. This affects stream mode only. +In 2:1 scaling: If the unscaled absolute value of dx or dy is 6 or more, +it is doubled. Otherwise, for the unscaled value 0,1,2,3,4,5,6, the +scaled value 0,1,1,3,6,9,12 is sent. +

    +

    12.3 PS/2 mouse protocol +

    + +

    +

    The default protocol

    + +

    The standard PS/2 protocol uses 3-byte packets, as follows: +

    +

    +Yovfl Xovfl dy8 dx8 1 Middle Btn Right Btn Left Btn
    +dx7 dx6 dx5 dx4 dx3 dx2 dx1 dx0
    +dy7 dy6 dy5 dy4 dy3 dy2 dy1 dy0
    + +
    +

    It gives the movement in the X and Y direction in 9-bit two's complement +notation (range -256 to +255) and an overflow indicator. +It also gives the status of the three mouse buttons. +When this protocol is used, the f2 Read mouse ID command +is answered by 00. +

    +

    Intellimouse

    + +

    The Microsoft Intellimouse uses the above protocol until scrolling wheel +mode is activated by sending the magic sequence +f3 c8 f3 64 f3 50 +(set sample rate 200, 100, 80). In this mode, the Read mouse ID command +returns 03, and 4-byte packets are used: +

    +

    +Yovfl Xovfl dy8 dx8 1 Middle Btn Right Btn Left Btn
    +dx7 dx6 dx5 dx4 dx3 dx2 dx1 dx0
    +dy7 dy6 dy5 dy4 dy3 dy2 dy1 dy0
    +dz3 dz3 dz3 dz3 dz3 dz2 dz1 dz0
    + +
    +

    Here the last byte gives the movement of the scrolling wheel in +4-bit two's complement notation (range -8 to +7) and the leading +four bits are just copies of the sign bit. +

    +

    Intellimouse Explorer mouse

    + +

    The Explorer mouse protocol allows for scrolling wheel and five buttons. +It is activated by first sending the magic sequence for Intellimouse, +and then, when the Intellimouse ID has been seen, sending the magic sequence +f3 c8 f3 c8 f3 50 +(set sample rate 200, 200, 80). In this mode, the Read mouse ID command +returns 04, and 4-byte packets are used: +

    +

    +Yovfl Xovfl dy8 dx8 1 Middle Btn Right Btn Left Btn
    +dx7 dx6 dx5 dx4 dx3 dx2 dx1 dx0
    +dy7 dy6 dy5 dy4 dy3 dy2 dy1 dy0
    +0 0 5th Btn 4th Btn dz3 dz2 dz1 dz0
    + +
    +

    Lots of other protocols occur, and only incomplete data is known +about most of them. Some examples. +

    +

    Typhoon mouse

    + +

    The Typhoon optical mouse is reported to send 6-byte packets. +Bytes 1-3 are as for the default PS/2 protocol. +Byte 4 equals byte 1. Byte 5 gives the Z axis movement, one of +ff, 00, 01. Byte 6 is 0. +Of course the idea is that this packet looks like two ordinary packets +and ordinary PS/2 mouse drivers will handle it. +The 6-byte mode is activated by sending the magic sequence +f3 c8 f3 64 f3 50 +f3 3c f3 28 f3 14 +(set sample rate 200, 100, 80, 60, 40, 20). +It is recognized by the ID 08. +

    +

    12.4 Mouse Commands +

    + +

    Every command or data byte sent to the mouse (except for the +resend command fe) is ACKed with fa. +If the command or data is invalid, it is NACKed with fe. +If the next byte is again invalid, the reply is ERROR: fc. +

    +

    Command d0: Read extended ID +

    Read up to 256 bytes. +

    Commands d1-df: Vendor unique commands +

    +

    Command d1: Logitech PS/2++ command +

    This command was to be used, followed by an arbitrary data sequence. +Now replaced by the +sliced commands using +e8. +

    Command e1: Read secondary ID +

    +

    +Replies with two bytes. +An IBM TrackPoint returns 01 as first byte, +and a second byte depending on the model. +
    +

    Command e2: IBM TrackPoint command +

    +

    +Followed by several parameter bytes. For details, see +ykt3dext.pdf. +
    +

    Command e6: + Set mouse scaling to 1:1 +

    +

    +Often ingredient in magic sequences. +
    +

    Command e7: + Set mouse scaling to 2:1 +

    +

    +Often ingredient in magic sequences. +
    +

    Command e8: + Set mouse resolution +

    +

    +This command is followed by a byte indicating the resolution +(0, 1, 2, 3: 1, 2, 4, 8 units per mm, respectively). +It is used in magic sequences to transport two bits, +so that four of these are needed to send a byte to the mouse. +See +below. +
    +

    Command e9: + Status request +

    +

    +This command returns three bytes: +

    First a status byte: +Bit 7: unused, 0. +Bit 6: 0: +stream mode, +1: +remote mode. +Bit 5: 0: disabled, 1: enabled. +Bit 4: 0: scaling set to 1:1, 1: scaling set to 2:1. +Bit 3: unused, 0. +Bit 2: 1: left button pressed. +Bit 1: 1: middle button pressed. +Bit 0: 1: right button pressed. +

    Then a resolution byte: +0, 1, 2, 3: 1, 2, 4, 8 units per mm, respectively. +

    Finally a sample rate (in Hz). +

    See below for special +Synaptics Touchpad handling. +

    +

    Command ea: Set +stream mode +

    +

    Command eb: + Read data +

    +

    +Read a mouse packet. +Needed in +remote mode to ask the mouse for data. +Also functions in +stream mode. +
    +

    Command ec: Clear +echo mode +

    +

    Command ee: Set +echo mode +

    +

    Command f0: Set +remote mode +

    +

    Command f2: + Read mouse ID +

    +

    +(Only supported on some systems.) +This command reads a 1-byte mouse ID. The reply is a single byte 00. +Wait at least 10ms after issuing this command. +

    For the keyboard reply, see +above. +

    BallPoint (trackball) devices return a single byte 02, +Intellimouse returns 03, +Explorer Mouse returns 04, +4d Mouse returns 06, +4dplus Mouse returns 08,as does the Typhoon mouse. +

    +

    Command f3: + Set mouse sample rate +

    +

    +(Only supported on some systems.) +Set mouse sample rate in Hz. +If the given sampling rate is acceptable the ACK is fa. +Otherwise the NACK is fe, and the host can correct. +If it is incorrect again fc is sent. +Correct values are, e.g., 10, 20, 40, 60, 80, 100, 200. +
    +

    Command f4: + Mouse enable +

    +

    +The stream mode mouse data reporting is disabled after a reset and after +the +disable command. This command enables it again. +
    +

    Command f5: + Mouse disable +

    +

    +This stops mouse data reporting in +stream mode. +In stream mode, this command should be sent before sending any other commands. +
    +

    Command f6: + Set defaults +

    +

    +If this command is recognized, a reset is done (set sampling rate 100 Hz, +resolution 4 counts/mm, +stream mode, +disabled, scaling 1:1), but no diagnostics are performed. +For some enhanced mice that require a magic sequence to get into +enhanced mode, this command will reset them to default PS/2 mode. +
    +

    Command fe: Resend +

    +

    +If this command is recognized, the last mouse packet (possibly several bytes) +is resent. There is no ACK to this command, but if the last reply was ACK, +it is sent. +
    +

    Command ff: + Mouse reset +

    +

    +A self-test is performed. When OK, the response is aa 00. +On error, the response is fc 00. +The mouse is reset to default PS/2 mode. +
    +

    +

    12.5 Sliced parameters +

    + +

    For more advanced mouse modes it is necessary to send data to the mouse. +There is now a commonly accepted way. +

    First Logitech tried to use the d1 command followed by an +arbitrary data sequence. +While the IBM specs reserve d1-df for vendor unique commands, +it turns out that not all BIOSes will transmit such codes. +So Logitech drops the d1 and uses the sequence +e8 aa e8 bb e8 cc +e8 dd to transmit the byte aabbccdd, where +aa, bb, cc, dd are 2-bit quantities. +In this way an arbitrarily long sequence of bytes can be transmitted. +

    For synchronization purposes it is possible to separate such groups +of four e8 commands by an e6 command. +Indeed, such separation may be required: Synaptics Touchpads react to +e9 or f3 commands preceded by precisely four +e8 commands. +

    +

    Magic knock

    + +

    For example, the "magic knock" d1 39 db +that sets a device that understands it in PS/2++ mode, +becomes e8 00 e8 03 +e8 02 e8 01 e6 +e8 03 e8 01 +e8 02 e8 03, +abbreviated {E8}0321 {E6} {E8}3123. +Note that 0321 and 3123 do not have repeated symbols. If they had, +too intelligent intermediate hardware transmitting these sequences +might see a superfluous command and suppress it. +

    +

    Magic unknock

    + +

    PS/2++ mode is cleared again by the "magic unknock" +{E8} 0323 or D1 3B from an external device, and +{E8} 0321 or D1 39 from an internal device. +(These commands differ so that in setups where the same commands are +sent to internal and external devices, they can be commanded separately.) +

    For a decription of the PS/2++ format, see +ps2ppspec.htm. +

    +

    12.6 Synaptics Touchpad +

    + +

    A few sketchy details. For nice precise information, get +the +Synaptics interfacing guide. +

    +

    Status request

    + +

    When preceded by an 8-bit request number encoded via four + +e8 +commands, the +e9 status request +returns modified output, somewhat dependent on the Touchpad model. +

    +

    Request 00: Identify Touchpad +

    This request returns three bytes, of which the middle one +is the constant 47. This is the way to recognize +a Touchpad. The low order four bits of the third word contain +the major model version number, the first word contains the +minor version number, and the high order four bits of the +third word contain the (obsolete) model code. +

    +

    Request 01: Read Touchpad Modes +

    This request returns three bytes, of which the first two +are the constants 3b and 47. +The last byte is the mode byte +

    +

    +ABS Rate - - Baud/Sleep DisGest PackSize Wmode
    + +
    +

    Here ABS indicates absolute mode (instead of the default +relative mode). +

    Rate is 0 for 40 packets/sec, 1 for 80 packets/sec. +The PS/2 sampling rate value is ignored. +

    Baud/Sleep indicates the baud rate when used with a serial protocol +(0: 1200 baud, 1: 9600 baud). It must be set whenever ABS or Rate is set. +When used with the PS/2 protocol this bit indicates sleep mode - +a low power mode in which finger activity is ignored and only button +presses are reported. +

    DisGest is the "disable gestures" bit. When set, we have classical +mouse behaviour. When cleared, "tap" and "drag" processing is enabled. +

    PackSize is used for the serial protocol only (and then chooses between +6-, 7- and 8-byte packets, also depending on the Wmode bit). +

    Wmode is used in absolute mode only. When set the packets also +contain the W value. (This value indicates the amount of contact: +0: two-finger contact, 1: three-finger contact, 2: pen contact, +3: reserved, 4-7: ordinary finger contact, 8-15: wide finger or palm contact.) +

    This described Touchpad 4.x. Earlier models had up to four mode bytes. +This request would return mode bytes 1 and 2 in the first and last result byte, +and request 02 would return mode bytes 3 and 4. +

    +

    Request 02: Read Capabilities +

    This request returns three bytes, of which the middle one is +the constant 47. The first and third byte are the high-order +and low-order parts of the capability word. +(Thus on Touchpad 4.x. On earlier models mode bytes 3 and 4 are returned.) +

    This capability word has 16 bits. Bit 15 indicates that capabilities +are supported. Bit 4 indicates that Sleep is supported (for the PS/2 +protocol). Bit 3 indicates that four buttons (Left, Right, Up, Down) +are supported. Bit 1 indicates that multi-finger detection is supported. +Bit 0 indicates that palm detection is supported. +

    +

    Request 03: Read Model ID +

    +

    Request 06: Read Serial Number Prefix +

    +

    Request 07: Read Serial Number Suffix +

    +

    Request 08: Read Resolution +

    +

    +

    Mode setting

    + +

    When preceded by an 8-bit request number encoded via four e8 +commands, the +f3 14 +(set sample rate 20) command sets the mode byte to the +encoded number. (Thus on Touchpads 4.x. Older models have more mode +bytes and several such commands.) +

    +

    +

    12.7 Vendor extensions +

    + +

    There is a complicated forest of "magic sequences" that enable +vendor extensions. Recognizing all of these is a very obscure activity. +

    (Moreover, recognizing these may be counterproductive: +if the mouse has special capabilities which are activated +by a special sequence, and it is connected to the computer +via a KVM switch that does not know about this special protocol, +then switching away and back will leave the mouse in the non-special +state. This leads to non-functioning mice.) +

    A 2002 Logitech file describes the following procedure for recognizing +the mouse type: +

    Stage 1: Send ff: reset. +The reply is ignored. (Most common is aa 00.) +

    Stage 2: Send f3 0a f2: set sample rate +and ask for ID. If the reply is 02, we have a trackball - +it has its own protocol. (The usual reply is 00.) +

    Stage 3: Send e8 00 e6 e6 e6 +e9: set resolution and scaling (three times), and request status. +The reply consists of three bytes s1 s2 s3. +An old-fashioned mouse would report 0 in the second status byte s2 +(since that is the resolution and we just set it). +

    If s2 is nonzero then: s2 is the number of buttons, +s3 is the firmware revision, +s1 has the firmware ID (device type) bits 6-0 in bits 3-0,6-4, +while bit 7 of s1 indicates support for the +e7 e7 e7 e9 command. +

    If s1=d0 and s2=03 and +s3=c8, suspect Synaptics. +

    If s1 and s2 are zero but s3 equals 0a, +suspect Alps. (s3=0a is as expected, but s1=0 +is not) +

    Stage 4: If bit 7 of s1 is set, or if we suspect Alps, +send e8 00 e7 e7 e7 e9: +set resolution and scaling (three times), and request status. +The reply consists of three bytes t1 t2 t3. +Of course, we already know that this is not an old-fashioned mouse. +

    If t2=01 and FirmwareID < 0x10 and +t1 >> 6 = 1, then conclude that we have a +Cordless MouseMan (RA12). +

    If t2=01 and FirmwareID < 0x10 and +t1 >> 6 = 3, then conclude that we have a +Cordless MouseMan (RB24). +

    Other cases with t2=01 are for new cordless mice. +

    If we suspect Synaptics and t2=0 and t3=0a, +then conclude that we have a Synaptics touchpad. +

    If we suspect Alps and t1=33, then conclude that +we have an Alps touchpad. +

    Stage 5: If we don't know the type yet, send f3 c8 +f3 64 f3 50 f2: +Set sampling rate to 200, 100, 80 Hz, and ask for ID. +The reply is a single byte. +If we get 3, conclude that we have an IntelliMouse. +(And this sequence is the initialization sequence for the IntelliMouse.) +

    Stage 6: Send ff: reset. Now the device is no longer in any +special state. +

    Stage 7: If we don't know the type yet, send e8 00 +e8 00 e8 00 e8 00 +e9: set resolution to 0 (four times), and ask for status. +The reply consists of three bytes u1 u2 u3. +If u2=47 and u3=13, then conclude +that we have a new Synaptics touchpad. +

    Stage 7a: At this point we can narrow down to model type. +If the thing is Synaptics or Alps, then Logitech is no longer interested. +If it has 3 buttons, FirmwareID 1 and firmware revision 50, +then conclude that it is a Logitech Mouseman. +

    Stage 8: If we think it is a touchpad, detect whether it has programmable RAM. +Send e6 e8 00 e8 00 e8 +00 e8 00 e9. The reply consists of three +bytes v1 v2 v3. +If v1=06 and v2=00, then conclude +that we have a Touchpad TP3 with programmable RAM. +

    Stage 9: Test whether the device understands the Logitech PS/2++ protocol. +Send the "magic knock" f5 e8 00 e8 +03 e8 02 e8 01 e6 +e8 03 e8 01 e8 02 +e8 03 f4. +Check whether the device replies with an extended report. +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-13.html b/specs/kbd/scancodes-13.html new file mode 100644 index 0000000..fe45f4c --- /dev/null +++ b/specs/kbd/scancodes-13.html @@ -0,0 +1,79 @@ + + + + + Keyboard scancodes: USB + + + + + +Next +Previous +Contents +
    +

    13. USB

    + +

    The USB specification prescribes 16-bit keycodes for keyboard positions, +identified with key captions for the usual US layout. +Below the values are given in decimal. 0-3 are protocol values, +namely NoEvent, ErrorRollOver, POSTFail, ErrorUndefined, respectively. +The values 224-231 are for modifier keys. +

    +

    +

    + 1 2 3 4 5 6 7 8 9 10 11
    +- err err err A B C D E F G H
    +
    +12 13 14 15 16 17 18 19 20 21 22 23
    +I J K L M N O P Q R S T
    +
    +24 25 26 27 28 29 30 31 32 33 34 35
    +U V W X Y Z 1 2 3 4 5 6
    +
    +36 37 38 39 40 41 42 43 44 45 46 47
    +7 8 9 0 Enter Esc BSp Tab Space - / _ = / + [ / {
    +
    +48 49 50 51 52 53 54 55 56 57 58 59
    +] / } \ / | ... ; / : ' / " ` / ~ , / < . / > / / ? Caps Lock F1 F2
    +
    +60 61 62 63 64 65 66 67 68 69 70 71
    +F3 F4 F5 F6 F7 F8 F9 F10 F11 F12 PrtScr Scroll Lock
    +
    +72 73 74 75 76 77 78 79 80 81 82 83
    +Pause Insert Home PgUp Delete End PgDn Right Left Down Up Num Lock
    +
    +84 85 86 87 88 89 90 91 92 93 94 95
    +KP / KP * KP - KP + KP Enter KP 1 / End KP 2 / Down KP 3 / PgDn KP 4 / Left KP 5 KP 6 / Right KP 7 / Home
    +
    +96 97 98 99 100 101 102 103 104 105 106 107
    +KP 8 / Up KP 9 / PgUp KP 0 / Ins KP . / Del ... Applic Power KP = F13 F14 F15 F16
    +
    +108 109 110 111 112 113 114 115 116 117 118 119
    +F17 F18 F19 F20 F21 F22 F23 F24 Execute Help Menu Select
    +
    +120 121 122 123 124 125 126 127 128 129 130 131
    +Stop Again Undo Cut Copy Paste Find Mute Volume Up Volume Down Locking Caps Lock Locking Num Lock
    +
    +132 133 134 135 136 137 138 139 140 141 142 143
    +Locking Scroll Lock KP , KP = Internat Internat Internat Internat Internat Internat Internat Internat Internat
    +
    +144 145 146 147 148 149 150 151 152 153 154 155
    +LANG LANG LANG LANG LANG LANG LANG LANG LANG Alt Erase SysRq Cancel
    +
    +156 157 158 159 160 161 162 163 164 165166167
    +Clear Prior Return Separ Out Oper Clear / Again CrSel / Props ExSel
    +
    +
    +224 225 226 227 228 229 230 231
    +LCtrl LShift LAlt LGUI RCtrl RShift RAlt RGUI
    +
    + +
    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-14.html b/specs/kbd/scancodes-14.html new file mode 100644 index 0000000..ff895cf --- /dev/null +++ b/specs/kbd/scancodes-14.html @@ -0,0 +1,26 @@ + + + + + Keyboard scancodes: Reporting + + + + +Next +Previous +Contents +
    +

    14. Reporting

    + +

    Additions and corrections are welcome. +Use showkey -s to get the scancodes. +Mention keyboard manufacturer and type, and the keycaps. +

    Andries Brouwer - aeb@cwi.nl +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-2.html b/specs/kbd/scancodes-2.html new file mode 100644 index 0000000..d921c87 --- /dev/null +++ b/specs/kbd/scancodes-2.html @@ -0,0 +1,182 @@ + + + + + Keyboard scancodes: Special keyboards - XT keyboards + + + + + +Next +Previous +Contents +
    +

    2. Special keyboards - XT keyboards

    + +

    First keyboards with an XT interface. +There is no keyboard controller, no commands to the keyboard. +On a modern computer these will usually yield "keyboard error" +or "KB/interface error" or some such, but sometimes they can be +used nevertheless. +

    The IBM PC (all models) and the IBM XT (models 68, 78, 86, 87, 88, +267, 277) came with this 83-key keyboard. +The IBM AT (models 68, 99, 239, 319) came with an 84-key keyboard. +The IBM XT (models 89, 268, 278, 286) and the IBM AT model 339 +came with a 101-key keyboard. +

    The original IBM 83-key PC/XT keyboard did not have LEDs. +The original IBM 84-key AT keyboard has LEDs, separates the +keypad from the main area, moves the Esc key to the right, +and adds the SysReq key. +The original IBM 101-key keyboard moves the ten function keys +from the left to the top row and adds two more. The Esc key is moved +in front of this row of function keys. The "number" and "cursor" +functions of the keypad are separated. There are duplicate Ctrl and Alt +keys. +

    +

    2.1 XT keyboard +

    + +

    The +XT keyboard +has 83 keys, nicely numbered 1-83, that is, with scancodes +01-53. No escaped scancodes. +

    +

    + + +
    +

    +

    2.2 Victor keyboard +

    + +

    This +Victor keyboard +is very similar. The keypad is separated here, and the Esc key +has been moved to the keypad. The frontside of the ScrollLock key +says Break. It resembles an AT keyboard but has only 83 keys, +the SysRq is still missing. +

    +

    + + +
    +

    +

    2.3 Olivetti M24 keyboard +

    + +

    +

    + + +
    + +The Olivetti M24 (also sold under the names Logabax 1600 and +ATT PC-6300) was an IBM compatible manufactured in 1984. +

    John Elliott writes: +The Olivetti M24 is an XT sort-of clone. It +has two possible keyboards - the normal (83-key) IBM one, +and a "deluxe" one (102 keys) with 18 function keys. +

    Unlike a normal XT keyboard, it is possible to send commands to it. +The BIOS does this twice: +(1) Command 01h makes the keyboard perform a self-test. +(2) Command 05h makes the keyboard return a 1-byte ID. The least signficant +bit is set for a "deluxe" layout. +

    The keyboard connector is DE-9 rather than DIN. Pins are: +

    +1   KBDATA
    +2   KBCLOCK
    +3   GND
    +4   GND
    +5  +12V
    +6   -RESET1
    +7   Keyboard/-Typewriter
    +8   TEST0
    +9   +5V
    +
    + +(pins 6-9 are not used by the supplied keyboards). +

    Attached +the diagram +of the 'deluxe' keyboard, which shows its scancodes in decimal. +

    A mouse can be attached to the keyboard. The following is based +on disassembling attmouse.drv from Windows 1.0. +

    Windows initialises the mouse by sending the following bytes to the +keyboard: 0x12, 0x77, 0x78, 0x79, 0x00. +The 0x12 is almost certainly a command byte; 0x77, 0x78 and 0x79 are the +scancodes to be returned by the three mouse buttons. I don't know what the +0x00 is for. +

    It then handles the following scancodes: +0xFE -- mouse movement. The next two scancodes are delta X, then delta Y, +in ones' complement. +0x77, 0x78, 0x79 (and 0xF7, 0xF8, 0xF9) -- button presses / releases. +

    When shutting down the mouse, it sends these bytes to the keyboard: +0x11, 0x1C, 0x53, 0x01, 0x4B, 0x4D, 0x48, 0x50, 0x02, 0x04. +My guesses here are: +0x11: Mouse movement becomes simulated keypresses. +0x1C, 0x53, 0x01: Scancodes to be returned by mouse button presses. +0x4B, 0x4D, 0x48, 0x50: Scancodes to be returned by mouse movement. +0x02, 0x04: Don't know. +

    +

    2.4 Telerate keyboard +

    + +

    The +Telerate keyboard was used +for financial applications, as is clear from the keycaps. +This keyboard (in the old XT version, without e0 prefixes) +has four additional keys, with scancodes 61, +62, 63, 64. The F11 and F12 keys have +scancodes 54 and 55 (instead of the common 57 +and 58). There are two LEDs (for CapsLock and NumLock). +

    +

    + + +
    +

    +

    2.5 NCR keyboard +

    + +

    Also with an XT interface this +NCR keyboard, +still with ten function keys on the left, but already with a separate +block of keys between the ordinary keys and the numeric keypad. +This middle block has on top five keys +Ctrl (1d, same as the Ctrl on the left), +Del (53, same as Keypad-Del/.), +PgUp (49, same as Keypad-9/PgUp), +End (4f, same as Keypad-1/End), +PgDn (51, same as Keypad-3/PgDn), and below five cursor keys +(48, same as Keypad-8/Up; +4b, same as Keypad-4/Left; +47, same as Keypad-7/Home; +4d, same as Keypad-6/Right; +50, same as Keypad-2/Down). +Enter and Keypad-enter are both 1c. +Below the Enter key PrtScn/* (37), and below that again +Ins (52, same as Keypad-0/Ins). +CapsLock and NumLock have a built-in LED. +

    +

    + + +
    +

    +

    +

    2.6 Cherry G80-0777 +

    + +

    According to +FreeKEYB/kbdinfo.html +this keyboard has five additional keys with scancodes +55 (F11), 56 (F12), +57 (F13), 58 (F14), 59 (F15). +

    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-3.html b/specs/kbd/scancodes-3.html new file mode 100644 index 0000000..fdc5f60 --- /dev/null +++ b/specs/kbd/scancodes-3.html @@ -0,0 +1,64 @@ + + + + + Keyboard scancodes: Special keyboards - Amstrad/Schneider keyboards + + + + + +Next +Previous +Contents +
    +

    3. Special keyboards - Amstrad/Schneider keyboards

    + +

    Since IBM had patented their keyboard design, +Amstrad developed an entirely different keyboard. +

    +

    3.1 Amstrad/Schneider PC1512 +

    + +

    The +Amstrad keyboard +is entirely incompatible with XT and AT keyboards, and can be used only +on an Amstrad; conversely, no other keyboard will work on an older Amstrad. +This keyboard has a Del key on the keypad, and both Del-> and Del<- keys +above the Enter key. The Del-> key has scancode 70. +Left of the Enter key a PrtSc/* key. +There is an additional Enter key with scancode 74. +It is possible to connect a mouse and/or joystick to the keyboard, +and then these devices also yield scancodes: +77 (joystick button 1), 78 (joystick button 2), +79 (joystick right), 7a (joystick left), +7b (joystick up), 7c (joystick down), +7d (mouse right), 7e (mouse left). +

    +

    + + +
    +

    +

    3.2 Amstrad/Schneider other models +

    + +

    John Elliott adds: +

    The above only mentions the PC1512/PC1640 style of keyboard. +Later Amstrad XTs (PPC512, PPC640, PC20, PC200, PC2086, PC3086) used a +102-key keyboard with the same layout and scancodes as a normal 102-key XT +keyboard. This design is not only incompatible with normal XT and AT +keyboards, it's also incompatible with the PC1512 keyboard. The joystick +socket is no longer present, but mouse button clicks are still handled by +the keyboard, with the same scancodes 7d (right button) and +7e (left button). +

    On the PPC512, PPC640, PC20 and PC200, the keyboard is in the same box as +the motherboard, and is connected directly to it by ribbon cable. +

    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-4.html b/specs/kbd/scancodes-4.html new file mode 100644 index 0000000..7c058e6 --- /dev/null +++ b/specs/kbd/scancodes-4.html @@ -0,0 +1,41 @@ + + + + + Keyboard scancodes: Special keyboards - AT keyboards + + + + + +Next +Previous +Contents +
    +

    4. Special keyboards - AT keyboards

    + +

    The AT keyboard adds a keyboard controller. +The numeric keypad is now separated from the main keyboard. +There is a single new key, with scancode 84 = 54, +namely SysRq. +

    The protocol for AT and later keyboards differs from that for +XT keyboards. Some old keyboards have an XT/AT switch on the +backside that selects the appropriate protocol. +Other keyboard autodetect XT or AT mode. +

    +

    + + +
    +

    +

    The KeyTronic KB101-1 keyboard has four switches of which the first two +indicate the desired behaviour (00 - autodetect, 01 - unused, +10 - PC/XT, 11 - AT). Autodetect does not always work. +

    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-5.html b/specs/kbd/scancodes-5.html new file mode 100644 index 0000000..ed71bbc --- /dev/null +++ b/specs/kbd/scancodes-5.html @@ -0,0 +1,1411 @@ + + + + + Keyboard scancodes: Special keyboards - MF II keyboards + + + + + +Next +Previous +Contents +
    +

    5. Special keyboards - MF II keyboards

    + +

    Next the modern keyboards. (MF stands for MultiFunctional.) +The layout has changed: the function keys now form a top row. +Function keys F11 and F12 were added. The ten keypad digit keys +that served dual purposes (depending on NumLock and Shift) +were duplicated so that digits and cursor movements could be +produced without help from the Shift or Numlock keys. +Also the Alt and Ctrl keys were duplicated. +Prefixes e0 and e1 were introduced +to distinguish old and new versions of the same old key. +All modern keyboards follow this scheme, but many add a messy +collection of "internet buttons" and "CD keys". +

    Let us group keyboards according to manufacturer. +

    +

    5.1 Compaq keyboards +

    + +

    +

    +

    Compaq Armada laptop keyboard

    + +

    Christian Gennerat <christian.gennerat@vz.cit.alcatel.fr> writes: +There are 4 extra keys on the Compaq Armada laptops. +The four keys are located over the Esc-F1..F12, and are labelled *1-*4. +Scancodes: 65, 67, 69, 6b. +

    +

    Compaq Easy Accesss Internet Keyboard

    + +

    Petr Slansky <slansky@usa.net> writes: +

    Internet buttons: +e0 13 online community button (people icon), +e0 14 online Compaq button (Q icon), +e0 15 online services button (bulb icon), +e0 1e online e-mail button (envelope icon), +e0 21 online Search button (magnifier icon), +e0 23 online start button (i icon), +e0 32 online commerce button (shopping basket icon), +

    e0 68 Quick Print button (printer icon), +e0 1f Favorite Application Launch button (racket icon), +

    e0 5f Sleep button, +

    CD/DVD player buttons: +e0 22 Play/Pause, +e0 24 Stop, +e0 19 Next Track, +e0 10 Previous Track, +e0 2c Eject, +

    Volume Control buttons: +e0 30 Volume increase (+), +e0 2e Volume decrease (-), +e0 20 Mute. +

    +

    + + +
    +

    +

    Compaq Eight-Button Easy Access Keyboard

    + +

    A +Compaq keyboard that I have here, +has the usual setup (with Windows keys) plus a top row of eight buttons, +that produce scancodes +e0 23, +e0 1f, +e0 1a, +e0 1e, +e0 13, +e0 14, +e0 15, +e0 1b. +These keys do not produce any codes in scan code Set 3. +

    +

    + + +
    +

    +

    5.2 IBM keyboards +

    + +

    +

    IBM Rapid Access keyboard

    + +

    + +(Information from Dennis Bjorklund <dennisb@cs.chalmers.se> +and others.) +

    The IBM Rapid Access keyboard has 14 extra buttons and two more leds +than a normal PC keyboard. By default, these buttons do not generate +any scancodes. To activate them one has to send the sequence +ea 71 to the keyboard. +Once that is done the extra keys generate normal e0xx sequences. +To turn off the extra keys you send ea 70. +

    These 14 keys send the following scancodes (when activated): +

    e0 25 (Suspend), +e0 26 (Help), +e0 32 (Prg 1), +e0 17 (Prg 2), +e0 30 (Prg 3), +e0 2e (Prg 4), +e0 19 (Play CD), +e0 24 (CD Stop), +e0 22 (CD Pause), +e0 1e (Vol -), +e0 20 (Vol +), +e0 23 (Prev song), +e0 21 (Next song), +e0 12 (Mute). +

    +

    The Suspend and Mute buttons have extra LEDs on them. +Sending the sequence + +eb 00 ff +to the keyboard makes all five LEDs lit up for a moment. +The sequence eb 00 04 lights the Suspend LED +(behind a waning moon). +The sequence eb 00 20 makes the Mute LED blink. +The sequence eb 00 80 locks the keyboard; +if the Mute LED was blinking it now is lit permanently. +Sending eb 00 ff unlocks the keyboard again. +

    The command ec returns 0c 01 (untranslated) +which becomes 3e 43 in translated scancode Set 2. +(Possibly an ID?) +

    +

    + +Dennis Bjorklund writes: +Here is the hack I use to send commands to the keyboard. After you have +compiled it you can do things like send_to_keyboard ea 71, +but don't run two of these at the exact same moment, and don't send +strange codes because the keyboard might lock up. +

    My computer runs this at every startup. After that the extra buttons on +the rapid access work just fine in XFree86. +

    +
    +/* gcc -O2 -s -Wall -osend_to_keyboard main.c */
    +#include <stdlib.h>
    +#include <unistd.h>
    +#include <sys/io.h>
    +
    +int main(int argc, char *argv[]) {
    +  int i;
    +
    +  ioperm(0x60, 3, 1);
    +
    +  for (i = 1; i < argc; i++) {
    +    int x = strtol(argv[i], 0, 16);
    +
    +    usleep(300);
    +    outb(x, 0x60);
    +  }
    +
    +  return 0;
    +}
    +
    +
    +

    +

    + + +
    +

    +

    IBM Rapid Access II keyboard

    + +

    +

    This keyboard has a top row of seven color-coded buttons. +On the upper right a "wheel" composite button with six parts. +Below it a blue button ("mute"). +Finally, the usual block with four arrow keys has been enlarged +by two more keys ("page left" and "page right"). +

    Keys: +

    e0 25 (Green, "Internet"), +e0 26 (Blue, "Internet shopping"), +e0 32 (Yellow, "IBM Web support"), +e0 17 (Purple), +e0 30 (Red), +e0 2e (Cyan, "Help"), +e0 5f (White, "Standby" - has a LED), +e0 20 (CD stop), +e0 22 (CD play), +e0 21 (Volume D), +e0 23 (Volume U), +e0 24 (CD back), +e0 12 (CD fwd), +e0 1e (Mute - no LED). +

    (In translated scancode Set 3, these become +41, 3f, 3d, 3b, 3c, +66, --, 69, 6a, 6b, +6c, 6d, 44, 68, respectively.) +

    The "back" ("page left") and "forward" ("page right") keys +generate ALT+left and ALT+right respectively: +38 e0 4b (release sequence +b8 e0 cb) and +38 e0 4d. +

    The commands ea 70 and ea 71 +serve to switch off (resp. on) the special keys. +(These are on by default, but can be switched off.) +However, the white Standby key is always on. +

    The white Standby button has a LED (that is flashed during a reset). +It is set by the command + +eb 71 +and cleared by the command eb 70. +

    +

    +

    + + +
    +

    +

    IBM ThinkPad

    + +

    George Staikos <staikos@0wned.org> writes: +

    I have an IBM ThinkPad i1460. It has the IBM EasyLaunch<tm> keys. +These are four multicoloured keys up at the top of the keyboard +for "Home Page", "Search", "Shop", "Mail". They dont' seem to create +any keyboard events at all. The keyboard interrupt doesn't trigger, +showkeys doesn't see them do anything, and in DOS, a simple +sequence of BIOS calls doesn't see them either. +Also, being a laptop, it has an FN key. This key generates 55. +

    +

    5.3 Logitech keyboards +

    + +

    +

    Logitech Internet keyboard

    + +

    Jonathan DeBoer <deboer@ugrad.cs.ualberta.ca> reports: +This keyboard has 18 unusual keys. +

    e0 7a (WWW), +e0 32 (History), +e0 21 (Open URL), +e0 23 (Home), +38 2a 0f 8f (key press) +8f b8 aa (key release) (Send To Back) - +this sequence simulates Alt+Shift+Tab, but contains two Tab releases, +e0 17 (Print), +e0 10 (Back), +e0 22 (Forward), +e0 24 (Stop), +e0 19 (Refresh), +e0 1e (Search), +e0 12 (Find), +e0 26 (Add Favourite), +e0 18 (Open Favourites), +e0 20 (Hot Links), +e0 30 (Scroll Up), +e0 2e (Scroll Down), +e0 25 (Logitech). +

    +

    Ryan Lortie <desertangel@globalserve.net> writes: +The "Logitech" key is used as a modifier. +In windows, Logitech-Keypad+ increases volume, Logitech-Keypad- decreases. +There is a conjoined dual-button key for "scroll". +You press the top part to scroll up, the bottom to scroll down. +

    +

    Graham Hay adds: The extra LED is an amber colour, placed above +the www key with a recessed line linking them. Sending eb alone +turns it on. It will flash on/off about once per second after that. +A single ec will turn it off. +

    +

    + + +
    + +(enlarge)

    +

    +

    Logitech Cordless Desktop Pro keyboard

    + +

    Nick Rusnov <nick@grawk.net> reports: +

    The special buttons on a Logitech Cordless Desktop Pro keyboard +produce the following scancodes: +

    e0 5f (Moon (sleep)), +e0 32 (Homepage), +e0 6c (Mail), +e0 65 (Search), +e0 66 (runningguuy), +e0 20 (Mute), +e0 2e (VolDown), +e0 30 (VolUp), +e0 22 (Play/Pause), +e0 24 (Stop), +e0 10 (Rewind), +e0 19 (ff), +e0 21 (Logitech). +

    +

    Logitech Access keyboard

    + +

    Denis Kosygin <kosygin@math.princeton.edu> reports: +

    In addition to usual 104 keys in the usual PC layout this keyboard +has 11 extra keys. Ten of them produce the following escape scancodes: +e0 5f (User (moon)), +e0 6c (E-mail), +e0 11 (Messenger/SMS), +e0 12 (Webcam), +e0 20 (Mute (crossed speaker)), +e0 30 (VolUp (triangle up with + sign in it)), +e0 2e (VolDown (triangle down with - sign in it)), +e0 6d (Media), +e0 32 (My Home), +e0 65 (Search). +

    The eleventh key (with keycap "F lock") is a switch between two sets +of scancodes for function keys F1-F12. When "F lock" is pressed, then +F1-F12 act as function keys and produce usual keyscans for these keys. +When "F lock" is depressed, F1-F12 generate the following keyscans: +

    e0 3b (new [F1]), +e0 3c (reply [F2]), +e0 3d (forward [F3]), +e0 3e (send [F4]), +e0 10 (rewind [F5]), +e0 19 (fast forward [F6]), +e0 22 (play/pause [F7]), +e0 24 (stop [F8]), +e0 43 (my com [F9]), +e0 44 (my doc [F10]), +e0 57 (my pic [F11]), +e0 58 (my music [F12]). +

    +

    + + +
    +

    +

    +

    Logitech Cordless Desktop Optical keyboard

    + +

    Stefan reports: +

    The special buttons on a Logitech Cordless Desktop Optical keyboard +produce the following scancodes: +

    e0 69 (Go), +e0 6a (Back), +e0 5f (Sleep), +e0 66 (Favorites), +e0 24 (SeekBack), +e0 22 (SeekForward), +e0 01 (Media), +e0 1e (VolUp), +e0 25 (VolDown), +e0 26 (Mute), +e0 1f (PlayPause), +e0 17 (Stop), +e0 6c (Email), +e0 65 (Search), +e0 02 (Homepage). +

    Some other keys behave differently. +

    +

    +

    5.4 Microsoft keyboards +

    + +

    +

    Some common scancodes found on some Microsoft keyboards. +

    +

    +

    +e0 05 Messenger or Files e0 07 Redo (on F3 or not) e0 08 Undo (on F2 or not) e0 09 Application Left
    +e0 0a Paste e0 0b/8b Scroll Up/Down Normal e0 10 Prev Track, |<< e0 11/91 Scroll Up/Down Fast
    +e0 12/92 Scroll Up/Down Faster e0 13 Word e0 14 Excel e0 15 Calendar
    +e0 16 Log Off e0 17 Cut e0 18 Copy e0 19 Next Track, >>|
    +e0 1e Application Right e0 1f/9f Scroll Up/Down Fastest e0 20 Mute e0 21 Calculator
    +e0 22 Play/Pause e0 23 Spell (on F10) e0 24 Stop (cf e0 68) e0 2e Volume -
    +e0 30 Volume + e0 32 Web/Home e0 3b Help (on F1) e0 3c My Music or Office Home (on F2)
    +e0 3d Task Pane (on F3) e0 3e New (on F4) e0 3f Open (on F5) e0 40 Close (on F6)
    +e0 41 Reply (on F7) e0 42 Fwd (on F8) e0 43 Send (on F9) e0 57 Save (on F11)
    +e0 58 Print (on F12) e0 5b LeftWindows e0 5c RightWindows e0 5d Application (Menu)
    +e0 5e Power e0 5f Sleep e0 63 Wake e0 64 My Pictures
    +e0 65 Search e0 66 Favorites e0 67 Refresh e0 68 Stop (cf e0 24)
    +e0 69 Forward e0 6a Back e0 6b My Computer e0 6c Mail
    +e0 6d Media
    + +
    +

    +

    +

    Microsoft Natural keyboard

    + +

    This keyboard has three additional keys, with escaped scancodes +e0 5b (LeftWindow), +e0 5c (RightWindow), +e0 5d (Menu). +The untranslated Set 2 scancodes (see +below) +are e0 1f, e0 27 and +e0 2f, respectively. +The USB key codes are usage page 0x07, usage index 227, 231, 101 +(decimal), respectively. +Microsoft +describes +the intended use in detail. Both Windows keys are intended to be +used as modifier keys, like both shift and control and alt keys. +The Menu key may be modified by shift etc. +

    +

    Microsoft Internet keyboard

    + +

    In addition to the three extra keys on the Microsoft Natural keyboard, +this keyboard has ten keys, with escaped scancodes +e0 6a (Back), +e0 69 (Forward), +e0 68 (Stop), +e0 6c (Mail), +e0 65 (Search), +e0 66 (Favorites), +e0 32 (Web/Home), +e0 6b (My Computer), +e0 21 (Calculator), +e0 5f (Sleep). +The untranslated Set 1 codes are as expected (make codes identical to +the above translated Set 2 ones). The translated Set 3 codes are +6a, 69, 68, 6c, 65, +66, 97, 6b, 99, 54, +respectively. +

    +

    Microsoft Natural keyboard pro

    + +

    Marco Melgazzi <marco@techie.com> reports: +The Microsoft Natural keyboard pro has 19 additional keys, +with escaped scancodes +e0 6a (Back), +e0 69 (Forward), +e0 68 (Stop), +e0 67 (Refresh), +e0 65 (Search), +e0 66 (Favorites), +e0 32 (Web/Home), +e0 6c (Mail), +e0 20 (Mute), +e0 2e (Volume -), +e0 30 (Volume +), +e0 22 (Play/Pause), +e0 24 (Stop), +e0 10 (Prev Track), +e0 19 (Next Track), +e0 6d (Media), +e0 6b (My Computer), +e0 21 (Calculator), +e0 5f (Sleep). +(That is, we have the ten extra keys of the Microsoft Internet keyboard, +with the same scancodes, and also Refresh, Mute, Volume -, Volume +, +Play/Pause, Stop, Prev Track, Next Track, Media.) +

    +

    Microsoft Natural Multimedia Keyboard

    + +

    Jeremy Brand <jeremy@nirvani.net> reports: +The Microsoft Natural Multimedia Keyboard has 17 additional keys. +Scancodes are +

    ? (My Documents), +e0 64 (My Pictures), +e0 3c (My Music), +e0 20 (Mute), +e0 22 (Play/Pause), +e0 24 (Stop), +e0 30 (Volume +), +e0 2e (Volume -), +e0 10 (|<<), +e0 19 (>>|), +e0 6d (Media), +e0 6c (Mail), +e0 32 (Web/Home), +e0 05 (Messenger), +e0 21 (Calculator), +e0 16 (Log Off), +e0 5f (Sleep). +

    Moreover, the function keys are dual purpose. +There is a "function lock" key. +By default the function keys are not function keys, they are +"Help", "Undo", etc. You have to press the function lock key +and then the function keys act like the usual function keys. +In the default state the scancodes are +

    e0 3b (Help) on F1 key, +e0 08 (Undo) on F2 key, +e0 07 (Redo) on F3 key, +? (New) on F4 key, +? (Open) on F5 key, +? (Close) on F6 key, +? (Replay) on F7 key, +e0 42 (Fwd) on F8 key, +e0 43 (Send) on F9 key, +e0 23 (Spell) on F10 key, +e0 57 (Save) on F11 key, +e0 58 (Print) on F12 key. +

    +

    +

    Microsoft Office keyboard

    + +

    Christian Hammond +reports +about the keyboard Scroll Wheel: +The following is my interpretation of the results of +showkey -s. I had read that the wheel has 3 speeds, +normal, fast, and faster. However, my results show 4. +

    Scroll Up: Normal e0 0b, +Fast e0 11, +Faster e0 12, +Fastest e0 1f. +

    Scroll Down: Normal e0 8b, +Fast e0 91, +Faster e0 92, +Fastest e0 9f. +

    Wouter van Wijk <woutervanwijk@netscape.net> reported the scancodes +given below. +

    On the left touchpad above the scroll wheel: +e0 6a (Back), +e0 69 (Forward). +On the left touchpad below the scroll wheel: +e0 17 (Cut), +e0 18 (Copy), +e0 0a (Paste), +e0 09 (Application Left), +e0 1e (Application Right), +

    Buttons on the top row: +No scancode (F Lock), +e0 13 (Word), +e0 14 (Excel), +e0 32 (Web/Home), +e0 6c (Mail), +e0 15 (Calendar), +e0 05 (Files), +e0 21 (Calculator), +e0 20 (Mute), +e0 2e (Volume -), +e0 30 (Volume +), +e0 16 (Log Off), +e0 5f (Sleep). +This is the expected code for Sleep. However, there do not seem to be +Power and WakeUp keys. +

    The twelve function keys can be in two states. In the default state +they produce the (new) codes below. The FLock toggle switches them +back to good old function key state. +e0 3b (Help [F1]), +e0 3c (Office Home [F2]), +e0 3d (Task Pane [F3]), +e0 3e (New [F4]), +e0 3f (Open [F5]), +e0 40 (Close [F6]), +e0 41 (Reply [F7]), +e0 42 (Fwd [F8]), +e0 43 (Send [F9]), +e0 23 (Spell [F10]), +e0 57 (Save [F11]), +e0 58 (Print [F12]). +Note that each of these codes is just the e0 variation +of the ordinary function key code, except for that for Spell [F10]. +When the FLock light is off (default) the e0-version +is activated. +

    Above the 5-key block with Insert, Home, Delete, PgUp, PgDown: +e0 08 (Undo), +e0 07 (Redo). +

    Above the number pad: +59 (=), +e0 4c (( [PrintScreen]), +e0 64 () [ScrollLock]), +0e (Backspace), +0f (Tab). +These are the usual codes for Backspace and Tab but new codes +for (, ), =. PrintScreen and ScrollLock have the usual codes. +

    +

    + + +
    + +See the +Microsoft ad. +

    +

    5.5 Safeway keyboards +

    + +

    +

    +

    Safeway SW10 keyboard

    + +

    The Safeway SW10 keyboard has the usual keys, including the three +Windows keys, and including Power, Sleep, Wake keys +(below Delete, End, PageDown) that do not produce scancodes +unless the Fn key (above Keypad-Minus) is pressed simultaneously. +This Fn key is used together with 11 keys: F1-F7, F11, Power, Sleep, Wake. +Fn-F11 disables the keyboard and another Fn-F11 enables it again. +Fn-F1/F2/F3/F4/F5/F6/F7 sets the repeat rate +(on my keyboard I measured 2.0/4.0/6.7/12/26/32/32 chars/sec respectively). +

    +

    Safeway SW23 keyboard

    + +

    The Safeway SW23 keyboard has 132 keys: the usual 104 keys +(101 plus three Windows keys), five more keys called Turbo +(below Enter, right of RShift), and Power, Sleep, Wake +(below Delete, End, PageDown), and Ez (above Keypad-Minus), +and 23 buttons in two rows above the row of function keys. +By default, the five extra keys do not produce scancodes. +(The Ez is a mode toggle. The Turbo key is used to enable +the Power, Sleep, Wake keys.) +

    First row of buttons: three Volume buttons: +e0 58 (Mute), +e0 5a (Vol -), +e0 70 (Vol +), +five CD Player buttons: +e0 59 (Prev), +e0 42 (Play), +e0 69 (Next), +e0 64 (Stop), +e0 71 (Eject), +two Recorder buttons: +e0 40 (Rew/Play), +e0 29 (Rec/Stop). +

    Second row of buttons: +e0 23 (Sleep), +e0 7d (Cut), +e0 7e (Copy), +e0 7f (Paste), +e0 20 (Rotate), +e0 43 (Close), +e0 30 (My Doc), +e0 44 (DOS), +e0 79 (Game), +e0 77 (WWW), +e0 6e (Calc), +e0 3e (X'fer), +e0 6a (Menu/?). +

    The Ez key does not produce scancodes, but toggles a +M/Mode LED, the fourth next to the Num, Caps, Scroll LEDs. +When that LED is set, the 17 keypad keys give different +scancodes: +e0 3c (N/Lock), +e0 7b (/), +e0 22 (*), +e0 61 (-), +e0 0f (7), +e0 21 (8), +e0 6b (9), +e0 3d (+), +e0 04 (4), +e0 62 (5), +e0 39 (6), +e0 10 (1), +e0 24 (2), +e0 05 (3), +e0 02 (0), +e0 41 (.), +e0 3f (Enter). +

    The Turbo key does not produce scancodes, and neither do +Power, Sleep, Wake. However, when Turbo is pressed simultaneously, +the Power, Sleep, Wake keys yield e0 5e, +e0 5f, e0 63 as +they should. +

    In untranslated scancode mode 3, the multimedia and power keys +do not yield any code. In untranslated scancode mode 1 they +yield the same code as in untranslated scancode mode 2. +(This is a design bug: untranslated scancode mode 1 should be the same +as translated scancode mode 2 (see +below), +and this is true for the ordinary keys, but fails here for the +"multimedia" keys. For example, the keys End and Keypad-Minus +(in M/Mode) yield the same e0 4f in +untranslated scancode mode 1.) +

    Note that some "protocol keycodes" occur here with e0 prefix. +Indeed, we see e1, ee, f1, fe, ff +in the key up sequence for the multimedia keys Keypad-Minus +(e0 e1), Calc (e0 ee), +Eject (e0 f1), Copy (e0 fe), +Paste (e0 ff). +

    +

    5.6 Internet Wireless Keyboard +

    + +

    This keyboard (nameless, made in China) has 9+1+9 buttons, +nine on each side of the Sleep button. +Buttons: +e0 6a (Web Backward), +e0 69 (Web Forward), +e0 68 (Web Stop), +e0 67 (Web Refresh), +e0 65 (Web Search), +e0 66 (Web Favorites), +e0 32 (Web Home), +e0 6c (E-mail), +e0 20 (Mute), +e0 5f (Sleep), +e0 2e (Volume Down), +e0 30 (Volume Up), +e0 22 (Play/Pause), +e0 24 (Stop), +e0 10 (Fast Backward), +e0 19 (Fast Forward), +e0 6d (Media Player), +e0 6b (My Computer), +e0 21 (Calculator). +

    This keyboard reports +keyboard ID +ab 83 (translated ab c1). +Scancode sets 1 and 2 are reported as 01 and 02 +(translated c3 and c1). +These translations are bugs, but otherwise all seems to function +as expected, except that this keyboard does not recognize +scancode set 3 and returns fe for an attempt to set Set 3. +Every command ed xx is accepted, but there are no LEDs, +there is only a battery indicator. +

    The mouse that accompanies the keyboard shows no reactions. +It may need a special driver. +

    +

    5.7 Nokia keyboard +

    + +

    This 121-key +Nokia keyboard +has ten function keys on the left and twenty-four on +the two top rows. On the right a block with cursor keys +and a block with numeric keys. There are three LEDs. +The keys have brown markings, and sometimes also blue ones. +Where both occur, the blue markings describe the usual PC keytops. +

    Roughly speaking, the scancodes are as expected. +The +function keys F1-F10,F11,F12 +have scan codes 3b-44, 57, 58 as usual. +The keys on the upper row, labeled F13-F24, yield the same codes as +shifted F1-F12. E.g., F13 gives 2a 3b on press, +and bb aa on release. +The function keys F4,F11,F13-F19,F21,F24 have front labels +CrSel, AltCr, Red, Pink, Green, Yellow, Blue, Turq, White, +Col, USM. +

    The +ten keys on the left +have the following scancodes. +First column of five: +01 (Attn/Esc/NxtTsk), as expected for Esc; +1d 3b (Quit/Reset), as expected for Ctrl F1; +1d 3c (ExSel), as expected for Ctrl F2; +1d 3d (Ident/Print), as expected for Ctrl F3; +1d 3e (Help/EnlW), as expected for Ctrl F4. +For these last four keys (and the ChgSc/WSCtrl below) the code +becomes 3b-3e (and 3f) when left or right +Ctrl is pressed already. +Second column of five: +e1 1d 45 ((Break)/Clear/Pause/Test), and e0 46 +with Ctrl, as expected for Pause/Break; +46 (ScrLock), as expected for ScrLock; +e0 2a e0 37 (PrtSc/SysRq), and e0 37 with +left or right Ctrl or left or right Shift, and 54 +with left or right Alt, as expected for PrtSc; +1d 3f (ChgSc/WSCtrl), as expected for Ctrl F5; +38 e0 49 (Jump), as expected for Alt PgUp. +

    On +the right a cursor key +section and a number pad. +The cursor key section has the expected block of six: +e0 52 (Dup/Insert/PA1); +e0 47 (Field Mark/Home/PA2); +e0 49 (PA3/PgUp); +e0 53 (Delete/DelWd); +e0 4f (ErEOF/End/ErInp); +e0 51 (PgDn). +Next four arrow keys: +e0 48 (Up); +e0 4b (Left); +e0 4d (Right); +e0 50 (Down). +And in the middle 1d 40 (Home), with code as expected for Ctrl F6. +

    Finally the numeric keypad, with the usual keys that generate the +usual codes, and a single additional key, a Tab, with 0f +like the ordinary tab. +

    +

    + + +
    +

    +

    5.8 Focus KeyPro FK-9000 keyboard +

    + +

    Raul D. Miller <rockwell@nova.umd.edu> +and Timothy C. Hagman <hagmanti@cps.msu.edu> +report: +

    The keyboard is a KeyPro FK-9000. The FCC label says it's made in +Taiwan by Focus Electronic Co, Ltd. It has a built-in calculator. +

    This keyboard has twelve additional keys, with scancodes +55 (PF1), +6d (PF11), +6f (PF12), +73 (PF2), +74 (PF9), +77 (PF3), +78 (PF4), +79 (PF5), +7a* (PF6), +7b (PF7), +7c (PF8), +7e* (PF10). +

    The break codes equal the make codes ORed with 0x80, as always, +but the Linux kernel eats fa and fe as +protocol bytes. +

    The behavior of these keys is different from that of normal keys-- +they generate nothing when pressed; then generate the above scancodes +at the normal repeat time and rate, and then generate (except for the +starred ones) their scancode ORed with 0x80 when released... +

    These PF keys are reprogrammable -- and programming occurs as a sequence +of keyboard actions. Therefore, the PF keys duplicate whatever +keyboard actions occurred during their programming. +You hit the "Prog" key, then the PF key you want to program; type the +string you want to store in the key (it's limited to 14 keypresses), +and then hit the PF key again. After that, when you hit the PF key, +it sends the string, and generates its own abnormal scancode upon +release. When the key is held down, it generates the scancode repeatedly, +but does not generate the string stored in it repeatedly. +

    When you go to program a key, the scancodes for "PF##-" are sent +to the computer, then the scancodes for each key you hit as you +hit it (the shift, etc. keys are an exception-- they send "s-" +and such :), and then, when you hit the PF## key again to end the +programming, it sends a sequence of (at least) 18 "0e 8e"s -- +Backspaces... +

    The program key itself doesn't generate a scancode at any time. +The same applies to the CE and AC/ON keys (part of the calculator). +There is a switch to change between calculator and keyboard mode +which generates no scancodes. +

    When the keyboard is in calculator mode, the entire numeric +keypad (and everything else on the right side) generates no +scancodes. +

    When the keyboard is not in caluclator mode, the %, MC, MR, M-, +M+, and Square Root keys all generate ff when pressed, +ff to repeat, and ff on release. +

    The little unlabeled key between the right Ctrl and right Alt +generates 56 when hit, repeats that, and then d6 +when released, just like a normal key. +

    +

    +

    5.9 BTC keyboard +

    + +

    This keyboard has one additional key, with escaped scancode +e0 6f (Macro). (Funny enough it does this +in all modes, each of the three scancode sets, translated or not. +In particular, this Macro key is the only key that generates +two bytes in scancode mode 3.) +

    +

    5.10 LK411 and LK450 keyboards +

    + +

    These keyboards have seven additional keys, with escaped scancodes +e0 0f (LeftCompose), +e0 3d (F13), +e0 3e (F14), +e0 3f (Help), +e0 40 (Do), +e0 41 (F17), +e0 4e (Keypad-minplus). +(LK411 has all seven. LK450 has the last six - the report did not +mention a Compose key.) +There are only two LEDs. The keycaps are unusual. +

    In (translated) scancode Set 3 these keys give codes +68, 44, 42, 40, +3e, 65, 70. +In untranslated Set 2, the F17 key gives e0 83. +

    An +LK411 keyboard, +with +left +and +right hand side enlarged. +

    The keys labeled F18, F19, F20 produce the codes expected for +PrtSc, ScrollLock, Pause. +The keys labelled PF1, PF2, PF3, PF4 produce the codes expected for +NumLock, Keypad-/, Keypad-*, Keypad--. +The Keypad-, key produces the code 4e expected for Keypad-+. +The Right ComposeCharacter key produces the code expected for RCtrl. +The key labelled </> produces the code 29 +expected for `/~. The key labelled with `/~/(Esc) produces +the code expected for Esc. +

    +

    + + +
    +

    +

    5.11 An OmniKey keyboard +

    + +

    This keyboard has one additional key, with escaped scancode +e0 4c (Omni). +

    For the Northgate OmniKey 101 keyboard it is said that the command +e8 reads a 2-byte ID. +

    +

    5.12 GRiD 2260 keyboard +

    + +

    The GRiD 2260 notebook has a key producing the +6c scancode; I do not know the keycap. +

    +

    5.13 An old Olivetti keyboard +

    + +

    Kasper Dupont <kasperd@daimi.au.dk> writes: +My 10 year old 102-key keyboard that came with an "Olivetti PCS 286" +actually has connectors for three additional keys just bellow Delete, End, +and PgDn. There is no keys on the connectors, I only found them because I +opened the keyboard for cleaning. The scancodes are from left to right +65, 66, 67. +

    +

    5.14 Cherry G81-3000 +

    + +

    According to +Delorie +the "Cherry G81-3000 SAx/04" keyboard has four additional keys, +which can be made available by a user modification; +the three new keys located directly below the cursor pad's +Delete, End, and PgDn keys send make codes 66-68 (F19-F21); +the fourth new key, labeled (delta), sends make code 73. +

    +

    5.15 Accord keyboard +

    + +

    According to +Delorie +the "Accord" +ergonomic keyboard +with optional touchpad has an additional key above the Grey-Minus key +marked with a left-pointing triangle and labeled "Fn" in the owner's +booklet which sends make code e0 68. +

    +

    5.16 Trust Ergonomic keyboard +

    + +

    Frank v Waveren <fvw@var.cx> reports: +The Trust Ergo Track keyboard has one additional key (`application key'), with +escaped scancode e0 68. The keycap is a triangle pointing left. +

    +

    5.17 Brazilian keyboards +

    + +

    ABNT (Associação Brasileira de Normas Tecnicas) and ABNT2 +are Brazilian keyboard layout standards. The plain Brazilian +keyboard has 103 keys. +

    The Brazilian ABNT keyboard has two unusual keys, +with scancodes 73 (/?) and 7e (Keypad-.). +The former is located to the left of the RShift (which +key therefore is less wide than usually), the latter below +the Keypad-Plus (reducing the Keypad-Plus to single height). +

    Under Linux, the corresponding key codes are 89 and 121, respectively. +These keys do not function with Windows NT 4.0. +

    Antonio Dias <accdias@sst.com.br> provided the +keypad layout +and writes: Brazilian ABNT2 keyboards come with two layouts. +In MSDOS they call them ID 274 and ID 275. +

    +

    5.18 RC930 keyboard +

    + +

    Torben Fjerdingstad <tfj@olivia.ping.dk> reports: +

    It's an rc930 keyboard, from Regnecentralen/RC International, Now ICL. +This keyboard has four additional keys, with scancodes +59 (A1), +5a (A2), +5b (A3), +5c (A4). +

    The rc930/rc931 keyboards are not made anymore, because they had a +problem with fast typists, writing over 400 chars/minute. +Writing 'af<space>', very, very fast, did a PgUp. +

    +

    5.19 Tandberg Data keyboard +

    + +

    Kjetil Torgrim Homme <kjetilho@ifi.uio.no> reports: +

    My Tandberg Data keyboard uses the prefix 80 for +its numerous (20) extra keys. The 80 scancodes are: +

    11, 12, 13, 14, 16, +17, 18, 19, 1e, 1f, +20, 21, 22, 23, 25, +26, 2f, 30, 32, 56. +

    For completeness, the e0 scancodes: +

    1c, 2a, 35, 37, 47, +48, 49, 4b, 4d, 4f, +50, 51, 52, 53. +

    The e1 scancode: 1d. +As you can see, there is no overlap on this keyboard. +

    Harald Arnesen <gurre@start.no> gives the keycaps +for these for the Tandberg TDV5020 keyboard. +All use prefix 80 on both press and release. +

    Thirteen keys have (Norwegian) text: +11 HJELP (help), 14 STRYK (cut), +16 KOPI (copy), 17 FLYTT (move), +19 JUST (justify), 21 MERK (mark), +22 ANGRE (undo), 23 SKRIV (print), +25 SLUTT (exit), 26 FELT (field), +2f AVSN (paragraph), 30 SETN (sentence), +and 32 ORD (word). +

    Seven keys have symbols: 12 /\/\/\ (insert soft hyphen), +13 [Crossed down-arrow] (move down five lines), +18 >> << (justify left/right), +1e <> >< (justify full/center), +1f |<- (backtab), +20 ->| (tab), and +56 [Back/down arrow] (start new paragraph). +

    Other keycaps also occur. Those given above were meant +for use with the Notis WP word processor. +

    +

    5.20 Host Connected keyboard +

    + +

    IBM makes the "Host Connected Keyboard" for PS/2 machines used as +3270 emulators. +Delorie +reports on the 122-key "Host Connected" keyboard. +It may have 5b (F13), 5c (F14), 5d (F15), +63 (F16), 64 (F17), 65 (F18), +66 (F19), 67 (F20), 68 (F21), +69 (F22), 6a (F23), 6b (F24). +

    +

    5.21 A nameless USB keyboard +

    + +

    This keyboard has four additional keys: Power (rose), Sleep (blue), +WakeUp (green) and FN (yellow). +In legacy mode these keys give the expected keycodes +(e0 5e, e0 5f, e0 63, +and none, respectively), but the interaction is funny. +The four keys act as radiobuttons. Pressing one yields its key down code, +but releasing it does not produce any scancodes. Now pressing another +yields the down code for the other followed by the up code for the +previous one. The FN key follows this pattern, only its scancode sequence +is empty. Thus, pressing it causes the release code for a previous key +to be emitted. Pressing a key a second time gives no reaction: the radiobutton +was down already. +

    +

    5.22 Omnibook keyboard +

    + +

    + +The HP Omnibook XE3 laptop has special multimedia keys (aka OneTouch buttons) +disabled by default. It is enabled by writing 0x59 to port 0x64 +and then 0x90 to port 0x60 (as was found by Pavel Mihaylov). +Various kernel patches can be found on the net. See, for example, +this one. +

    Keys (on a GF model): +

    e0 32 (WWW), +e0 6c (Mail), +e0 74 (Demo), +e0 73 (Help), +e0 10 (Previous Track), +e0 22 (Play / Pause), +e0 24 (Stop / Eject), +e0 19 (Next Track), +e0 2e (Volume Down), +e0 30 (Volume Up), +e0 20 (Mute / Unmute). +

    +

    +

    +

    5.23 EZ Button keyboard +

    + +

    Eric Schott <eric@morningjoy.com> writes: +

    I have an IBM EZ Button keyboard (US layout), which seems to +generate codes that are similar - but not identical - to the +Rapid Access keycodes listed above. +

    There are 14 additional keys: +

    e0 25 ("Power" moon - has an LED), +e0 26 ("Help"), +e0 32 ("Internet"), +e0 17 ("Lotus Word Pro"), +e0 30 ("Lotus Organizer"), +e0 2e ("Aptiva Installer"), +e0 19 ("Delete Message"), +e0 24 (Stop), +e0 22 (Pause), +e0 1e ("Msg" - has an LED), +e0 20 ("CD" - has an LED), +e0 23 (Rewind), +e0 21 (Fast Forward), and +e0 12 ("Talk" - has an LED). +

    +

    The LEDs in the buttons are controlled by the sequence + +eb 00 xx +where the xx controls the LEDs. Bit 0 controls the "Msg" LED, +1 the CD LED, 2 the Power LED, 4 the Talk LED, and 5 the Message +Waiting LED. +

    +

    +

    5.24 Chicony KBP-8993 keyboard +

    + +

    Matthijs Melchior <mmelchio@xs4all.nl> reports: +

    The Chicony KBP-8993 keyboard is similar. It has 14 additional +keys, enabled by sending ea 71 and disabled +by sending ea 70. +

    These keys generate the following scan codes: +

    e0 25 (Moon), +e0 32 (WWW), +e0 30 (DOS), +e0 17 (MyDoc), +e0 26 (Menu), +e0 1e (zzZ), +e0 2e (Close), +e0 24 (Stop), +e0 23 (Back), +e0 22 (Play), +e0 21 (Forward), +e0 20 (Mute), +e0 12 (VolDown), +e0 19 (VolUp). +

    The two extra LEDs, above the Moon key, and next to the zzZ key +are manipulated by sending: + +eb 00 0x, +where bit 0 is the Moon LED and bit 1 is the zzZ LED. +

    +

    5.25 Keyboards for HP Kayak and Vectra +

    + +

    Fons Rademakers <Fons.Rademakers@cern.ch> writes: +

    The electronics for this keyboard was first developed by HP's +Home Products Division (HPD). +They now make improved versions, which I don't know much about. +We (HP Corporate PC Divisions, in Grenoble) reused the electronics, +and changed the serigraphy printed on the keys. +

    +

    +
    +MsgTTlWWW ? Lck MsgPhnWWWxxxSlp 133134135136137
    +PhnS3 S4 S5 i <<>||[] >> HP 138139140141142
    + Mut Mut 143
    + Vl+ Vl+ 144
    + VL- VL- 145
    + +
    Grenoble keyboard ------- Old HPD keyboard -------- key numbers
    +

    +

    +

    + Key# Scancode Gren. Name HPD name ASCII
    +
    + 133 e0 1e Message/SC1 Message a
    + 134 e0 12 Top Tools Phone e
    + 135 e0 32 Web Browser Internet m
    + 136 e0 17 Reminder Shortcut i
    + 137 e0 25 Lock Suspend k
    + 138 e0 23 Phone/SC2 << h
    + 139 e0 22 ShortCut 3 >|| g
    + 140 e0 24 ShortCut 4 [] j
    + 141 e0 21 ShortCut 5 >> f
    + 142 e0 26 Information Information l
    + 143 e0 20 Mute Mute d
    + 144 e0 30 Volume + Volume + b
    + 145 e0 2e Volume - Volume - c
    + +
    +

    Note the scancodes above are those read by x86 software in port 0x60. +This is also called Scancode Set 1. +Break codes are the same, with bit 7 of the second scancode set. +Example: e0 9e for the Message key. +

    +

    <spikboll@gmx.net> adds: +These keyboards have a "mail LED" (it's positioned above the Message +button) that kan be controlled by the Rapid Access hack: +'send_to_keyboard eb' makes the led blink and +'send_to_keyboard ec' turns the led off. +'send_to_keyboard ed' makes the led light steadily +and locks up the keys. +

    +

    5.26 A keyboard +

    + +

    Jon Masters <jonathan@easypenguin.co.uk> writes: +

    My new 121 key keyboard has 105 keys + 16 multimedia keys +(including cool stuff like a volume jog dial that sends one scancode +when turned one way and anther when turned the opposite way). +

    e0 5e (Power Off), +e0 5f (Sleep), +e0 63 (Resume), +e0 2e (Help), +e0 20 (My Favourite), +e0 30 (Browser), +e0 32 (WWW Search), +e0 26 (Shortcut), +e0 25 (Volume Down), +e0 1e (Volume Up), +e0 12 (Mute), +e0 22 (Previous), +e0 10 (Stop), +e0 24 (Next), +e0 21 (Eject), +e0 19 (Play). +

    +

    5.27 Yahoo! keyboard +

    + +

    Bernhard Polzin <B.Polzin@web.de> writes: +

    I have a transparent violet colored "Yahoo!" Keyboard with extra keys +for Internet and Audio. Unusual scancodes (untranslated/translated): +

    e0 37 / e0 5e (Power), +e0 3f / e0 5f (Sleep), +e0 5e / e0 63 (Wake), +e0 21 / e0 2e (Y!), +e0 4b / e0 26 (Short Cut), +e0 3a / e0 32 (E-Mail), +e0 23 / e0 20 (My Doc), +e0 32 / e0 30 (WWW), +e0 1c / e0 1e (Volume +), +e0 42 / e0 25 (Volume -), +e0 24 / e0 12 (Mute), +e0 15 / e0 10 (Stop), +e0 4e / e0 0c (Play/Pause), +e0 34 / e0 22 (Prev Track), +e0 3d / e0 08 (Next Track), +e0 4d / e0 19 (Eject). +(Volume +), (Volume -), (Prev Track) and (Next Track) are typematic. +

    Note that this is very similar to the previous one. +

    +

    + + +
    +

    +

    5.28 Honeywell Multimedia Keyboard +

    + +

    Eric Yeo reports that his Honeywell Multimedia Keyboard has the following +additional keys: +e0 25 (Screen saver), +e0 24 (Mail), +e0 32 (WWW), +e0 10 (Game), +e0 26 (Calc), +e0 1e (Shortcut 1), +e0 18 (Shortcut 2), +e0 12 (Prev), +e0 22 (Next), +e0 19 (Play), +e0 23 (Stop), +e0 30 (Vol up), +e0 2e (Vol down), +e0 17 (Eject), +e0 20 (Mute). +

    +

    5.29 Samsung Ergonomics Keyboard +

    + +

    Miguel Costa reports that his +Samsung Ergonomics Keyboard has the following additional keys: +e0 2e (Vol down), +e0 30 (Vol up), +e0 20 (Mute), +e0 18 (Eject), +e0 22 (PlayPause), +e0 24 (Stop), +e0 10 (Rewind), +e0 19 (Forward), +e0 26 (Help), +e0 59 (Favorites), +e0 09 (Exit), +e0 0a (Address book), +e0 02 (Action 1), +e0 03 (Action 2), +e0 04 (Action 3), +e0 05 (Action 4), +e0 06 (Action 5), +e0 32 (Internet), +e0 6c (Email), +e0 5f (Standby), +e0 5b (Windows left), +e0 5c (Windows right), +e0 5d (Windows task). +

    +

    + + +
    +

    +

    5.30 The "LiteOn MediaTouch Keyboard" type SK-2500 +

    + +

    Serge van den Boom reports that his LiteOn MediaTouch Keyboard +(a Trust "Direct Access Keyboard"), has 18 additional keys: +e0 25 (Suspend), +e0 7a (Coffee), +e0 32 (WWW), +e0 21 (Calculator), +e0 23 (Xfer), +38 2a 0f 8f / 8f b8 +aa (Switch window), +e0 17 (Close), +e0 10 (|<<), +e0 22 (>| / []), +e0 24 ([]), +e0 19 (>>|), +e0 1e (Record), +e0 12 (Rewind), +e0 26 (Menu/?), +e0 18 (Eject), +e0 20 (Mute), +e0 30 (Volume +), +e0 2e (Volume -). +Of these, the keys (|<<), (>>|), (Volume +), (Volume -) repeat. +The others do not, except for the rather special (Switch window) +key. Upon press it produces the LAlt-down, LShift-down, Tab-down, +Tab-up sequence; it repeats 0f, that is, Tab-down; +and upon release it produces the Tab-up, LAlt-up, LShift-up sequence. +

    +

    + + +
    +

    +

    +

    5.31 The Acer Aspire 1310LC laptop +

    + +

    Pau Aliagas reports that his Acer Aspire 1310LC laptop has 4 +additional keys: +e0 6c (Mail), +e0 32 (WWW), +e0 74 (P1), +e0 73 (P2). +

    +

    +

    5.32 The Emachines eKB-5190(A) keyboard +

    + +

    This keyboard has 18 additional keys, with translated Set 2 scancodes: +e0 1e (Banking), +e0 25 (Brokerage), +e0 26 (Pay Bills), +e0 24 (News), +e0 21 (Sports), +e0 22 (Travel), +e0 32 (Shopping), +e0 23 (Tickets), +e0 31 (Music), +e0 18 (Health), +e0 30 (Greetings), +e0 1f (Games), +e0 13 (Auctions), +e0 2e (MySite), +e0 20 (Telephone), +e0 12 (Surf), +e0 19 (Search), +e0 10 (Vol -), +e0 17 (Vol +). +The respective untranslated Set 3 codes are +95, 9d, 9c, 94, 99, +93, 97, 9a, 9e, 9f, +91, a3, a2, 92, 9b, +96, a0, a1, 98 (equal to the +translated Set 3 codes). +

    Unusual commands are e4 0b, which returns +bc 1c (untranslated 06 f0 5a), +and e4 0c, which returns +ff (untranslated 00), +and ec 0c, which returns 06 regardless of +translation. I do not know the meaning or function of these. +

    +

    5.33 Keyboards with many keys +

    + +

    The current mechanism is unable to handle keyboards with more than +127 keys. But such keyboards seem to exist. Indeed, I now have a +Safeway SW23 that has 132 keys. +

    Mark Hatle <fray@kernel.crashing.org> wrote: +

    On some ADB keyboards there are actually 128 distinct keys. +They use scancodes 0-127. +

    ADB is Apple Desktop Bus. The way that ADB works is similar to SCSI but +on a much slower level. Specifically there is a communications chip in +the computer, ADB controller, and the same chip in the keyboard. The +keyboard sends the scancode to its internal ADB controller, the internal +ADB controller then does any key mapping needed (not used under linux +from my understanding) and passes the data to the computer. +

    The ADB controller is capable of sending 256 distinct keys, but to my +knowledge only 128 are sent. The key 0 is the 'a' and key 127 is the +"power button". +

    Also some of the Apple ADB keyboards have special "sound" and "function" +keys. These keys (used in MacOS for volume up and down, screen contrast +changing, etc) also show up on the ADB scancodes. +

    ADB is used for both m68k and PPC Linux. The m68k Macintosh port, and +the PPC - Power Macintosh and CHRP ports. +

    and later: +

    Basically the scancode sequences for ADB are 16 bit. so there can actually +be 65536 scancodes, currently though only 128 are defined. +

    +

    5.34 A keyboard treating PrtSc/SysRq like Pause/Break +

    + +

    + +Mike A. Harris <mharris@meteng.on.ca> +reports a keyboard (an "Mtek" keyboard, model "K208") +where PrtSc/SysRq behaves like Pause/Break and also sends both make +and break sequences when pressed and nothing when released. +It does not repeat. +(Thus, he gets e0 2a e0 37 +e0 b7 e0 aa for PrtSc press, +and 54 d4 for SysRq (i.e., Alt+PrtSc).) +Others have reported the same (for an unspecified type of keyboard). +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-6.html b/specs/kbd/scancodes-6.html new file mode 100644 index 0000000..32f8c4c --- /dev/null +++ b/specs/kbd/scancodes-6.html @@ -0,0 +1,304 @@ + + + + + Keyboard scancodes: NCD keyboards + + + + + +Next +Previous +Contents +
    +

    6. NCD keyboards

    + +

    Some keyboards natively produce +Set 3 scancodes. +When connected to a PC one will by default see translated Set 3 scancodes. +This means that the F9 and F10 keys have make codes 60 and 61 +and break codes e0 and e1. Thus, these latter codes are +ordinary key release codes here, not protocol codes. +

    The N-nnn type numbers indicate the number nnn of keys the keyboard has. +

    +

    6.1 A Japanese keyboard using e0 as ordinary scancode +

    + +

    Benjamin Carter <bcarter@ultra5.cs.umr.edu> reports: +

    I recently came into possession of a 97-key keyboard with Japanese +markings on the keys. (The keys also have the standard +qwerty-characters on them, with the exception of some of the meta-keys +(there are 3 keys near the Alt keys on either side of the spacebar with +only Japanese characters on them so I don't know what they are). +In any case, the keyboard sends out scancodes that work for all the main +keys (backspace, letters and numbers, enter, shift), but the numeric +keypad, Alt keys, and function keys don't work. +I have run the board through showkey -s, so I know what +scancodes this keyboard sends out. +However, the F9 and F10 keys send out 60 and 61, +respectively, so their key release events send out e0 +and e1, confusing the keyboard driver. +

    (Compare this with the +table +giving the translated Set 3 scancodes. The reported codes are +almost identical.) +

    # These are across the top of the keyboard. +

    58 (F1), 59 (F2), 5a (F3), +5b (F4), 5c (F5), 5d (F6), +5e (F7), 5f (F8), 60 (F9), +61 (F10), 62 (F11), 63 (F12) +

    +76 (Break), 77 (Setup). +

    +# top row +

    64 (Esc), +02 (1), 03 (2), 04 (3), +05 (4), 06 (5), 07 (6), +08 (7), 09 (8), 0a (9), +0b (0), 0c (-), 0d (=), +29 (`), 0e (Backspace) +

    +

    # 2nd row +

    0f (Tab), +10 (Q), 11 (W), 12 (E), +13 (R), 14 (T), 15 (Y), +16 (U), 17 (I), 18 (O), +19 (P), 1a ([), 1b (]), +79 (Del), 6e (Line Feed) +

    +

    # 3rd row +

    38 (Ctrl), +1e (A), 1f (S), 20 (D), +21 (F), 22 (G), 23 (H), +24 (J), 25 (K), 26 (L), +27 (;), 28 ('), 75 (\), +1c (Return) +

    +

    # 4th row +

    2a (Shift_L), +2c (Z), 2d (X), 2e (C), +2f (V), 30 (B), 31 (N), +32 (M), 33 (,), 34 (.), +35 (/), +3a ((unknown)), +36 (Shift_R) +

    +

    # bottom row +

    1d (Caps Lock), 71 (Alt_L), +01 ((unknown)), +39 (Space), +45 ((unknown)), +72 (Alt_R), +46 ((unknown)) +

    +

    # numeric keypad. No "grey" section on the keyboard. +

    47 (7), 48 (8), 49 (9), +54 (Keypad -), +4b (4), 4c (5), 4d (6), +37 (Keypad +), +4f (1), 50 (2), 51 (3), +4e (Keypad Enter), +52 (0), +78 (Up), +53 (Keypad .), +56 (Left), +55 (Down), +7d (Right), +7e (Keypad ,). +

    +

    +

    6.2 The NCD N-123NA keyboard +

    + +

    +

    + + +
    +

    There are more keyboards that do not use e0 as escape code. +For example, Paul Schulz <pauls@caemrad.com.au> +reports the same for Sun Type 5 Keyboard with PS/2 connector, +NCD model N-123NA. The scancodes are very similar to those given above: +

    # Sun Keys (far left) +

    44 (Help), +42 (Stop), +40 (Again), +3e (Props), +65 (Undo), +70 (Front), +66 (Copy), +67 (Open), +68 (Paste), +69 (Find), +6a (Cut), +

    # Top row +

    64 (ESC), +58 (F1), +59 (F2), +5a (F3), +5b (F4), +5c (F5), +5d (F6), +5e (F7), +5f (F8), +60 (F9), +61 (F10), +62 (F11), +63 (F12), +

    # 1st row +

    29 (~/`), +02 (!/1), +03 (@/2), +04 (#/3), +05 ($/4), +06 (%/5), +07 (^/6), +08 (&/7), +09 (*/8), +0a ((/9), +0b ()/0), +0c (_/-), +0d (+/=), +0e (BS), +

    # 2nd row +

    0f (TAB), +10 (Q), +11 (W), +12 (E), +13 (R), +14 (T), +15 (Y), +16 (U), +17 (I), +18 (O), +19 (P), +1a ({/[), +1b (}/]), +75 (|/\), +

    # 3rd row +

    29 (CAPS), +30 (A), +31 (S), +32 (D), +33 (F), +34 (G), +35 (H), +36 (J), +37 (K), +38 (L), +39 (:/;), +40 ("/'), +28 (Enter), +

    # 4th row +

    2a (Shift), +2c (Z), +2d (X), +2e (C), +2f (V), +30 (B), +31 (N), +32 (M), +33 (</,), +34 (>/.), +35 (?//), +36 (Shift), +

    # Bottom row +

    38 (Ctrl), +71 (Alt), +66 (Meta), +39 (Space), +6c (Meta), +72 (Compose), +3a (Alt), +

    # To the right +

    6e (PrintScreen/SysRq), +76 (ScrollLock), +77 (Pause/Break), +

    76 (Insert), +7f (Home), +6f (PageUp), +

    79 (Del), +7a (End), +7e (PageDown), +

    80 (.), +81 (.), +82 (.), +

    d4 (.), +78 (Up), +41 (.), +

    56 (Left), +55 (Down), +7d (Right), +

    # Keypad +

    6d (Mute), +73 (Brightness/Vol Down), +74 (Brightness/Vol Up), +53 (Setup), +

    01 (NumLock), +45 (/), +46 (*), +54 (-), +

    47 (7/Home), +48 (8/Up), +4d (9/PgUp), +37 (+), +

    4b (4/Left), +4c (5), +4d (6/Right), +

    4f (1/End), +50 (2/Down), +51 (3/PgDn), +4e (Enter), +

    52 (0/Ins), +53 (./Del). +

    +

    6.3 The NCD N-123UX keyboard +

    + +

    Don Christensen reports that his NCD N-123UX keyboard +returns scancode Set 3. +

    +

    6.4 The NCD N-97 keyboard +

    + +

    David Monro reports: I have a PS/2 keyboard, an NCD N-97, +which shipped with some NCD X terminals and also with some Mips +workstations IIRC. This keyboard returns Set 3 keycodes +even when its told to be in Set 2. In particular, the release +codes for F9 and F10 are e0 and e1. +The +keyboard ID is ab 85. +

    +

    +

    6.5 NCD X terminals +

    + +

    NCD keyboards are often used with NCD X terminals. +Here the key combinations to get into the boot monitor. +

    +

    +

    +N-101 LCtrl-LAlt-Setup
    +N-102 or Windows compatible LAlt-CapsLock-Setup
    +VT220-compatible Ctrl-Compose-F3
    +N-108LK Ctrl-LAlt-F3
    +N-97 LAlt-CapsLock-Setup
    +N-97 Kana and Hitachi Kana LAlt-CapsLock-Setup
    +N-107 Sun type 4 compatible Stop A (L1-A)
    +N-123 Sun type 5 compatible Stop-A (L1-A)
    +Nokia 122
    +3270 (122-key Lexmark) LShift LAlt Setup
    + (on the left keypad)
    + +
    +

    +

    +

    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-7.html b/specs/kbd/scancodes-7.html new file mode 100644 index 0000000..0191d4a --- /dev/null +++ b/specs/kbd/scancodes-7.html @@ -0,0 +1,164 @@ + + + + + + Keyboard scancodes: Japanese keyboards + + + + + +Next +Previous +Contents +
    +

    7. Japanese keyboards

    + +

    +

    +

    +

    + + +
    + +
    + + +
    +

    +

    7.1 Japanese 86/106 keyboards +

    + +

    (Information from Barry Yip <g609296@cc.win.or.jp>, +Norman Diamond, NIIBE Yutaka and H. Peter Anvin, who +contributed the photographs of his +JP106 keyboard above and of his +Japanese laptop.) +

    Common Japanese keyboards have five additional keys +(106-key, or 86-key for a notebook; these days there may also +be 3 extra Windows keys). These keys have scancodes +70 (hiragana/katakana), +73 (backslash/underscore), +79 (henkan/zenkouho), +7b (muhenkan), +7d (yen/vertical bar). +

    + +Different keycaps: +

    +

    +USB Scancode Japanese US USB Scancode Japanese US
    +5329(hankaku/zenkaku)(` / ~) 471a(@ / `)([ / {)
    +3103 (2 / ") (2 / @) 481b([ / {) (] / })
    +3507 (6 / &) (6 / ^) 5127(; / +) (; / :)
    +3608 (7 / ') (7 / &) 5228(: / *) (' / ")
    +3709 (8 / () (8 / *) 292b(] / }) (backslash / |)
    +380a (9 / )) (9 / () 13573(backslash / _)
    +390b (0 / ~) (0 / )) 1397b(muhenkan)
    +450c (- / =) (- / _)13879(henkan/zenkouho)
    +460d (^ / overbar) (= / +) 13670(hiragana/katakana)
    +1377d (\ / |)
    + +
    +

    ASCII and JIS-Roman differ in two or three points: the code positions +where ASCII has backslash, tilde, broken bar, +JIS-Roman uses yen, overbar and vertical bar, respectively. +

    Some keyboards have the tilde printed on the keycap for the 0 key, some don't. +Similarly, some keyboards have the backslash printed on the keycap for the _ key +and some don't, but in all cases you need Shift to get _. +

    +

    7.2 Description of the all-Japanese keys +

    + +

    Norman Diamond adds to the previous section: +

    To the left of the spacebar, +(Shift-JIS) –³•ÏŠ· +(muhenkan) means no conversion +from kana to kanji. To the right of the spacebar, +•ÏŠ· +(henkan) means conversion from kana to kanji. In Microsoft systems +it converts the most recently input sequence of kana to the system's +first guess at a string of kanji/kana/etc. with the correct pronunciation +and a guess at the meaning. Repeated keypresses change it to other +possible guesses which are either less common or less recently used, +depending on the situation. The shifted version of this key is +‘OŒò•â +(zenkouho) which means "previous candidate" -- "zen" means "previous", +while "kouho" means "candidate" (explanation courtesy of NIIBE Yutaka) +-- it rotates back to earlier guesses for kanji conversion. +The alt version of this key is +‘SŒò•â +also pronounced (zenkouho), which means "all candidates" -- here, +"zen" means "all" -- it displays a menu of all known guesses. +I never use the latter two functions of the key, because after +pushing the henkan key about three times and not getting the desired guess, +it displays a menu of all known guesses anyway. +

    Next on the right, +‚Ђ炪‚È +(hiragana) means that +phonetic input uses one conventional Japanese phonetic alphabet, +which of course can be converted to kanji by pressing the henkan key later. +The shifted version is +ƒJƒ^ƒJƒi +(katakana) which means the other Japanese phonetic alphabet, +and the alt version is +ƒ[ƒ}Žš +(ro-maji) which means the Roman alphabet. +

    Near the upper left, +”¼/‘S +(han/zen) means switch between hankaku +(half-size, the same size as an ASCII character) and zenkaku +(full-size, since the amount of space occupied by a kanji +is approximately a square, twice as fat as an ASCII character). +It only affects katakana and a few other characters +(for example there's a full-width copy of each ASCII character +in addition to the single-byte half-width encodings). +The alt version of this is +Š¿Žš +(kanji) which +actually causes typed Roman phonetic keys to be displayed as Japanese +phonetic kana (either hiragana or katakana depending on one of the other +keys described above) and doesn't cause conversion to kanji. +

    +

    7.3 A Japanese keyboard that imitates a US one +

    + +

    John Bradford reports that he has a Japanese keyboard +(an IBM 5576 KEYBOARD-2, part number 94X1110) that by default +simulates US key layout. Thus, pressing the @ key yields scancodes +2a 03 (fake shift followed by digit 2), +pressing Shift - yields scancodes b6 0d +(fake shift down, =) with release 8d 36, etc. +

    Thus, the (translated Set 2) scancodes can be read off the +table with differences between the +Japanese and the US layout. +

    In this state the non-ASCII keys (Yen and overline) yield an error +(ff). The Japanese keys hankaku, kanji/katakana, muhenkan, +zenkoho/henkan, hiragana, zenmen ki, yield the codes expected from +keys in that position on a US keyboard: 29 (`/~), +38 (LAlt), 39 (space), 39 (space), +39 (space), e0 38 (RAlt), respectively. +

    Switching the keyboard to Set 3 enables the Japanese keys. +In untranslated Set 3 these give codes: hankaku 0e, +Yen 13, overline (shift ^), kanji/katakana 19, +muhenkan 85, zenkoho/henkan 86, +hiragana 87, zenmen ki 39. +(Also: backslash/underscore 5c, bracketright/braceright 53.) +

    This is the only keyboard I know that gives more information in Set 3 +than in Set 2. It reports +keyboard ID +ab 90. +

    +

    + + +
    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-8.html b/specs/kbd/scancodes-8.html new file mode 100644 index 0000000..5a5bbc9 --- /dev/null +++ b/specs/kbd/scancodes-8.html @@ -0,0 +1,75 @@ + + + + + Keyboard scancodes: Korean keyboards + + + + + +Next +Previous +Contents +
    +

    8. Korean keyboards

    + +

    The Korean keyboard has two keys, the Korean/Chinese +and the Korean/English toggles, that generate scancodes +f1 and f2 (respectively) when pressed, +and nothing when released. They do not repeat. +The keycaps are "hancha" and "han/yong" (written in Hangul). +Hancha (hanja) means Chinese character, and Han/Yong is short for +Hangul/Yongcha (Korean/English). +They are located left and right of the space bar. +

    +

    +

    8.1 An A4tech keyboard +

    + +

    Dave Willis reports on his A4tech keyboard: +

    Apart from the Korean Hancha and Han/Yong keys, there are on the top row: +

    e0 5f (Moon), +e0 6c (Mail), +e0 6b (Computer), +e0 21 (Calculator), +e0 6d (Notes), +e0 10 (Previous), +e0 19 (Next), +e0 2e (Minus), +e0 20 (Mute), +e0 30 (Plus), +e0 22 (Play/Pause), +e0 24 (Stop), +e0 65 (Magnifier), +e0 32 (Home), +e0 66 (Folder), +e0 67 (recycle-style arrows), +e0 68 (x). +

    Below mute: +e0 62 (Office). +

    On the right hand side: +e0 6a (arrow up left), +e0 69 (arrow down right), +e0 0b (wheel up), +e0 2c (wheel down), +e0 64 (wheel in). +

    Wheel up and wheel down have no release code, only the plus and minus keys +will repeat themselves when held down. +

    +

    8.2 The DEC LK201-K +

    + +

    +

    + + +
    +

    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes-9.html b/specs/kbd/scancodes-9.html new file mode 100644 index 0000000..44451c6 --- /dev/null +++ b/specs/kbd/scancodes-9.html @@ -0,0 +1,403 @@ + + + + + Keyboard scancodes: Keyboard-internal scancodes + + + + + +Next +Previous +Contents +
    +

    9. Keyboard-internal scancodes

    + +

    +

    9.1 Three scancode sets +

    + +

    The usual PC keyboards are capable of producing three sets of scancodes. +Writing 0xf0 followed by 1, 2 or 3 to port 0x60 will put the keyboard +in scancode mode 1, 2 or 3. Writing 0xf0 followed by 0 queries the mode, +resulting in a scancode byte 43, 41 or 3f +from the keyboard. +

    Set 1 contains the values that the XT keyboard (with only one set +of scancodes) produced, with extensions for new keys. Someone +decided that another numbering was more logical and invented +scancode Set 2. However, it was realized that new scancodes +would break old programs, so the keyboard output was fed to a +8042 microprocessor on the motherboard that could translate Set 2 +back into Set 1. Indeed a smart construction. This is the default today. +Finally there is the PS/2 version, Set 3, more regular, but used by +almost nobody. +

    (I wrote this long ago. Nowadays Linux 2.5 may try to use Set 3. +Also certain HP machines, like the PS/2 version of the HP9000 +workstation, have used Set 3.) +

    Sets 2 and 3 are designed to be translated by the 8042. +Set 1 should not be translated. +

    Not all keyboards support all scancode sets. For example, my MyCom +laptop only supports scancode Set 2, and its keyboard does not react +at all when in mode 1 or 3. +

    +

    9.2 Make and Break codes +

    + +

    The key press / key release is coded as follows: +

    For Set 1, if the make code of a key is c, the break code +will be c+0x80. If the make code is e0 c, +the break code will be e0 c+0x80. +The Pause key has make code e1 1d 45 +e1 9d c5 and does not generate a break code. +

    For Set 2, if the make code of a key is c, the break code +will be f0 c. If the make code is e0 c, +the break code will be e0 f0 c. +The Pause key has the 8-byte make code e1 14 77 +e1 f0 14 f0 77. +

    For Set 3, by default most keys do not generate a break code - only CapsLock, +LShift, RShift, LCtrl and LAlt do. However, by default all non-traditional +keys do generate a break code - thus, LWin, RWin, Menu do, and for example +on the Microsoft Internet keyboard, so do Back, Forward, Stop, +Mail, Search, Favorites, Web/Home, MyComputer, Calculator, Sleep. +On my BTC keyboard, also the Macro key does. +

    In Scancode Mode 3 it is possible to enable or disable key repeat +and the production of break codes either on a key-by-key basis +or for all keys at once. +And just like for Set 2, key release is indicated by a f0 prefix +in those cases where it is indicated. +There is nothing special with the Pause key in scancode mode 3. +

    +

    9.3 Translation +

    + +

    The 8042 microprocessor translates the incoming byte stream produced +by the keyboard, and turns an f0 prefix into an OR with +80 for the next byte. + +(Some implementations do this for the next byte that does not have +this bit set already. A consequence is that in Set 3 the keys with Set-3 +value 0x80 or more are broken in a peculiar way: hitting such a key and +then some other key turns the make code for this last key into a break code. +For example the Sleep key on a Microsoft Internet keyboard generates +54 / d4 for press/release. But pressing and +releasing first Menu and then Sleep produces +8d 8d d4 d4 as translation of +8d f0 8d 54 f0 54. +Other implementations are OK.) +

    + +Unless told not to translate, the keyboard controller translates +keyboard scancodes into the scancodes it returns to the CPU +using the following table (in hex): +

    +

    +

    + 00 01 02 03 04 05 06 07 08 09 0a 0b 0c 0d 0e 0f
    +
    +00 ff 43 41 3f 3d 3b 3c 58 64 44 42 40 3e 0f 29 59
    +10 65 38 2a 70 1d 10 02 5a 66 71 2c 1f 1e 11 03 5b
    +20 67 2e 2d 20 12 05 04 5c 68 39 2f 21 14 13 06 5d
    +30 69 31 30 23 22 15 07 5e 6a 72 32 24 16 08 09 5f
    +40 6b 33 25 17 18 0b 0a 60 6c 34 35 26 27 19 0c 61
    +50 6d 73 28 74 1a 0d 62 6e 3a 36 1c 1b 75 2b 63 76
    +60 55 56 77 78 79 7a 0e 7b 7c 4f 7d 4b 47 7e 7f 6f
    +70 52 53 50 4c 4d 48 01 45 57 4e 51 4a 37 49 46 54
    +80 80? 81 82 41 54 85 86 87 88 89 8a 8b 8c 8d 8e 8f
    +90 90 91 92 93 94 95 96 97 98 99 9a 9b 9c 9d 9e 9f
    +a0 a0 a1 a2 a3 a4 a5 a6 a7 a8 a9 aa ab ac ad ae af
    +b0 b0 b1 b2 b3 b4 b5 b6 b7 b8 b9 ba bb bc bd be bf
    +c0 c0 c1 c2 c3 c4 c5 c6 c7 c8 c9 ca cb cc cd ce cf
    +d0 d0 d1 d2 d3 d4 d5? d6 d7 d8 d9? da? db dc dd de df
    +e0 e0 e1 e2 e3 e4 e5 e6 e7 e8 e9 ea eb ec ed ee ef?
    +f0 - f1? f2? f3? f4? f5? f6? f7? f8? f9? fa? fb? fc? fd? fe? ff
    + +
    +

    A reference for the first half of this table is the book by Gary J Konzak +PC 8042 Controller, ISBN 0-929392-21-3. +(Report by vojtech@suse.cz.) +

    A way to check this table is: (i) put the keyboard in untranslated modes +1, 2, 3 and look at the +resulting values, +and (ii) put the keyboard in translated scancode modes 1, 2, 3. Now compare +the values. The entries with question marks were not checked in this way. +

    Note that the range 01-7f of this table is 1-1. +In the second half of the table, translated and untranslated values +are equal in all known cases, with the two exceptions 83 and 84. +

    One asks the controller to transmit untranslated scancodes by writing +a keyboard controller command with bit 5 set and bit 6 cleared. +E.g., use the command byte 45 to get translated codes, +and 24 to get untranslated codes that do not cause interrupts. +

    +

    Effects of translation

    + +

    +

    Origin of strange scan code set values

    + +

    The keyboard command f0 with argument 1, 2 or 3 +sets the current scancode set, and this same command +with argument 0 asks for the current scancode set. +The reply is 43, 41 or 3f +for sets 1, 2, 3. Why? Because in reality the reply is 1, 2 or 3, +and that is what one sees when translation is off. But translation +turns these into 43, 41, 3f. +

    +

    Keyboard IDs

    + +

    Keyboards do report an ID as a reply to the command + +f2. +(An XT keyboard does not reply, an AT keyboard only replies with an ACK.) +An MF2 AT keyboard reports ID ab 83. +Translation turns this into ab 41. +

    Many short keyboards, like IBM ThinkPads, and Spacesaver keyboards, +send ab 84 untranslated, +which becomes ab 54 translated. +(The netbsd source has a misunderstanding here, and seems to associate +the 54 and 84 to the ThinkPad model - cf. the defines +KEYB_R_MF2ID2TP75X, KEYB_R_MF2ID2TP76X.) +

    Several 122-key keyboards are reported to send ab 86. +Here translated and untranslated values coincide. +(Reports mention "122-Key Enhanced Keyboard", "standard 122-key keyboard", +"122 Key Mainframe Interactive (MFI) Keyboard".) +

    David Monro reports ab 85 for a +NCD N-97 keyboard. +Tim Clarke reports ab 85 for the +"122-Key Host Connect(ed) Keyboard". +

    He also reports +Also, when playing with my KVM problems Belkin gave me a +105-key Windows keyboard which Id.s itself as 18ABh. +

    Linux 2.5.25 kernel source has 0xaca1 for a +"NCD Sun layout keyboard". It also mentions 0xab02 and 0xab7f, +but these arise as (mistaken) back translations from +ab 41 and ab 54. +

    Ralph Brown's Interrupt list mentions "old Japanese 'G', 'P', 'A' keyboards", +with keyboard IDs ab 90, ab 91, +ab 92. Here translated and untranslated versions +coincide. ID ab 90 was also mentioned +above. +

    +

    +

    9.4 Correspondence +

    + +

    For the traditional keys the correspondence is fairly clear: +above we saw the +translation table, +and Set 1 equals translated Set 2, and Set 3 equals Set 2 in most cases +where Set 2 has a single (non-escaped) scancode, +and in any case the correspondence is constant (and given +below). +

    On the other hand, modern keyboards have all kinds of multimedia +and other additional keys, and what happens for them is completely +random, and varies from keyboard to keyboard. +

    Let us look at an example. +

    The +Microsoft Internet keyboard +has keys Search, Favorites, Stop, Forward, Back, My Computer, +Mail, Web / Home, Calculator with translated Set 3 scancodes +65, 66, 68, 69, 6a, +6b, 6c, 97, 99, respectively, +and translated Set 2 scancodes e0 xx, with +xx = 65, 66, 68, 69, +6a, 6b, 6c, 32, 21. +

    On the other hand, the +IBM Rapid Access II keyboard +has keys CD stop, CD play, Volume D, Volume U, CD back, CD fwd +with translated Set 3 scancodes +69, 6a, 6b, 6c, 6d, 44, +and translated Set 2 scancodes e0 xx, with +xx = 20, 22, 21, 23, +24, 12. +

    Thus, different keyboards have different mappings between Set 2 +and Set 3 codes. +

    +

    9.5 Use +

    + +

    Can these other scancode sets be used? Probably not. +

    () Translated scancode Set 1 has weird codes that nobody wants to use. +

    (i) My MyCom laptop does not support scancode sets 1 and 3 at all. +

    (ii) Some laptops have special key combinations that bring one +into a setup or configuration utility. It is impossible to do +anything useful, or to get out of it again, when the scancode mode +is not translated Set 2. +

    (iii) Many keyboards have bugs in scancode sets 1 and/or 3 but +are fine in scancode Set 2. +Vojtech Pavlik reports that his BTC keyboard has the same codes +for the '1' and '2' keys in Set3, both having the code for '1'). +On my BTC keyboard the key up value for Esc and 1 are both ff +in scancode Set 1. My Safeway keyboard has untranslated Set 1 equal +to translated Set 2, except for the multimedia keys, where +untranslated Set 1 equals untranslated Set 2. +

    (iv) A big advantage of Set 3 is that each key generates a unique code +so that one does not need to parse sequences. However, the BTC keyboard +mentioned +above generates e0 6f +for its Macro key also in scancode mode 3. The Safeway keyboard mentioned +above does not generate any codes +for its multimedia keys in scancode mode 3. +

    (v) Some keyboard controllers cannot handle Set 3 values that are +larger than 0x7f, and give +peculiar results +for e.g. the Windows keys in translated scancode mode 3. +The result is that the following key is "eaten": the key down action +turns into a key up. +

    (vi) The USB legacy support only supports translated Set 2. +

    (vii) The +Microsoft Keyboard Scan Code Specification writes: +In the very early days of Windows NT, an attempt was made +to use the much more orthogonal Scan Code Set 3, but due to bugs +in the implementation of this Scan Code Set on numerous OEM +keyboards, the idea was abandoned. +And also: Scan Code Set 3 is not used or required for operation +of Microsoft operating systems. +

    (viii) Others also tried Set 3. The PS/2 version of the HP9000 +workstation uses it. This is fine with HP's keyboards but causes +some problems with foreign keyboards. +

    (ix) It is said that Hal Snyder of Mark Williams, Co remarked: +"We find that about 10% of cheap no-name keyboards do not work +in scan code set 3". +

    (x) These days Linux probes the keyboard, and may try to enable Set 3. +This is good for learning a lot about strange keyboards. +It is bad for having a stable system that just works. +

    +

    9.6 A table +

    + +

    (USB codes in decimal, scancodes in hex.) +

    +

    +

    +# USB Set 1 X(Set 1) Set 2 X(Set 2) Set 3 X(Set 3) keycap
    +1 53 29 39 0e 29 0e 29 ` ~
    +2 30 02 41 16 02 16 02 1 !
    +3 31 03 3f 1e 03 1e 03 2 @
    +4 32 04 3d 26 04 26 04 3 #
    +5 33 05 3b 25 05 25 05 4 $
    +6 34 06 3c 2e 06 2e 06 5 % E
    +7 35 07 58 36 07 36 07 6 ^
    +8 36 08 64 3d 08 3d 08 7 &
    +9 37 09 44 3e 09 3e 09 8 *
    +10 38 0a 42 46 0a 46 0a 9 (
    +11 39 0b 40 45 0b 45 0b 0 )
    +12 45 0c 3e 4e 0c 4e 0c - _
    +13 46 0d 0f 55 0d 55 0d = +
    +15 42 0e 29 66 0e 66 0e Backspace
    +16 43 0f 59 0d 0f 0d 0f Tab
    +17 20 10 65 15 10 15 10 Q
    +18 26 11 38 1d 11 1d 11 W
    +19 8 12 2a 24 12 24 12 E
    +20 21 13 70 2d 13 2d 13 R
    +21 23 14 1d 2c 14 2c 14 T
    +22 28 15 10 35 15 35 15 Y
    +23 24 16 02 3c 16 3c 16 U
    +24 12 17 5a 43 17 43 17 I
    +25 18 18 66 44 18 44 18 O
    +26 19 19 71 4d 19 4d 19 P
    +27 47 1a 2c 54 1a 54 1a [ {
    +28 48 1b 1f 5b 1b 5b 1b ] }
    +29 49 2b 21 5d 2b 5c 75 \ |
    +30 57 3a 32 58 3a 14 1d CapsLock
    +31 4 1e 03 1c 1e 1c 1e A
    +32 22 1f 5b 1b 1f 1b 1f S
    +33 7 20 67 23 20 23 20 D
    +34 9 21 2e 2b 21 2b 21 F
    +35 10 22 2d 34 22 34 22 G
    +36 11 23 20 33 23 33 23 H
    +37 13 24 12 3b 24 3b 24 J
    +38 14 25 05 42 25 42 25 K
    +39 15 26 04 4b 26 4b 26 L
    +40 51 27 5c 4c 27 4c 27 ; :
    +41 52 28 68 52 28 52 28 ' "
    +42 50 00 ff 00 ff 00 ff non-US-1
    +43 40 1c 1e 5a 1c 5a 1c Enter
    +44 225 2a 2f 12 2a 12 2a LShift
    +46 29 2c 14 1a 2c 1a 2c Z
    +47 27 2d 13 22 2d 22 2d X
    +48 6 2e 06 21 2e 21 2e C
    +49 25 2f 5d 2a 2f 2a 2f V
    +50 5 30 69 32 30 32 30 B
    +51 17 31 31 31 31 31 31 N
    +52 16 32 30 3a 32 3a 32 M
    +53 54 33 23 41 33 41 33 , <
    +54 55 34 22 49 34 49 34 . >
    +55 56 35 15 4a 35 4a 35 / ?
    +57 229 36 07 59 36 59 36 RShift
    +58 224 1d 11 14 1d 11 38 LCtrl
    +60 226 38 6a 11 38 19 71 LAlt
    +61 44 39 72 29 39 29 39 space
    +62 230 e0-38 e0-6a e0-11 e0-38 39 72 RAlt
    +64 228 e0-1d e0-11 e0-14 e0-1d 58 3a RCtrl
    +75 73 e0-52 e0-28 e0-70 e0-52 67 7b Insert
    +76 76 e0-53 e0-74 e0-71 e0-53 64 79 Delete
    +80 74 e0-47 e0-60 e0-6c e0-47 6e 7f Home
    +81 77 e0-4f e0-61 e0-69 e0-4f 65 7a End
    +85 75 e0-49 e0-34 e0-7d e0-49 6f 6f PgUp
    +86 78 e0-51 e0-73 e0-7a e0-51 6d 7e PgDn
    +79 80 e0-4b e0-26 e0-6b e0-4b 61 56 Left
    +83 82 e0-48 e0-6c e0-75 e0-48 63 78 Up
    +84 81 e0-50 e0-6d e0-72 e0-50 60 55 Down
    +89 79 e0-4d e0-19 e0-74 e0-4d 6a 7d Right
    +90 83 45 0b 77 45 76 01 NumLock
    +91 95 47 60 6c 47 6c 47 KP-7 / Home
    +92 92 4b 26 6b 4b 6b 4b KP-4 / Left
    +93 89 4f 61 69 4f 69 4f KP-1 / End
    +95 84 e0-35 e0-15 e0-4a e0-35 77 45 KP-/
    +96 96 48 6c 75 48 75 48 KP-8 / Up
    +97 93 4c 27 73 4c 73 4c KP-5
    +98 90 50 6d 72 50 72 50 KP-2 / Down
    +99 98 52 28 70 52 70 52 KP-0 / Ins
    +100 85 37 5e 7c 37 7e 46 KP-*
    +101 97 49 34 7d 49 7d 49 KP-9 / PgUp
    +102 94 4d 19 74 4d 74 4d KP-6 / Right
    +103 91 51 73 7a 51 7a 51 KP-3 / PgDn
    +104 99 53 74 71 53 71 53 KP-. / Del
    +105 86 4a 35 7b 4a 84 54 KP--
    +106 87 4e 0c 79 4e 7c 37 KP-+
    +108 88 e0-1c e0-1e e0-5a e0-1c 79 4e KP-Enter
    +110 41 01 43 76 01 08 64 Esc
    +112 58 3b 24 05 3b 07 58 F1
    +113 59 3c 16 06 3c 0f 59 F2
    +114 60 3d 08 04 3d 17 5a F3
    +115 61 3e 09 0c 3e 1f 5b F4
    +116 62 3f 5f 03 3f 27 5c F5
    +117 63 40 6b 0b 40 2f 5d F6
    +118 64 41 33 83 41 37 5e F7
    +119 65 42 25 0a 42 3f 5f F8
    +120 66 43 17 01 43 47 60 F9
    +121 67 44 18 09 44 4f 61 F10
    +122 68 57 6e 78 57 56 62 F11
    +123 69 58 3a 07 58 5e 63 F12
    +124 70 e0-37 e0-5e e0-7c e0-37 57 6e PrtScr
    +0 154 54 1a 84 54 57 6e Alt+SysRq
    +125 71 46 0a 7e 46 5f 76 ScrollLock
    +126 72 e1-1d-45 e1-11-0b e1-14-77 e1-1d-45 62 77 Pause
    +0 0 e0-46 e0-0a e0-7e e0-46 62 77 Ctrl+Break
    +0 227 e0-5b e0-1b e0-1f e0-5b 8b 8b LWin (USB: LGUI)
    +0 231 e0-5c e0-75 e0-27 e0-5c 8c 8c RWin (USB: RGUI)
    +0 0 e0-5d e0-2b e0-2f e0-5d 8d 8d Menu
    +0 0 e0-5f e0-76 e0-3f e0-5f 7f 54 Sleep
    +0 0 e0-5e e0-63 e0-37 e0-5e 00 ff Power
    +0 0 e0-63 e0-78 e0-5e e0-63 00 ff Wake
    + +
    +

    +

    +

    9.7 Vendor extensions +

    + +

    + +Logitech uses an e2 prefix for the codes sent by a +pointing device integrated on the keyboard. +

    +

    +

    +

    +


    +Next +Previous +Contents + + diff --git a/specs/kbd/scancodes.html b/specs/kbd/scancodes.html new file mode 100644 index 0000000..f25761f --- /dev/null +++ b/specs/kbd/scancodes.html @@ -0,0 +1,175 @@ + + + + + Keyboard scancodes + + + + + +Next +Previous +Contents +
    +

    Keyboard scancodes

    + +

    Andries Brouwer, aeb@cwi.nl

    v1.2e, 2004-05-20 +


    +This note contains some information about PC keyboard scancodes. +
    +

    +

    1. Keyboard scancodes

    + + +

    +

    2. Special keyboards - XT keyboards

    + + +

    +

    3. Special keyboards - Amstrad/Schneider keyboards

    + + +

    +

    4. Special keyboards - AT keyboards

    + +

    +

    5. Special keyboards - MF II keyboards

    + + +

    +

    6. NCD keyboards

    + + +

    +

    7. Japanese keyboards

    + + +

    +

    8. Korean keyboards

    + + +

    +

    9. Keyboard-internal scancodes

    + + +

    +

    10. The AT keyboard controller

    + + +

    +

    11. Keyboard commands

    + + +

    +

    12. The PS/2 Mouse

    + + +

    +

    13. USB

    + +

    +

    14. Reporting

    + +
    +Next +Previous +Contents + + diff --git a/specs/kbd/sk2500.jpg b/specs/kbd/sk2500.jpg new file mode 100644 index 0000000..b74122a Binary files /dev/null and b/specs/kbd/sk2500.jpg differ diff --git a/specs/kbd/table.h b/specs/kbd/table.h new file mode 100644 index 0000000..5f7c56f --- /dev/null +++ b/specs/kbd/table.h @@ -0,0 +1,170 @@ +/* Scancode stuff - aeb, 991216 */ + +/* translation from keyboard to scancode - the 8042 table */ + +unsigned char ttable[256] = { +0xff,0x43,0x41,0x3f,0x3d,0x3b,0x3c,0x58,0x64,0x44,0x42,0x40,0x3e,0x0f,0x29,0x59, +0x65,0x38,0x2a,0x70,0x1d,0x10,0x02,0x5a,0x66,0x71,0x2c,0x1f,0x1e,0x11,0x03,0x5b, +0x67,0x2e,0x2d,0x20,0x12,0x05,0x04,0x5c,0x68,0x39,0x2f,0x21,0x14,0x13,0x06,0x5d, +0x69,0x31,0x30,0x23,0x22,0x15,0x07,0x5e,0x6a,0x72,0x32,0x24,0x16,0x08,0x09,0x5f, +0x6b,0x33,0x25,0x17,0x18,0x0b,0x0a,0x60,0x6c,0x34,0x35,0x26,0x27,0x19,0x0c,0x61, +0x6d,0x73,0x28,0x74,0x1a,0x0d,0x62,0x6e,0x3a,0x36,0x1c,0x1b,0x75,0x2b,0x63,0x76, +0x55,0x56,0x77,0x78,0x79,0x7a,0x0e,0x7b,0x7c,0x4f,0x7d,0x4b,0x47,0x7e,0x7f,0x6f, +0x52,0x53,0x50,0x4c,0x4d,0x48,0x01,0x45,0x57,0x4e,0x51,0x4a,0x37,0x49,0x46,0x54, +0x80,0x81,0x82,0x41,0x54,0x85,0x86,0x87,0x88,0x89,0x8a,0x8b,0x8c,0x8d,0x8e,0x8f, +0x90,0x91,0x92,0x93,0x94,0x95,0x96,0x97,0x98,0x99,0x9a,0x9b,0x9c,0x9d,0x9e,0x9f, +0xa0,0xa1,0xa2,0xa3,0xa4,0xa5,0xa6,0xa7,0xa8,0xa9,0xaa,0xab,0xac,0xad,0xae,0xaf, +0xb0,0xb1,0xb2,0xb3,0xb4,0xb5,0xb6,0xb7,0xb8,0xb9,0xba,0xbb,0xbc,0xbd,0xbe,0xbf, +0xc0,0xc1,0xc2,0xc3,0xc4,0xc5,0xc6,0xc7,0xc8,0xc9,0xca,0xcb,0xcc,0xcd,0xce,0xcf, +0xd0,0xd1,0xd2,0xd3,0xd4,0xd5,0xd6,0xd7,0xd8,0xd9,0xda,0xdb,0xdc,0xdd,0xde,0xdf, +0xe0,0xe1,0xe2,0xe3,0xe4,0xe5,0xe6,0xe7,0xe8,0xe9,0xea,0xeb,0xec,0xed,0xee,0xef, +0xf0,0xf1,0xf2,0xf3,0xf4,0xf5,0xf6,0xf7,0xf8,0xf9,0xfa,0xfb,0xfc,0xfd,0xfe,0xff +}; + +/* some entries guessed - see scancodes.sgml */ + + +/* Untranslated scancodes, and USB key values. + For translated values, feed through ttable[]. + + I also included Vojtech Pavlik's scancodes.h in this directory. + It mostly agrees with this table, but lacks + Microsoft Internet keys, and misses some set1 values. */ + +struct keycode { + unsigned int position, usb, set1, set2, set3; + char *name; /* keycap on a standard US keyboard */ +} keycodes[] = { + 1, 53, 0x29, 0x0e, 0x0e, "`~", + 2, 30, 0x02, 0x16, 0x16, "1!", + 3, 31, 0x03, 0x1e, 0x1e, "2@", + 4, 32, 0x04, 0x26, 0x26, "3#", + 5, 33, 0x05, 0x25, 0x25, "4$", + 6, 34, 0x06, 0x2e, 0x2e, "5%E", + 7, 35, 0x07, 0x36, 0x36, "6^", + 8, 36, 0x08, 0x3d, 0x3d, "7&", + 9, 37, 0x09, 0x3e, 0x3e, "8*", + 10, 38, 0x0a, 0x46, 0x46, "9(", + 11, 39, 0x0b, 0x45, 0x45, "0)", + 12, 45, 0x0c, 0x4e, 0x4e, "-_", + 13, 46, 0x0d, 0x55, 0x55, "=+", + 15, 42, 0x0e, 0x66, 0x66, "Backspace", + + 16, 43, 0x0f, 0x0d, 0x0d, "Tab", + 17, 20, 0x10, 0x15, 0x15, "Q", + 18, 26, 0x11, 0x1d, 0x1d, "W", + 19, 8, 0x12, 0x24, 0x24, "E", + 20, 21, 0x13, 0x2d, 0x2d, "R", + 21, 23, 0x14, 0x2c, 0x2c, "T", + 22, 28, 0x15, 0x35, 0x35, "Y", + 23, 24, 0x16, 0x3c, 0x3c, "U", + 24, 12, 0x17, 0x43, 0x43, "I", + 25, 18, 0x18, 0x44, 0x44, "O", + 26, 19, 0x19, 0x4d, 0x4d, "P", + 27, 47, 0x1a, 0x54, 0x54, "[{", + 28, 48, 0x1b, 0x5b, 0x5b, "]}", + 29, 49, 0x2b, 0x5d, 0x5c, "\\|", + + 30, 57, 0x3a, 0x58, 0x14, "CapsLock", + 31, 04, 0x1e, 0x1c, 0x1c, "A", + 32, 22, 0x1f, 0x1b, 0x1b, "S", + 33, 7, 0x20, 0x23, 0x23, "D", + 34, 9, 0x21, 0x2b, 0x2b, "F", + 35, 10, 0x22, 0x34, 0x34, "G", + 36, 11, 0x23, 0x33, 0x33, "H", + 37, 13, 0x24, 0x3b, 0x3b, "J", + 38, 14, 0x25, 0x42, 0x42, "K", + 39, 15, 0x26, 0x4b, 0x4b, "L", + 40, 51, 0x27, 0x4c, 0x4c, ";:", + 41, 52, 0x28, 0x52, 0x52, "'\"", + 42, 50, 0, 0, 0, "non-US-1", + 43, 40, 0x1c, 0x5a, 0x5a, "Enter", + + 44, 225, 0x2a, 0x12, 0x12, "LShift", + 46, 29, 0x2c, 0x1a, 0x1a, "Z", + 47, 27, 0x2d, 0x22, 0x22, "X", + 48, 6, 0x2e, 0x21, 0x21, "C", + 49, 25, 0x2f, 0x2a, 0x2a, "V", + 50, 5, 0x30, 0x32, 0x32, "B", + 51, 17, 0x31, 0x31, 0x31, "N", + 52, 16, 0x32, 0x3a, 0x3a, "M", + 53, 54, 0x33, 0x41, 0x41, ",<", + 54, 55, 0x34, 0x49, 0x49, ".>", + 55, 56, 0x35, 0x4a, 0x4a, "/?", + 57, 229, 0x36, 0x59, 0x59, "RShift", + + 58, 224, 0x1d, 0x14, 0x11, "LCtrl", + 60, 226, 0x38, 0x11, 0x19, "LAlt", + 61, 44, 0x39, 0x29, 0x29, "space", + 62, 230, 0xe038, 0xe011, 0x39, "RAlt", + 64, 228, 0xe01d, 0xe014, 0x58, "RCtrl", + + 75, 73, 0xe052, 0xe070, 0x67, "Insert", + 76, 76, 0xe053, 0xe071, 0x64, "Delete", + 80, 74, 0xe047, 0xe06c, 0x6e, "Home", + 81, 77, 0xe04f, 0xe069, 0x65, "End", + 85, 75, 0xe049, 0xe07d, 0x6f, "PgUp", + 86, 78, 0xe051, 0xe07a, 0x6d, "PgDn", + + 79, 80, 0xe04b, 0xe06b, 0x61, "Left", + 83, 82, 0xe048, 0xe075, 0x63, "Up", + 84, 81, 0xe050, 0xe072, 0x60, "Down", + 89, 79, 0xe04d, 0xe074, 0x6a, "Right", + + 90, 83, 0x45, 0x77, 0x76, "NumLock", + 91, 95, 0x47, 0x6c, 0x6c, "KP-7 / Home", + 92, 92, 0x4b, 0x6b, 0x6b, "KP-4 / Left", + 93, 89, 0x4f, 0x69, 0x69, "KP-1 / End", + 95, 84, 0xe035, 0xe04a, 0x77, "KP-/", + 96, 96, 0x48, 0x75, 0x75, "KP-8 / Up", + 97, 93, 0x4c, 0x73, 0x73, "KP-5", + 98, 90, 0x50, 0x72, 0x72, "KP-2", + 99, 98, 0x52, 0x70, 0x70, "KP-0 / Ins", + 100, 85, 0x37, 0x7c, 0x7e, "KP-*", + 101, 97, 0x49, 0x7d, 0x7d, "KP-9", + 102, 94, 0x4d, 0x74, 0x74, "KP-6 / Right", + 103, 91, 0x51, 0x7a, 0x7a, "KP-3 / PgDn", + 104, 99, 0x53, 0x71, 0x71, "KP-. / Del", + 105, 86, 0x4a, 0x7b, 0x84, "KP--", + 106, 87, 0x4e, 0x79, 0x7c, "KP-+", + 108, 88, 0xe01c, 0xe05a, 0x79, "KP-Enter", + + 110, 41, 0x01, 0x76, 0x08, "Esc", + 112, 58, 0x3b, 0x05, 0x07, "F1", + 113, 59, 0x3c, 0x06, 0x0f, "F2", + 114, 60, 0x3d, 0x04, 0x17, "F3", + 115, 61, 0x3e, 0x0c, 0x1f, "F4", + 116, 62, 0x3f, 0x03, 0x27, "F5", + 117, 63, 0x40, 0x0b, 0x2f, "F6", + 118, 64, 0x41, 0x83, 0x37, "F7", /* Vojtech has 0x02 in set2 */ + 119, 65, 0x42, 0x0a, 0x3f, "F8", + 120, 66, 0x43, 0x01, 0x47, "F9", + 121, 67, 0x44, 0x09, 0x4f, "F10", + 122, 68, 0x57, 0x78, 0x56, "F11", + 123, 69, 0x58, 0x07, 0x5e, "F12", + + 124, 70, 0xe037, 0xe07c, 0x57, "PrtScr", + 0, 154, 0x54, 0x84, 0x57, "Alt+SysRq", + 125, 71, 0x46, 0x7e, 0x5f, "ScrollLock", + 126, 72, 0xe11d45, 0xe11477, 0x62, "Pause", + 0, 0, 0xe046, 0xe07e, 0x62, "Ctrl+Break", + + /* Microsoft Windows and Internet keys and Power keys */ + 0, 227, 0xe05b, 0xe01f, 0x8b, "LWin (USB: LGUI)", + 0, 231, 0xe05c, 0xe027, 0x8c, "RWin (USB: RGUI)", + 0, 0, 0xe05d, 0xe02f, 0x8d, "Menu", + + 0, 0, 0xe06a, 0xe038, 0x38, "Back", + 0, 0, 0xe069, 0xe030, 0x30, "Forward", + 0, 0, 0xe068, 0xe028, 0x28, "Stop", + 0, 0, 0xe06c, 0xe048, 0x48, "Mail", + 0, 0, 0xe065, 0xe010, 0x10, "Search", + 0, 0, 0xe066, 0xe018, 0x18, "Favorites", + 0, 0, 0xe032, 0xe03a, 0x97, "Web / Home", + + 0, 0, 0xe06b, 0xe040, 0x40, "My Computer", + 0, 0, 0xe021, 0xe02b, 0x99, "Calculator", + 0, 0, 0xe05f, 0xe03f, 0x7f, "Sleep", + 0, 0, 0xe05e, 0xe037, 0, "Power", + 0, 0, 0xe063, 0xe05e, 0, "Wake", +}; diff --git a/specs/kbd/telerate-s.jpg b/specs/kbd/telerate-s.jpg new file mode 100644 index 0000000..5082aa3 Binary files /dev/null and b/specs/kbd/telerate-s.jpg differ diff --git a/specs/kbd/telerate.jpg b/specs/kbd/telerate.jpg new file mode 100644 index 0000000..02b5564 Binary files /dev/null and b/specs/kbd/telerate.jpg differ diff --git a/specs/kbd/victor-s.jpg b/specs/kbd/victor-s.jpg new file mode 100644 index 0000000..301f639 Binary files /dev/null and b/specs/kbd/victor-s.jpg differ diff --git a/specs/kbd/victor.jpg b/specs/kbd/victor.jpg new file mode 100644 index 0000000..b05e03c Binary files /dev/null and b/specs/kbd/victor.jpg differ diff --git a/specs/kbd/xt-at-switch.jpg b/specs/kbd/xt-at-switch.jpg new file mode 100644 index 0000000..a7ca378 Binary files /dev/null and b/specs/kbd/xt-at-switch.jpg differ diff --git a/specs/kbd/xtkbd-s.jpg b/specs/kbd/xtkbd-s.jpg new file mode 100644 index 0000000..3a97834 Binary files /dev/null and b/specs/kbd/xtkbd-s.jpg differ diff --git a/specs/kbd/xtkbd.jpg b/specs/kbd/xtkbd.jpg new file mode 100644 index 0000000..220c848 Binary files /dev/null and b/specs/kbd/xtkbd.jpg differ diff --git a/specs/kbd/yahoo912.jpg b/specs/kbd/yahoo912.jpg new file mode 100644 index 0000000..0150d67 Binary files /dev/null and b/specs/kbd/yahoo912.jpg differ diff --git a/specs/mc146818a.pdf b/specs/mc146818a.pdf new file mode 100644 index 0000000..0675424 Binary files /dev/null and b/specs/mc146818a.pdf differ diff --git a/specs/pc16550d.pdf b/specs/pc16550d.pdf new file mode 100644 index 0000000..4cec8ac Binary files /dev/null and b/specs/pc16550d.pdf differ diff --git a/specs/sysv-abi-4.1.pdf b/specs/sysv-abi-4.1.pdf new file mode 100644 index 0000000..1605f1d Binary files /dev/null and b/specs/sysv-abi-4.1.pdf differ diff --git a/specs/sysv-abi-i386-4.pdf b/specs/sysv-abi-i386-4.pdf new file mode 100644 index 0000000..28ce48c Binary files /dev/null and b/specs/sysv-abi-i386-4.pdf differ diff --git a/specs/sysv-abi-update.html/ch4.eheader.html b/specs/sysv-abi-update.html/ch4.eheader.html new file mode 100644 index 0000000..5ddd45b --- /dev/null +++ b/specs/sysv-abi-update.html/ch4.eheader.html @@ -0,0 +1,1184 @@ + +ELF Header +

    ELF Header

    +

    +Some object file control structures can grow, because the ELF header +contains their actual sizes. If the object file format changes, a program +may encounter control structures that are larger or smaller than expected. +Programs might therefore ignore ``extra'' information. The treatment of +``missing'' information depends on context and will be specified when and +if extensions are defined. +


    +Figure 4-3: ELF Header +

    +

    +#define EI_NIDENT 16
    +
    +typedef struct {
    +        unsigned char   e_ident[EI_NIDENT];
    +        Elf32_Half      e_type;
    +        Elf32_Half      e_machine;
    +        Elf32_Word      e_version;
    +        Elf32_Addr      e_entry;
    +        Elf32_Off       e_phoff;
    +        Elf32_Off       e_shoff;
    +        Elf32_Word      e_flags;
    +        Elf32_Half      e_ehsize;
    +        Elf32_Half      e_phentsize;
    +        Elf32_Half      e_phnum;
    +        Elf32_Half      e_shentsize;
    +        Elf32_Half      e_shnum;
    +        Elf32_Half      e_shstrndx;
    +} Elf32_Ehdr;
    +
    +typedef struct {
    +        unsigned char   e_ident[EI_NIDENT];
    +        Elf64_Half      e_type;
    +        Elf64_Half      e_machine;
    +        Elf64_Word      e_version;
    +        Elf64_Addr      e_entry;
    +        Elf64_Off       e_phoff;
    +        Elf64_Off       e_shoff;
    +        Elf64_Word      e_flags;
    +        Elf64_Half      e_ehsize;
    +        Elf64_Half      e_phentsize;
    +        Elf64_Half      e_phnum;
    +        Elf64_Half      e_shentsize;
    +        Elf64_Half      e_shnum;
    +        Elf64_Half      e_shstrndx;
    +} Elf64_Ehdr;
    +
    +
    +
    +
    e_ident
    +
    The initial bytes mark the file as an object file and +provide machine-independent +data with which to decode and interpret the file's contents. +Complete descriptions +appear below in ``ELF Identification''.
    +
    e_type
    +
    This member identifies the object file type.
    +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValueMeaning
    ET_NONE0No file type
    ET_REL1Relocatable file
    ET_EXEC2Executable file
    ET_DYN3Shared object file
    ET_CORE4Core file
    ET_LOOS0xfe00Operating system-specific
    ET_HIOS0xfeffOperating system-specific
    ET_LOPROC0xff00Processor-specific
    ET_HIPROC0xffffProcessor-specific
    +

    +Although the core file contents are unspecified, +type ET_CORE +is reserved to mark the file. +Values from ET_LOOS +through ET_HIOS +(inclusive) are reserved for operating system-specific semantics. +Values from ET_LOPROC +through ET_HIPROC +(inclusive) are reserved for processor-specific semantics. If meanings +are specified, the processor supplement explains them. Other values are +reserved and will be assigned to new object file types as necessary. +

    +

    e_machine
    +
    This member's value specifies the required architecture for +an individual file.
    +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValueMeaning
    EM_NONE0No machine
    EM_M321AT&T WE 32100
    EM_SPARC2SPARC
    EM_3863Intel 80386
    EM_68K4Motorola 68000
    EM_88K5Motorola 88000
    reserved6Reserved for future use (was EM_486)
    EM_8607Intel 80860
    EM_MIPS8MIPS I Architecture
    EM_S3709IBM System/370 Processor
    EM_MIPS_RS3_LE10MIPS RS3000 Little-endian
    reserved11-14Reserved for future use
    EM_PARISC15Hewlett-Packard PA-RISC
    reserved16Reserved for future use
    EM_VPP50017Fujitsu VPP500
    EM_SPARC32PLUS18Enhanced instruction set SPARC
    EM_96019Intel 80960
    EM_PPC20PowerPC
    EM_PPC642164-bit PowerPC
    EM_S39022IBM System/390 Processor
    reserved23-35Reserved for future use
    EM_V80036NEC V800
    EM_FR2037Fujitsu FR20
    EM_RH3238TRW RH-32
    EM_RCE39Motorola RCE
    EM_ARM40Advanced RISC Machines ARM
    EM_ALPHA41Digital Alpha
    EM_SH42Hitachi SH
    EM_SPARCV943SPARC Version 9
    EM_TRICORE44Siemens TriCore embedded processor
    EM_ARC45Argonaut RISC Core, Argonaut Technologies Inc.
    EM_H8_30046Hitachi H8/300
    EM_H8_300H47Hitachi H8/300H
    EM_H8S48Hitachi H8S
    EM_H8_50049Hitachi H8/500
    EM_IA_6450Intel IA-64 processor architecture
    EM_MIPS_X51Stanford MIPS-X
    EM_COLDFIRE52Motorola ColdFire
    EM_68HC1253Motorola M68HC12
    EM_MMA54Fujitsu MMA Multimedia Accelerator
    EM_PCP55Siemens PCP
    EM_NCPU56Sony nCPU embedded RISC processor
    EM_NDR157Denso NDR1 microprocessor
    EM_STARCORE58Motorola Star*Core processor
    EM_ME1659Toyota ME16 processor
    EM_ST10060STMicroelectronics ST100 processor
    EM_TINYJ61Advanced Logic Corp. TinyJ embedded processor family
    EM_X86_6462AMD x86-64 architecture
    EM_PDSP63Sony DSP Processor
    EM_PDP1064Digital Equipment Corp. PDP-10
    EM_PDP1165Digital Equipment Corp. PDP-11
    EM_FX6666Siemens FX66 microcontroller
    EM_ST9PLUS67STMicroelectronics ST9+ 8/16 bit microcontroller
    EM_ST768STMicroelectronics ST7 8-bit microcontroller
    EM_68HC1669Motorola MC68HC16 Microcontroller
    EM_68HC1170Motorola MC68HC11 Microcontroller
    EM_68HC0871Motorola MC68HC08 Microcontroller
    EM_68HC0572Motorola MC68HC05 Microcontroller
    EM_SVX73Silicon Graphics SVx
    EM_ST1974STMicroelectronics ST19 8-bit microcontroller
    EM_VAX75Digital VAX
    EM_CRIS76Axis Communications 32-bit embedded processor
    EM_JAVELIN77Infineon Technologies 32-bit embedded processor
    EM_FIREPATH78Element 14 64-bit DSP Processor
    EM_ZSP79LSI Logic 16-bit DSP Processor
    EM_MMIX80Donald Knuth's educational 64-bit processor
    EM_HUANY81Harvard University machine-independent object files
    EM_PRISM82SiTera Prism
    EM_AVR83Atmel AVR 8-bit microcontroller
    EM_FR3084Fujitsu FR30
    EM_D10V85Mitsubishi D10V
    EM_D30V86Mitsubishi D30V
    EM_V85087NEC v850
    EM_M32R88Mitsubishi M32R
    EM_MN1030089Matsushita MN10300
    EM_MN1020090Matsushita MN10200
    EM_PJ91picoJava
    EM_OPENRISC92OpenRISC 32-bit embedded processor
    EM_ARC_A593ARC Cores Tangent-A5
    EM_XTENSA94Tensilica Xtensa Architecture
    EM_VIDEOCORE95Alphamosaic VideoCore processor
    EM_TMM_GPP96Thompson Multimedia General Purpose Processor
    EM_NS32K97National Semiconductor 32000 series
    EM_TPC98Tenor Network TPC processor
    EM_SNP1K99Trebia SNP 1000 processor
    EM_ST200100STMicroelectronics (www.st.com) ST200 microcontroller
    +

    +Other values are reserved and will be assigned to new machines +as necessary. +Processor-specific ELF names use the machine name to distinguish them. +For example, the flags mentioned below use the +prefix EF_; +a flag named WIDGET for the EM_XYZ +machine would be called EF_XYZ_WIDGET. +

    e_version
    +
    This member identifies the object file version.
    +

    + + + + + + + + + + + + + + + + +
    NameValueMeaning
    EV_NONE0Invalid version
    EV_CURRENT1Current version
    +

    +The value 1 signifies the original file format; +extensions will create new versions with higher numbers. +Although the value of EV_CURRENT +is shown as 1 in the previous table, it will +change as necessary to reflect the current version number. +

    e_entry
    +
    This member gives the virtual address to which the +system first transfers +control, thus starting the process. If the file has no associated entry +point, this member holds zero.
    +
    e_phoff
    +
    This member holds the program header table's file offset in bytes. +If the file has no program header table, this member holds zero.
    +
    e_shoff
    +
    This member holds the section header table's file offset in bytes. +If the file has no section header table, this member holds zero.
    +
    e_flags
    +
    This member holds processor-specific flags associated with the file. +Flag names take the form +EF_machine_flag.
    +
    e_ehsize
    +
    This member holds the ELF header's size in bytes.
    +
    e_phentsize
    +
    This member holds the size in bytes of one entry in the file's program +header table; all entries are the same size.
    +
    e_phnum
    +
    This member holds the number of entries in the program header table. +Thus the product of +e_phentsize and e_phnum gives the +table's size in bytes. +If a file has no program header table, e_phnum +holds the value zero.
    +
    e_shentsize
    +
    This member holds a section header's size in bytes. A section header +is one entry in the section header table; all entries are the same size. +
    +
    e_shnum
    + +
    This member holds the number of entries in the section header table. +Thus the product of e_shentsize and +e_shnum gives the +section header table's size in bytes. +If a file has no section header table, +e_shnum holds the value zero. +

    +If the number of sections is greater than or equal to +SHN_LORESERVE (0xff00), this member +has the value zero and the actual number of section header table +entries is contained in the sh_size field of +the section header at index 0. +(Otherwise, the sh_size member of the initial entry +contains 0.) +

    +
    e_shstrndx
    +
    This member holds the section header table index of the +entry associated with the section name string table. +If the file has no section name string +table, this member holds the value SHN_UNDEF. +See ``Sections'' +and ``String Table'' below +for more information. +

    +If the section name string table section index is greater than or equal to +SHN_LORESERVE (0xff00), this member +has the value SHN_XINDEX (0xffff) and the +actual index of the section name string table section +is contained in the sh_link field of +the section header at index 0. +(Otherwise, the sh_link member of the initial entry +contains 0.) +

    +
    +

    +

    ELF Identification

    +

    +As mentioned above, ELF provides an object file framework to support +multiple processors, multiple data encodings, and multiple +classes of machines. To support this object file family, +the initial bytes of the file specify +how to interpret the file, independent of the processor on +which the inquiry is made and independent of the file's +remaining contents. +

    +The initial bytes of an ELF header (and an object file) correspond to +the e_ident member. +


    +Figure 4-4: e_ident[] Identification Indexes +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValuePurpose
    EI_MAG00File identification
    EI_MAG11File identification
    EI_MAG22File identification
    EI_MAG33File identification
    EI_CLASS4File class
    EI_DATA5Data encoding
    EI_VERSION6File version
    EI_OSABI7Operating system/ABI identification
    EI_ABIVERSION8ABI version
    EI_PAD9Start of padding bytes
    EI_NIDENT16Size of e_ident[]
    +


    +

    +These indexes access bytes that hold the following values. +

    +
    EI_MAG0 to EI_MAG3
    +
    A file's first 4 bytes hold a ``magic number,'' identifying the file +as an ELF object file.
    +

    + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValuePosition
    ELFMAG00x7fe_ident[EI_MAG0]
    ELFMAG1'E'e_ident[EI_MAG1]
    ELFMAG2'L'e_ident[EI_MAG2]
    ELFMAG3'F'e_ident[EI_MAG3]
    +

    +

    EI_CLASS
    +
    The next byte, e_ident[EI_CLASS], identifies the +file's class, or capacity.
    +

    + + + + + + + + + + + + + + + + + + + + + +
    NameValueMeaning
    ELFCLASSNONE0Invalid class
    ELFCLASS32132-bit objects
    ELFCLASS64264-bit objects
    +

    +The file format is designed to be portable among machines of various +sizes, without imposing the sizes of the largest machine on the +smallest. The class of the file defines the basic types +used by the data structures +of the object file container itself. The data contained in object file +sections may follow a different programming model. If so, the processor +supplement describes the model used. +

    +Class ELFCLASS32 supports machines with +32-bit architectures. It +uses the basic types defined in the table +labeled ``32-Bit Data Types.'' +

    +Class ELFCLASS64 supports machines with 64-bit +architectures. It uses the basic types defined in the table +labeled ``64-Bit Data Types.'' +

    +Other classes will be defined as necessary, with different basic types +and sizes for object file data. +

    EI_DATA
    +
    Byte e_ident[EI_DATA] specifies the +encoding of both the data structures used by object file container +and data contained in object file sections. +The following encodings are currently defined. +
    +

    + + + + + + + + + + + + + + + + + + + + + +
    NameValueMeaning
    ELFDATANONE0Invalid data encoding
    ELFDATA2LSB1See below
    ELFDATA2MSB2See below
    +

    +Other values are reserved and will be assigned to new +encodings as necessary. +

    +


    NOTE: +Primarily for the convenience of code that looks at the ELF +file at runtime, the ELF data structures are intended to have the +same byte order as that of the running program. +
    +
    EI_VERSION
    +
    Byte e_ident[EI_VERSION] specifies the +ELF header version +number. Currently, this value must be EV_CURRENT, +as explained above for e_version.
    +

    +

    EI_OSABI
    +
    Byte e_ident[EI_OSABI] identifies the +OS- or ABI-specific ELF extensions used by this file. +Some fields in other ELF structures have flags and values +that have operating system and/or ABI specific meanings; +the interpretation of those fields is determined by the value of this byte. +If the object file does not use any extensions, +it is recommended that this byte be set to 0. +If the value for this byte is 64 through 255, +its meaning depends on the value of the e_machine header member. +The ABI processor supplement for an architecture +can define its own associated set of values for this byte in this range. +If the processor supplement does not specify a set of values, +one of the following values shall be used, +where 0 can also be taken to mean unspecified. +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValueMeaning
    ELFOSABI_NONE0No extensions or unspecified
    ELFOSABI_HPUX1Hewlett-Packard HP-UX
    ELFOSABI_NETBSD2NetBSD
    ELFOSABI_LINUX3Linux
    ELFOSABI_SOLARIS6Sun Solaris
    ELFOSABI_AIX7AIX
    ELFOSABI_IRIX8IRIX
    ELFOSABI_FREEBSD9FreeBSD
    ELFOSABI_TRU6410Compaq TRU64 UNIX
    ELFOSABI_MODESTO11Novell Modesto
    ELFOSABI_OPENBSD12Open BSD
    ELFOSABI_OPENVMS13Open VMS
    ELFOSABI_NSK14Hewlett-Packard Non-Stop Kernel
     64-255Architecture-specific value range
    +

    +

    +

    EI_ABIVERSION
    +
    Byte e_ident[EI_ABIVERSION] identifies the +version of the ABI to which the object is targeted. +This field is used to distinguish among incompatible versions +of an ABI. The interpretation of this version number +is dependent on the ABI identified by the EI_OSABI +field. If no values are specified for the EI_OSABI +field by the processor supplement or no version values are +specified for the ABI determined by a particular value of the +EI_OSABI byte, the value 0 shall +be used for the EI_ABIVERSION byte; it +indicates unspecified.
    +

    +

    EI_PAD
    +
    This value marks the beginning of the unused bytes in +e_ident. These bytes are reserved and set to zero; +programs that read object files +should ignore them. The value of EI_PAD will +change in the future if currently unused bytes are given +meanings.
    +
    +

    +A file's data encoding specifies how to interpret the basic objects +in a file. Class ELFCLASS32 files use objects +that occupy 1, 2, and 4 bytes. Class ELFCLASS64 files +use objects that occupy 1, 2, 4, and 8 bytes. Under the defined +encodings, objects are represented as shown below. +

    +Encoding ELFDATA2LSB specifies 2's complement values, +with the least significant byte occupying the lowest address. +


    +Figure 4-5: Data Encoding ELFDATA2LSB, byte address zero on the left +

    + + + + + +
    01
    0x01
    +

    + + + + + + +
    0201
    0x0102
    +

    + + + + + + + + +
    04030201
    0x01020304
    +

    + + + + + + + + + + + + +
    0807060504030201
    0x0102030405060708
    +


    +

    +Encoding ELFDATA2MSB specifies 2's complement values, +with the most significant byte occupying the lowest address. +


    +Figure 4-6: Data Encoding ELFDATA2MSB, byte address zero on the left +

    + + + + + +
    01
    0x01
    +

    + + + + + + +
    0102
    0x0102
    +

    + + + + + + + + +
    01020304
    0x01020304
    +

    + + + + + + + + + + + + +
    0102030405060708
    0x0102030405060708
    +


    +

    +

    +

    Machine Information (Processor-Specific)

    +

    +


    NOTE: +This section requires processor-specific information. +The ABI supplement for the desired processor describes the details. +
    +Previous +Contents +Next +
    + + +© 1997, 1998, 1999, 2000, 2001 The Santa Cruz Operation, Inc. All rights reserved. +© 2002 Caldera International. All rights reserved. + + + diff --git a/specs/sysv-abi-update.html/ch4.intro.html b/specs/sysv-abi-update.html/ch4.intro.html new file mode 100644 index 0000000..ccc81fd --- /dev/null +++ b/specs/sysv-abi-update.html/ch4.intro.html @@ -0,0 +1,252 @@ + +Chapter 4: Object Files +

    Introduction

    +This chapter describes the +object file format, called ELF (Executable and Linking Format). +There are three main types of object files. +
      +

    • +A relocatable file +holds code and data suitable for linking +with other object files to create an executable +or a shared object file. +

    • +An executable file +holds a program suitable for execution; +the file specifies how +exec(BA_OS) +creates a program's process image. +

    • +A +shared object file +holds code and data suitable for linking +in two contexts. +First, the link editor [see ld(BA_OS)] +processes the shared object file with other relocatable +and shared object files to create another object file. +Second, the dynamic linker combines it with an executable file and other +shared objects to create a process image. +
    +

    +Created by the assembler and link editor, object files are binary +representations of programs intended to be executed directly on +a processor. Programs that require other abstract machines, such +as shell scripts, are excluded. +

    +

    +After the introductory material, this chapter focuses on the file +format and how it pertains to building programs. Chapter 5 also +describes parts of the object file, concentrating on the information +necessary to execute a program. +

    + +

    File Format

    +Object files participate in program linking (building a program) +and program execution (running a program). For convenience and +efficiency, the object file format provides parallel views of a file's +contents, reflecting the differing needs of those activities. +Figure 4-1 shows an object file's organization. +


    +Figure 4-1: Object File Format +

    + + + + +
    + + + + + + + + + +
    Linking View
    ELF Header
    Program header table
    optional
    Section 1
    ...
    Section n
    ...
    Section header table
    required
    +
    + + + + + + + + + +
    Execution View
    ELF Header
    Program header table
    required
    Segment 1
    Segment 2
    Segment 3
    ...
    Section header table
    optional
    +
    +


    +

    +An ELF header resides at the beginning and +holds a ``road map'' +describing the file's organization. Sections hold the bulk +of object file information for the linking view: instructions, +data, symbol table, relocation information, and so on. +Descriptions of special sections appear later in the chapter. +Chapter 5 discusses segments and the program execution +view of the file. +

    +

    +A program header table tells the system how to create a process image. +Files used to build a process image (execute a program) +must have a program header table; relocatable files do not need one. +A section header table +contains information describing the file's sections. +Every section has an entry in the table; each entry +gives information such as the section name, the +section size, and so on. +Files used during linking must have a section header table; +other object files may or may not have one. +


    +NOTE: +Although the figure shows the program header table +immediately after the ELF header, and the section header table +following the sections, actual files may differ. +Moreover, sections and segments have no specified order. +Only the ELF header has a fixed position in the file. +

    + +

    Data Representation

    +As described here, the object file +format +supports various processors with 8-bit bytes +and either 32-bit or 64-bit architectures. +Nevertheless, it is intended to be extensible to larger +(or smaller) architectures. +Object files therefore represent some control data +with a machine-independent format, +making it possible to identify object files and +interpret their contents in a common way. +Remaining data in an object file +use the encoding of the target processor, regardless of +the machine on which the file was created. +


    +Figure 4-2: 32-Bit Data Types +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameSizeAlignmentPurpose
    Elf32_Addr44Unsigned program address
    Elf32_Off44Unsigned file offset
    Elf32_Half22Unsigned medium integer
    Elf32_Word44Unsigned integer
    Elf32_Sword44Signed integer
    unsigned char11Unsigned small integer
    +

    +64-Bit Data Types +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameSizeAlignmentPurpose
    Elf64_Addr88Unsigned program address
    Elf64_Off88Unsigned file offset
    Elf64_Half22Unsigned medium integer
    Elf64_Word44Unsigned integer
    Elf64_Sword44Signed integer
    Elf64_Xword88Unsigned long integer
    Elf64_Sxword88Signed long integer
    unsigned char11Unsigned small integer
    +

    +


    +All data structures that the object file format +defines follow the ``natural'' size and alignment guidelines +for the relevant class. +If necessary, data structures contain explicit padding to +ensure 8-byte alignment for 8-byte objects, +4-byte alignment for 4-byte objects, to force +structure sizes to a multiple of 4 or 8, and so forth. +Data also have suitable alignment from the beginning of the file. +Thus, for example, a structure containing an +Elf32_Addr +member will be aligned on a 4-byte boundary within the file. +

    +For portability reasons, ELF uses no bit-fields. +


    +Contents +Next +
    + + +© 1997, 1998, 1999, 2000, 2001 The Santa Cruz Operation, Inc. All rights reserved. + + + diff --git a/specs/sysv-abi-update.html/ch4.reloc.html b/specs/sysv-abi-update.html/ch4.reloc.html new file mode 100644 index 0000000..22996e1 --- /dev/null +++ b/specs/sysv-abi-update.html/ch4.reloc.html @@ -0,0 +1,180 @@ + +Relocation

    +

    Relocation

    +Relocation is the process of connecting symbolic references +with symbolic definitions. +For example, when a program calls a function, the associated call +instruction must transfer control to the proper destination address +at execution. +Relocatable files must have +``relocation entries'' which +are necessary because they contain information that +describes how to modify their section contents, thus allowing +executable and shared object files to hold +the right information for a process's program image. +


    +Figure 4-21: Relocation Entries +

    +

    +
    +typedef struct {
    +	Elf32_Addr	r_offset;
    +	Elf32_Word	r_info;
    +} Elf32_Rel;
    +
    +typedef struct {
    +	Elf32_Addr	r_offset;
    +	Elf32_Word	r_info;
    +	Elf32_Sword	r_addend;
    +} Elf32_Rela;
    +
    +typedef struct {
    +	Elf64_Addr	r_offset;
    +	Elf64_Xword	r_info;
    +} Elf64_Rel;
    +
    +typedef struct {
    +	Elf64_Addr	r_offset;
    +	Elf64_Xword	r_info;
    +	Elf64_Sxword	r_addend;
    +} Elf64_Rela;
    +
    +
    +
    +
    +

    r_offset
    +This member gives the location at which to apply the +relocation action. +For a relocatable file, +the value is the byte offset from the beginning of the section +to the storage unit affected by the relocation. +For an executable file or a shared object, +the value is the virtual address +of the storage unit affected by the relocation. +

    r_info
    +This member gives both the symbol table index with respect to which +the relocation must be made, and the type of relocation to apply. +For example, a call instruction's relocation entry +would hold the symbol table index of the function being called. +If the index is STN_UNDEF, +the undefined symbol index, +the relocation uses 0 as the ``symbol value''. +Relocation types are processor-specific; +descriptions of their behavior appear in the processor +supplement. +When the text below refers to a relocation entry's +relocation type or symbol table index, it means the result of applying +ELF32_R_TYPE (or ELF64_R_TYPE) or ELF32_R_SYM (or ELF64_R_SYM), +respectively, to the entry's r_info member. +
    +
    +	#define ELF32_R_SYM(i)	((i)>>8)
    +	#define ELF32_R_TYPE(i)   ((unsigned char)(i))
    +	#define ELF32_R_INFO(s,t) (((s)<<8)+(unsigned char)(t))
    +
    +	#define ELF64_R_SYM(i)    ((i)>>32)
    +	#define ELF64_R_TYPE(i)   ((i)&0xffffffffL)
    +	#define ELF64_R_INFO(s,t) (((s)<<32)+((t)&0xffffffffL))
    +
    +
    +

    r_addend
    +This member specifies a constant addend used to +compute the value to be stored into the relocatable field. +
    +

    +As specified previously, only +Elf32_Rela and Elf64_Rela +entries contain an explicit addend. +Entries of type Elf32_Rel and Elf64_Rel +store an implicit addend in the location to be modified. +Depending on the processor architecture, one form or the other +might be necessary or more convenient. +Consequently, an implementation for a particular machine +may use one form exclusively or either form depending on context. +

    +A relocation section references two other sections: +a symbol table and a section to modify. +The section header's sh_info and sh_link +members, described in +``Sections'' +above, specify these relationships. +Relocation entries for different object files have +slightly different interpretations for the +r_offset member. +

    +

      +

    • +In relocatable files, r_offset +holds a section offset. +The relocation section itself describes how to +modify another section in the file; relocation offsets +designate a storage unit within the second section. +

    • +In executable and shared object files, +r_offset holds a virtual address. +To make these files' relocation entries more useful +for the dynamic linker, the section offset (file interpretation) +gives way to a virtual address (memory interpretation). +
    +Although the interpretation of r_offset +changes for different object files to +allow efficient access by the relevant programs, +the relocation types' meanings stay the same. +

    + +The typical application of an ELF relocation is to determine the +referenced symbol value, extract the addend (either from the +field to be relocated or from the addend field contained in +the relocation record, as appropriate for the type of relocation +record), apply the expression implied by the relocation type +to the symbol and addend, extract the desired part of the expression +result, and place it in the field to be relocated. +

    +If multiple consecutive relocation records are applied +to the same relocation location (r_offset), +they are composed instead +of being applied independently, as described above. +By consecutive, we mean that the relocation records are +contiguous within a single relocation section. By composed, +we mean that the standard application described above is modified +as follows: +

      +
    • +In all but the last relocation operation of a composed sequence, +the result of the relocation expression is retained, rather +than having part extracted and placed in the relocated field. +The result is retained at full pointer precision of the +applicable ABI processor supplement. +

    • +In all but the first relocation operation of a composed sequence, +the addend used is the retained result of the previous relocation +operation, rather than that implied by the relocation type. +
    +

    +Note that a consequence of the above rules is that the location specified +by a relocation type is relevant for the +first element of a composed sequence (and then only for relocation +records that do not contain an explicit addend field) and for the +last element, where the location determines where the relocated value +will be placed. For all other relocation operands in a composed +sequence, the location specified is ignored. +

    +An ABI processor supplement may specify individual relocation types +that always stop a composition sequence, or always start a new one. + +

    Relocation Types (Processor-Specific)

    +
    +NOTE: +This section requires processor-specific information. The ABI +supplement for the desired processor describes the details. +
    +Previous +Contents +Next +
    + + +© 1997, 1998, 1999, 2000, 2001 The Santa Cruz Operation, Inc. All rights reserved. + + + diff --git a/specs/sysv-abi-update.html/ch4.sheader.html b/specs/sysv-abi-update.html/ch4.sheader.html new file mode 100644 index 0000000..ca7c737 --- /dev/null +++ b/specs/sysv-abi-update.html/ch4.sheader.html @@ -0,0 +1,1307 @@ + +Sections

    +

    Sections

    +An object file's section header table lets one +locate all the file's sections. +The section header table is an array of Elf32_Shdr +or Elf64_Shdr structures +as described below. +A section header table index is a subscript into this array. +The ELF header's e_shoff +member gives the byte offset from the beginning of the +file to the section header table. +e_shnum normally tells how many entries the section header table contains. +e_shentsize gives the size in bytes of each entry. +

    +If the number of sections is greater than or equal to +SHN_LORESERVE (0xff00), e_shnum +has the value SHN_UNDEF (0) and the +actual number of section header table +entries is contained in the sh_size field of +the section header at index 0 +(otherwise, the sh_size member of the initial entry +contains 0). +

    +Some section header table indexes are reserved in contexts +where index size is restricted, for example, the st_shndx +member of a symbol table entry and the e_shnum and +e_shstrndx members of the ELF header. +In such contexts, the reserved values do not represent actual +sections in the object file. Also in such contexts, an escape +value indicates that the actual section +index is to be found elsewhere, in a larger field. +


    +Figure 4-7: Special Section Indexes +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValue
    SHN_UNDEF0
    SHN_LORESERVE0xff00
    SHN_LOPROC0xff00
    SHN_HIPROC0xff1f
    SHN_LOOS0xff20
    SHN_HIOS0xff3f
    SHN_ABS0xfff1
    SHN_COMMON0xfff2
    SHN_XINDEX0xffff
    SHN_HIRESERVE0xffff
    +


    +

    +

    +

    SHN_UNDEF
    +This value marks an undefined, missing, irrelevant, or +otherwise meaningless section reference. +For example, a symbol ``defined'' relative to section number +SHN_UNDEF is an undefined symbol. +
    +
    +NOTE: +Although index 0 is reserved as the undefined value, +the section header table contains an entry for index 0. +If the e_shnum +member of the ELF header says a file has 6 entries +in the section header table, they have the indexes 0 through 5. +The contents of the initial entry are specified later in this +section. +

    +

    +

    SHN_LORESERVE
    +This value specifies the lower bound of the +range of reserved indexes. +

    SHN_LOPROC through SHN_HIPROC
    +Values in this inclusive range +are reserved for processor-specific semantics. +

    SHN_LOOS through SHN_HIOS
    +Values in this inclusive range +are reserved for operating system-specific semantics. +

    SHN_ABS
    +This value specifies absolute values for the corresponding reference. +For example, symbols defined relative to section number SHN_ABS +have absolute values and are not affected by relocation. +

    SHN_COMMON
    +Symbols defined relative to this section are common symbols, +such as FORTRAN +COMMON +or unallocated C external variables. +

    SHN_XINDEX
    +This value is an escape value. +It indicates that the actual section header index is too large to fit +in the containing field and is to be found in another location +(specific to the structure where it appears). +

    SHN_HIRESERVE
    +This value specifies the upper bound of the +range of reserved indexes. +The system reserves indexes between SHN_LORESERVE +and SHN_HIRESERVE, +inclusive; the values do not reference the section header table. +The section header table does not +contain entries for the reserved indexes. +
    +

    +Sections contain all information in an object file +except the ELF header, the program header table, +and the section header table. +Moreover, object files' sections satisfy several conditions. +

      +

    • +Every section in an object file has exactly one +section header describing it. +Section headers may exist that do not have a section. +

    • +Each section occupies one contiguous (possibly empty) +sequence of bytes within a file. +

    • +Sections in a file may not overlap. +No byte in a file resides in more than one section. +

    • +An object file may have inactive space. +The various headers and the sections might not +``cover'' every byte in an object file. +The contents of the inactive data are unspecified. +
    +A section header has the following structure. +
    + +
    +Figure 4-8: Section Header
    +

    + +typedef struct { + Elf32_Word sh_name; + Elf32_Word sh_type; + Elf32_Word sh_flags; + Elf32_Addr sh_addr; + Elf32_Off sh_offset; + Elf32_Word sh_size; + Elf32_Word sh_link; + Elf32_Word sh_info; + Elf32_Word sh_addralign; + Elf32_Word sh_entsize; +} Elf32_Shdr; + +typedef struct { + Elf64_Word sh_name; + Elf64_Word sh_type; + Elf64_Xword sh_flags; + Elf64_Addr sh_addr; + Elf64_Off sh_offset; + Elf64_Xword sh_size; + Elf64_Word sh_link; + Elf64_Word sh_info; + Elf64_Xword sh_addralign; + Elf64_Xword sh_entsize; +} Elf64_Shdr; + +

    +
    +

    +

    +

    sh_name
    +This member specifies the name of the section. +Its value is an index into the section header +string table section [see +``String Table'' below], +giving the location of a null-terminated string. +

    sh_type
    +This member categorizes the section's contents and semantics. +Section types and their descriptions appear +below. +

    +

    sh_flags
    +Sections support 1-bit flags that describe miscellaneous attributes. +Flag definitions appear +below. +

    +

    sh_addr
    +If the section will appear in the memory image of a process, +this member gives the address at which the section's first +byte should reside. +Otherwise, the member contains 0. +

    sh_offset
    +This member's value gives the byte offset from the beginning of the file +to the first byte in the section. +One section type, SHT_NOBITS +described +below, +occupies no space in the file, and its +sh_offset member locates the conceptual placement in the file. +

    sh_size
    +This member gives the section's size in bytes. +Unless the section type is +SHT_NOBITS, the section occupies sh_size +bytes in the file. +A section of type SHT_NOBITS +may have a non-zero size, but it occupies no space in the file. +

    sh_link
    +This member holds a section header table index link, +whose interpretation depends on the section type. +A table below +describes the values. +

    sh_info
    +This member holds extra information, +whose interpretation depends on the section type. +A table below +describes the values. If the sh_flags field for this +section header includes the attribute SHF_INFO_LINK, then this member represents a section header table index. +

    sh_addralign
    +Some sections have address alignment constraints. +For example, if a section holds a doubleword, +the system must ensure doubleword alignment for the entire section. +The value of sh_addr +must be congruent to 0, modulo the value of sh_addralign. +Currently, only 0 and positive integral powers of two are allowed. +Values 0 and 1 mean the section has no alignment constraints. +

    sh_entsize
    +Some sections hold a table of fixed-size entries, +such as a symbol table. +For such a section, this member gives the size in bytes of each entry. +The member contains 0 if the section does not hold a table +of fixed-size entries. +
    +

    +A section header's sh_type member specifies the section's semantics. +


    + +Figure 4-9: Section Types,sh_type +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValue
    SHT_NULL0
    SHT_PROGBITS1
    SHT_SYMTAB2
    SHT_STRTAB3
    SHT_RELA4
    SHT_HASH5
    SHT_DYNAMIC6
    SHT_NOTE7
    SHT_NOBITS8
    SHT_REL9
    SHT_SHLIB10
    SHT_DYNSYM11
    SHT_INIT_ARRAY14
    SHT_FINI_ARRAY15
    SHT_PREINIT_ARRAY16
    SHT_GROUP17
    SHT_SYMTAB_SHNDX18
    SHT_LOOS0x60000000
    SHT_HIOS0x6fffffff
    SHT_LOPROC0x70000000
    SHT_HIPROC0x7fffffff
    SHT_LOUSER0x80000000
    SHT_HIUSER0xffffffff
    +


    +

    +

    +

    SHT_NULL
    +This value marks the section header as inactive; +it does not have an associated section. +Other members of the section header have undefined values. +

    SHT_PROGBITS
    +The section holds information defined by the program, +whose format and meaning are determined solely by the program. +

    SHT_SYMTAB and SHT_DYNSYM
    +These sections hold a symbol table. +Currently, an object file may have only one section of each type, +but this restriction may be relaxed in the future. +Typically, SHT_SYMTAB +provides symbols for link editing, though it may also be +used for dynamic linking. +As a complete symbol table, it may contain many symbols unnecessary +for dynamic linking. +Consequently, an object file may also contain a SHT_DYNSYM +section, which holds a minimal set of dynamic linking symbols, +to save space. +See ``Symbol Table'' below +for details. +

    SHT_STRTAB
    +The section holds a string table. +An object file may have multiple string table sections. +See ``String Table'' +below for details. +

    SHT_RELA
    +The section holds relocation entries +with explicit addends, such as type +Elf32_Rela for the 32-bit class of object files +or type Elf64_Rela for the 64-bit class of object files. +An object file may have multiple relocation sections. +``Relocation'' +below for details. +

    SHT_HASH
    +The section holds a symbol hash table. +Currently, an object file may have only one hash table, +but this restriction may be relaxed in the future. +See +``Hash Table'' +in the Chapter 5 for details. +

    SHT_DYNAMIC
    +The section holds information for dynamic linking. +Currently, an object file may have only one dynamic section, +but this restriction may be relaxed in the future. +See +``Dynamic Section'' +in Chapter 5 for details. +

    SHT_NOTE
    +The section holds information that marks the file in some way. +See +``Note Section'' +in Chapter 5 for details. +

    SHT_NOBITS
    +A section of this type occupies no space in the file but +otherwise resembles +SHT_PROGBITS. +Although this section contains no bytes, the sh_offset +member contains the conceptual file offset. +

    SHT_REL
    +The section holds relocation entries +without explicit addends, such as type +Elf32_Rel for the 32-bit class of object files or +type Elf64_Rel for the 64-bit class of object files. +An object file may have multiple relocation sections. +See ``Relocation'' +below for details. +

    SHT_SHLIB
    +This section type is reserved but has unspecified semantics. + +

    SHT_INIT_ARRAY
    +This section contains an array of pointers to initialization functions, +as described in ``Initialization and +Termination Functions'' in Chapter 5. Each pointer in the array +is taken as a parameterless procedure with a void return. +

    SHT_FINI_ARRAY
    +This section contains an array of pointers to termination functions, +as described in ``Initialization and +Termination Functions'' in Chapter 5. Each pointer in the array +is taken as a parameterless procedure with a void return. +

    SHT_PREINIT_ARRAY
    +This section contains an array of pointers to functions that are +invoked before all other initialization functions, +as described in ``Initialization and +Termination Functions'' in Chapter 5. Each pointer in the array +is taken as a parameterless procedure with a void return. +

    SHT_GROUP
    +This section defines a section group. A section group +is a set of sections that are related and that must be treated +specially by the linker (see below for further +details). Sections of type SHT_GROUP may appear only +in relocatable objects (objects with the ELF header e_type +member set to ET_REL). The section header table entry +for a group section must appear in the section header table +before the entries for any of the sections that are members of +the group. +

    SHT_SYMTAB_SHNDX
    + +This section is associated with a section of type SHT_SYMTAB +and is required if any of the section header indexes referenced +by that symbol table contain the escape value SHN_XINDEX. +The section is an array of Elf32_Word values. +Each value corresponds one to one with a symbol table entry +and appear in the same order as those entries. +The values represent the section header indexes against which +the symbol table entries are defined. +Only if corresponding symbol table entry's st_shndx field +contains the escape value SHN_XINDEX +will the matching Elf32_Word hold the actual section header index; +otherwise, the entry must be SHN_UNDEF (0). +

    SHT_LOOS through SHT_HIOS
    +Values in this inclusive range +are reserved for operating system-specific semantics. +

    SHT_LOPROC through SHT_HIPROC
    +Values in this inclusive range +are reserved for processor-specific semantics. +

    SHT_LOUSER
    +This value specifies the lower bound of the range of +indexes reserved for application programs. +

    SHT_HIUSER
    +This value specifies the upper bound of the range of +indexes reserved for application programs. +Section types between SHT_LOUSER and +SHT_HIUSER may be used by the application, without conflicting with +current or future system-defined section types. +
    +

    +Other section type values are reserved. +As mentioned before, the section header for index 0 (SHN_UNDEF) +exists, even though the index marks undefined section references. +This entry holds the following. +


    +Figure 4-10: Section Header Table Entry:Index 0 +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValueNote
    sh_name0No name
    sh_typeSHT_NULLInactive
    sh_flags0No flags
    sh_addr0No address
    sh_offset0No offset
    sh_sizeUnspecifiedIf non-zero, the actual number of section header entries
    sh_linkUnspecifiedIf non-zero, the index of the section header string table section
    sh_info0No auxiliary information
    sh_addralign0No alignment
    sh_entsize0No entries
    +


    +

    +A section header's sh_flags +member holds 1-bit flags that describe the section's attributes. +Defined values appear in the following table; +other values are reserved. + +


    +Figure 4-11: Section Attribute Flags +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValue
    SHF_WRITE0x1
    SHF_ALLOC0x2
    SHF_EXECINSTR0x4
    SHF_MERGE0x10
    SHF_STRINGS0x20
    SHF_INFO_LINK0x40
    SHF_LINK_ORDER0x80
    SHF_OS_NONCONFORMING0x100
    SHF_GROUP0x200
    SHF_TLS0x400
    SHF_MASKOS0x0ff00000
    SHF_MASKPROC0xf0000000
    +


    +

    +If a flag bit is set in sh_flags, +the attribute is ``on'' for the section. +Otherwise, the attribute is ``off'' or does not apply. +Undefined attributes are set to zero. +

    +

    SHF_WRITE
    +The section contains data that should be writable during +process execution. +

    SHF_ALLOC
    +The section occupies memory during process execution. +Some control sections do not reside in the memory image +of an object file; this attribute is off for those sections. +

    SHF_EXECINSTR
    +The section contains executable machine instructions. + +

    SHF_MERGE
    +The data in the section may be merged to eliminate duplication. +Unless the SHF_STRINGS flag is also set, +the data elements in the section are of a uniform size. +The size of each element is specified in the section +header's sh_entsize field. +If the SHF_STRINGS flag is also set, +the data elements consist of null-terminated character strings. +The size of each character is specified in the section +header's sh_entsize field. +

    +Each element in the section is compared against other elements +in sections with the same name, type and flags. +Elements that would have identical values at program run-time +may be merged. +Relocations referencing elements of such sections must be +resolved to the merged locations of the referenced values. +Note that any relocatable values, including +values that would result in run-time relocations, must be +analyzed to determine whether the run-time values would actually +be identical. An ABI-conforming object file may not depend +on specific elements being merged, and an ABI-conforming +link editor may choose not to merge specific elements. + +

    SHF_STRINGS
    +The data elements in the section consist of null-terminated character +strings. The size of each character is specified in the section +header's sh_entsize field. + +

    SHF_INFO_LINK
    +The sh_info field of this section header holds a section +header table index. + +

    SHF_LINK_ORDER
    +This flag adds special ordering requirements for link editors. +The requirements apply if the +sh_link field of this section's header references +another section (the linked-to section). +If this section is combined with other +sections in the output file, it must appear in the same +relative order with respect to those sections, as the linked-to section +appears with respect to sections the linked-to section is combined with. +

    +


    +NOTE: +A typical use of this flag is to build a table that references text or +data sections in address order. +
    + +

    SHF_OS_NONCONFORMING
    +This section requires special OS-specific processing +(beyond the standard linking rules) +to avoid incorrect behavior. +If this section has either an sh_type value +or contains sh_flags bits in the OS-specific ranges for +those fields, and a link editor processing this section does not +recognize those values, then the link editor should reject +the object file containing this section with an error. +

    SHF_GROUP
    +This section is a member (perhaps the only one) of a section group. +The section must be referenced by a section of type SHT_GROUP. +The SHF_GROUP flag may be set only for sections contained +in relocatable objects (objects with the ELF header e_type +member set to ET_REL). +See below for further details. + +

    SHF_TLS
    +This section holds Thread-Local Storage, +meaning that each separate execution flow +has its own distinct instance of this data. +Implementations need not support this flag. +

    SHF_MASKOS
    +All bits included in this mask +are reserved for operating system-specific semantics. +

    SHF_MASKPROC
    +All bits included in this mask +are reserved for processor-specific semantics. +If meanings are specified, the processor supplement explains +them. +
    +

    +Two members in the section header, +sh_link and sh_info, +hold special information, depending on section type. +


    + +Figure 4-12: sh_link and sh_info Interpretation +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    sh_typesh_linksh_info
    SHT_DYNAMICThe section header index of +the string table used by +entries in the section.0
    SHT_HASHThe section header index of +the symbol table to which +the hash table applies.0
    SHT_REL
    SHT_RELA
    The section header index of +the associated symbol table.The section header index of +the section to which the +relocation applies.
    SHT_SYMTAB
    SHT_DYNSYM
    The section header index of +the associated string table.One greater than the symbol table index of the last local +symbol (binding STB_LOCAL).
    SHT_GROUPThe section header index of +the associated symbol table.The symbol table index of an entry in the +associated symbol table. The name of the specified symbol table +entry provides a signature for the section group.
    SHT_SYMTAB_SHNDXThe section header index of +the associated symbol table section.0
    +


    + +

    Rules for Linking Unrecognized Sections

    +If a link editor encounters sections whose headers contain OS-specific +values it does not recognize in the sh_type +or sh_flags fields, the link editor should combine those +sections as described below. +

    +If the section's sh_flags bits include the attribute +SHF_OS_NONCONFORMING, then the section requires +special knowledge to be correctly processed, and the link editor should +reject the object containing the section with an error. +

    +Unrecognized sections that do not have the +SHF_OS_NONCONFORMING attribute, are combined in a two-phase +process. As the link editor combines sections using this process, +it must honor the alignment constraints of the +input sections (asserted by the sh_addralign field), +padding between sections with zero bytes, if necessary, and producing +a combination with the maximum alignment constraint of its +component input sections. +

    +

      +
    1. +In the first phase, input sections that match in name, type +and attribute flags should be concatenated into single sections. +The concatenation order should satisfy the requirements of +any known input section attributes (e.g, SHF_MERGE +and SHF_LINK_ORDER). When not otherwise constrained, +sections should be emitted in input order. +
    2. +In the second phase, sections should be assigned to segments or +other units based on their attribute flags. Sections of each particular +unrecognized type should be assigned to the same unit unless +prevented by incompatible flags, and within a unit, sections +of the same unrecognized type should be placed together +if possible. +
    +

    +Non OS-specific processing (e.g. relocation) should be applied +to unrecognized section types. An output section header table, +if present, should contain entries for unknown sections. +Any unrecognized section attribute flags should be removed. +


    +NOTE: +It is recommended that link editors follow the same two-phase +ordering approach described above when linking sections of +known types. Padding between such sections may have values +different from zero, where appropriate. +
    + +

    Section Groups

    +Some sections occur in interrelated groups. For example, an out-of-line +definition of an inline function might require, in addition to the +section containing its executable instructions, a read-only data +section containing literals referenced, one or more debugging information +sections and other informational sections. Furthermore, there may be +internal references among these sections that would not make sense +if one of the sections were removed or replaced by a duplicate from +another object. Therefore, such groups must be +included or omitted from the linked object as a unit. +A section cannot be a member of more than one group. +

    +A section of type SHT_GROUP defines such a grouping +of sections. The name of a symbol from one of the containing +object's symbol tables provides a signature for the section group. +The section header of the SHT_GROUP section specifies +the identifying symbol entry, as described above: +the sh_link member contains the section header index +of the symbol table section that contains the entry. +The sh_info member contains the symbol table index of +the identifying entry. The sh_flags +member of the section header contains 0. +The name of the section (sh_name) is not specified. +

    +The referenced signature symbol is not restricted. +Its containing symbol table section need not be a member of the group, +for example. +

    +The section data of a SHT_GROUP section is an array +of Elf32_Word entries. The first entry is a flag word. +The remaining entries are a sequence of section header indices. +

    +The following flags are currently defined: +


    + +Figure 4-13: Section Group Flags +

    + + + + + + + + + + + + + + + +
    NameValue
    GRP_COMDAT0x1
    GRP_MASKOS0x0ff00000
    GRP_MASKPROC0xf0000000
    +


    +
    +

    GRP_COMDAT
    +This is a COMDAT group. It may duplicate another COMDAT group +in another object file, where duplication is defined as having the +same group signature. In such cases, only one of the +duplicate groups may be retained by the linker, and the +members of the remaining groups must be discarded. +

    GRP_MASKOS
    +All bits included in this mask +are reserved for operating system-specific semantics. +

    GRP_MASKPROC
    +All bits included in this mask +are reserved for processor-specific semantics. +If meanings are specified, the processor supplement explains +them. +
    +

    +The section header indices in the SHT_GROUP section +identify the sections that make up the group. Each such section +must have the SHF_GROUP flag set in its sh_flags +section header member. If the linker decides to remove the section +group, it must remove all members of the group. +


    +NOTE: +This requirement is not intended to imply that special case behavior +like removing debugging information requires removing the sections +to which that information refers, even if they are part of the same +group. +
    +

    + +To facilitate removing a group without leaving dangling references +and with only minimal processing of the symbol table, +the following rules must be followed: +

      +

    • +A symbol table entry with STB_GLOBAL or STB_WEAK +binding that is defined relative to one of a group's sections, +and that is contained in a symbol table section +that is not part of the group, +must be converted to an undefined symbol +(its section index must be changed to SHN_UNDEF) +if the group members are discarded. +References to this symbol table entry from outside the group are allowed. +

    • +A symbol table entry with STB_LOCAL binding +that is defined relative to one of a group's sections, +and that is contained in a symbol table section +that is not part of the group, +must be discarded if the group members are discarded. +References to this symbol table entry from outside the group are not allowed. +

    • +An undefined symbol that is referenced only from one or more sections +that are part of a particular group, +and that is contained in a symbol table section +that is not part of the group, +is not removed when the group members are discarded. +In other words, +the undefined symbol is not removed +even if no references to that symbol remain. +

    • +There may not be non-symbol references to the sections comprising +a group from outside the group, for example, use of a group +member's section header index in an sh_link or +sh_info member. +
    + +

    Special Sections

    +Various sections hold program and control information. +

    +The following table +shows sections that are used by the system +and have the indicated types and attributes. +


    +Figure 4-14: Special Sections +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameTypeAttributes
    .bss SHT_NOBITS SHF_ALLOC+SHF_WRITE
    .comment SHT_PROGBITSnone
    .data SHT_PROGBITSSHF_ALLOC+SHF_WRITE
    .data1 SHT_PROGBITSSHF_ALLOC+SHF_WRITE
    .debug SHT_PROGBITSnone
    .dynamic SHT_DYNAMIC see below
    .dynstr SHT_STRTAB SHF_ALLOC
    .dynsym SHT_DYNSYM SHF_ALLOC
    .fini SHT_PROGBITSSHF_ALLOC+SHF_EXECINSTR
    .fini_array SHT_FINI_ARRAYSHF_ALLOC+SHF_WRITE
    .got SHT_PROGBITSsee below
    .hash SHT_HASH SHF_ALLOC
    .init SHT_PROGBITSSHF_ALLOC+SHF_EXECINSTR
    .init_array SHT_INIT_ARRAYSHF_ALLOC+SHF_WRITE
    .interp SHT_PROGBITSsee below
    .line SHT_PROGBITSnone
    .note SHT_NOTE none
    .plt SHT_PROGBITSsee below
    .preinit_array SHT_PREINIT_ARRAYSHF_ALLOC+SHF_WRITE
    .relname SHT_REL see below
    .relaname SHT_RELA see below
    .rodata SHT_PROGBITSSHF_ALLOC
    .rodata1 SHT_PROGBITSSHF_ALLOC
    .shstrtab SHT_STRTAB none
    .strtab SHT_STRTAB see below
    .symtab SHT_SYMTAB see below
    .symtab_shndxSHT_SYMTAB_SHNDX see below
    .tbss SHT_NOBITSSHF_ALLOC+SHF_WRITE+SHF_TLS
    .tdata SHT_PROGBITSSHF_ALLOC+SHF_WRITE+SHF_TLS
    .tdata1 SHT_PROGBITSSHF_ALLOC+SHF_WRITE+SHF_TLS
    .text SHT_PROGBITSSHF_ALLOC+SHF_EXECINSTR
    +


    +

    +

    +

    .bss
    +This section holds uninitialized data that contribute +to the program's memory image. +By definition, the system initializes the data with zeros +when the program begins to run. +The section occupies no file space, as indicated by the section type, +SHT_NOBITS. +

    .comment
    +This section holds version control information. +

    .data and .data1
    +These sections hold initialized data that contribute +to the program's memory image. +

    .debug
    +This section holds information for symbolic debugging. +The contents are unspecified. All section names with the +prefix .debug are reserved for future use in the +ABI. +

    .dynamic
    +This section holds dynamic linking information. +The section's attributes will include the SHF_ALLOC bit. +Whether the SHF_WRITE bit is set is processor specific. +See Chapter 5 for more information. +

    .dynstr
    +This section holds strings needed for dynamic linking, +most commonly the strings +that represent the names associated with symbol table entries. +See Chapter 5 for more information. +

    .dynsym
    +This section holds the dynamic linking symbol table, +as described in +``Symbol Table''. +See Chapter 5 for more information. +

    .fini
    +This section holds executable instructions that contribute +to the process termination code. +That is, when a program exits normally, the system arranges +to execute the code in this section. +

    .fini_array
    +This section holds an array of function pointers that contributes +to a single termination array for the executable or shared +object containing the section. +

    .got
    +This section holds the global offset table. +See ``Coding Examples'' in Chapter 3, ``Special Sections'' in +Chapter 4, and ``Global Offset Table'' in Chapter 5 of the +processor supplement for more information. +

    .hash
    +This section holds a symbol hash table. +See +``Hash Table'' +in Chapter 5 for more information. +

    .init
    +This section holds executable instructions that contribute +to the process initialization code. +When a program starts to run, the system arranges +to execute the code in this section before calling the +main program entry point (called main for C programs). +

    .init_array
    +This section holds an array of function pointers that contributes +to a single initialization array for the executable or shared +object containing the section. +

    .interp
    +This section holds the path name of a program interpreter. +If the file has a loadable segment that includes +relocation, the sections' attributes will include the +SHF_ALLOC bit; otherwise, that bit will be off. +See Chapter 5 for more information. +

    .line
    +This section holds line number information for symbolic +debugging, which describes +the correspondence between the source program and the +machine code. +The contents are unspecified. +

    .note
    +This section holds information in the format that +``Note Section''. +in Chapter 5 describes. +

    .plt
    +This section holds the procedure linkage table. +See ``Special Sections'' in Chapter 4 and ``Procedure Linkage +Table'' in Chapter 5 of the processor supplement for more +information. +

    .preinit_array
    +This section holds an array of function pointers that contributes +to a single pre-initialization array for the executable or shared +object containing the section. +

    .relname and .relaname
    +These sections hold relocation information, as described in +``Relocation''. +If the file has a loadable segment that includes +relocation, the sections' attributes will include the +SHF_ALLOC bit; otherwise, that bit will be off. +Conventionally, name +is supplied by the section to which the relocations apply. +Thus a relocation section for .text +normally would have the name .rel.text or .rela.text. +

    .rodata and .rodata1
    +These sections hold read-only data that +typically contribute to a non-writable segment +in the process image. +See +``Program Header'' +in Chapter 5 for more information. +

    .shstrtab
    +This section holds section names. +

    .strtab
    +This section holds strings, most commonly the strings +that represent the names associated with symbol table entries. +If the file has a loadable segment that includes the +symbol string table, the section's attributes will include the +SHF_ALLOC +bit; otherwise, that bit will be off. +

    .symtab
    +This section holds a symbol table, as +``Symbol Table''. +in this chapter describes. +If the file has a loadable segment that includes the +symbol table, the section's attributes will include the +SHF_ALLOC bit; otherwise, that bit will be off. +

    .symtab_shndx
    +This section holds the special symbol table section index +array, as described above. The section's attributes will include +the SHF_ALLOC bit if the associated symbol table +section does; otherwise that bit will be off. + +

    .tbss
    +This section holds uninitialized thread-local data that contribute +to the program's memory image. +By definition, +the system initializes the data with zeros +when the data is instantiated for each new execution flow. +The section occupies no file space, as indicated by the section type, +SHT_NOBITS. +Implementations need not support thread-local storage. + +

    .tdata
    +This section holds initialized thread-local data that contributes +to the program's memory image. +A copy of its contents is instantiated by the system +for each new execution flow. +Implementations need not support thread-local storage. +

    .text
    +This section holds the ``text,'' or executable +instructions, of a program. +
    +

    +Section names with a dot (.) prefix +are reserved for the system, +although applications may use these sections +if their existing meanings are satisfactory. +Applications may use names without the prefix to +avoid conflicts with system sections. +The object file format lets one define sections not +shown in the previous list. +An object file may have more than one section +with the same name. +

    +Section names reserved for a processor architecture +are formed by placing an abbreviation of the architecture +name ahead of the section name. +The name should be taken from the +architecture names used for e_machine. +For instance .FOO.psect is the psect +section defined by the FOO architecture. +Existing extensions are called by their historical names. +

    + + + + + + + + + + + + + + + + + + + + + +
    Pre-existing Extensions
    .sdata.tdesc
    .sbss.lit4
    .lit8.reginfo
    .gptab.liblist
    .conflict
    +


    +NOTE: +For information on processor-specific sections, +see the ABI supplement for the desired processor. +
    +Previous +Contents +Next +
    + + +© 1997, 1998, 1999, 2000, 2001 The Santa Cruz Operation, Inc. All rights reserved. + + + diff --git a/specs/sysv-abi-update.html/ch4.strtab.html b/specs/sysv-abi-update.html/ch4.strtab.html new file mode 100644 index 0000000..6915c0c --- /dev/null +++ b/specs/sysv-abi-update.html/ch4.strtab.html @@ -0,0 +1,124 @@ + +String Table

    +

    String Table

    +String table sections hold null-terminated character sequences, +commonly called strings. +The object file uses these strings to represent symbol and section names. +One references a string as an index into the +string table section. +The first byte, which is index zero, is defined to hold +a null character. +Likewise, a string table's last byte is defined to hold +a null character, ensuring null termination for all strings. +A string whose index is zero specifies +either no name or a null name, depending on the context. +An empty string table section is permitted; its section header's sh_size +member would contain zero. +Non-zero indexes are invalid for an empty string table. +

    +A section header's sh_name +member holds an index into the section header string table +section, as designated by the e_shstrndx +member of the ELF header. +The following figures show a string table with 25 bytes +and the strings associated with various indexes. +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    Index+0+1+2+3+4+5+6+7+8+9
    0\0name.\0Var
    10iable\0able
    20\0\0xx\0 
    +


    +Figure 4-15: String Table Indexes +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    IndexString
    0none
    1name.
    7Variable
    11able
    16able
    24null string
    +


    +

    +As the example shows, a string table index may refer +to any byte in the section. +A string may appear more than once; +references to substrings may exist; +and a single string may be referenced multiple times. +Unreferenced strings also are allowed. +


    +Previous +Contents +Next +
    + + +© 1997, 1998, 1999, 2000, 2001 The Santa Cruz Operation, Inc. All rights reserved. + + + diff --git a/specs/sysv-abi-update.html/ch4.symtab.html b/specs/sysv-abi-update.html/ch4.symtab.html new file mode 100644 index 0000000..c1031c7 --- /dev/null +++ b/specs/sysv-abi-update.html/ch4.symtab.html @@ -0,0 +1,592 @@ + +Symbol Table

    +

    Symbol Table

    +An object file's symbol table holds information +needed to locate and relocate a program's symbolic +definitions and references. +A symbol table index is a subscript into this array. +Index 0 both designates the first entry in the table +and serves as the undefined symbol index. The contents of the +initial entry are specified later in this section. +

    + + + + + + + +
    NameValue
    STN_UNDEF0
    +

    +A symbol table entry has the following format. +


    +Figure 4-16: Symbol Table Entry +

    +

    +
    +typedef struct {
    +	Elf32_Word	st_name;
    +	Elf32_Addr	st_value;
    +	Elf32_Word	st_size;
    +	unsigned char	st_info;
    +	unsigned char	st_other;
    +	Elf32_Half	st_shndx;
    +} Elf32_Sym;
    +
    +typedef struct {
    +	Elf64_Word	st_name;
    +	unsigned char	st_info;
    +	unsigned char	st_other;
    +	Elf64_Half	st_shndx;
    +	Elf64_Addr	st_value;
    +	Elf64_Xword	st_size;
    +} Elf64_Sym;
    +
    + +
    +

    +

    +

    st_name
    +This member holds an index into the object file's +symbol string table, which +holds the character representations of the symbol names. +If the value is non-zero, it represents a string table +index that gives the symbol name. +Otherwise, the symbol table entry has no name. +
    +
    +NOTE: +External C symbols have the same names in C +and object files' symbol tables. +

    +

    +

    st_value
    +This member gives the value of the associated symbol. +Depending on the context, this may be an absolute value, +an address, and so on; details appear below. +

    st_size
    +Many symbols have associated sizes. +For example, a data object's size is the number +of bytes contained in the object. +This member holds 0 if the symbol has no size or an unknown size. +

    st_info
    +This member specifies the symbol's type and binding attributes. +A list of the values and meanings appears below. +The following code shows how to manipulate the values for +both 32 and 64-bit objects. +
    +
    +   #define ELF32_ST_BIND(i)   ((i)>>4)
    +   #define ELF32_ST_TYPE(i)   ((i)&0xf)
    +   #define ELF32_ST_INFO(b,t) (((b)<<4)+((t)&0xf))
    +
    +   #define ELF64_ST_BIND(i)   ((i)>>4)
    +   #define ELF64_ST_TYPE(i)   ((i)&0xf)
    +   #define ELF64_ST_INFO(b,t) (((b)<<4)+((t)&0xf))
    +
    +
    + +

    st_other
    +This member currently specifies a symbol's visibility. +A list of the values and meanings appears below. +The following code shows how to manipulate the values for +both 32 and 64-bit objects. Other bits contain 0 and have +no defined meaning. +
    +
    +   #define ELF32_ST_VISIBILITY(o) ((o)&0x3)
    +   #define ELF64_ST_VISIBILITY(o) ((o)&0x3)
    +
    +
    +

    st_shndx
    +Every symbol table entry is defined in relation +to some section. This member holds the relevant +section header table index. +As the sh_link and sh_info interpretation +table +and the related text describe, +some section indexes indicate special meanings. +

    +If this member contains SHN_XINDEX, +then the actual section header index is too large to fit in this field. +The actual value is contained in the associated +section of type SHT_SYMTAB_SHNDX. +

    +

    +A symbol's binding determines the linkage visibility +and behavior. +


    +Figure 4-17: Symbol Binding +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValue
    STB_LOCAL0
    STB_GLOBAL1
    STB_WEAK2
    STB_LOOS10
    STB_HIOS12
    STB_LOPROC13
    STB_HIPROC15
    +


    +
    +

    STB_LOCAL
    +Local symbols are not visible outside the object file +containing their definition. +Local symbols of the same name may exist in +multiple files without interfering with each other. +

    STB_GLOBAL
    +Global symbols are visible to all object files being combined. +One file's definition of a global symbol will satisfy +another file's undefined reference to the same global symbol. +

    STB_WEAK
    +Weak symbols resemble global symbols, but their +definitions have lower precedence. +

    STB_LOOS through STB_HIOS
    +Values in this inclusive range +are reserved for operating system-specific semantics. +

    STB_LOPROC through STB_HIPROC
    +Values in this inclusive range +are reserved for processor-specific semantics. If meanings are +specified, the processor supplement explains them. +
    +

    +Global and weak symbols differ in two major ways. +

      +

    • +When the link editor combines several relocatable object files, +it does not allow multiple definitions of STB_GLOBAL +symbols with the same name. +On the other hand, if a defined global symbol exists, +the appearance of a weak symbol with the same name +will not cause an error. +The link editor honors the global definition and ignores +the weak ones. +Similarly, if a common symbol exists +(that is, a symbol whose st_shndx +field holds SHN_COMMON), +the appearance of a weak symbol with the same name will +not cause an error. +The link editor honors the common definition and +ignores the weak ones. +

    • +When the link editor searches archive libraries [see ``Archive File'' +in Chapter 7], +it extracts archive members that contain definitions of +undefined global symbols. +The member's definition may be either a global or a weak symbol. +The link editor does not +extract archive members to resolve undefined weak symbols. +Unresolved weak symbols have a zero value. +
    + +
    +NOTE: +The behavior of weak symbols in areas not specified by this document is +implementation defined. +Weak symbols are intended primarily for use in system software. +Applications using weak symbols are unreliable +since changes in the runtime environment +might cause the execution to fail. +

    +In each symbol table, all symbols with STB_LOCAL +binding precede the weak and global symbols. +As +``Sections'', +above describes, +a symbol table section's sh_info +section header member holds the symbol table index +for the first non-local symbol. +

    +A symbol's type provides a general classification for +the associated entity. +


    +Figure 4-18: Symbol Types +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValue
    STT_NOTYPE0
    STT_OBJECT1
    STT_FUNC2
    STT_SECTION3
    STT_FILE4
    STT_COMMON5
    STT_TLS6
    STT_LOOS10
    STT_HIOS12
    STT_LOPROC13
    STT_HIPROC15
    +


    +

    +

    +

    STT_NOTYPE
    +The symbol's type is not specified. +

    STT_OBJECT
    +The symbol is associated with a data object, +such as a variable, an array, and so on. +

    STT_FUNC
    +The symbol is associated with a function or other executable code. +

    STT_SECTION
    +The symbol is associated with a section. +Symbol table entries of this type exist primarily for relocation +and normally have STB_LOCAL binding. +

    STT_FILE
    +Conventionally, the symbol's name gives the name of +the source file associated with the object file. +A file symbol has STB_LOCAL +binding, its section index is SHN_ABS, +and it precedes the other STB_LOCAL +symbols for the file, if it is present. +

    STT_COMMON
    +The symbol labels an uninitialized common block. +See below for details. + +

    STT_TLS
    +The symbol specifies a Thread-Local Storage entity. +When defined, it gives the assigned offset for the symbol, +not the actual address. +Symbols of type STT_TLS can be referenced +by only special thread-local storage relocations +and thread-local storage relocations can only reference +symbols with type STT_TLS. +Implementation need not support thread-local storage. +

    STT_LOOS through STT_HIOS
    +Values in this inclusive range +are reserved for operating system-specific semantics. +

    STT_LOPROC through STT_HIPROC
    +Values in this inclusive range +are reserved for processor-specific semantics. +If meanings are specified, the processor supplement explains +them. +
    +

    +Function symbols (those with type +STT_FUNC) in shared object files have special significance. +When another object file references a function from +a shared object, the link editor automatically creates a procedure +linkage table entry for the referenced symbol. +Shared object symbols with types other than +STT_FUNC will not +be referenced automatically through the procedure linkage table. + +

    +Symbols with type STT_COMMON label uninitialized +common blocks. In relocatable objects, these symbols are +not allocated and must have the special section index +SHN_COMMON (see below). +In shared objects and executables these symbols must be +allocated to some section in the defining object. +

    +In relocatable objects, symbols with type STT_COMMON +are treated just as other symbols with index SHN_COMMON. +If the link-editor allocates space for the SHN_COMMON +symbol in an output section of the object it is producing, it +must preserve the type of the output symbol as STT_COMMON. +

    +When the dynamic linker encounters a reference to a symbol +that resolves to a definition of type STT_COMMON, +it may (but is not required to) change its symbol resolution +rules as follows: instead of binding the reference to +the first symbol found with the given name, the dynamic linker searches +for the first symbol with that name with type other +than STT_COMMON. If no such symbol is found, +it looks for the STT_COMMON definition of that +name that has the largest size. + +

    +A symbol's visibility, although it may be specified in a relocatable +object, defines how that symbol may be accessed once it has +become part of an executable or shared object. +


    +Figure 4-19: Symbol Visibility +

    + + + + + + + + + + + + + + + + + + + +
    NameValue
    STV_DEFAULT0
    STV_INTERNAL1
    STV_HIDDEN2
    STV_PROTECTED3
    +


    +

    +

    +

    STV_DEFAULT
    +The visibility of symbols with the STV_DEFAULT +attribute is as specified by the symbol's binding type. +That is, global and weak symbols are visible +outside of their defining component +(executable file or shared object). +Local symbols are hidden, as described below. +Global and weak symbols are also preemptable, +that is, they may by preempted by definitions of the same +name in another component. +
    +NOTE: +An implementation may restrict the set of global and weak +symbols that are externally visible. +

    +

    STV_PROTECTED
    +A symbol defined in the current component is protected +if it is visible in other components but not preemptable, +meaning that any reference to such a symbol from within the +defining component must be resolved to the definition in +that component, even if there is a definition in another +component that would preempt by the default rules. +A symbol with STB_LOCAL binding may not have +STV_PROTECTED visibility. + +If a symbol definition with STV_PROTECTED visibility +from a shared object is taken as resolving a reference +from an executable or another shared object, +the SHN_UNDEF symbol table entry created +has STV_DEFAULT visibility. +
    +NOTE: + +The presence of the STV_PROTECTED flag on a symbol +in a given load module does not affect the symbol resolution +rules for references to that symbol from outside the containing +load module. +

    +

    STV_HIDDEN
    +A symbol defined in the current component is hidden +if its name is not visible to other components. Such a symbol +is necessarily protected. This attribute may be used to +control the external interface of a component. Note that +an object named by such a symbol may still be referenced +from another component if its address is passed outside. +

    +A hidden symbol contained in a relocatable object must be +either removed or converted to STB_LOCAL binding +by the link-editor when the relocatable object is included in an +executable file or shared object. +

    STV_INTERNAL
    +The meaning of this visibility attribute may be defined by processor +supplements to further constrain hidden symbols. A processor +supplement's definition should be such that generic tools +can safely treat internal symbols as hidden. +

    +An internal symbol contained in a relocatable object must be +either removed or converted to STB_LOCAL binding +by the link-editor when the relocatable object is included in an +executable file or shared object. +

    +

    +None of the visibility attributes affects resolution of symbols +within an executable or shared object during link-editing -- such +resolution is controlled by the binding type. Once the link-editor +has chosen its resolution, these attributes impose two requirements, +both based on the fact that references in the code being linked may +have been optimized to take advantage of the attributes. +

      +
    • +First, all of the non-default visibility attributes, when applied +to a symbol reference, imply that a definition to satisfy that +reference must be provided within the current executable or +shared object. If such a symbol reference has no definition within the +component being linked, then the reference must have +STB_WEAK binding and is resolved to zero. +
    • +Second, if any reference to or definition of a name is a symbol with +a non-default visibility attribute, the visibility attribute +must be propagated to the resolving symbol in the linked object. +If different visibility attributes are specified for distinct +references to or definitions of a symbol, the most constraining +visibility attribute must be propagated to the resolving symbol +in the linked object. The attributes, ordered from least +to most constraining, are: STV_PROTECTED, +STV_HIDDEN and STV_INTERNAL. +
    +

    +If a symbol's value refers to a +specific location within a section, +its section index member, st_shndx, +holds an index into the section header table. +As the section moves during relocation, the symbol's value +changes as well, and references to the symbol +continue to ``point'' to the same location in the program. +Some special section index values give other semantics. +

    +

    SHN_ABS
    +The symbol has an absolute value that will not change +because of relocation. + +

    SHN_COMMON
    +The symbol labels a common block that has not yet been allocated. +The symbol's value gives alignment constraints, +similar to a section's +sh_addralign member. +The link editor will allocate the storage for the symbol +at an address that is a multiple of +st_value. +The symbol's size tells how many bytes are required. +Symbols with section index SHN_COMMON may +appear only in relocatable objects. +

    SHN_UNDEF
    +This section table index means the symbol is undefined. +When the link editor combines this object file with +another that defines the indicated symbol, +this file's references to the symbol will be linked +to the actual definition. +

    SHN_XINDEX
    + +This value is an escape value. +It indicates that the symbol refers to a specific location within a section, +but that the section header index for that section is too large to be +represented directly in the symbol table entry. +The actual section header index is found in the associated +SHT_SYMTAB_SHNDX section. +The entries in that section correspond one to one +with the entries in the symbol table. +Only those entries in SHT_SYMTAB_SHNDX +that correspond to symbol table entries with SHN_XINDEX +will hold valid section header indexes; +all other entries will have value 0. +
    +

    +The symbol table entry for index 0 (STN_UNDEF) +is reserved; it holds the following. +


    +Figure 4-20: Symbol Table Entry:Index 0 +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValueNote
    st_name0No name
    st_value0Zero value
    st_size0No size
    st_info0No type, local binding
    st_other0Default visibility
    st_shndxSHN_UNDEFNo section
    +


    + +

    Symbol Values

    +Symbol table entries for different object file types have +slightly different interpretations for the st_value member. +
      +

    • +In relocatable files, st_value holds alignment constraints for a symbol +whose section index is SHN_COMMON. +

    • +In relocatable files, st_value holds +a section offset for a defined symbol. +st_value is an offset from the beginning of the section that +st_shndx identifies. +

    • +In executable and shared object files, +st_value holds a virtual address. +To make these files' symbols more useful +for the dynamic linker, the section offset (file interpretation) +gives way to a virtual address (memory interpretation) +for which the section number is irrelevant. +
    +Although the symbol table values have similar meanings +for different object files, the data allows +efficient access by the appropriate programs. +
    +Previous +Contents +Next +
    + + +© 1997, 1998, 1999, 2000, 2001 The Santa Cruz Operation, Inc. All rights reserved. + + + diff --git a/specs/sysv-abi-update.html/ch5.dynamic.html b/specs/sysv-abi-update.html/ch5.dynamic.html new file mode 100644 index 0000000..da7bfb6 --- /dev/null +++ b/specs/sysv-abi-update.html/ch5.dynamic.html @@ -0,0 +1,1250 @@ + +Dynamic Linking

    +

    Dynamic Linking

    + +

    Program Interpreter

    +An executable file that participates in +dynamic linking shall have one +PT_INTERP program header element. +During +exec(BA_OS), +the system retrieves a path name from the PT_INTERP +segment and creates the initial process image from +the interpreter file's segments. That is, +instead of using the original executable file's +segment images, the system composes a memory +image for the interpreter. +It then is the interpreter's responsibility to +receive control from the system and provide an +environment for the application program. +

    +As ``Process Initialization'' in Chapter 3 of the +processor supplement mentions, +the interpreter receives control in one of two ways. +First, it may receive a file descriptor +to read the executable file, positioned at the beginning. +It can use this file descriptor to read and/or map the executable +file's segments into memory. +Second, depending on the executable file format, the system +may load the executable file into memory instead of giving the +interpreter an open file descriptor. +With the possible exception of the file descriptor, +the interpreter's initial process state matches +what the executable file would have received. +The interpreter itself may not require a second interpreter. +An interpreter may be either a shared object +or an executable file. +

      +

    • +A shared object (the normal case) is loaded as +position-independent, with addresses that may vary +from one process to another; the system creates its segments +in the dynamic segment area used by mmap(KE_OS) and related services +[See ``Virtual Address Space'' in Chapter 3 of the processor +supplement]. +Consequently, a shared object interpreter typically will +not conflict with the original executable file's +original segment addresses. +

    • +An executable file may be loaded at fixed addresses; +if so, the system creates its segments +using the virtual addresses from the program header table. +Consequently, an executable file interpreter's +virtual addresses may collide with the +first executable file; the interpreter is responsible +for resolving conflicts. +
    + +

    Dynamic Linker

    +When building an executable file that uses dynamic linking, +the link editor adds a program header element of type +PT_INTERP to an executable file, telling the system to invoke +the dynamic linker as the program interpreter. +
    +NOTE: +The locations of the system provided dynamic +linkers are processor specific. +

    +Exec(BA_OS) +and the dynamic linker cooperate to +create the process image for the program, which entails +the following actions: +

      +

    • +Adding the executable file's memory segments to the process image; +

    • +Adding shared object memory segments to the process image; +

    • +Performing relocations for the executable file and its +shared objects; +

    • +Closing the file descriptor that was used to read the executable file, +if one was given to the dynamic linker; +

    • +Transferring control to the program, making it look as if +the program had received control directly from +exec(BA_OS). +
    +

    +The link editor also constructs various data +that assist the dynamic linker +for executable and shared object files. +As shown above in +``Program Header'', +this data resides +in loadable segments, making them available during execution. +(Once again, recall the exact segment contents are processor-specific. +See the processor supplement for complete information). +

      +

    • +A .dynamic section with type SHT_DYNAMIC +holds various data. +The structure residing at the +beginning of the section holds the addresses +of other dynamic linking information. +

    • +The .hash section with type SHT_HASH +holds a symbol hash table. +

    • +The .got and .plt sections with type +SHT_PROGBITS +hold two separate tables: +the global offset table and the procedure linkage table. +Chapter 3 discusses how programs use the global offset table +for position-independent code. +Sections below explain how the dynamic linker uses +and changes the tables to create memory images for object files. +
    +

    +Because every ABI-conforming program imports the basic system +services from a shared object library [See ``System Library'' +in Chapter 6], the dynamic linker participates in every +ABI-conforming program execution. +

    +As +`Program Loading'' explains in the processor supplement, +shared objects may occupy +virtual memory addresses that are different from the addresses recorded +in the file's program header table. +The dynamic linker relocates the memory image, updating +absolute addresses before the application gains control. +Although the absolute address values would be correct +if the library were loaded at +the addresses specified in the program header table, this normally +is not the case. +

    +If the process environment [see exec(BA_OS)] +contains a variable named LD_BIND_NOW +with a non-null value, the dynamic linker processes +all relocations before transferring control to the program. +For example, all the following environment entries +would specify this behavior. +

      +

    • +LD_BIND_NOW=1 +

    • +LD_BIND_NOW=on +

    • +LD_BIND_NOW=off +
    +Otherwise, LD_BIND_NOW either +does not occur in the environment or has a null value. +The dynamic linker is permitted to evaluate procedure linkage table +entries lazily, thus avoiding symbol resolution and relocation +overhead for functions that are not called. +See ``Procedure Linkage Table'' in this chapter of the processor +supplement for more information. + +

    Dynamic Section

    +If an object file participates in dynamic linking, +its program header table will have an element of type +PT_DYNAMIC. +This ``segment'' contains the .dynamic section. +A special symbol, _DYNAMIC, +labels the section, which contains +an array of the following structures. +


    +Figure 5-9: Dynamic Structure +

    +

    +
    +typedef struct {
    +	Elf32_Sword	d_tag;
    +   	union {
    +   		Elf32_Word	d_val;
    +   		Elf32_Addr	d_ptr;
    +	} d_un;
    +} Elf32_Dyn;
    +
    +extern Elf32_Dyn	_DYNAMIC[];
    +
    +typedef struct {
    +	Elf64_Sxword	d_tag;
    +   	union {
    +   		Elf64_Xword	d_val;
    +   		Elf64_Addr	d_ptr;
    +	} d_un;
    +} Elf64_Dyn;
    +
    +extern Elf64_Dyn	_DYNAMIC[];
    +
    +
    +
    +

    +For each object with this type, d_tag +controls the interpretation of d_un. +

    +

    d_val
    +These objects represent integer values with various +interpretations. +

    d_ptr
    +These objects represent program virtual addresses. +As mentioned previously, a file's virtual addresses +might not match the memory virtual addresses during execution. +When interpreting addresses contained in the dynamic +structure, the dynamic linker computes actual addresses, +based on the original file value and the memory base address. +For consistency, files do not +contain relocation entries to ``correct'' addresses in the dynamic +structure. +
    +

    + +To make it simpler for tools to interpret the contents of +dynamic section entries, the value of each tag, except for those in +two special compatibility ranges, +will determine the interpretation of the d_un +union. A tag whose value is an even number +indicates a dynamic section entry that uses d_ptr. +A tag whose value is an odd number indicates a dynamic section entry +that uses d_val or that uses neither d_ptr +nor d_val. Tags whose values are less +than the special value DT_ENCODING and tags +whose values fall between DT_HIOS and +DT_LOPROC do not follow these rules. +

    +The following table summarizes the tag requirements +for executable and shared object files. +If a tag is marked ``mandatory'', the dynamic linking +array for an ABI-conforming file must have an entry of that type. +Likewise, ``optional'' means an entry for the tag may appear +but is not required. +


    +Figure 5-10: Dynamic Array Tags, d_tag +

    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    NameValued_unExecutableShared Object
    DT_NULL0ignoredmandatorymandatory
    DT_NEEDED1d_valoptionaloptional
    DT_PLTRELSZ2d_valoptionaloptional
    DT_PLTGOT3d_ptroptionaloptional
    DT_HASH4d_ptrmandatorymandatory
    DT_STRTAB5d_ptrmandatorymandatory
    DT_SYMTAB6d_ptrmandatorymandatory
    DT_RELA7d_ptrmandatoryoptional
    DT_RELASZ8d_valmandatoryoptional
    DT_RELAENT9d_valmandatoryoptional
    DT_STRSZ10d_valmandatorymandatory
    DT_SYMENT11d_valmandatorymandatory
    DT_INIT12d_ptroptionaloptional
    DT_FINI13d_ptroptionaloptional
    DT_SONAME14d_valignoredoptional
    DT_RPATH*15d_valoptionalignored
    DT_SYMBOLIC*16ignoredignoredoptional
    DT_REL17d_ptrmandatoryoptional
    DT_RELSZ18d_valmandatoryoptional
    DT_RELENT19d_valmandatoryoptional
    DT_PLTREL20d_valoptionaloptional
    DT_DEBUG21d_ptroptionalignored
    DT_TEXTREL*22ignoredoptionaloptional
    DT_JMPREL23d_ptroptionaloptional
    DT_BIND_NOW*24ignoredoptionaloptional
    DT_INIT_ARRAY25d_ptroptionaloptional
    DT_FINI_ARRAY26d_ptroptionaloptional
    DT_INIT_ARRAYSZ27d_valoptionaloptional
    DT_FINI_ARRAYSZ28d_valoptionaloptional
    DT_RUNPATH29d_valoptionaloptional
    DT_FLAGS30d_valoptionaloptional
    DT_ENCODING32unspecifiedunspecifiedunspecified
    DT_PREINIT_ARRAY32d_ptroptionalignored
    DT_PREINIT_ARRAYSZ33d_valoptionalignored
    DT_LOOS0x6000000Dunspecifiedunspecifiedunspecified
    DT_HIOS0x6ffff000unspecifiedunspecifiedunspecified
    DT_LOPROC0x70000000unspecifiedunspecifiedunspecified
    DT_HIPROC0x7fffffffunspecifiedunspecifiedunspecified
    +

    +* Signifies an entry that is at level 2. +


    +
    +

    DT_NULL
    +An entry with a DT_NULL tag marks the end of the +_DYNAMIC array. +

    DT_NEEDED
    +This element holds the string table offset of a null-terminated string, +giving the name of a needed library. +The offset is an index into the table recorded in the DT_STRTAB code. +See +``Shared Object Dependencies'' +for more +information about these names. +The dynamic array may contain multiple entries with +this type. +These entries' relative order is significant, though their +relation to entries of other types is not. +

    DT_PLTRELSZ
    +This element holds the total size, in bytes, +of the relocation entries associated with the procedure linkage table. +If an entry of type DT_JMPREL is present, a +DT_PLTRELSZ must accompany it. +

    DT_PLTGOT
    +This element holds an address associated with the procedure linkage table +and/or the global offset table. +See this section in the processor supplement for details. +

    DT_HASH
    +This element holds the address of the symbol hash table, +described in +``Hash Table''. +This hash table refers to the symbol table referenced by the DT_SYMTAB +element. +

    DT_STRTAB
    +This element holds the address of the string table, +described in Chapter 4. +Symbol names, library names, and other strings reside +in this table. +

    DT_SYMTAB
    +This element holds the address of the symbol table, +described in the first part of this chapter, with Elf32_Sym +entries for the 32-bit class of files and Elf64_Sym +entries for the 64-bit class of files. +

    DT_RELA
    +This element holds the address of a relocation table, +described in Chapter 4. +Entries in the table have explicit addends, such as +Elf32_Rela for the 32-bit file class +or Elf64_Rela for the 64-bit file class. +An object file may have multiple relocation sections. +When building the relocation table for an +executable or shared object file, the link editor +catenates those sections to form a single table. +Although the sections remain independent in the object file, +the dynamic linker sees a single table. +When the dynamic linker creates the process image for +an executable file or adds a shared object to the +process image, it reads the relocation table and performs +the associated actions. +If this element is present, the dynamic structure must also have +DT_RELASZ and DT_RELAENT elements. +When relocation is ``mandatory'' for a file, either +DT_RELA or DT_REL may occur (both are permitted but not required). +

    DT_RELASZ
    +This element holds the total size, in bytes, of the +DT_RELA relocation table. +

    DT_RELAENT
    +This element holds the size, in bytes, of the +DT_RELA relocation entry. +

    DT_STRSZ
    +This element holds the size, in bytes, of the string table. +

    DT_SYMENT
    +This element holds the size, in bytes, of a symbol table entry. +

    DT_INIT
    +This element holds the address of the initialization function, +discussed in +``Initialization and Termination Functions'' +below. +

    DT_FINI
    +This element holds the address of the termination function, +discussed in +``Initialization and Termination Functions'' +below. +

    DT_SONAME
    +This element holds the string table offset of a null-terminated string, +giving the name of the shared object. +The offset is an index into the table recorded in the DT_STRTAB entry. +See +``Shared Object Dependencies'' +below for more +information about these names. + +

    DT_RPATH
    +This element holds the string table offset of a null-terminated search +library search path string discussed in +``Shared Object Dependencies''. +The offset is an index into the table recorded in the +DT_STRTAB entry. This entry is at level 2. Its +use has been superseded by DT_RUNPATH. +

    DT_SYMBOLIC
    +This element's presence in a shared object library alters +the dynamic linker's symbol resolution algorithm for +references within the library. +Instead of starting a symbol search with the +executable file, the dynamic linker starts from the +shared object itself. +If the shared object fails to supply the referenced +symbol, the dynamic linker then searches the +executable file and other shared objects as usual. +This entry is at level 2. Its use has been superseded +by the DF_SYMBOLIC flag. +

    DT_REL
    +This element is similar to DT_RELA, +except its table has implicit addends, such as +Elf32_Rel for the 32-bit file class +or Elf64_Rel for the 64-bit file class. +If this element is present, the dynamic structure must also have +DT_RELSZ and DT_RELENT elements. +

    DT_RELSZ
    +This element holds the total size, in bytes, of the +DT_REL relocation table. +

    DT_RELENT
    +This element holds the size, in bytes, of the +DT_REL relocation entry. +

    DT_PLTREL
    +This member specifies the type of relocation entry +to which the procedure linkage table refers. +The d_val member holds DT_REL or DT_RELA, +as appropriate. +All relocations in a procedure linkage table must use +the same relocation. +

    DT_DEBUG
    +This member is used for debugging. Its contents are not specified +for the ABI; programs that access this entry are not +ABI-conforming. +

    DT_TEXTREL
    +This member's absence signifies that no +relocation entry should cause a modification to a non-writable +segment, as specified by the segment permissions in the program +header table. +If this member is present, one or more relocation entries might +request modifications to a non-writable segment, and the dynamic +linker can prepare accordingly. +This entry is at level 2. Its use has been superseded +by the DF_TEXTREL flag. +

    DT_JMPREL
    +If present, this entry's d_ptr +member holds the address of relocation entries associated solely +with the procedure linkage table. +Separating these relocation entries lets the dynamic linker ignore +them during process initialization, if lazy binding is enabled. +If this entry is present, the related entries of types +DT_PLTRELSZ and DT_PLTREL +must also be present. +

    DT_BIND_NOW
    +If present in a shared object or executable, this entry +instructs the dynamic linker to process all relocations +for the object containing this entry before transferring +control to the program. +The presence of this entry takes +precedence over a directive to use lazy binding for this object when +specified through the environment or via dlopen(BA_LIB). +This entry is at level 2. Its use has been superseded +by the DF_BIND_NOW flag. +

    DT_INIT_ARRAY
    + +This element holds the address of the array of pointers to initialization +functions, +discussed in +``Initialization and Termination Functions'' +below. +

    DT_FINI_ARRAY
    +This element holds the address of the array of pointers to termination +functions, +discussed in +``Initialization and Termination Functions'' +below. +

    DT_INIT_ARRAYSZ
    +This element holds the size in bytes of the array of initialization +functions pointed to by the DT_INIT_ARRAY entry. +If an object has a DT_INIT_ARRAY entry, it must +also have a DT_INIT_ARRAYSZ entry. +

    DT_FINI_ARRAYSZ
    +This element holds the size in bytes of the array of termination +functions pointed to by the DT_FINI_ARRAY entry. +If an object has a DT_FINI_ARRAY entry, it must +also have a DT_FINI_ARRAYSZ entry. + +

    DT_RUNPATH
    +This element holds the string table offset of a null-terminated +library search path string discussed in +``Shared Object Dependencies''. +The offset is an index into the table recorded in the +DT_STRTAB entry. +

    DT_FLAGS
    +This element holds flag values specific to the object being +loaded. Each flag value will have the name DF_flag_name. +Defined values and their meanings are described below. +All other values are reserved. +

    DT_PREINIT_ARRAY
    +This element holds the address of the array of pointers to pre-initialization +functions, +discussed in +``Initialization and Termination Functions'' +below. The DT_PREINIT_ARRAY table is processed only +in an executable file; it is ignored if contained in a shared object. +

    DT_PREINIT_ARRAYSZ
    +This element holds the size in bytes of the array of pre-initialization +functions pointed to by the DT_PREINIT_ARRAY entry. +If an object has a DT_PREINIT_ARRAY entry, it must +also have a DT_PREINIT_ARRAYSZ entry. As with +DT_PREINIT_ARRAY, this entry is ignored if it appears +in a shared object. +

    DT_ENCODING
    +Values greater than or equal to DT_ENCODING +and less than DT_LOOS +follow the rules for the interpretation of the d_un union +described above. +

    DT_LOOS through DT_HIOS
    +Values in this inclusive range +are reserved for operating system-specific semantics. +All such values follow the rules for the interpretation of the +d_un union described above. +

    DT_LOPROC through DT_HIPROC
    +Values in this inclusive range +are reserved for processor-specific semantics. If meanings +are specified, the processor supplement explains them. +All such values follow the rules for the interpretation of the +d_un union described above. +
    +

    +Except for the DT_NULL element at the end of the array, +and the relative order of DT_NEEDED +elements, entries may appear in any order. +Tag values not appearing in the table are reserved. + +


    +Figure 5-11: DT_FLAGS values +

    + + + + + + + + + + + + + + + + + + + + + + + +
    NameValue
    DF_ORIGIN0x1
    DF_SYMBOLIC0x2
    DF_TEXTREL0x4
    DF_BIND_NOW0x8
    DF_STATIC_TLS0x10
    +


    +
    +

    DF_ORIGIN
    +This flag signifies that the object being loaded may make reference +to the $ORIGIN substitution string (see ``Substitution Sequences''). +The dynamic linker must determine the pathname of the object +containing this entry when the object is loaded. + +

    DF_SYMBOLIC
    +If this flag is set in a shared object library, +the dynamic linker's symbol resolution algorithm for +references within the library is changed. +Instead of starting a symbol search with the +executable file, the dynamic linker starts from the +shared object itself. +If the shared object fails to supply the referenced +symbol, the dynamic linker then searches the +executable file and other shared objects as usual. + +

    DF_TEXTREL
    +If this flag is not set, no +relocation entry should cause a modification to a non-writable +segment, as specified by the segment permissions in the program +header table. +If this flag is set, one or more relocation entries might +request modifications to a non-writable segment, and the dynamic +linker can prepare accordingly. + +

    DF_BIND_NOW
    +If set in a shared object or executable, this flag +instructs the dynamic linker to process all relocations +for the object containing this entry before transferring +control to the program. +The presence of this entry takes +precedence over a directive to use lazy binding for this object when +specified through the environment or via dlopen(BA_LIB). + +

    DF_STATIC_TLS
    +If set in a shared object or executable, +this flag instructs the dynamic linker to reject +attempts to load this file dynamically. +It indicates that the shared object or executable +contains code using a static thread-local storage scheme. +Implementations need not support any form of thread-local storage. +
    + +

    Shared Object Dependencies

    +When the link editor processes an archive library, +it extracts library members and copies them into +the output object file. +These statically linked services are available during +execution without involving the dynamic linker. +Shared objects also provide services, and +the dynamic linker must attach the proper shared object files to +the process image for execution. +

    +When the dynamic linker creates the memory segments for +an object file, the dependencies (recorded in +DT_NEEDED entries of the dynamic structure) +tell what shared objects are needed to +supply the program's services. +By repeatedly connecting referenced shared objects and +their dependencies, the dynamic linker builds a complete process image. +When resolving symbolic references, the dynamic linker +examines the symbol tables with a breadth-first search. +That is, it first looks at the symbol table of the +executable program itself, then at the symbol tables +of the DT_NEEDED entries (in order), +and then at the second level DT_NEEDED entries, and +so on. Shared object files must be readable by the process; +other permissions are not required. +


    +NOTE: +Even when a shared object is referenced multiple +times in the dependency list, the dynamic linker will +connect the object only once to the process. +

    +

    +Names in the dependency list are copies either of the +DT_SONAME strings or the path names of the shared objects used to build +the object file. +For example, if the link editor builds an executable +file using one shared object with a +DT_SONAME entry of lib1 +and another shared object library with the path name +/usr/lib/lib2, the executable file will contain +lib1 and /usr/lib/lib2 in its dependency list. +

    +If a shared object name has one or more slash (/) +characters anywhere in the name, such as /usr/lib/lib2 +or directory/file, the dynamic linker uses that string directly +as the path name. +If the name has no slashes, such as lib1, +three facilities specify shared object path searching. +