Skip to content
Browse files

Import flex-2.5.4

  • Loading branch information...
1 parent 4087210 commit c166d6db498ff5ddfbd4bf535d8179bd6f4b04bb mikel committed Dec 10, 1996
Showing with 16,897 additions and 7,414 deletions.
  1. +29 −43 usr.bin/lex/COPYING
  2. +185 −0 usr.bin/lex/FlexLexer.h
  3. +1,233 −0 usr.bin/lex/NEWS
  4. +87 −126 usr.bin/lex/ccl.c
  5. +26 −0 usr.bin/lex/config.h
  6. +816 −808 usr.bin/lex/dfa.c
  7. +145 −282 usr.bin/lex/ecs.c
  8. +4,060 −0 usr.bin/lex/flex.1
  9. +1,541 −0 usr.bin/lex/flex.skl
  10. +474 −274 usr.bin/lex/flexdef.h
  11. +1,290 −1,012 usr.bin/lex/gen.c
  12. +3,019 −1,614 usr.bin/lex/initscan.c
  13. +15 −0 usr.bin/lex/libmain.c
  14. +8 −0 usr.bin/lex/libyywrap.c
  15. +984 −587 usr.bin/lex/main.c
  16. +630 −538 usr.bin/lex/misc.c
  17. +16 −0 usr.bin/lex/mkskel.sh
  18. +381 −401 usr.bin/lex/nfa.c
  19. +541 −343 usr.bin/lex/parse.y
  20. +508 −350 usr.bin/lex/scan.l
  21. +151 −216 usr.bin/lex/sym.c
  22. +567 −617 usr.bin/lex/tblcmp.c
  23. +1 −0 usr.bin/lex/version.h
  24. +190 −203 usr.bin/lex/yylex.c
View
72 usr.bin/lex/COPYING
@@ -2,51 +2,37 @@ Flex carries the copyright used for BSD software, slightly modified
because it originated at the Lawrence Berkeley (not Livermore!) Laboratory,
which operates under a contract with the Department of Energy:
-/*-
- * Copyright (c) 1991 The Regents of the University of California.
- * All rights reserved.
- *
- * This code is derived from software contributed to Berkeley by
- * Vern Paxson of Lawrence Berkeley Laboratory.
- *
- * The United States Government has rights in this work pursuant
- * to contract no. DE-AC03-76SF00098 between the United States
- * Department of Energy and the University of California.
- *
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- * 3. All advertising materials mentioning features or use of this software
- * must display the following acknowledgement:
- * This product includes software developed by the University of
- * California, Berkeley and its contributors.
- * 4. Neither the name of the University nor the names of its contributors
- * may be used to endorse or promote products derived from this software
- * without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
- *
- * @(#)COPYING 5.2 (Berkeley) 4/12/91
- */
+ Copyright (c) 1990 The Regents of the University of California.
+ All rights reserved.
+
+ This code is derived from software contributed to Berkeley by
+ Vern Paxson.
+
+ The United States Government has rights in this work pursuant
+ to contract no. DE-AC03-76SF00098 between the United States
+ Department of Energy and the University of California.
+
+ Redistribution and use in source and binary forms are permitted
+ provided that: (1) source distributions retain this entire
+ copyright notice and comment, and (2) distributions including
+ binaries display the following acknowledgement: ``This product
+ includes software developed by the University of California,
+ Berkeley and its contributors'' in the documentation or other
+ materials provided with the distribution and in all advertising
+ materials mentioning features or use of this software. Neither the
+ name of the University nor the names of its contributors may be
+ used to endorse or promote products derived from this software
+ without specific prior written permission.
+
+ THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR
+ IMPLIED WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED
+ WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR
+ PURPOSE.
This basically says "do whatever you please with this software except
-remove this notice".
+remove this notice or take advantage of the University's (or the flex
+authors') name".
-Note that the "flex.skel" scanner skeleton carries no copyright notice.
+Note that the "flex.skl" scanner skeleton carries no copyright notice.
You are free to do whatever you please with scanners generated using flex;
for them, you are not even bound by the above copyright.
View
185 usr.bin/lex/FlexLexer.h
@@ -0,0 +1,185 @@
+// $Header: /cvsroot/src/usr.bin/lex/Attic/FlexLexer.h,v 1.1.1.1 1996/12/10 06:06:27 mikel Exp $
+
+// FlexLexer.h -- define interfaces for lexical analyzer classes generated
+// by flex
+
+// Copyright (c) 1993 The Regents of the University of California.
+// All rights reserved.
+//
+// This code is derived from software contributed to Berkeley by
+// Kent Williams and Tom Epperly.
+//
+// Redistribution and use in source and binary forms are permitted provided
+// that: (1) source distributions retain this entire copyright notice and
+// comment, and (2) distributions including binaries display the following
+// acknowledgement: ``This product includes software developed by the
+// University of California, Berkeley and its contributors'' in the
+// documentation or other materials provided with the distribution and in
+// all advertising materials mentioning features or use of this software.
+// Neither the name of the University nor the names of its contributors may
+// be used to endorse or promote products derived from this software without
+// specific prior written permission.
+// THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
+// WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
+// MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
+
+// This file defines FlexLexer, an abstract class which specifies the
+// external interface provided to flex C++ lexer objects, and yyFlexLexer,
+// which defines a particular lexer class.
+//
+// If you want to create multiple lexer classes, you use the -P flag
+// to rename each yyFlexLexer to some other xxFlexLexer. You then
+// include <FlexLexer.h> in your other sources once per lexer class:
+//
+// #undef yyFlexLexer
+// #define yyFlexLexer xxFlexLexer
+// #include <FlexLexer.h>
+//
+// #undef yyFlexLexer
+// #define yyFlexLexer zzFlexLexer
+// #include <FlexLexer.h>
+// ...
+
+#ifndef __FLEX_LEXER_H
+// Never included before - need to define base class.
+#define __FLEX_LEXER_H
+#include <iostream.h>
+
+extern "C++" {
+
+struct yy_buffer_state;
+typedef int yy_state_type;
+
+class FlexLexer {
+public:
+ virtual ~FlexLexer() { }
+
+ const char* YYText() { return yytext; }
+ int YYLeng() { return yyleng; }
+
+ virtual void
+ yy_switch_to_buffer( struct yy_buffer_state* new_buffer ) = 0;
+ virtual struct yy_buffer_state*
+ yy_create_buffer( istream* s, int size ) = 0;
+ virtual void yy_delete_buffer( struct yy_buffer_state* b ) = 0;
+ virtual void yyrestart( istream* s ) = 0;
+
+ virtual int yylex() = 0;
+
+ // Call yylex with new input/output sources.
+ int yylex( istream* new_in, ostream* new_out = 0 )
+ {
+ switch_streams( new_in, new_out );
+ return yylex();
+ }
+
+ // Switch to new input/output streams. A nil stream pointer
+ // indicates "keep the current one".
+ virtual void switch_streams( istream* new_in = 0,
+ ostream* new_out = 0 ) = 0;
+
+ int lineno() const { return yylineno; }
+
+ int debug() const { return yy_flex_debug; }
+ void set_debug( int flag ) { yy_flex_debug = flag; }
+
+protected:
+ char* yytext;
+ int yyleng;
+ int yylineno; // only maintained if you use %option yylineno
+ int yy_flex_debug; // only has effect with -d or "%option debug"
+};
+
+}
+#endif
+
+#if defined(yyFlexLexer) || ! defined(yyFlexLexerOnce)
+// Either this is the first time through (yyFlexLexerOnce not defined),
+// or this is a repeated include to define a different flavor of
+// yyFlexLexer, as discussed in the flex man page.
+#define yyFlexLexerOnce
+
+class yyFlexLexer : public FlexLexer {
+public:
+ // arg_yyin and arg_yyout default to the cin and cout, but we
+ // only make that assignment when initializing in yylex().
+ yyFlexLexer( istream* arg_yyin = 0, ostream* arg_yyout = 0 );
+
+ virtual ~yyFlexLexer();
+
+ void yy_switch_to_buffer( struct yy_buffer_state* new_buffer );
+ struct yy_buffer_state* yy_create_buffer( istream* s, int size );
+ void yy_delete_buffer( struct yy_buffer_state* b );
+ void yyrestart( istream* s );
+
+ virtual int yylex();
+ virtual void switch_streams( istream* new_in, ostream* new_out );
+
+protected:
+ virtual int LexerInput( char* buf, int max_size );
+ virtual void LexerOutput( const char* buf, int size );
+ virtual void LexerError( const char* msg );
+
+ void yyunput( int c, char* buf_ptr );
+ int yyinput();
+
+ void yy_load_buffer_state();
+ void yy_init_buffer( struct yy_buffer_state* b, istream* s );
+ void yy_flush_buffer( struct yy_buffer_state* b );
+
+ int yy_start_stack_ptr;
+ int yy_start_stack_depth;
+ int* yy_start_stack;
+
+ void yy_push_state( int new_state );
+ void yy_pop_state();
+ int yy_top_state();
+
+ yy_state_type yy_get_previous_state();
+ yy_state_type yy_try_NUL_trans( yy_state_type current_state );
+ int yy_get_next_buffer();
+
+ istream* yyin; // input source for default LexerInput
+ ostream* yyout; // output sink for default LexerOutput
+
+ struct yy_buffer_state* yy_current_buffer;
+
+ // yy_hold_char holds the character lost when yytext is formed.
+ char yy_hold_char;
+
+ // Number of characters read into yy_ch_buf.
+ int yy_n_chars;
+
+ // Points to current character in buffer.
+ char* yy_c_buf_p;
+
+ int yy_init; // whether we need to initialize
+ int yy_start; // start state number
+
+ // Flag which is used to allow yywrap()'s to do buffer switches
+ // instead of setting up a fresh yyin. A bit of a hack ...
+ int yy_did_buffer_switch_on_eof;
+
+ // The following are not always needed, but may be depending
+ // on use of certain flex features (like REJECT or yymore()).
+
+ yy_state_type yy_last_accepting_state;
+ char* yy_last_accepting_cpos;
+
+ yy_state_type* yy_state_buf;
+ yy_state_type* yy_state_ptr;
+
+ char* yy_full_match;
+ int* yy_full_state;
+ int yy_full_lp;
+
+ int yy_lp;
+ int yy_looking_for_trail_begin;
+
+ int yy_more_flag;
+ int yy_more_len;
+ int yy_more_offset;
+ int yy_prev_more_offset;
+};
+
+#endif
View
1,233 usr.bin/lex/NEWS
@@ -0,0 +1,1233 @@
+Changes between release 2.5.4 (11Sep96) and release 2.5.3:
+
+ - Fixed a bug introduced in 2.5.3 that blew it when a call
+ to input() occurred at the end of an input file.
+
+ - Fixed scanner skeleton so the example in the man page of
+ scanning strings using exclusive start conditions works.
+
+ - Minor Makefile tweaks.
+
+
+Changes between release 2.5.3 (29May96) and release 2.5.2:
+
+ - Some serious bugs in yymore() have been fixed. In particular,
+ when using AT&T-lex-compatibility or %array, you can intermix
+ calls to input(), unput(), and yymore(). (This still doesn't
+ work for %pointer, and isn't likely to in the future.)
+
+ - A bug in handling NUL's in the input stream of scanners using
+ REJECT has been fixed.
+
+ - The default main() in libfl.a now repeatedly calls yylex() until
+ it returns 0, rather than just calling it once.
+
+ - Minor tweak for Windows NT Makefile, MISC/NT/Makefile.
+
+
+Changes between release 2.5.2 (25Apr95) and release 2.5.1:
+
+ - The --prefix configuration option now works.
+
+ - A bug that completely broke the "-Cf" table compression
+ option has been fixed.
+
+ - A major headache involving "const" declarators and Solaris
+ systems has been fixed.
+
+ - An octal escape sequence in a flex regular expression must
+ now contain only the digits 0-7.
+
+ - You can now use "--" on the flex command line to mark the
+ end of flex options.
+
+ - You can now specify the filename '-' as a synonym for stdin.
+
+ - By default, the scanners generated by flex no longer
+ statically initialize yyin and yyout to stdin and stdout.
+ This change is necessary because in some ANSI environments,
+ stdin and stdout are not compile-time constant. You can
+ force the initialization using "%option stdinit" in the first
+ section of your flex input.
+
+ - "%option nounput" now correctly omits the unput() routine
+ from the output.
+
+ - "make clean" now removes config.log, config.cache, and the
+ flex binary. The fact that it removes the flex binary means
+ you should take care if making changes to scan.l, to make
+ sure you don't wind up in a bootstrap problem.
+
+ - In general, the Makefile has been reworked somewhat (thanks
+ to Francois Pinard) for added flexibility - more changes will
+ follow in subsequent releases.
+
+ - The .texi and .info files in MISC/texinfo/ have been updated,
+ thanks also to Francois Pinard.
+
+ - The FlexLexer::yylex(istream* new_in, ostream* new_out) method
+ now does not have a default for the first argument, to disambiguate
+ it from FlexLexer::yylex().
+
+ - A bug in destructing a FlexLexer object before doing any scanning
+ with it has been fixed.
+
+ - A problem with including FlexLexer.h multiple times has been fixed.
+
+ - The alloca() chud necessary to accommodate bison has grown
+ even uglier, but hopefully more correct.
+
+ - A portability tweak has been added to accommodate compilers that
+ use char* generic pointers.
+
+ - EBCDIC contact information in the file MISC/EBCDIC has been updated.
+
+ - An OS/2 Makefile and config.h for flex 2.5 is now available in
+ MISC/OS2/, contributed by Kai Uwe Rommel.
+
+ - The descrip.mms file for building flex under VMS has been updated,
+ thanks to Pat Rankin.
+
+ - The notes on building flex for the Amiga have been updated for
+ flex 2.5, contributed by Andreas Scherer.
+
+
+Changes between release 2.5.1 (28Mar95) and release 2.4.7:
+
+ - A new concept of "start condition" scope has been introduced.
+ A start condition scope is begun with:
+
+ <SCs>{
+
+ where SCs is a list of one or more start conditions. Inside
+ the start condition scope, every rule automatically has the
+ prefix <SCs> applied to it, until a '}' which matches the
+ initial '{'. So, for example:
+
+ <ESC>{
+ "\\n" return '\n';
+ "\\r" return '\r';
+ "\\f" return '\f';
+ "\\0" return '\0';
+ }
+
+ is equivalent to:
+
+ <ESC>"\\n" return '\n';
+ <ESC>"\\r" return '\r';
+ <ESC>"\\f" return '\f';
+ <ESC>"\\0" return '\0';
+
+ As indicated in this example, rules inside start condition scopes
+ (and any rule, actually, other than the first) can be indented,
+ to better show the extent of the scope.
+
+ Start condition scopes may be nested.
+
+ - The new %option directive can be used in the first section of
+ a flex scanner to control scanner-generation options. Most
+ options are given simply as names, optionally preceded by the
+ word "no" (with no intervening whitespace) to negate their
+ meaning. Some are equivalent to flex flags, so putting them
+ in your scanner source is equivalent to always specifying
+ the flag (%option's take precedence over flags):
+
+ 7bit -7 option
+ 8bit -8 option
+ align -Ca option
+ backup -b option
+ batch -B option
+ c++ -+ option
+ caseful opposite of -i option (caseful is the default);
+ case-sensitive same as above
+ caseless -i option;
+ case-insensitive same as above
+ debug -d option
+ default opposite of -s option
+ ecs -Ce option
+ fast -F option
+ full -f option
+ interactive -I option
+ lex-compat -l option
+ meta-ecs -Cm option
+ perf-report -p option
+ read -Cr option
+ stdout -t option
+ verbose -v option
+ warn opposite of -w option (so use "%option nowarn" for -w)
+
+ array equivalent to "%array"
+ pointer equivalent to "%pointer" (default)
+
+ Some provide new features:
+
+ always-interactive generate a scanner which always
+ considers its input "interactive" (no call to isatty()
+ will be made when the scanner runs)
+ main supply a main program for the scanner, which
+ simply calls yylex(). Implies %option noyywrap.
+ never-interactive generate a scanner which never
+ considers its input "interactive" (no call to isatty()
+ will be made when the scanner runs)
+ stack if set, enable start condition stacks (see below)
+ stdinit if unset ("%option nostdinit"), initialize yyin
+ and yyout statically to nil FILE* pointers, instead
+ of stdin and stdout
+ yylineno if set, keep track of the current line
+ number in global yylineno (this option is expensive
+ in terms of performance). The line number is available
+ to C++ scanning objects via the new member function
+ lineno().
+ yywrap if unset ("%option noyywrap"), scanner does not
+ call yywrap() upon EOF but simply assumes there
+ are no more files to scan
+
+ Flex scans your rule actions to determine whether you use the
+ REJECT or yymore features (this is not new). Two %options can be
+ used to override its decision, either by setting them to indicate
+ the feature is indeed used, or unsetting them to indicate it
+ actually is not used:
+
+ reject
+ yymore
+
+ Three %option's take string-delimited values, offset with '=':
+
+ outfile="<name>" equivalent to -o<name>
+ prefix="<name>" equivalent to -P<name>
+ yyclass="<name>" set the name of the C++ scanning class
+ (see below)
+
+ A number of %option's are available for lint purists who
+ want to suppress the appearance of unneeded routines in
+ the generated scanner. Each of the following, if unset,
+ results in the corresponding routine not appearing in the
+ generated scanner:
+
+ input, unput
+ yy_push_state, yy_pop_state, yy_top_state
+ yy_scan_buffer, yy_scan_bytes, yy_scan_string
+
+ You can specify multiple options with a single %option directive,
+ and multiple directives in the first section of your flex input file.
+
+ - The new function:
+
+ YY_BUFFER_STATE yy_scan_string( const char *str )
+
+ returns a YY_BUFFER_STATE (which also becomes the current input
+ buffer) for scanning the given string, which occurs starting
+ with the next call to yylex(). The string must be NUL-terminated.
+ A related function:
+
+ YY_BUFFER_STATE yy_scan_bytes( const char *bytes, int len )
+
+ creates a buffer for scanning "len" bytes (including possibly NUL's)
+ starting at location "bytes".
+
+ Note that both of these functions create and scan a *copy* of
+ the string/bytes. (This may be desirable, since yylex() modifies
+ the contents of the buffer it is scanning.) You can avoid the
+ copy by using:
+
+ YY_BUFFER_STATE yy_scan_buffer( char *base, yy_size_t size )
+
+ which scans in place the buffer starting at "base", consisting
+ of "size" bytes, the last two bytes of which *must* be
+ YY_END_OF_BUFFER_CHAR (these bytes are not scanned; thus, scanning
+ consists of base[0] through base[size-2], inclusive). If you
+ fail to set up "base" in this manner, yy_scan_buffer returns a
+ nil pointer instead of creating a new input buffer.
+
+ The type yy_size_t is an integral type to which you can cast
+ an integer expression reflecting the size of the buffer.
+
+ - Three new routines are available for manipulating stacks of
+ start conditions:
+
+ void yy_push_state( int new_state )
+
+ pushes the current start condition onto the top of the stack
+ and BEGIN's "new_state" (recall that start condition names are
+ also integers).
+
+ void yy_pop_state()
+
+ pops the top of the stack and BEGIN's to it, and
+
+ int yy_top_state()
+
+ returns the top of the stack without altering the stack's
+ contents.
+
+ The start condition stack grows dynamically and so has no built-in
+ size limitation. If memory is exhausted, program execution
+ is aborted.
+
+ To use start condition stacks, your scanner must include
+ a "%option stack" directive.
+
+ - flex now supports POSIX character class expressions. These
+ are expressions enclosed inside "[:" and ":]" delimiters (which
+ themselves must appear between the '[' and ']' of a character
+ class; other elements may occur inside the character class, too).
+ The expressions flex recognizes are:
+
+ [:alnum:] [:alpha:] [:blank:] [:cntrl:] [:digit:] [:graph:]
+ [:lower:] [:print:] [:punct:] [:space:] [:upper:] [:xdigit:]
+
+ These expressions all designate a set of characters equivalent to
+ the corresponding isXXX function (for example, [:alnum:] designates
+ those characters for which isalnum() returns true - i.e., any
+ alphabetic or numeric). Some systems don't provide isblank(),
+ so flex defines [:blank:] as a blank or a tab.
+
+ For example, the following character classes are all equivalent:
+
+ [[:alnum:]]
+ [[:alpha:][:digit:]
+ [[:alpha:]0-9]
+ [a-zA-Z0-9]
+
+ If your scanner is case-insensitive (-i flag), then [:upper:]
+ and [:lower:] are equivalent to [:alpha:].
+
+ - The promised rewrite of the C++ FlexLexer class has not yet
+ been done. Support for FlexLexer is limited at the moment to
+ fixing show-stopper bugs, so, for example, the new functions
+ yy_scan_string() & friends are not available to FlexLexer
+ objects.
+
+ - The new macro
+
+ yy_set_interactive(is_interactive)
+
+ can be used to control whether the current buffer is considered
+ "interactive". An interactive buffer is processed more slowly,
+ but must be used when the scanner's input source is indeed
+ interactive to avoid problems due to waiting to fill buffers
+ (see the discussion of the -I flag in flex.1). A non-zero value
+ in the macro invocation marks the buffer as interactive, a zero
+ value as non-interactive. Note that use of this macro overrides
+ "%option always-interactive" or "%option never-interactive".
+
+ yy_set_interactive() must be invoked prior to beginning to
+ scan the buffer.
+
+ - The new macro
+
+ yy_set_bol(at_bol)
+
+ can be used to control whether the current buffer's scanning
+ context for the next token match is done as though at the
+ beginning of a line (non-zero macro argument; makes '^' anchored
+ rules active) or not at the beginning of a line (zero argument,
+ '^' rules inactive).
+
+ - Related to this change, the mechanism for determining when a scan is
+ starting at the beginning of a line has changed. It used to be
+ that '^' was active iff the character prior to that at which the
+ scan started was a newline. The mechanism now is that '^' is
+ active iff the last token ended in a newline (or the last call to
+ input() returned a newline). For most users, the difference in
+ mechanisms is negligible. Where it will make a difference,
+ however, is if unput() or yyless() is used to alter the input
+ stream. When in doubt, use yy_set_bol().
+
+ - The new beginning-of-line mechanism involved changing some fairly
+ twisted code, so it may have introduced bugs - beware ...
+
+ - The macro YY_AT_BOL() returns true if the next token scanned from
+ the current buffer will have '^' rules active, false otherwise.
+
+ - The new function
+
+ void yy_flush_buffer( struct yy_buffer_state* b )
+
+ flushes the contents of the current buffer (i.e., next time
+ the scanner attempts to match a token using b as the current
+ buffer, it will begin by invoking YY_INPUT to fill the buffer).
+ This routine is also available to C++ scanners (unlike some
+ of the other new routines).
+
+ The related macro
+
+ YY_FLUSH_BUFFER
+
+ flushes the contents of the current buffer.
+
+ - A new "-ooutput" option writes the generated scanner to "output".
+ If used with -t, the scanner is still written to stdout, but
+ its internal #line directives (see previous item) use "output".
+
+ - Flex now generates #line directives relating the code it
+ produces to the output file; this means that error messages
+ in the flex-generated code should be correctly pinpointed.
+
+ - When generating #line directives, filenames with embedded '\'s
+ have those characters escaped (i.e., turned into '\\'). This
+ feature helps with reporting filenames for some MS-DOS and OS/2
+ systems.
+
+ - The FlexLexer class includes two new public member functions:
+
+ virtual void switch_streams( istream* new_in = 0,
+ ostream* new_out = 0 )
+
+ reassigns yyin to new_in (if non-nil) and yyout to new_out
+ (ditto), deleting the previous input buffer if yyin is
+ reassigned. It is used by:
+
+ int yylex( istream* new_in = 0, ostream* new_out = 0 )
+
+ which first calls switch_streams() and then returns the value
+ of calling yylex().
+
+ - C++ scanners now have yy_flex_debug as a member variable of
+ FlexLexer rather than a global, and member functions for testing
+ and setting it.
+
+ - When generating a C++ scanning class, you can now use
+
+ %option yyclass="foo"
+
+ to inform flex that you have derived "foo" as a subclass of
+ yyFlexLexer, so flex will place your actions in the member
+ function foo::yylex() instead of yyFlexLexer::yylex(). It also
+ generates a yyFlexLexer::yylex() member function that generates a
+ run-time error if called (by invoking yyFlexLexer::LexerError()).
+ This feature is necessary if your subclass "foo" introduces some
+ additional member functions or variables that you need to access
+ from yylex().
+
+ - Current texinfo files in MISC/texinfo, contributed by Francois
+ Pinard.
+
+ - You can now change the name "flex" to something else (e.g., "lex")
+ by redefining $(FLEX) in the Makefile.
+
+ - Two bugs (one serious) that could cause "bigcheck" to fail have
+ been fixed.
+
+ - A number of portability/configuration changes have been made
+ for easier portability.
+
+ - You can use "YYSTATE" in your scanner as an alias for YY_START
+ (for AT&T lex compatibility).
+
+ - input() now maintains yylineno.
+
+ - input() no longer trashes yytext.
+
+ - interactive scanners now read characters in YY_INPUT up to a
+ newline, a large performance gain.
+
+ - C++ scanner objects now work with the -P option. You include
+ <FlexLexer.h> once per scanner - see comments in <FlexLexer.h>
+ (or flex.1) for details.
+
+ - C++ FlexLexer objects now use the "cerr" stream to report -d output
+ instead of stdio.
+
+ - The -c flag now has its full glorious POSIX interpretation (do
+ nothing), rather than being interpreted as an old-style -C flag.
+
+ - Scanners generated by flex now include two #define's giving
+ the major and minor version numbers (YY_FLEX_MAJOR_VERSION,
+ YY_FLEX_MINOR_VERSION). These can then be tested to see
+ whether certain flex features are available.
+
+ - Scanners generated using -l lex compatibility now have the symbol
+ YY_FLEX_LEX_COMPAT #define'd.
+
+ - When initializing (i.e., yy_init is non-zero on entry to yylex()),
+ generated scanners now set yy_init to zero before executing
+ YY_USER_INIT. This means that you can set yy_init back to a
+ non-zero value in YY_USER_INIT if you need the scanner to be
+ reinitialized on the next call.
+
+ - You can now use "#line" directives in the first section of your
+ scanner specification.
+
+ - When generating full-table scanners (-Cf), flex now puts braces
+ around each row of the 2-d array initialization, to silence warnings
+ on over-zealous compilers.
+
+ - Improved support for MS-DOS. The flex sources have been successfully
+ built, unmodified, for Borland 4.02 (all that's required is a
+ Borland Makefile and config.h file, which are supplied in
+ MISC/Borland - contributed by Terrence O Kane).
+
+ - Improved support for Macintosh using Think C - the sources should
+ build for this platform "out of the box". Contributed by Scott
+ Hofmann.
+
+ - Improved support for VMS, in MISC/VMS/, contributed by Pat Rankin.
+
+ - Support for the Amiga, in MISC/Amiga/, contributed by Andreas
+ Scherer. Note that the contributed files were developed for
+ flex 2.4 and have not been tested with flex 2.5.
+
+ - Some notes on support for the NeXT, in MISC/NeXT, contributed
+ by Raf Schietekat.
+
+ - The MISC/ directory now includes a preformatted version of flex.1
+ in flex.man, and pre-yacc'd versions of parse.y in parse.{c,h}.
+
+ - The flex.1 and flexdoc.1 manual pages have been merged. There
+ is now just one document, flex.1, which includes an overview
+ at the beginning to help you find the section you need.
+
+ - Documentation now clarifies that start conditions persist across
+ switches to new input files or different input buffers. If you
+ want to e.g., return to INITIAL, you must explicitly do so.
+
+ - The "Performance Considerations" section of the manual has been
+ updated.
+
+ - Documented the "yy_act" variable, which when YY_USER_ACTION is
+ invoked holds the number of the matched rule, and added an
+ example of using yy_act to profile how often each rule is matched.
+
+ - Added YY_NUM_RULES, a definition that gives the total number
+ of rules in the file, including the default rule (even if you
+ use -s).
+
+ - Documentation now clarifies that you can pass a nil FILE* pointer
+ to yy_create_buffer() or yyrestart() if you've arrange YY_INPUT
+ to not need yyin.
+
+ - Documentation now clarifies that YY_BUFFER_STATE is a pointer to
+ an opaque "struct yy_buffer_state".
+
+ - Documentation now stresses that you gain the benefits of removing
+ backing-up states only if you remove *all* of them.
+
+ - Documentation now points out that traditional lex allows you
+ to put the action on a separate line from the rule pattern if
+ the pattern has trailing whitespace (ugh!), but flex doesn't
+ support this.
+
+ - A broken example in documentation of the difference between
+ inclusive and exclusive start conditions is now fixed.
+
+ - Usage (-h) report now goes to stdout.
+
+ - Version (-V) info now goes to stdout.
+
+ - More #ifdef chud has been added to the parser in attempt to
+ deal with bison's use of alloca().
+
+ - "make clean" no longer deletes emacs backup files (*~).
+
+ - Some memory leaks have been fixed.
+
+ - A bug was fixed in which dynamically-expanded buffers were
+ reallocated a couple of bytes too small.
+
+ - A bug was fixed which could cause flex to read and write beyond
+ the end of the input buffer.
+
+ - -S will not be going away.
+
+
+Changes between release 2.4.7 (03Aug94) and release 2.4.6:
+
+ - Fixed serious bug in reading multiple files.
+
+ - Fixed bug in scanning NUL's.
+
+ - Fixed bug in input() returning 8-bit characters.
+
+ - Fixed bug in matching text with embedded NUL's when
+ using %array or lex compatibility.
+
+ - Fixed multiple invocations of YY_USER_ACTION when using '|'
+ continuation action.
+
+ - Minor prototyping fixes.
+
+Changes between release 2.4.6 (04Jan94) and release 2.4.5:
+
+ - Linking with -lfl no longer required if your program includes
+ its own yywrap() and main() functions. (This change will cause
+ problems if you have a non-ANSI compiler on a system for which
+ sizeof(int) != sizeof(void*) or sizeof(int) != sizeof(size_t).)
+
+ - The use of 'extern "C++"' in FlexLexer.h has been modified to
+ get around an incompatibility with g++'s header files.
+
+Changes between release 2.4.5 (11Dec93) and release 2.4.4:
+
+ - Fixed bug breaking C++ scanners that use REJECT or variable
+ trailing context.
+
+ - Fixed serious input problem for interactive scanners on
+ systems for which char is unsigned.
+
+ - Fixed bug in incorrectly treating '$' operator as variable
+ trailing context.
+
+ - Fixed bug in -CF table representation that could lead to
+ corrupt tables.
+
+ - Fixed fairly benign memory leak.
+
+ - Added `extern "C++"' wrapper to FlexLexer.h header. This
+ should overcome the g++ 2.5.X problems mentioned in the
+ NEWS for release 2.4.3.
+
+ - Changed #include of FlexLexer.h to use <> instead of "".
+
+ - Added feature to control whether the scanner attempts to
+ refill the input buffer once it's exhausted. This feature
+ will be documented in the 2.5 release.
+
+
+Changes between release 2.4.4 (07Dec93) and release 2.4.3:
+
+ - Fixed two serious bugs in scanning 8-bit characters.
+
+ - Fixed bug in YY_USER_ACTION that caused it to be executed
+ inappropriately (on the scanner's own internal actions, and
+ with incorrect yytext/yyleng values).
+
+ - Fixed bug in pointing yyin at a new file and resuming scanning.
+
+ - Portability fix regarding min/max/abs macros conflicting with
+ function definitions in standard header files.
+
+ - Added a virtual LexerError() method to the C++ yyFlexLexer class
+ for reporting error messages instead of always using cerr.
+
+ - Added warning in flexdoc that the C++ scanning class is presently
+ experimental and subject to considerable change between major
+ releases.
+
+
+Changes between release 2.4.3 (03Dec93) and release 2.4.2:
+
+ - Fixed bug causing fatal scanner messages to fail to print.
+
+ - Fixed things so FlexLexer.h can be included in other C++
+ sources. One side-effect of this change is that -+ and -CF
+ are now incompatible.
+
+ - libfl.a now supplies private versions of the the <string.h>/
+ <strings.h> string routines needed by flex and the scanners
+ it generates, to enhance portability to some BSD systems.
+
+ - More robust solution to 2.4.2's flexfatal() bug fix.
+
+ - Added ranlib of installed libfl.a.
+
+ - Some lint tweaks.
+
+ - NOTE: problems have been encountered attempting to build flex
+ C++ scanners using g++ version 2.5.X. The problem is due to an
+ unfortunate heuristic in g++ 2.5.X that attempts to discern between
+ C and C++ headers. Because FlexLexer.h is installed (by default)
+ in /usr/local/include and not /usr/local/lib/g++-include, g++ 2.5.X
+ decides that it's a C header :-(. So if you have problems, install
+ the header in /usr/local/lib/g++-include instead.
+
+
+Changes between release 2.4.2 (01Dec93) and release 2.4.1:
+
+ - Fixed bug in libfl.a referring to non-existent "flexfatal" function.
+
+ - Modified to produce both compress'd and gzip'd tar files for
+ distributions (you probably don't care about this change!).
+
+
+Changes between release 2.4.1 (30Nov93) and release 2.3.8:
+
+ - The new '-+' flag instructs flex to generate a C++ scanner class
+ (thanks to Kent Williams). flex writes an implementation of the
+ class defined in FlexLexer.h to lex.yy.cc. You may include
+ multiple scanner classes in your program using the -P flag. Note
+ that the scanner class also provides a mechanism for creating
+ reentrant scanners. The scanner class uses C++ streams for I/O
+ instead of FILE*'s (thanks to Tom Epperly). If the flex executable's
+ name ends in '+' then the '-+' flag is automatically on, so creating
+ a symlink or copy of "flex" to "flex++" results in a version of
+ flex that can be used exclusively for C++ scanners.
+
+ Note that without the '-+' flag, flex-generated scanners can still
+ be compiled using C++ compilers, though they use FILE*'s for I/O
+ instead of streams.
+
+ See the "GENERATING C++ SCANNERS" section of flexdoc for details.
+
+ - The new '-l' flag turns on maximum AT&T lex compatibility. In
+ particular, -l includes support for "yylineno" and makes yytext
+ be an array instead of a pointer. It does not, however, do away
+ with all incompatibilities. See the "INCOMPATIBILITIES WITH LEX
+ AND POSIX" section of flexdoc for details.
+
+ - The new '-P' option specifies a prefix to use other than "yy"
+ for the scanner's globally-visible variables, and for the
+ "lex.yy.c" filename. Using -P you can link together multiple
+ flex scanners in the same executable.
+
+ - The distribution includes a "texinfo" version of flexdoc.1,
+ contributed by Roland Pesch (thanks also to Marq Kole, who
+ contributed another version). It has not been brought up to
+ date, but reflects version 2.3. See MISC/flex.texinfo.
+
+ The flex distribution will soon include G.T. Nicol's flex
+ manual; he is presently bringing it up-to-date for version 2.4.
+
+ - yywrap() is now a function, and you now *must* link flex scanners
+ with libfl.a.
+
+ - Site-configuration is now done via an autoconf-generated
+ "configure" script contributed by Francois Pinard.
+
+ - Scanners now use fread() (or getc(), if interactive) and not
+ read() for input. A new "table compression" option, -Cr,
+ overrides this change and causes the scanner to use read()
+ (because read() is a bit faster than fread()). -f and -F
+ are now equivalent to -Cfr and -CFr; i.e., they imply the
+ -Cr option.
+
+ - In the blessed name of POSIX compliance, flex supports "%array"
+ and "%pointer" directives in the definitions (first) section of
+ the scanner specification. The former specifies that yytext
+ should be an array (of size YYLMAX), the latter, that it should
+ be a pointer. The array version of yytext is universally slower
+ than the pointer version, but has the advantage that its contents
+ remain unmodified across calls to input() and unput() (the pointer
+ version of yytext is, still, trashed by such calls).
+
+ "%array" cannot be used with the '-+' C++ scanner class option.
+
+ - The new '-Ca' option directs flex to trade off memory for
+ natural alignment when generating a scanner's tables. In
+ particular, table entries that would otherwise be "short"
+ become "long".
+
+ - The new '-h' option produces a summary of the flex flags.
+
+ - The new '-V' option reports the flex version number and exits.
+
+ - The new scanner macro YY_START returns an integer value
+ corresponding to the current start condition. You can return
+ to that start condition by passing the value to a subsequent
+ "BEGIN" action. You also can implement "start condition stacks"
+ by storing the values in an integer stack.
+
+ - You can now redefine macros such as YY_INPUT by just #define'ing
+ them to some other value in the first section of the flex input;
+ no need to first #undef them.
+
+ - flex now generates warnings for rules that can't be matched.
+ These warnings can be turned off using the new '-w' flag. If
+ your scanner uses REJECT then you will not get these warnings.
+
+ - If you specify the '-s' flag but the default rule can be matched,
+ flex now generates a warning.
+
+ - "yyleng" is now a global, and may be modified by the user (though
+ doing so and then using yymore() will yield weird results).
+
+ - Name definitions in the first section of a scanner specification
+ can now include a leading '^' or trailing '$' operator. In this
+ case, the definition is *not* pushed back inside of parentheses.
+
+ - Scanners with compressed tables are now "interactive" (-I option)
+ by default. You can suppress this attribute (which makes them
+ run slightly slower) using the new '-B' flag.
+
+ - Flex now generates 8-bit scanners by default, unless you use the
+ -Cf or -CF compression options (-Cfe and -CFe result in 8-bit
+ scanners). You can force it to generate a 7-bit scanner using
+ the new '-7' flag. You can build flex to generate 8-bit scanners
+ for -Cf and -CF, too, by adding -DDEFAULT_CSIZE=256 to CFLAGS
+ in the Makefile.
+
+ - You no longer need to call the scanner routine yyrestart() to
+ inform the scanner that you have switched to a new file after
+ having seen an EOF on the current input file. Instead, just
+ point yyin at the new file and continue scanning.
+
+ - You no longer need to invoke YY_NEW_FILE in an <<EOF>> action
+ to indicate you wish to continue scanning. Simply point yyin
+ at a new file.
+
+ - A leading '#' no longer introduces a comment in a flex input.
+
+ - flex no longer considers formfeed ('\f') a whitespace character.
+
+ - %t, I'm happy to report, has been nuked.
+
+ - The '-p' option may be given twice ('-pp') to instruct flex to
+ report minor performance problems as well as major ones.
+
+ - The '-v' verbose output no longer includes start/finish time
+ information.
+
+ - Newlines in flex inputs can optionally include leading or
+ trailing carriage-returns ('\r'), in support of several PC/Mac
+ run-time libraries that automatically include these.
+
+ - A start condition of the form "<*>" makes the following rule
+ active in every start condition, whether exclusive or inclusive.
+
+ - The following items have been corrected in the flex documentation:
+
+ - '-C' table compression options *are* cumulative.
+
+ - You may modify yytext but not lengthen it by appending
+ characters to the end. Modifying its final character
+ will affect '^' anchoring for the next rule matched
+ if the character is changed to or from a newline.
+
+ - The term "backtracking" has been renamed "backing up",
+ since it is a one-time repositioning and not a repeated
+ search. What used to be the "lex.backtrack" file is now
+ "lex.backup".
+
+ - Unindented "/* ... */" comments are allowed in the first
+ flex input section, but not in the second.
+
+ - yyless() can only be used in the flex input source, not
+ externally.
+
+ - You can use "yyrestart(yyin)" to throw away the
+ current contents of the input buffer.
+
+ - To write high-speed scanners, attempt to match as much
+ text as possible with each rule. See MISC/fastwc/README
+ for more information.
+
+ - Using the beginning-of-line operator ('^') is fairly
+ cheap. Using unput() is expensive. Using yyless() is
+ cheap.
+
+ - An example of scanning strings with embedded escape
+ sequences has been added.
+
+ - The example of backing-up in flexdoc was erroneous; it
+ has been corrected.
+
+ - A flex scanner's internal buffer now dynamically grows if needed
+ to match large tokens. Note that growing the buffer presently
+ requires rescanning the (large) token, so consuming a lot of
+ text this way is a slow process. Also note that presently the
+ buffer does *not* grow if you unput() more text than can fit
+ into the buffer.
+
+ - The MISC/ directory has been reorganized; see MISC/README for
+ details.
+
+ - yyless() can now be used in the third (user action) section
+ of a scanner specification, thanks to Ceriel Jacobs. yyless()
+ remains a macro and cannot be used outside of the scanner source.
+
+ - The skeleton file is no longer opened at run-time, but instead
+ compiled into a large string array (thanks to John Gilmore and
+ friends at Cygnus). You can still use the -S flag to point flex
+ at a different skeleton file.
+
+ - flex no longer uses a temporary file to store the scanner's
+ actions.
+
+ - A number of changes have been made to decrease porting headaches.
+ In particular, flex no longer uses memset() or ctime(), and
+ provides a single simple mechanism for dealing with C compilers
+ that still define malloc() as returning char* instead of void*.
+
+ - Flex now detects if the scanner specification requires the -8 flag
+ but the flag was not given or on by default.
+
+ - A number of table-expansion fencepost bugs have been fixed,
+ making flex more robust for generating large scanners.
+
+ - flex more consistently identifies the location of errors in
+ its input.
+
+ - YY_USER_ACTION is now invoked only for "real" actions, not for
+ internal actions used by the scanner for things like filling
+ the buffer or handling EOF.
+
+ - The rule "[^]]" now matches any character other than a ']';
+ formerly it matched any character at all followed by a ']'.
+ This change was made for compatibility with AT&T lex.
+
+ - A large number of miscellaneous bugs have been found and fixed
+ thanks to Gerhard Wilhelms.
+
+ - The source code has been heavily reformatted, making patches
+ relative to previous flex releases no longer accurate.
+
+
+Changes between 2.3 Patch #8 (21Feb93) and 2.3 Patch #7:
+
+ - Fixed bugs in dynamic memory allocation leading to grievous
+ fencepost problems when generating large scanners.
+ - Fixed bug causing infinite loops on character classes with 8-bit
+ characters in them.
+ - Fixed bug in matching repetitions with a lower bound of 0.
+ - Fixed bug in scanning NUL characters using an "interactive" scanner.
+ - Fixed bug in using yymore() at the end of a file.
+ - Fixed bug in misrecognizing rules with variable trailing context.
+ - Fixed bug compiling flex on Suns using gcc 2.
+ - Fixed bug in not recognizing that input files with the character
+ ASCII 128 in them require the -8 flag.
+ - Fixed bug that could cause an infinite loop writing out
+ error messages.
+ - Fixed bug in not recognizing old-style lex % declarations if
+ followed by a tab instead of a space.
+ - Fixed potential crash when flex terminated early (usually due
+ to a bad flag) and the -v flag had been given.
+ - Added some missing declarations of void functions.
+ - Changed to only use '\a' for __STDC__ compilers.
+ - Updated mailing addresses.
+
+
+Changes between 2.3 Patch #7 (28Mar91) and 2.3 Patch #6:
+
+ - Fixed out-of-bounds array access that caused bad tables
+ to be produced on machines where the bad reference happened
+ to yield a 1. This caused problems installing or running
+ flex on some Suns, in particular.
+
+
+Changes between 2.3 Patch #6 (29Aug90) and 2.3 Patch #5:
+
+ - Fixed a serious bug in yymore() which basically made it
+ completely broken. Thanks goes to Jean Christophe of
+ the Nethack development team for finding the problem
+ and passing along the fix.
+
+
+Changes between 2.3 Patch #5 (16Aug90) and 2.3 Patch #4:
+
+ - An up-to-date version of initscan.c so "make test" will
+ work after applying the previous patches
+
+
+Changes between 2.3 Patch #4 (14Aug90) and 2.3 Patch #3:
+
+ - Fixed bug in hexadecimal escapes which allowed only digits,
+ not letters, in escapes
+ - Fixed bug in previous "Changes" file!
+
+
+Changes between 2.3 Patch #3 (03Aug90) and 2.3 Patch #2:
+
+ - Correction to patch #2 for gcc compilation; thanks goes to
+ Paul Eggert for catching this.
+
+
+Changes between 2.3 Patch #2 (02Aug90) and original 2.3 release:
+
+ - Fixed (hopefully) headaches involving declaring malloc()
+ and free() for gcc, which defines __STDC__ but (often) doesn't
+ come with the standard include files such as <stdlib.h>.
+ Reordered #ifdef maze in the scanner skeleton in the hope of
+ getting the declarations right for cfront and g++, too.
+
+ - Note that this patch supercedes patch #1 for release 2.3,
+ which was never announced but was available briefly for
+ anonymous ftp.
+
+
+Changes between 2.3 (full) release of 28Jun90 and 2.2 (alpha) release:
+
+ User-visible:
+
+ - A lone <<EOF>> rule (that is, one which is not qualified with
+ a list of start conditions) now specifies the EOF action for
+ *all* start conditions which haven't already had <<EOF>> actions
+ given. To specify an end-of-file action for just the initial
+ state, use <INITIAL><<EOF>>.
+
+ - -d debug output is now contigent on the global yy_flex_debug
+ being set to a non-zero value, which it is by default.
+
+ - A new macro, YY_USER_INIT, is provided for the user to specify
+ initialization action to be taken on the first call to the
+ scanner. This action is done before the scanner does its
+ own initialization.
+
+ - yy_new_buffer() has been added as an alias for yy_create_buffer()
+
+ - Comments beginning with '#' and extending to the end of the line
+ now work, but have been deprecated (in anticipation of making
+ flex recognize #line directives).
+
+ - The funky restrictions on when semi-colons could follow the
+ YY_NEW_FILE and yyless macros have been removed. They now
+ behave identically to functions.
+
+ - A bug in the sample redefinition of YY_INPUT in the documentation
+ has been corrected.
+
+ - A bug in the sample simple tokener in the documentation has
+ been corrected.
+
+ - The documentation on the incompatibilities between flex and
+ lex has been reordered so that the discussion of yylineno
+ and input() come first, as it's anticipated that these will
+ be the most common source of headaches.
+
+
+ Things which didn't used to be documented but now are:
+
+ - flex interprets "^foo|bar" differently from lex. flex interprets
+ it as "match either a 'foo' or a 'bar', providing it comes at the
+ beginning of a line", whereas lex interprets it as "match either
+ a 'foo' at the beginning of a line, or a 'bar' anywhere".
+
+ - flex initializes the global "yyin" on the first call to the
+ scanner, while lex initializes it at compile-time.
+
+ - yy_switch_to_buffer() can be used in the yywrap() macro/routine.
+
+ - flex scanners do not use stdio for their input, and hence when
+ writing an interactive scanner one must explictly call fflush()
+ after writing out a prompt.
+
+ - flex scanner can be made reentrant (after a fashion) by using
+ "yyrestart( yyin );". This is useful for interactive scanners
+ which have interrupt handlers that long-jump out of the scanner.
+
+ - a defense of why yylineno is not supported is included, along
+ with a suggestion on how to convert scanners which rely on it.
+
+
+ Other changes:
+
+ - Prototypes and proper declarations of void routines have
+ been added to the flex source code, courtesy of Kevin B. Kenny.
+
+ - Routines dealing with memory allocation now use void* pointers
+ instead of char* - see Makefile for porting implications.
+
+ - Error-checking is now done when flex closes a file.
+
+ - Various lint tweaks were added to reduce the number of gripes.
+
+ - Makefile has been further parameterized to aid in porting.
+
+ - Support for SCO Unix added.
+
+ - Flex now sports the latest & greatest UC copyright notice
+ (which is only slightly different from the previous one).
+
+ - A note has been added to flexdoc.1 mentioning work in progress
+ on modifying flex to generate straight C code rather than a
+ table-driven automaton, with an email address of whom to contact
+ if you are working along similar lines.
+
+
+Changes between 2.2 Patch #3 (30Mar90) and 2.2 Patch #2:
+
+ - fixed bug which caused -I scanners to bomb
+
+
+Changes between 2.2 Patch #2 (27Mar90) and 2.2 Patch #1:
+
+ - fixed bug writing past end of input buffer in yyunput()
+ - fixed bug detecting NUL's at the end of a buffer
+
+
+Changes between 2.2 Patch #1 (23Mar90) and 2.2 (alpha) release:
+
+ - Makefile fixes: definition of MAKE variable for systems
+ which don't have it; installation of flexdoc.1 along with
+ flex.1; fixed two bugs which could cause "bigtest" to fail.
+
+ - flex.skel fix for compiling with g++.
+
+ - README and flexdoc.1 no longer list an out-of-date BITNET address
+ for contacting me.
+
+ - minor typos and formatting changes to flex.1 and flexdoc.1.
+
+
+Changes between 2.2 (alpha) release of March '90 and previous release:
+
+ User-visible:
+
+ - Full user documentation now available.
+
+ - Support for 8-bit scanners.
+
+ - Scanners now accept NUL's.
+
+ - A facility has been added for dealing with multiple
+ input buffers.
+
+ - Two manual entries now. One which fully describes flex
+ (rather than just its differences from lex), and the
+ other for quick(er) reference.
+
+ - A number of changes to bring flex closer into compliance
+ with the latest POSIX lex draft:
+
+ %t support
+ flex now accepts multiple input files and concatenates
+ them together to form its input
+ previous -c (compress) flag renamed -C
+ do-nothing -c and -n flags added
+ Any indented code or code within %{}'s in section 2 is
+ now copied to the output
+
+ - yyleng is now a bona fide global integer.
+
+ - -d debug information now gives the line number of the
+ matched rule instead of which number rule it was from
+ the beginning of the file.
+
+ - -v output now includes a summary of the flags used to generate
+ the scanner.
+
+ - unput() and yyrestart() are now globally callable.
+
+ - yyrestart() no longer closes the previous value of yyin.
+
+ - C++ support; generated scanners can be compiled with C++ compiler.
+
+ - Primitive -lfl library added, containing default main()
+ which calls yylex(). A number of routines currently living
+ in the scanner skeleton will probably migrate to here
+ in the future (in particular, yywrap() will probably cease
+ to be a macro and instead be a function in the -lfl library).
+
+ - Hexadecimal (\x) escape sequences added.
+
+ - Support for MS-DOS, VMS, and Turbo-C integrated.
+
+ - The %used/%unused operators have been deprecated. They
+ may go away soon.
+
+
+ Other changes:
+
+ - Makefile enhanced for easier testing and installation.
+ - The parser has been tweaked to detect some erroneous
+ constructions which previously were missed.
+ - Scanner input buffer overflow is now detected.
+ - Bugs with missing "const" declarations fixed.
+ - Out-of-date Minix/Atari patches provided.
+ - Scanners no longer require printf() unless FLEX_DEBUG is being used.
+ - A subtle input() bug has been fixed.
+ - Line numbers for "continued action" rules (those following
+ the special '|' action) are now correct.
+ - unput() bug fixed; had been causing problems porting flex to VMS.
+ - yymore() handling rewritten to fix bug with interaction
+ between yymore() and trailing context.
+ - EOF in actions now generates an error message.
+ - Bug involving -CFe and generating equivalence classes fixed.
+ - Bug which made -CF be treated as -Cf fixed.
+ - Support for SysV tmpnam() added.
+ - Unused #define's for scanner no longer generated.
+ - Error messages which are associated with a particular input
+ line are now all identified with their input line in standard
+ format.
+ - % directives which are valid to lex but not to flex are
+ now ignored instead of generating warnings.
+ - -DSYS_V flag can now also be specified -DUSG for System V
+ compilation.
+
+
+Changes between 2.1 beta-test release of June '89 and previous release:
+
+ User-visible:
+
+ - -p flag generates a performance report to stderr. The report
+ consists of comments regarding features of the scanner rules
+ which result in slower scanners.
+
+ - -b flag generates backtracking information to lex.backtrack.
+ This is a list of scanner states which require backtracking
+ and the characters on which they do so. By adding rules
+ one can remove backtracking states. If all backtracking states
+ are eliminated, the generated scanner will run faster.
+ Backtracking is not yet documented in the manual entry.
+
+ - Variable trailing context now works, i.e., one can have
+ rules like "(foo)*/[ \t]*bletch". Some trailing context
+ patterns still cannot be properly matched and generate
+ error messages. These are patterns where the ending of the
+ first part of the rule matches the beginning of the second
+ part, such as "zx*/xy*", where the 'x*' matches the 'x' at
+ the beginning of the trailing context. Lex won't get these
+ patterns right either.
+
+ - Faster scanners.
+
+ - End-of-file rules. The special rule "<<EOF>>" indicates
+ actions which are to be taken when an end-of-file is
+ encountered and yywrap() returns non-zero (i.e., indicates
+ no further files to process). See manual entry for example.
+
+ - The -r (reject used) flag is gone. flex now scans the input
+ for occurrences of the string "REJECT" to determine if the
+ action is needed. It tries to be intelligent about this but
+ can be fooled. One can force the presence or absence of
+ REJECT by adding a line in the first section of the form
+ "%used REJECT" or "%unused REJECT".
+
+ - yymore() has been implemented. Similarly to REJECT, flex
+ detects the use of yymore(), which can be overridden using
+ "%used" or "%unused".
+
+ - Patterns like "x{0,3}" now work (i.e., with lower-limit == 0).
+
+ - Removed '\^x' for ctrl-x misfeature.
+
+ - Added '\a' and '\v' escape sequences.
+
+ - \<digits> now works for octal escape sequences; previously
+ \0<digits> was required.
+
+ - Better error reporting; line numbers are associated with rules.
+
+ - yyleng is a macro; it cannot be accessed outside of the
+ scanner source file.
+
+ - yytext and yyleng should not be modified within a flex action.
+
+ - Generated scanners #define the name FLEX_SCANNER.
+
+ - Rules are internally separated by YY_BREAK in lex.yy.c rather
+ than break, to allow redefinition.
+
+ - The macro YY_USER_ACTION can be redefined to provide an action
+ which is always executed prior to the matched rule's action.
+
+ - yyrestart() is a new action which can be used to restart
+ the scanner after it has seen an end-of-file (a "real" one,
+ that is, one for which yywrap() returned non-zero). It takes
+ a FILE* argument indicating a new file to scan and sets
+ things up so that a subsequent call to yylex() will start
+ scanning that file.
+
+ - Internal scanner names all preceded by "yy_"
+
+ - lex.yy.c is deleted if errors are encountered during processing.
+
+ - Comments may be put in the first section of the input by preceding
+ them with '#'.
+
+
+
+ Other changes:
+
+ - Some portability-related bugs fixed, in particular for machines
+ with unsigned characters or sizeof( int* ) != sizeof( int ).
+ Also, tweaks for VMS and Microsoft C (MS-DOS), and identifiers all
+ trimmed to be 31 or fewer characters. Shortened file names
+ for dinosaur OS's. Checks for allocating > 64K memory
+ on 16 bit'ers. Amiga tweaks. Compiles using gcc on a Sun-3.
+ - Compressed and fast scanner skeletons merged.
+ - Skeleton header files done away with.
+ - Generated scanner uses prototypes and "const" for __STDC__.
+ - -DSV flag is now -DSYS_V for System V compilation.
+ - Removed all references to FTL language.
+ - Software now covered by BSD Copyright.
+ - flex will replace lex in subsequent BSD releases.
View
213 usr.bin/lex/ccl.c
@@ -1,188 +1,149 @@
+/* ccl - routines for character classes */
+
/*-
* Copyright (c) 1990 The Regents of the University of California.
* All rights reserved.
*
* This code is derived from software contributed to Berkeley by
- * Vern Paxson of Lawrence Berkeley Laboratory.
+ * Vern Paxson.
*
* The United States Government has rights in this work pursuant
* to contract no. DE-AC03-76SF00098 between the United States
* Department of Energy and the University of California.
*
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- * 3. All advertising materials mentioning features or use of this software
- * must display the following acknowledgement:
- * This product includes software developed by the University of
- * California, Berkeley and its contributors.
- * 4. Neither the name of the University nor the names of its contributors
- * may be used to endorse or promote products derived from this software
- * without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
+ * Redistribution and use in source and binary forms are permitted provided
+ * that: (1) source distributions retain this entire copyright notice and
+ * comment, and (2) distributions including binaries display the following
+ * acknowledgement: ``This product includes software developed by the
+ * University of California, Berkeley and its contributors'' in the
+ * documentation or other materials provided with the distribution and in
+ * all advertising materials mentioning features or use of this software.
+ * Neither the name of the University nor the names of its contributors may
+ * be used to endorse or promote products derived from this software without
+ * specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
*/
-#ifndef lint
-static char sccsid[] = "@(#)ccl.c 5.2 (Berkeley) 6/18/90";
-#endif /* not lint */
-
-/* ccl - routines for character classes */
+/* $Header: /cvsroot/src/usr.bin/lex/Attic/ccl.c,v 1.1.1.2 1996/12/10 06:06:32 mikel Exp $ */
#include "flexdef.h"
-/* ccladd - add a single character to a ccl
- *
- * synopsis
- * int cclp;
- * int ch;
- * ccladd( cclp, ch );
- */
+/* ccladd - add a single character to a ccl */
void ccladd( cclp, ch )
int cclp;
int ch;
+ {
+ int ind, len, newpos, i;
- {
- int ind, len, newpos, i;
+ check_char( ch );
- len = ccllen[cclp];
- ind = cclmap[cclp];
+ len = ccllen[cclp];
+ ind = cclmap[cclp];
- /* check to see if the character is already in the ccl */
+ /* check to see if the character is already in the ccl */
- for ( i = 0; i < len; ++i )
- if ( ccltbl[ind + i] == ch )
- return;
+ for ( i = 0; i < len; ++i )
+ if ( ccltbl[ind + i] == ch )
+ return;
- newpos = ind + len;
+ newpos = ind + len;
- if ( newpos >= current_max_ccl_tbl_size )
- {
- current_max_ccl_tbl_size += MAX_CCL_TBL_SIZE_INCREMENT;
+ if ( newpos >= current_max_ccl_tbl_size )
+ {
+ current_max_ccl_tbl_size += MAX_CCL_TBL_SIZE_INCREMENT;
- ++num_reallocs;
+ ++num_reallocs;
- ccltbl = reallocate_character_array( ccltbl, current_max_ccl_tbl_size );
- }
+ ccltbl = reallocate_Character_array( ccltbl,
+ current_max_ccl_tbl_size );
+ }
- ccllen[cclp] = len + 1;
- ccltbl[newpos] = ch;
- }
+ ccllen[cclp] = len + 1;
+ ccltbl[newpos] = ch;
+ }
-/* cclinit - make an empty ccl
- *
- * synopsis
- * int cclinit();
- * new_ccl = cclinit();
- */
+/* cclinit - return an empty ccl */
int cclinit()
-
- {
- if ( ++lastccl >= current_maxccls )
{
- current_maxccls += MAX_CCLS_INCREMENT;
+ if ( ++lastccl >= current_maxccls )
+ {
+ current_maxccls += MAX_CCLS_INCREMENT;
- ++num_reallocs;
+ ++num_reallocs;
- cclmap = reallocate_integer_array( cclmap, current_maxccls );
- ccllen = reallocate_integer_array( ccllen, current_maxccls );
- cclng = reallocate_integer_array( cclng, current_maxccls );
- }
+ cclmap = reallocate_integer_array( cclmap, current_maxccls );
+ ccllen = reallocate_integer_array( ccllen, current_maxccls );
+ cclng = reallocate_integer_array( cclng, current_maxccls );
+ }
- if ( lastccl == 1 )
- /* we're making the first ccl */
- cclmap[lastccl] = 0;
+ if ( lastccl == 1 )
+ /* we're making the first ccl */
+ cclmap[lastccl] = 0;
- else
- /* the new pointer is just past the end of the last ccl. Since
- * the cclmap points to the \first/ character of a ccl, adding the
- * length of the ccl to the cclmap pointer will produce a cursor
- * to the first free space
- */
- cclmap[lastccl] = cclmap[lastccl - 1] + ccllen[lastccl - 1];
+ else
+ /* The new pointer is just past the end of the last ccl.
+ * Since the cclmap points to the \first/ character of a
+ * ccl, adding the length of the ccl to the cclmap pointer
+ * will produce a cursor to the first free space.
+ */
+ cclmap[lastccl] = cclmap[lastccl - 1] + ccllen[lastccl - 1];
- ccllen[lastccl] = 0;
- cclng[lastccl] = 0; /* ccl's start out life un-negated */
+ ccllen[lastccl] = 0;
+ cclng[lastccl] = 0; /* ccl's start out life un-negated */
- return ( lastccl );
- }
+ return lastccl;
+ }
-/* cclnegate - negate a ccl
- *
- * synopsis
- * int cclp;
- * cclnegate( ccl );
- */
+/* cclnegate - negate the given ccl */
void cclnegate( cclp )
int cclp;
-
- {
- cclng[cclp] = 1;
- }
+ {
+ cclng[cclp] = 1;
+ }
/* list_character_set - list the members of a set of characters in CCL form
*
- * synopsis
- * int cset[CSIZE];
- * FILE *file;
- * list_character_set( cset );
- *
- * writes to the given file a character-class representation of those
- * characters present in the given set. A character is present if it
- * has a non-zero value in the set array.
+ * Writes to the given file a character-class representation of those
+ * characters present in the given CCL. A character is present if it
+ * has a non-zero value in the cset array.
*/
void list_character_set( file, cset )
FILE *file;
int cset[];
+ {
+ register int i;
- {
- register int i;
- char *readable_form();
+ putc( '[', file );
- putc( '[', file );
+ for ( i = 0; i < csize; ++i )
+ {
+ if ( cset[i] )
+ {
+ register int start_char = i;
- for ( i = 0; i < csize; ++i )
- {
- if ( cset[i] )
- {
- register int start_char = i;
+ putc( ' ', file );
- putc( ' ', file );
+ fputs( readable_form( i ), file );
- fputs( readable_form( i ), file );
+ while ( ++i < csize && cset[i] )
+ ;
- while ( ++i < csize && cset[i] )
- ;
+ if ( i - 1 > start_char )
+ /* this was a run */
+ fprintf( file, "-%s", readable_form( i - 1 ) );
- if ( i - 1 > start_char )
- /* this was a run */
- fprintf( file, "-%s", readable_form( i - 1 ) );
+ putc( ' ', file );
+ }
+ }
- putc( ' ', file );
- }
+ putc( ']', file );
}
-
- putc( ']', file );
- }
View
26 usr.bin/lex/config.h
@@ -0,0 +1,26 @@
+/* config.h. Generated automatically by configure. */
+/* $Header: /cvsroot/src/usr.bin/lex/Attic/config.h,v 1.1.1.1 1996/12/10 06:06:55 mikel Exp $ */
+
+/* Define to empty if the keyword does not work. */
+/* #undef const */
+
+/* Define to `unsigned' if <sys/types.h> doesn't define. */
+/* #undef size_t */
+
+/* Define if you have the ANSI C header files. */
+#define STDC_HEADERS 1
+
+/* Define if you have the <malloc.h> header file. */
+#define HAVE_MALLOC_H 1
+
+/* Define if you have the <string.h> header file. */
+#define HAVE_STRING_H 1
+
+/* Define if you have the <sys/types.h> header file. */
+#define HAVE_SYS_TYPES_H 1
+
+/* Define if you have <alloca.h> and it should be used (not on Ultrix). */
+/* #undef HAVE_ALLOCA_H */
+
+/* Define if platform-specific command line handling is necessary. */
+/* #undef NEED_ARGV_FIXUP */
View
1,624 usr.bin/lex/dfa.c
@@ -1,51 +1,36 @@
+/* dfa - DFA construction routines */
+
/*-
* Copyright (c) 1990 The Regents of the University of California.
* All rights reserved.
*
* This code is derived from software contributed to Berkeley by
- * Vern Paxson of Lawrence Berkeley Laboratory.
+ * Vern Paxson.
*
* The United States Government has rights in this work pursuant
* to contract no. DE-AC03-76SF00098 between the United States
* Department of Energy and the University of California.
*
- * Redistribution and use in source and binary forms, with or without
- * modification, are permitted provided that the following conditions
- * are met:
- * 1. Redistributions of source code must retain the above copyright
- * notice, this list of conditions and the following disclaimer.
- * 2. Redistributions in binary form must reproduce the above copyright
- * notice, this list of conditions and the following disclaimer in the
- * documentation and/or other materials provided with the distribution.
- * 3. All advertising materials mentioning features or use of this software
- * must display the following acknowledgement:
- * This product includes software developed by the University of
- * California, Berkeley and its contributors.
- * 4. Neither the name of the University nor the names of its contributors
- * may be used to endorse or promote products derived from this software
- * without specific prior written permission.
- *
- * THIS SOFTWARE IS PROVIDED BY THE REGENTS AND CONTRIBUTORS ``AS IS'' AND
- * ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE
- * IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE
- * ARE DISCLAIMED. IN NO EVENT SHALL THE REGENTS OR CONTRIBUTORS BE LIABLE
- * FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, EXEMPLARY, OR CONSEQUENTIAL
- * DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS
- * OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION)
- * HOWEVER CAUSED AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT
- * LIABILITY, OR TORT (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY
- * OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF
- * SUCH DAMAGE.
+ * Redistribution and use in source and binary forms are permitted provided
+ * that: (1) source distributions retain this entire copyright notice and
+ * comment, and (2) distributions including binaries display the following
+ * acknowledgement: ``This product includes software developed by the
+ * University of California, Berkeley and its contributors'' in the
+ * documentation or other materials provided with the distribution and in
+ * all advertising materials mentioning features or use of this software.
+ * Neither the name of the University nor the names of its contributors may
+ * be used to endorse or promote products derived from this software without
+ * specific prior written permission.
+ * THIS SOFTWARE IS PROVIDED ``AS IS'' AND WITHOUT ANY EXPRESS OR IMPLIED
+ * WARRANTIES, INCLUDING, WITHOUT LIMITATION, THE IMPLIED WARRANTIES OF
+ * MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE.
*/
-#ifndef lint
-static char sccsid[] = "@(#)dfa.c 5.2 (Berkeley) 6/18/90";
-#endif /* not lint */
-
-/* dfa - DFA construction routines */
+/* $Header: /cvsroot/src/usr.bin/lex/Attic/dfa.c,v 1.1.1.2 1996/12/10 06:06:33 mikel Exp $ */
#include "flexdef.h"
+
/* declare functions that have forward references */
void dump_associated_rules PROTO((FILE*, int));
@@ -54,58 +39,59 @@ void sympartition PROTO((int[], int, int[], int[]));
int symfollowset PROTO((int[], int, int, int[]));
-/* check_for_backtracking - check a DFA state for backtracking
+/* check_for_backing_up - check a DFA state for backing up
*
* synopsis
- * int ds, state[numecs];
- * check_for_backtracking( ds, state );
+ * void check_for_backing_up( int ds, int state[numecs] );
*
* ds is the number of the state to check and state[] is its out-transitions,
- * indexed by equivalence class, and state_rules[] is the set of rules
- * associated with this state
+ * indexed by equivalence class.
*/
-void check_for_backtracking( ds, state )
+void check_for_backing_up( ds, state )
int ds;
int state[];
+ {
+ if ( (reject && ! dfaacc[ds].dfaacc_set) ||
+ (! reject && ! dfaacc[ds].dfaacc_state) )
+ { /* state is non-accepting */
+ ++num_backing_up;
- {
- if ( (reject && ! dfaacc[ds].dfaacc_set) || ! dfaacc[ds].dfaacc_state )
- { /* state is non-accepting */
- ++num_backtracking;
-
- if ( backtrack_report )
- {
- fprintf( backtrack_file, "State #%d is non-accepting -\n", ds );
+ if ( backing_up_report )
+ {
+ fprintf( backing_up_file,
+ _( "State #%d is non-accepting -\n" ), ds );
- /* identify the state */
- dump_associated_rules( backtrack_file, ds );
+ /* identify the state */
+ dump_associated_rules( backing_up_file, ds );
- /* now identify it further using the out- and jam-transitions */
- dump_transitions( backtrack_file, state );
+ /* Now identify it further using the out- and
+ * jam-transitions.
+ */
+ dump_transitions( backing_up_file, state );
- putc( '\n', backtrack_file );
- }
+ putc( '\n', backing_up_file );
+ }
+ }
}
- }
/* check_trailing_context - check to see if NFA state set constitutes
* "dangerous" trailing context
*
* synopsis
- * int nfa_states[num_states+1], num_states;
- * int accset[nacc+1], nacc;
- * check_trailing_context( nfa_states, num_states, accset, nacc );
+ * void check_trailing_context( int nfa_states[num_states+1], int num_states,
+ * int accset[nacc+1], int nacc );
*
* NOTES
- * Trailing context is "dangerous" if both the head and the trailing
+ * Trailing context is "dangerous" if both the head and the trailing
* part are of variable size \and/ there's a DFA state which contains
* both an accepting state for the head part of the rule and NFA states
* which occur after the beginning of the trailing context.
+ *
* When such a rule is matched, it's impossible to tell if having been
- * in the DFA state indicates the beginning of the trailing context
- * or further-along scanning of the pattern. In these cases, a warning
+ * in the DFA state indicates the beginning of the trailing context or
+ * further-along scanning of the pattern. In these cases, a warning
* message is issued.
*
* nfa_states[1 .. num_states] is the list of NFA states in the DFA.
@@ -115,103 +101,95 @@ int state[];
void check_trailing_context( nfa_states, num_states, accset, nacc )
int *nfa_states, num_states;
int *accset;
-register int nacc;
+int nacc;
+ {
+ register int i, j;
+
+ for ( i = 1; i <= num_states; ++i )
+ {
+ int ns = nfa_states[i];
+ register int type = state_type[ns];
+ register int ar = assoc_rule[ns];
- {
- register int i, j;
+ if ( type == STATE_NORMAL || rule_type[ar] != RULE_VARIABLE )
+ { /* do nothing */
+ }
- for ( i = 1; i <= num_states; ++i )
- {
- int ns = nfa_states[i];
- register int type = state_type[ns];
- register int ar = assoc_rule[ns];
-
- if ( type == STATE_NORMAL || rule_type[ar] != RULE_VARIABLE )
- { /* do nothing */
- }
-
- else if ( type == STATE_TRAILING_CONTEXT )
- {
- /* potential trouble. Scan set of accepting numbers for
- * the one marking the end of the "head". We assume that
- * this looping will be fairly cheap since it's rare that
- * an accepting number set is large.
- */
- for ( j = 1; j <= nacc; ++j )
- if ( accset[j] & YY_TRAILING_HEAD_MASK )
- {
- fprintf( stderr,
- "%s: Dangerous trailing context in rule at line %d\n",
- program_name, rule_linenum[ar] );
- return;
- }
- }
+ else if ( type == STATE_TRAILING_CONTEXT )
+ {
+ /* Potential trouble. Scan set of accepting numbers
+ * for the one marking the end of the "head". We
+ * assume that this looping will be fairly cheap
+ * since it's rare that an accepting number set
+ * is large.
+ */
+ for ( j = 1; j <= nacc; ++j )
+ if ( accset[j] & YY_TRAILING_HEAD_MASK )
+ {
+ line_warning(
+ _( "dangerous trailing context" ),
+ rule_linenum[ar] );
+ return;
+ }
+ }
+ }
}
- }
/* dump_associated_rules - list the rules associated with a DFA state
*
- * synopisis
- * int ds;
- * FILE *file;
- * dump_associated_rules( file, ds );
- *
- * goes through the set of NFA states associated with the DFA and
+ * Goes through the set of NFA states associated with the DFA and
* extracts the first MAX_ASSOC_RULES unique rules, sorts them,
- * and writes a report to the given file
+ * and writes a report to the given file.
*/
void dump_associated_rules( file, ds )
FILE *file;
int ds;
-
- {
- register int i, j;
- register int num_associated_rules = 0;
- int rule_set[MAX_ASSOC_RULES + 1];
- int *dset = dss[ds];
- int size = dfasiz[ds];
-
- for ( i = 1; i <= size; ++i )
{
- register rule_num = rule_linenum[assoc_rule[dset[i]]];
+ register int i, j;
+ register int num_associated_rules = 0;
+ int rule_set[MAX_ASSOC_RULES + 1];
+ int *dset = dss[ds];
+ int size = dfasiz[ds];
- for ( j = 1; j <= num_associated_rules; ++j )
- if ( rule_num == rule_set[j] )
- break;
+ for ( i = 1; i <= size; ++i )
+ {
+ register int rule_num = rule_linenum[assoc_rule[dset[i]]];
- if ( j > num_associated_rules )
- { /* new rule */
- if ( num_associated_rules < MAX_ASSOC_RULES )
- rule_set[++num_associated_rules] = rule_num;
- }
- }
+ for ( j = 1; j <= num_associated_rules; ++j )
+ if ( rule_num == rule_set[j] )
+ break;
- bubble( rule_set, num_associated_rules );
+ if ( j > num_associated_rules )
+ { /* new rule */
+ if ( num_associated_rules < MAX_ASSOC_RULES )
+ rule_set[++num_associated_rules] = rule_num;
+ }
+ }
- fprintf( file, " associated rule line numbers:" );
+ bubble( rule_set, num_associated_rules );
- for ( i = 1; i <= num_associated_rules; ++i )
- {
- if ( i % 8 == 1 )
- putc( '\n', file );
-
- fprintf( file, "\t%d", rule_set[i] );
+ fprintf( file, _( " associated rule line numbers:" ) );
+
+ for ( i = 1; i <= num_associated_rules; ++i )
+ {
+ if ( i % 8 == 1 )
+ putc( '\n', file );
+
+ fprintf( file, "\t%d", rule_set[i] );
+ }
+
+ putc( '\n', file );
}
-
- putc( '\n', file );
- }
/* dump_transitions - list the transitions associated with a DFA state
*
- * synopisis
- * int state[numecs];
- * FILE *file;
- * dump_transitions( file, state );
+ * synopsis
+ * dump_transitions( FILE *file, int state[numecs] );
*
- * goes through the set of out-transitions and lists them in human-readable
+ * Goes through the set of out-transitions and lists them in human-readable
* form (i.e., not as equivalence classes); also lists jam transitions
* (i.e., all those which are not out-transitions, plus EOF). The dump
* is done to the given file.
@@ -220,868 +198,898 @@ int ds;
void dump_transitions( file, state )
FILE *file;
int state[];
+ {
+ register int i, ec;
+ int out_char_set[CSIZE];
- {
- register int i, ec;
- int out_char_set[CSIZE];
+ for ( i = 0; i < csize; ++i )
+ {
+ ec = ABS( ecgroup[i] );
+ out_char_set[i] = state[ec];
+ }
- for ( i = 0; i < csize; ++i )
- {
- ec = abs( ecgroup[i] );
- out_char_set[i] = state[ec];
- }
-
- fprintf( file, " out-transitions: " );
+ fprintf( file, _( " out-transitions: " ) );
- list_character_set( file, out_char_set );
+ list_character_set( file, out_char_set );
- /* now invert the members of the set to get the jam transitions */
- for ( i = 0; i < csize; ++i )
- out_char_set[i] = ! out_char_set[i];
+ /* now invert the members of the set to get the jam transitions */
+ for ( i = 0; i < csize; ++i )
+ out_char_set[i] = ! out_char_set[i];
- fprintf( file, "\n jam-transitions: EOF " );
+ fprintf( file, _( "\n jam-transitions: EOF " ) );
- list_character_set( file, out_char_set );
+ list_character_set( file, out_char_set );
- putc( '\n', file );
- }
+ putc( '\n', file );
+ }
/* epsclosure - construct the epsilon closure of a set of ndfa states
*
* synopsis
- * int t[current_max_dfa_size], numstates, accset[num_rules + 1], nacc;
- * int hashval;
- * int *epsclosure();
- * t = epsclosure( t, &numstates, accset, &nacc, &hashval );
+ * int *epsclosure( int t[num_states], int *numstates_addr,
+ * int accset[num_rules+1], int *nacc_addr,
+ * int *hashval_addr );
*
* NOTES
- * the epsilon closure is the set of all states reachable by an arbitrary
- * number of epsilon transitions which themselves do not have epsilon
+ * The epsilon closure is the set of all states reachable by an arbitrary
+ * number of epsilon transitions, which themselves do not have epsilon
* transitions going out, unioned with the set of states which have non-null
* accepting numbers. t is an array of size numstates of nfa state numbers.
- * Upon return, t holds the epsilon closure and numstates is updated. accset
- * holds a list of the accepting numbers, and the size of accset is given
- * by nacc. t may be subjected to reallocation if it is not large enough
- * to hold the epsilon closure.
+ * Upon return, t holds the epsilon closure and *numstates_addr is updated.
+ * accset holds a list of the accepting numbers, and the size of accset is
+ * given by *nacc_addr. t may be subjected to reallocation if it is not
+ * large enough to hold the epsilon closure.
*
- * hashval is the hash value for the dfa corresponding to the state set
+ * hashval is the hash value for the dfa corresponding to the state set.
*/
int *epsclosure( t, ns_addr, accset, nacc_addr, hv_addr )
int *t, *ns_addr, accset[], *nacc_addr, *hv_addr;
-
- {
- register int stkpos, ns, tsp;
- int numstates = *ns_addr, nacc, hashval, transsym, nfaccnum;
- int stkend, nstate;
- static int did_stk_init = false, *stk;
+ {
+ register int stkpos, ns, tsp;
+ int numstates = *ns_addr, nacc, hashval, transsym, nfaccnum;
+ int stkend, nstate;
+ static int did_stk_init = false, *stk;
#define MARK_STATE(state) \
- trans1[state] = trans1[state] - MARKER_DIFFERENCE;
+trans1[state] = trans1[state] - MARKER_DIFFERENCE;
#define IS_MARKED(state) (trans1[state] < 0)
#define UNMARK_STATE(state) \
- trans1[state] = trans1[state] + MARKER_DIFFERENCE;
+trans1[state] = trans1[state] + MARKER_DIFFERENCE;
#define CHECK_ACCEPT(state) \
- { \
- nfaccnum = accptnum[state]; \
- if ( nfaccnum != NIL ) \
- accset[++nacc] = nfaccnum; \
- }
+{ \
+nfaccnum = accptnum[state]; \
+if ( nfaccnum != NIL ) \
+accset[++nacc] = nfaccnum; \
+}
#define DO_REALLOCATION \
- { \
- current_max_dfa_size += MAX_DFA_SIZE_INCREMENT; \
- ++num_reallocs; \
- t = reallocate_integer_array( t, current_max_dfa_size ); \
- stk = reallocate_integer_array( stk, current_max_dfa_size ); \
- } \
+{ \
+current_max_dfa_size += MAX_DFA_SIZE_INCREMENT; \
+++num_reallocs; \
+t = reallocate_integer_array( t, current_max_dfa_size ); \
+stk = reallocate_integer_array( stk, current_max_dfa_size ); \
+} \
#define PUT_ON_STACK(state) \
- { \
- if ( ++stkend >= current_max_dfa_size ) \
- DO_REALLOCATION \
- stk[stkend] = state; \
- MARK_STATE(state) \
- }
+{ \
+if ( ++stkend >= current_max_dfa_size ) \
+DO_REALLOCATION \
+stk[stkend] = state; \
+MARK_STATE(state) \
+}
#define ADD_STATE(state) \
- { \
- if ( ++numstates >= current_max_dfa_size ) \
- DO_REALLOCATION \
- t[numstates] = state; \
- hashval = hashval + state; \
- }
+{ \
+if ( ++numstates >= current_max_dfa_size ) \
+DO_REALLOCATION \
+t[numstates] = state; \
+hashval += state; \
+}
#define STACK_STATE(state) \
- { \
- PUT_ON_STACK(state) \
- CHECK_ACCEPT(state) \
- if ( nfaccnum != NIL || transchar[state] != SYM_EPSILON ) \
- ADD_STATE(state) \
- }
+{ \
+PUT_ON_STACK(state) \
+CHECK_ACCEPT(state) \
+if ( nfaccnum != NIL || transchar[state] != SYM_EPSILON ) \
+ADD_STATE(state) \
+}
- if ( ! did_stk_init )
- {
- stk = allocate_integer_array( current_max_dfa_size );
- did_stk_init = true;
- }
- nacc = stkend = hashval = 0;
+ if ( ! did_stk_init )
+ {
+ stk = allocate_integer_array( current_max_dfa_size );
+ did_stk_init = true;
+ }
- for ( nstate = 1; nstate <= numstates; ++nstate )
- {
- ns = t[nstate];
+ nacc = stkend = hashval = 0;
- /* the state could be marked if we've already pushed it onto
- * the stack
- */
- if ( ! IS_MARKED(ns) )
- PUT_ON_STACK(ns)
+ for ( nstate = 1; nstate <= numstates; ++nstate )
+ {
+ ns = t[nstate];
- CHECK_ACCEPT(ns)
- hashval = hashval + ns;
- }
+ /* The state could be marked if we've already pushed it onto
+ * the stack.
+ */
+ if ( ! IS_MARKED(ns) )
+ {
+ PUT_ON_STACK(ns)
+ CHECK_ACCEPT(ns)
+ hashval += ns;
+ }
+ }
- for ( stkpos = 1; stkpos <= stkend; ++stkpos )
- {
- ns = stk[stkpos];
- transsym = transchar[ns];
+ for ( stkpos = 1; stkpos <= stkend; ++stkpos )
+ {
+ ns = stk[stkpos];
+ transsym = transchar[ns];
- if ( transsym == SYM_EPSILON )
- {
- tsp = trans1[ns] + MARKER_DIFFERENCE;
+ if ( transsym == SYM_EPSILON )
+ {
+ tsp = trans1[ns] + MARKER_DIFFERENCE;
- if ( tsp != NO_TRANSITION )
- {
- if ( ! IS_MARKED(tsp) )
- STACK_STATE(tsp)
+ if ( tsp != NO_TRANSITION )
+ {
+ if ( ! IS_MARKED(tsp) )
+ STACK_STATE(tsp)
- tsp = trans2[ns];
+ tsp = trans2[ns];
- if ( tsp != NO_TRANSITION )
- if ( ! IS_MARKED(tsp) )
- STACK_STATE(tsp)
+ if ( tsp != NO_TRANSITION && ! IS_MARKED(tsp) )
+ STACK_STATE(tsp)
+ }
+ }
}
- }
- }
- /* clear out "visit" markers */
+ /* Clear out "visit" markers. */
- for ( stkpos = 1; stkpos <= stkend; ++stkpos )
- {
- if ( IS_MARKED(stk[stkpos]) )
- {
- UNMARK_STATE(stk[stkpos])
- }
- else
- flexfatal( "consistency check failed in epsclosure()" );
- }
+ for ( stkpos = 1; stkpos <= stkend; ++stkpos )
+ {
+ if ( IS_MARKED(stk[stkpos]) )
+ UNMARK_STATE(stk[stkpos])
+ else
+ flexfatal(
+ _( "consistency check failed in epsclosure()" ) );
+ }
- *ns_addr = numstates;
- *hv_addr = hashval;
- *nacc_addr = nacc;
+ *ns_addr = numstates;
+ *hv_addr = hashval;
+ *nacc_addr = nacc;
- return ( t );
- }
+ return t;
+ }
/* increase_max_dfas - increase the maximum number of DFAs */
void increase_max_dfas()
+ {
+ current_max_dfas += MAX_DFAS_INCREMENT;
- {
- current_max_dfas += MAX_DFAS_INCREMENT;
-
- ++num_reallocs;
+ ++num_reallocs;
- base = reallocate_integer_array( base, current_max_dfas );
- def = reallocate_integer_array( def, current_max_dfas );
- dfasiz = reallocate_integer_array( dfasiz, current_max_dfas );
- accsiz = reallocate_integer_array( accsiz, current_max_dfas );
- dhash = reallocate_integer_array( dhash, current_max_dfas );
- dss = reallocate_int_ptr_array( dss, current_max_dfas );
- dfaacc = reallocate_dfaacc_union( dfaacc, current_max_dfas );
+ base = reallocate_integer_array( base, current_max_dfas );
+ def = reallocate_integer_array( def, current_max_dfas );
+ dfasiz = reallocate_integer_array( dfasiz, current_max_dfas );
+ accsiz = reallocate_integer_array( accsiz, current_max_dfas );
+ dhash = reallocate_integer_array( dhash, current_max_dfas );
+ dss = reallocate_int_ptr_array( dss, current_max_dfas );
+ dfaacc = reallocate_dfaacc_union( dfaacc, current_max_dfas );
- if ( nultrans )
- nultrans = reallocate_integer_array( nultrans, current_max_dfas );
- }
+ if ( nultrans )
+ nultrans =
+ reallocate_integer_array( nultrans, current_max_dfas );
+ }
/* ntod - convert an ndfa to a dfa
*
- * synopsis
- * ntod();
- *
- * creates the dfa corresponding to the ndfa we've constructed. the
- * dfa starts out in state #1.
+ * Creates the dfa corresponding to the ndfa we've constructed. The
+ * dfa starts out in state #1.
*/
void ntod()
-
- {
- int *accset, ds, nacc, newds;
- int sym, hashval, numstates, dsize;
- int num_full_table_rows; /* used only for -f */
- int *nset, *dset;
- int targptr, totaltrans, i, comstate, comfreq, targ;
- int *epsclosure(), snstods(), symlist[CSIZE + 1];
- int num_start_states;
- int todo_head, todo_next;
-
- /* note that the following are indexed by *equivalence classes*
- * and not by characters. Since equivalence classes are indexed
- * beginning with 1, even if the scanner accepts NUL's, this
- * means that (since every character is potentially in its own
- * equivalence class) these arrays must have room for indices
- * from 1 to CSIZE, so their size must be CSIZE + 1.
- */
- int duplist[CSIZE + 1], state[CSIZE + 1];
- int targfreq[CSIZE + 1], targstate[CSIZE + 1];
-
- /* this is so find_table_space(...) will know where to start looking in
- * chk/nxt for unused records for space to put in the state
- */
- if ( fullspd )
- firstfree = 0;
-
- accset = allocate_integer_array( num_rules + 1 );
- nset = allocate_integer_array( current_max_dfa_size );
-
- /* the "todo" queue is represented by the head, which is the DFA
- * state currently being processed, and the "next", which is the
- * next DFA state number available (not in use). We depend on the
- * fact that snstods() returns DFA's \in increasing order/, and thus
- * need only know the bounds of the dfas to be processed.
- */
- todo_head = todo_next = 0;
-
- for ( i = 0; i <= csize; ++i )
{
- duplist[i] = NIL;
- symlist[i] = false;
- }
+ int *accset, ds, nacc, newds;
+ int sym, hashval, numstates, dsize;
+ int num_full_table_rows; /* used only for -f */
+ int *nset, *dset;
+ int targptr, totaltrans, i, comstate, comfreq, targ;
+ int symlist[CSIZE + 1];
+ int num_start_states;
+ int todo_head, todo_next;
+
+ /* Note that the following are indexed by *equivalence classes*
+ * and not by characters. Since equivalence classes are indexed
+ * beginning with 1, even if the scanner accepts NUL's, this
+ * means that (since every character is potentially in its own
+ * equivalence class) these arrays must have room for indices
+ * from 1 to CSIZE, so their size must be CSIZE + 1.
+ */
+ int duplist[CSIZE + 1], state[CSIZE + 1];
+ int targfreq[CSIZE + 1], targstate[CSIZE + 1];
- for ( i = 0; i <= num_rules; ++i )
- accset[i] = NIL;
+ accset = allocate_integer_array( num_rules + 1 );
+ nset = allocate_integer_array( current_max_dfa_size );
- if ( trace )
- {
- dumpnfa( scset[1] );
- fputs( "\n\nDFA Dump:\n\n", stderr );
- }
+ /* The "todo" queue is represented by the head, which is the DFA
+ * state currently being processed, and the "next", which is the
+ * next DFA state number available (not in use). We depend on the
+ * fact that snstods() returns DFA's \in increasing order/, and thus
+ * need only know the bounds of the dfas to be processed.
+ */
+ todo_head = todo_next = 0;
- inittbl();
-
- /* check to see whether we should build a separate table for transitions
- * on NUL characters. We don't do this for full-speed (-F) scanners,
- * since for them we don't have a simple state number lying around with
- * which to index the table. We also don't bother doing it for scanners
- * unless (1) NUL is in its own equivalence class (indicated by a
- * positive value of ecgroup[NUL]), (2) NUL's equilvalence class is
- * the last equivalence class, and (3) the number of equivalence classes
- * is the same as the number of characters. This latter case comes about
- * when useecs is false or when its true but every character still
- * manages to land in its own class (unlikely, but it's cheap to check
- * for). If all these things are true then the character code needed
- * to represent NUL's equivalence class for indexing the tables is
- * going to take one more bit than the number of characters, and therefore
- * we won't be assured of being able to fit it into a YY_CHAR variable.
- * This rules out storing the transitions in a compressed table, since
- * the code for interpreting them uses a YY_CHAR variable (perhaps it
- * should just use an integer, though; this is worth pondering ... ###).
- *
- * Finally, for full tables, we want the number of entries in the
- * table to be a power of two so the array references go fast (it
- * will just take a shift to compute the major index). If encoding
- * NUL's transitions in the table will spoil this, we give it its
- * own table (note that this will be the case if we're not using
- * equivalence classes).
- */
-
- /* note that the test for ecgroup[0] == numecs below accomplishes
- * both (1) and (2) above
- */
- if ( ! fullspd && ec