Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unexpected exception encountered in JFlex #974

Closed
JefMihael opened this issue Nov 10, 2022 · 3 comments · Fixed by #985
Closed

Unexpected exception encountered in JFlex #974

JefMihael opened this issue Nov 10, 2022 · 3 comments · Fixed by #985
Assignees
Labels
bug Not working as intended
Milestone

Comments

@JefMihael
Copy link

scanner.flex

error.txt

i had a problem trying to generate a lexer with this .flex.
jflex version: jflex-1.8.2

@lsf37
Copy link
Member

lsf37 commented Nov 10, 2022

Thanks for reporting this, definitely looks like a bug. For discussion later, I've pasted the files inline here:

package sintatico;


import sintatico.sym;
//import static lexico.Tokens.*;
%%

//%public
%cup
%full
%line
%char
%ignorecase 
%eofval{
    return new Symbol(sym.EOF, new String("Fim do Arquivo"));
%eofval}
//%class AnalisadorLexico
//%type Tokens   
BRANCO = [ \n\t\r]
/*%{
public String lexema;
/*public int ContaToken;*/
%}*/

ID = [_|a-z|A-Z][a-z|A-Z|0-9_]*
INTEIRO = 0|[1-9][0-9]*
%%

("se")		{return new Symbol(sym.SE, yychar, yyline, yytext());}
("senao")	{return new Symbol(sym.SENAO, yychar, yyline, yytext());}
("senaose")	{return new Symbol(sym.SENAOSE, yychar, yyline, yytext());}
("enquanto")	{return new Symbol(sym.ENQUANTO, yychar, yyline, yytext());}
("faca")	{return new Symbol(sym.FACA, yychar, yyline, yytext());}
("para")	{return new Symbol(sym.PARA, yychar, yyline, yytext());}
("+")		{return new Symbol(sym.MAIS, yychar, yyline, yytext());}
("-")		{return new Symbol(sym.MENOS, yychar, yyline, yytext());}
("*")		{return new Symbol(sym.VEZES, yychar, yyline, yytext());}
("/")		{return new Symbol(sym.DIVISAO, yychar, yyline, yytext());}
("=")		{return new Symbol(sym.IGUAL, yychar, yyline, yytext());}
("<")		{return new Symbol(sym.MENOR, yychar, yyline, yytext());}
(">")		{return new Symbol(sym.MAIOR, yychar, yyline, yytext());}
("<=")          {return new Symbol(sym.MENORIGUAL, yychar, yyline, yytext());}
(">=")		{return new Symbol(sym.MAIORIGUAL, yychar, yyline, yytext());}
(";")		{return new Symbol(sym.PTVIRG, yychar, yyline, yytext());}
("{")           {return new Symbol(sym.ACHAVE, yychar, yyline, yytext());}
("}")           {return new Symbol(sym.FCHAVE, yychar, yyline, yytext());}
("(")           {return new Symbol(sym.APARENT, yychar, yyline, yytext());}
(")")           {return new Symbol(sym.FPARENT, yychar, yyline, yytext());}
{INTEIRO}	{return new Symbol(sym.INTEIRO, yychar, yyline, yytext());}
{ID}		{return new Symbol(sym.ID, yychar, yyline, yytext());}
{BRANCO}	{}
. {System.err.println("Caractere Ilegal: " + yytext());}

Error:

PS C:\Users\jefh_\OneDrive\Área de Trabalho\Compiladores\Compilador\src\sintatico> jflex scanner.flex
Reading "scanner.flex"
Constructing NFA :
Unexpected exception encountered. This indicates a bug in JFlex.
Please consider filing an issue at http://github.com/jflex-de/jflex/issues/new


Index 31 out of bounds for length 31
java.lang.IndexOutOfBoundsException: Index 31 out of bounds for length 31
        at java.base/jdk.internal.util.Preconditions.outOfBounds(Preconditions.java:100)
        at java.base/jdk.internal.util.Preconditions.outOfBoundsCheckIndex(Preconditions.java:106)
        at java.base/jdk.internal.util.Preconditions.checkIndex(Preconditions.java:302)
        at java.base/java.util.Objects.checkIndex(Objects.java:385)
        at java.base/java.util.ArrayList.get(ArrayList.java:427)
        at jflex.core.unicode.CharClasses.getClassCode(CharClasses.java:191)
        at jflex.core.NFA.insertStringNFA(NFA.java:544)
        at jflex.core.NFA.insertNFA(NFA.java:912)
        at jflex.core.NFA.addRegExp(NFA.java:187)
        at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action_part00000000(LexParse.java:1057)
        at jflex.core.LexParse$CUP$LexParse$actions.CUP$LexParse$do_action(LexParse.java:2257)
        at jflex.core.LexParse.do_action(LexParse.java:598)
        at java_cup.runtime.lr_parser.parse(lr_parser.java:699)
        at jflex.generator.LexGenerator.generate(LexGenerator.java:74)
        at jflex.Main.generate(Main.java:320)
        at jflex.Main.main(Main.java:336)

@lsf37 lsf37 added the bug Not working as intended label Nov 10, 2022
@lsf37 lsf37 self-assigned this Nov 10, 2022
@lsf37
Copy link
Member

lsf37 commented Dec 29, 2022

A small spec that produces the same exception:

%%

%full
%ignorecase

%%

s {}
sn {}

@lsf37
Copy link
Member

lsf37 commented Dec 29, 2022

And even smaller:

%%

%full
%ignorecase

%%

sb {}

Looks like one of the ignore-case characters for s is not in the char set allowed by %full

@lsf37 lsf37 added this to the 1.9.0 milestone Dec 30, 2022
lsf37 added a commit that referenced this issue Dec 30, 2022
The lexer spec can mention characters that are not in the input set
(e.g. for %7bit or %8bit). In particular, in caseless matching, the
caseless class might contain such characters.

Make getClassCode() robust against this situation, and ignore such
characters when we add transitions.

Fixes #974
lsf37 added a commit that referenced this issue Dec 30, 2022
lsf37 added a commit that referenced this issue Dec 30, 2022
lsf37 added a commit that referenced this issue Dec 30, 2022
The lexer spec can mention characters that are not in the input set
(e.g. for %7bit or %8bit). In particular, in caseless matching, the
caseless class might contain such characters.

Make getClassCode() robust against this situation, and ignore such
characters when we add transitions.

Fixes #974
lsf37 added a commit that referenced this issue Dec 30, 2022
lsf37 added a commit that referenced this issue Dec 30, 2022
The lexer spec can mention characters that are not in the input set
(e.g. for %7bit or %8bit). In particular, in caseless matching, the
caseless class might contain such characters.

Make getClassCode() robust against this situation, and ignore such
characters when we add transitions.

Fixes #974
lsf37 added a commit that referenced this issue Dec 30, 2022
lsf37 added a commit that referenced this issue Dec 30, 2022
The lexer spec can mention characters that are not in the input set
(e.g. for %7bit or %8bit). In particular, in caseless matching, the
caseless class might contain such characters.

Make getClassCode() robust against this situation, and ignore such
characters when we add transitions.

Fixes #974
lsf37 added a commit that referenced this issue Dec 30, 2022
lsf37 added a commit that referenced this issue Dec 30, 2022
The lexer spec can mention characters that are not in the input set
(e.g. for %7bit or %8bit). In particular, in caseless matching, the
caseless class might contain such characters.

Make getClassCode() robust against this situation, and ignore such
characters when we add transitions.

Fixes #974
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Not working as intended
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants