Skip to content
This repository has been archived by the owner on Sep 24, 2019. It is now read-only.

Function ragel FSM is not used #64

Closed
abargnesi opened this issue May 17, 2016 · 3 comments
Closed

Function ragel FSM is not used #64

abargnesi opened this issue May 17, 2016 · 3 comments

Comments

@abargnesi
Copy link
Member

abargnesi commented May 17, 2016

The function ragel FSM is not used anywhere.

Additionally it defines term as alpha+ excluding underscore. This is at odds with parsing terms.

$ rlwrap ruby -I lib lib/bel_parser/parsers/common/function.rb
protein_abundance
s(:function,
  s(:identifier, "protein_"))

$ rlwrap ruby -I lib lib/bel_parser/parsers/expression/term.rb
protein_abundance(HGNC:AKT1)
s(:term,
  s(:function,
    s(:identifier, "protein_abundance")),
  s(:argument,
    s(:parameter,
      s(:prefix,
        s(:identifier, "HGNC")),
      s(:value,
        s(:identifier, "AKT1")))))

Should we update the definition in common/function.rl or just remove it?

/cc @nbargnesi

@abargnesi abargnesi changed the title Function machine definition use of "alpha" may be to constrained. Function ragel FSM is not used May 17, 2016
@nbargnesi
Copy link
Member

It was being used in the rewritten term machines at one point. It made those term machines much easier to follow. If it still has a place there, we can keep it. If not, drop it.

@abargnesi
Copy link
Member Author

I agree that a specific parse type makes it easier to follow. We should keep the definition and reuse it in term. However, we should change the common.rl FUNCTION definition from the build-in machine alpha+ to something more liberal.

@abargnesi abargnesi self-assigned this May 17, 2016
@abargnesi
Copy link
Member Author

Redefined function to a specific ASCII range (excluding parenthesis chars), but compiling statement machines hangs. The compilation consistently consumes memory and never completes.

Diff:

diff --git a/lib/bel_parser/parsers/common/common.rl b/lib/bel_parser/parsers/common/common.rl
index de2f49b..54dd305 100644
--- a/lib/bel_parser/parsers/common/common.rl
+++ b/lib/bel_parser/parsers/common/common.rl
@@ -7,16 +7,16 @@ machine bel;
   WS          = space;
   EQL         = '=';
   NUMBER_SIGN = '#';
-  SQ             = "'";
+  SQ          = "'";
   DQ          = '"';
   COLON       = ':';
   ESCAPED     = /\\./;
-  NOT_SQESC      = [^'\\];
-  NOT_DQESC      = [^"\\];
+  NOT_SQESC   = [^'\\];
+  NOT_DQESC   = [^"\\];
   COMMA_DELIM = SP* ',' SP*;
   SS          = '//';
-  ID_CHARS   = [a-zA-Z0-9_]+;
-  LP         = '(';
+  ID_CHARS    = [a-zA-Z0-9_]+;
+  LP          = '(';
   RP          = ')';

   KW_SET = /SET/i;
@@ -63,9 +63,10 @@ machine bel;
     alpha+
     ;

-   FUNCTION =
-       alpha+
-       ;
+  # Excludes 0x28 (open parenthesis) and 0x29 (close parenthesis).
+  FUNCTION =
+    (0x21..0x27 | 0x2a..0x7e)+
+    ;
 }%%
 =end
 # vim: ts=2 sw=2 ft=ragel:
diff --git a/lib/bel_parser/parsers/expression/term.rl b/lib/bel_parser/parsers/expression/term.rl
index ac28863..f89cd69 100644
--- a/lib/bel_parser/parsers/expression/term.rl
+++ b/lib/bel_parser/parsers/expression/term.rl
@@ -3,23 +3,9 @@
 %%{
   machine bel;

+  include 'function.rl';
   include 'parameter.rl';

-  action start_function {
-    $stderr.puts 'TERM start_function'
-    @buffers[:function] = []
-  }
-
-  action append_function {
-    $stderr.puts 'TERM append_function'
-    @buffers[:function] << fc
-  }
-
-  action finish_function {
-    $stderr.puts 'TERM finish_function'
-    @buffers[:function] = identifier(utf8_string(@buffers[:function]))
-  }
-
   action term_init {
     $stderr.puts 'TERM term_init'
     @buffers[:term_stack] = [ term() ]
@@ -32,9 +18,7 @@

   action term_fx {
     $stderr.puts 'TERM term_fx'
-    fx = @buffers[:function]
-    fx_node = function(fx)
-    new_term = @buffers[:term_stack][-1] << fx_node
+    new_term = @buffers[:term_stack][-1] << @buffers[:function]
     @buffers[:term_stack][-1] = new_term
   }

@@ -48,7 +32,8 @@

   action fxbt {
     $stderr.puts 'TERM fxbt'
-    fpc -= @buffers[:function].length + 1
+    function_string = @buffers[:function].identifier.string_literal
+    fpc -= function_string.length + 1
     fcall inner_term;
   }

@@ -78,35 +63,35 @@
   }

   inner_term :=
-    an_ident >inner_term_init >start_function $append_function %finish_function
+    a_function >inner_term_init
     SP*
     '(' @term_fx $eof{ $stderr.puts "EOF!" }
       (
         a_parameter %term_argument |
-        an_ident >start_function $append_function '(' @fxbt
+        a_function '(' @fxbt
       )
       (
         COMMA_DELIM
         (
           a_parameter %term_argument |
-          an_ident >start_function $append_function '(' @fxbt
+          a_function '(' @fxbt
         )
       )*
     ')' @fxret;

   outer_term =
-    an_ident >term_init >start_function $append_function %finish_function
+    a_function >term_init
     SP*
     '(' @term_fx
       (
         a_parameter %term_argument $eof(eof_parameter_argument) |
-        an_ident >start_function $append_function '(' @fxbt
+        a_function '(' @fxbt
       )
       (
         COMMA_DELIM
         (
           a_parameter %term_argument $eof(eof_parameter_argument) |
-          an_ident >start_function $append_function '(' @fxbt
+          a_function '(' @fxbt
         )
       )*
     ')';

@abargnesi abargnesi removed the ready label May 17, 2016
abargnesi pushed a commit that referenced this issue May 17, 2016
Excluded open paren, (, due to ambiguity with term.

Excluded single and double quote due to ragel machine compilation
hanging.

Excluded space due to leading indentation feature (refs #36).

refs #64
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants