[mlir] Change end of OperationDefinition. #77273

jpienaar · 2024-01-08T05:58:12Z

Store the last token parsed in the parser state so that the range parsed can utilize its end rather than the start of the token after parsed. This results in a tighter range (especially true in the case of comments, see https://gist.github.com/jpienaar/9598339b504157b189c3a3c38314a703 for example of effect of change).

Discovered while working on a little textual post processing tool.

Store the last token parsed in the parser state so that the range parsed can utilize its end rather than the start of the token after parsed. This results in a tighter range return (especially true in the case of comments, see https://gist.github.com/jpienaar/9598339b504157b189c3a3c38314a703 for example of effect of change).

llvmbot · 2024-01-08T05:58:40Z

@llvm/pr-subscribers-mlir

@llvm/pr-subscribers-mlir-core

Author: Jacques Pienaar (jpienaar)

Changes

Store the last token parsed in the parser state so that the range parsed can utilize its end rather than the start of the token after parsed. This results in a tighter range (especially true in the case of comments, see https://gist.github.com/jpienaar/9598339b504157b189c3a3c38314a703 for example of effect of change).

Discovered while working on a little textual post processing tool.

Full diff: https://github.com/llvm/llvm-project/pull/77273.diff

3 Files Affected:

(modified) mlir/lib/AsmParser/Parser.cpp (+7-5)
(modified) mlir/lib/AsmParser/Parser.h (+5)
(modified) mlir/lib/AsmParser/ParserState.h (+5-2)

diff --git a/mlir/lib/AsmParser/Parser.cpp b/mlir/lib/AsmParser/Parser.cpp
index 3aa9adcbe1c5df..00f2b0c0c2f12f 100644
--- a/mlir/lib/AsmParser/Parser.cpp
+++ b/mlir/lib/AsmParser/Parser.cpp
@@ -1209,7 +1209,7 @@ ParseResult OperationParser::parseOperation() {
         resultIt += std::get<1>(record);
       }
       state.asmState->finalizeOperationDefinition(
-          op, nameTok.getLocRange(), /*endLoc=*/getToken().getLoc(),
+          op, nameTok.getLocRange(), /*endLoc=*/getLastToken().getEndLoc(),
           asmResultGroups);
     }
 
@@ -1225,8 +1225,9 @@ ParseResult OperationParser::parseOperation() {
 
     // Add this operation to the assembly state if it was provided to populate.
   } else if (state.asmState) {
-    state.asmState->finalizeOperationDefinition(op, nameTok.getLocRange(),
-                                                /*endLoc=*/getToken().getLoc());
+    state.asmState->finalizeOperationDefinition(
+        op, nameTok.getLocRange(),
+        /*endLoc=*/getLastToken().getEndLoc());
   }
 
   return success();
@@ -1500,8 +1501,9 @@ Operation *OperationParser::parseGenericOperation(Block *insertBlock,
   // If we are populating the parser asm state, finalize this operation
   // definition.
   if (state.asmState)
-    state.asmState->finalizeOperationDefinition(op, nameToken.getLocRange(),
-                                                /*endLoc=*/getToken().getLoc());
+    state.asmState->finalizeOperationDefinition(
+        op, nameToken.getLocRange(),
+        /*endLoc=*/getLastToken().getEndLoc());
   return op;
 }
 
diff --git a/mlir/lib/AsmParser/Parser.h b/mlir/lib/AsmParser/Parser.h
index 01c55f97a08c2c..b959e67b8e2583 100644
--- a/mlir/lib/AsmParser/Parser.h
+++ b/mlir/lib/AsmParser/Parser.h
@@ -102,6 +102,9 @@ class Parser {
   const Token &getToken() const { return state.curToken; }
   StringRef getTokenSpelling() const { return state.curToken.getSpelling(); }
 
+  /// Return the last parsed token.
+  const Token &getLastToken() const { return state.lastToken; }
+
   /// If the current token has the specified kind, consume it and return true.
   /// If not, return false.
   bool consumeIf(Token::Kind kind) {
@@ -115,6 +118,7 @@ class Parser {
   void consumeToken() {
     assert(state.curToken.isNot(Token::eof, Token::error) &&
            "shouldn't advance past EOF or errors");
+    state.lastToken = state.curToken;
     state.curToken = state.lex.lexToken();
   }
 
@@ -129,6 +133,7 @@ class Parser {
   /// Reset the parser to the given lexer position.
   void resetToken(const char *tokPos) {
     state.lex.resetPointer(tokPos);
+    state.lastToken = state.curToken;
     state.curToken = state.lex.lexToken();
   }
 
diff --git a/mlir/lib/AsmParser/ParserState.h b/mlir/lib/AsmParser/ParserState.h
index 1428ea3a82cee9..159058a18fa4e1 100644
--- a/mlir/lib/AsmParser/ParserState.h
+++ b/mlir/lib/AsmParser/ParserState.h
@@ -54,8 +54,8 @@ struct ParserState {
               AsmParserCodeCompleteContext *codeCompleteContext)
       : config(config),
         lex(sourceMgr, config.getContext(), codeCompleteContext),
-        curToken(lex.lexToken()), symbols(symbols), asmState(asmState),
-        codeCompleteContext(codeCompleteContext) {}
+        curToken(lex.lexToken()), lastToken(Token::error, ""), symbols(symbols),
+        asmState(asmState), codeCompleteContext(codeCompleteContext) {}
   ParserState(const ParserState &) = delete;
   void operator=(const ParserState &) = delete;
 
@@ -68,6 +68,9 @@ struct ParserState {
   /// This is the next token that hasn't been consumed yet.
   Token curToken;
 
+  /// This is the last token that has been consumed.
+  Token lastToken;
+
   /// The current state for symbol parsing.
   SymbolState &symbols;

joker-eph · 2024-01-08T10:38:23Z

No easy way to see this on a test at the moment I guess?

Mogball · 2024-01-08T18:22:18Z

Should help the LSP, no?

jpienaar · 2024-01-10T18:38:41Z

Correct yes, but no test exercises this. I have follow up change where I'll verify this in a simple way.

Store the last token parsed in the parser state so that the range parsed can utilize its end rather than the start of the token after parsed. This results in a tighter range (especially true in the case of comments, see ```mlir |%c4 = arith.constant 4 : index // Foo | ``` vs ```mlir |%c4 = arith.constant 4 : index| ``` ). Discovered while working on a little textual post processing tool.

jpienaar requested review from joker-eph, Mogball and River707 January 8, 2024 05:58

llvmbot added mlir:core MLIR Core Infrastructure mlir labels Jan 8, 2024

joker-eph approved these changes Jan 8, 2024

View reviewed changes

Mogball approved these changes Jan 8, 2024

View reviewed changes

jpienaar merged commit c1d02bd into llvm:main Jan 10, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[mlir] Change end of OperationDefinition. #77273

[mlir] Change end of OperationDefinition. #77273

jpienaar commented Jan 8, 2024

llvmbot commented Jan 8, 2024 •

edited

joker-eph commented Jan 8, 2024

Mogball commented Jan 8, 2024

jpienaar commented Jan 10, 2024

[mlir] Change end of OperationDefinition. #77273

[mlir] Change end of OperationDefinition. #77273

Conversation

jpienaar commented Jan 8, 2024

llvmbot commented Jan 8, 2024 • edited

joker-eph commented Jan 8, 2024

Mogball commented Jan 8, 2024

jpienaar commented Jan 10, 2024

llvmbot commented Jan 8, 2024 •

edited