Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add a lexer for untokenised BBC BASIC files #1280

Merged
merged 21 commits into from
Aug 2, 2019
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
23a6c68
Add a lexer for untokenised BBC BASIC files
bavison Jan 14, 2019
f559c65
[bbcbasic] only colour the leading * of inline CLI command as Generic…
bavison Jul 30, 2019
c3eefd0
[bbcbasic] use correct method for control keywords
bavison Jul 30, 2019
b8b3aeb
[bbcbasic] remove unnecessary escapes within character ranges
bavison Jul 30, 2019
96c5780
[bbcbasic] use dedicated token for binary numbers
bavison Jul 30, 2019
1629f21
[bbcbasic] deduplicate some rules between :root and :assembly2 using …
bavison Jul 31, 2019
8662327
[bbcbasic] where one keyword is a substring of another, list longer o…
bavison Jul 31, 2019
bae4a5f
[bbcbasic] exercise more rules in visual spec
bavison Jul 31, 2019
eb0390b
[bbcbasic] colour CLI command introducer as Keyword
bavison Jul 31, 2019
db27bd2
[bbcbasic] imperative ERROR keyword needs to be captured at higher pr…
bavison Jul 31, 2019
918e1ec
[bbcbasic] attempt to reduce indcidences of * operator matching CLI c…
bavison Jul 31, 2019
46171d4
[bbcbasic] simplify expression states
bavison Aug 1, 2019
a8861a1
[bbcbasic] fix `*` operators being misidentified as inline commands
bavison Aug 1, 2019
b030693
[bbcbasic] add `o` modifiers to all regexps
bavison Aug 1, 2019
40f57d3
[bbcbasic] stricter checking of control flow statements
bavison Aug 1, 2019
33d500f
[bbcbasic] further simplification
bavison Aug 1, 2019
50f3d0f
[bbcbasic] improvements to handling of `PROC`
bavison Aug 1, 2019
3d84b20
[bbcbasic] treat `FN` as a built-in function
bavison Aug 1, 2019
cce0cf4
[bbcbasic] use a different approach for detecting CLI commands
bavison Aug 1, 2019
62819fd
[bbcbasic] handle multiple or newlines leading up to a command
bavison Aug 1, 2019
d70c214
[bbcbasic] fix whitespace requirement after keyword
bavison Aug 1, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 6 additions & 0 deletions lib/rouge/demos/bbcbasic
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
REM > DefaultFilename
REM Ordinary comment
FOR n=1 TO 10
PRINTTAB(n)"Hello there ";FNnumber(n)DIV3+1
NEXT:END
DEFFNnumber(x%)=ABS(x%-4)
112 changes: 112 additions & 0 deletions lib/rouge/lexers/bbcbasic.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,112 @@
# -*- coding: utf-8 -*- #
# frozen_string_literal: true

module Rouge
module Lexers
class BBCBASIC < RegexLexer
title "BBCBASIC"
desc "BBC BASIC syntax"
tag 'bbcbasic'
filenames '*,fd1'
pyrmont marked this conversation as resolved.
Show resolved Hide resolved

def self.punctuation
@punctuation ||= %w(
[,;'~] SPC TAB
)
end

def self.function
@function ||= %w(
ABS ACS ADVAL ASC ASN ATN BEATS BEAT BGET# CHR\$ COS COUNT DEG DIM
EOF# ERL ERR EVAL EXP EXT# FN GET\$# GET\$ GET HIMEM INKEY\$ INKEY
INSTR INT LEFT\$ LEN LN LOG LOMEM MID\$ OPENIN OPENOUT OPENUP PAGE
POINT POS PTR# RAD REPORT\$ RIGHT\$ RND SGN SIN SQR STR\$ STRING\$ SUM
SUMLEN TAN TEMPO TIME\$ TIME TOP USR VAL VPOS
)
end

def self.statement
@statement ||= %w(
BEATS BPUT# CALL CASE CHAIN CLEAR CLG CLOSE# CLS COLOR COLOUR DATA
ELSE ENDCASE ENDIF ENDPROC ENDWHILE END ENVELOPE FOR GCOL GOSUB GOTO
IF INSTALL LET LIBRARY MODE NEXT OFF OF ON ORIGIN OSCI OTHERWISE
OVERLAY PLOT PRINT# PRINT PROC QUIT READ REPEAT REPORT RETURN SOUND
STEP STEREO STOP SWAP SYS THEN TINT TO VDU VOICES VOICE UNTIL WAIT
WHEN WHILE WIDTH
)
end

def self.operator
@operator ||= %w(
<< <= <> < >= >>> >> > [-!$()*+/=?^|] AND DIV EOR MOD NOT OR
)
end

def self.constant
@constant ||= %w(
FALSE TRUE
)
end

state :expression do
rule %r/#{BBCBASIC.function.join('|')}/o, Name::Builtin # function or pseudo-variable
rule %r/#{BBCBASIC.operator.join('|')}/o, Operator
rule %r/#{BBCBASIC.constant.join('|')}/o, Name::Constant
rule %r/"[^"]*"/o, Literal::String
rule %r/[a-z_`][\w`]*[$%]?/io, Name::Variable
rule %r/@%/o, Name::Variable
rule %r/[\d.]+/o, Literal::Number
rule %r/%[01]+/o, Literal::Number::Bin
rule %r/&[\h]+/o, Literal::Number::Hex
end

state :root do
rule %r/(:+)( *)(\*)(.*)/ do
groups Punctuation, Text, Keyword, Text # CLI command
end
rule %r/(\n+ *)(\*)(.*)/ do
groups Text, Keyword, Text # CLI command
end
rule %r/(ELSE|OTHERWISE|REPEAT|THEN)( *)(\*)(.*)/ do
groups Keyword, Text, Keyword, Text # CLI command
end
rule %r/[ \n]+/o, Text
rule %r/:+/o, Punctuation
rule %r/[\[]/o, Keyword, :assembly1
rule %r/REM *>.*/o, Comment::Special
rule %r/REM.*/o, Comment
rule %r/(?:#{BBCBASIC.statement.join('|')}|CIRCLE(?: *FILL)?|DEF *(?:FN|PROC)|DRAW(?: *BY)?|DIM(?!\()|ELLIPSE(?: *FILL)?|ERROR(?: *EXT)?|FILL(?: *BY)?|INPUT(?:#| *LINE)?|LINE(?: *INPUT)?|LOCAL(?: *DATA| *ERROR)?|MOUSE(?: *COLOUR| *OFF| *ON| *RECTANGLE| *STEP| *TO)?|MOVE(?: *BY)?|ON(?! *ERROR)|ON *ERROR *(?:LOCAL|OFF)?|POINT(?: *BY)?(?!\()|RECTANGE(?: *FILL)?|RESTORE(?: *DATA| *ERROR)?|TRACE(?: *CLOSE| *ENDPROC| *OFF| *STEP(?: *FN| *ON| *PROC)?| *TO)?)/o, Keyword
mixin :expression
rule %r/#{BBCBASIC.punctuation.join('|')}/o, Punctuation
end

# Assembly statements are parsed as
# {label} {directive|opcode |']' {expressions}} {comment}
# Technically, you don't need whitespace between opcodes and arguments,
# but this is rare in uncrunched source and trying to enumerate all
# possible opcodes here is impractical so we colour it as though
# the whitespace is required. Opcodes and directives can only easily be
# distinguished from the symbols that make up expressions by looking at
# their position within the statement. Similarly, ']' is treated as a
# keyword at the start of a statement or as punctuation elsewhere. This
# requires a two-state state machine.

state :assembly1 do
rule %r/ +/o, Text
rule %r/]/o, Keyword, :pop!
rule %r/[:\n]/o, Punctuation
rule %r/\.[a-z_`][\w`]*%? */io, Name::Label
rule %r/(?:REM|;)[^:\n]*/o, Comment
rule %r/[^ :\n]+/o, Keyword, :assembly2
end

state :assembly2 do
pyrmont marked this conversation as resolved.
Show resolved Hide resolved
rule %r/ +/o, Text
rule %r/[:\n]/o, Punctuation, :pop!
rule %r/(?:REM|;)[^:\n]*/o, Comment, :pop!
mixin :expression
rule %r/[!#,@\[\]^{}]/, Punctuation
end
end
end
end
14 changes: 14 additions & 0 deletions spec/lexers/bbcbasic_spec.rb
Original file line number Diff line number Diff line change
@@ -0,0 +1,14 @@
# -*- coding: utf-8 -*- #
# frozen_string_literal: true

describe Rouge::Lexers::BBCBASIC do
let(:subject) { Rouge::Lexers::BBCBASIC.new }

describe 'guessing' do
include Support::Guessing

it 'guesses by filename' do
assert_guess :filename => 'foo,fd1'
end
end
end
44 changes: 44 additions & 0 deletions spec/visual/samples/bbcbasic
Original file line number Diff line number Diff line change
@@ -0,0 +1,44 @@
REM > DefaultFilename
PRINT:REM Ordinary comment: This is still a comment
ONERRORPRINTERR:END
ERROR1,"error"
IF2*2=4ELSE*Cat:4.$
FOR n=1 TO 10
PRINTTAB(n)"Hello there ";FNnumber(n)DIV3+1'INKEY$(500)INKEY(500)
NEXT
DIM code% 100
FOR opt=0 TO 3 STEP 3
P%=code%
[OPT opt
.label1%
.label2% LDR r0,[P%MOD2,#0]
MOV pc,r14:REM comments in assembly terminate at colon:EQUD -1
.label3;comment
ALIGN
.label4
]
NEXT
CALL code%:PRINT USR(code%)
DIM`(100,100):PRINTDIM(`())
X=&FEDCBA98:PRINT X,~X,X>>>1,X>>1,X>1
PRINT %1010%1111:REM Prints "10" then "15"
A%=@%:@%="F10.2":PRINT~@%:@%=A%
ELLIPSE FILL 100,100,100,50,45
MOUSEOFF
DIM block% 100
block%!2*4=0
CASE thing OF
WHEN 1:PRINT"one"
OTHERWISE *echo not one
ENDCASE
*Help
ON
ONA%PROCa,PROCb ELSEPROCc
ONERRORONERROROFF:OFF:REPORT:END
IF A% PRINT"Problem"
IF A$ THEN PRINT"Not a problem"
WHILENOTEOF#f PRINTBGET#f:ENDWHILE
REPEATUNTILFALSE
END
DEFPROCa PRINT"Hello world":ENDPROC
DEFFNnumber(x%)=ABS(x%-4)
pyrmont marked this conversation as resolved.
Show resolved Hide resolved