Supporting space and tab delimited files

Bryan Oakley edited this page Aug 12, 2014 · 2 revisions

A couple people have asked if the extension supports space-separated files. At the moment, no, it doesn't.

I've been thinking about this, however, and I think it would be fairly easy to implement. Tedious, maybe, but not particularly difficult. I wanted to jot down some of my thoughts in case I decide to do this, or if someone else wants to contribute.

Robot's own parser has a very simple algorithm - if there's a tab anywhere in the line it uses tabs for separators. If the line begins with "| " then it uses pipes. If neither of those are true it uses spaces. Pretty simple strategy.

So, I'm thinking the first thing to do would be to set a state variable for each line that defines which mode to use. I would need to change the startState function to initialize this new variable:

startState: function() {
    return {
        separator: "pipes"

The next step would be to set this state variable every time we are at the beginning of a line. Setting it for a space or pipe is easy enough, but scanning ahead in the stream for a tab seems inefficient (and maybe not worth supporting?). Something like this could be added to the start of the token function:

token: function(stream, state) {
    if (stream.sol()) {
        if (stream.match(/\t.*$/, false)) {
            state.separator = "tabs"
        } else if (stream.peek() === "|") {
            state.separator = "pipes"
        } else {
            state.separator = "spaces"

I don't know if that's precisely right -- the regex for searching for tabs on the current line may need to be tweaked.

Once that's done, the next step is to modify the isSeparator function to take the state into consideration when deciding if the next token is a separator.

That might be almost all there is to getting the syntax highlighting to work. However, I know there are other places such as on_tab, rangeFinder and auto_indent that assume pipes for separators. Like isSeparator, these other functions need to be modified to look at the state before doing whatever it is they do. Perhaps they can be generalized, or perhaps there needs to be separate functions for each mode.

There also has to be some fiddling around with counting columns. In space and tab-separated, the first column starts at the first character, for pipe-separated the first column starts after the first pipe and space.