Skip to content

MarshalX/libprisma

 
 

Repository files navigation

This is a C++ porting of prism.js library. The code depends on Boost.Regex, as it's a faster and more comprehensive than STD's.

Grammars file is generated from prism.js source code itself, instructions later in the file.

Key concepts:

string text = ReadFile("grammars.dat");
m_highlighter = std::make_shared<SyntaxHighlighter>(text);
TokenList tokens = m_highlighter->tokenize(code, language);

for (auto it = tokens.begin(); it != tokens.end(); ++it)
{
    auto& node = *it;
    if (node.isSyntax())
    {
        const auto& child = dynamic_cast<const Syntax&>(node);
        // child.type() <- main token type (eg "include")
        // child.alias() <- "base" token type (eg "keyword")
        // child.begin() + node.end() <- list of tokens
    }
    else
    {
        const auto& child = dynamic_cast<const Text&>(node);
        // child.value() <- the actual text to highlight
    }
}

How to update

As mentioned, grammars dictionary is generated starting from prism.js source code. Currently, this is done manually by visiting prism's test drive. Once on the page, it is necessary to select all the languages, open the browser console and paste in both isEqual.js and generate.js. After a few seconds, the file grammars.dat will be downloaded.

TODO: would be great to automate this step, or at least to make the script auto-execute rather to require all the user input.

About

Code highlight tokenizer written in C++

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • JavaScript 71.0%
  • C++ 29.0%