Support syntax highlighting #67

mgrabovsky · 2014-06-27T17:36:35Z

It would be nice to have at least a basic support for highlighting code, i.e., some kind of polyglot, catch-all that would highlight the most frequently used keywords, string and number literals, symbols, etc.

A more robust option would be to use a third-party highlighting library with either (a) supplying the language in the markup (e.g. [code lang=ruby]puts "hello world"[/code]), or (b) employing a probabilistic language recognition algorithm. This may not go well with the current light-weight, few-dependencies philosophy, though.

The text was updated successfully, but these errors were encountered:

infamousbutterly · 2014-06-27T18:23:45Z

I second this.

Admin-Kaf · 2014-06-29T20:50:54Z

I planned to do that on my fork some day™ using the markup {{code goes here}} since tinyboard don't use [tags] like that. Here's three important points:

We/I have to make sure that any markup inside the code markup is escaped including the {{ }} itself.
I don't really know how to choose the language… Maybe {{(python) code}} or {{[python] code}}.
For the SyntaxHighlighter plugin I know this one: http://alexgorbatchev.com/SyntaxHighlighter/ but maybe someone knows a better one.

Completely unrelated but I also plan to find some way to allow latex with the markup $$latex$$.

mgrabovsky · 2014-06-30T10:02:55Z

What's wrong with BBCode-like tags? I don't understand how the current parser works, but it should be straightforward to integrate it into a reasonably well-written one. Personally, I think syntax like {{sys.exit()}} is ugly and, as you've pointed out, doesn't easily allow for specifying the language.

If the BBCode-like option is deemed not feasible, would something like Markdown's syntax (indenting) be possible? If not, then at least something that's easily discernible within the bounds of the system. Such as the following (requiring {{{ and }}} to be alone on their lines, aside from the optional language specifier):

Some comment regarding the following code here, pay good attention:
{{{ haskell
main :: IO ()
main = putStrLn $ show $ 9958431258 * 15432274
}}}
And that's it.

As for highlighting, I found highlight.js in my bookmarks which features automatic language recognition (that would eliminate the need for a language specifier and allow for simpler syntax) and seems to be in active development. See also its test page.

One more thing to consider: Should only blocks of code be allowed, or should inline code be supported, as well?

czaks · 2014-06-30T12:56:23Z

The current parser are simple dumb regular expressions (i.e. parser doesn't exist).

There's ongoing issue how to solve eg. this bit of code (assuming [code] is a code tag and [b] is a bold tag): [code=php]print("I like [b]bbcode[/b]!");[/code]

I actually got some idea about it now, will reply later :)

czaks · 2014-06-30T13:20:43Z

Ok, let's suppose, that we start processing. We do a regexp search for [code]text[/code], save the text into a variable called “0” and replace it with [postprocess]0[/postprocess]

Now, we can add color codes etc. for variables “0”, “1”, etc., wrap them into proper markup, eg. < code> or < pre>

Then, we launch all the another markup things, doing bolds, replacing -- with – etc.

At the end, we search for all [postprocess]number[/postprocess] and replace them with variable of a given number. This way the code doesn't get messed up.

About a library for highlighting, I know Geshi: http://qbnz.com/highlighter/ . I used it, but it has a licensing problem: it's released under a copyleft GNU GPL license (which may, or may not be a problem — we already have a php-gettext library, GPL licensed, which is conditionally loaded, when a native gettext library doesn't exist). Maybe there exists some Kate (KDE Advanced Text Editor) coloring db parser for php (I know that one exists for Haskell).

We can also offload the highlighting to Javascript (moving all the syntax definitions to be downloaded by the client).

ctrlcctrlv · 2014-06-30T14:04:45Z

8chan.co already has this through my own patch that uses JavaScript highlighting.

$config['additional_javascript'][] = 'js/code_tags/run_prettify.js'; // https://code.google.com/p/google-code-prettify/
$config['markup'][] = array("/\[code\](.+?)\[\/code\]/ms", "<code><pre class='prettyprint' style='display:inline-block'>\$1</pre></code>");

Produces markup like https://8chan.co/b/res/2009.html#2043

ghost · 2014-06-30T16:29:35Z

GNU-licensed software must not be a problem - I'd rather not use a piece of software at all than seeing it become anti-GNU only because the original developer is currently a StopNerd.

On topic: I discourage using a javascript library if it's not vastly better than a php solution, and GeSHi looks pretty fine for me.

Admin-Kaf · 2014-07-05T19:06:26Z

GeSHI looks awesome.

For the markup, I find it not really logical to use a [code][/code] when everything else use '''this''' or this. We should follow the already existing markup (inspired from the wiki one: https://en.wikipedia.org/wiki/Help:Wiki_markup#Text_formatting ) or replace everything by bbcode or markdown. Except that wiki use which is completely retarded.

{{_}} is a block tag anyway so I see no problem to obligate it to be alone on a line like mgrabovsky suggested. Like this:
{{\n
{{\n
code{{with shitty {'''syntax'''}}}\n
}}\n
}}\n

It's important to only take the ousides {{}} we must be able to use them inside the code.

mgrabovsky · 2014-07-05T20:59:52Z

GeSHi is all right, Wikimedia sites use it, too. As for the syntax, I still think that [code] or Markdown's triple-backquote are the only viable options.

@czaks — Personally, I'm not happy about the architecture of the parser, so I'd rather leave that to the more experienced people.

@Admin-Kaf — You're trying to enforce consistency where there's none. Let's look at the current markup:

'''strong''', ''italic'' and ==heading== from MediWiki (presumably),
**spoiler** from somewhere else.

As you pointed out, Wikipedia uses XML-like tags <source lang="..."> (or <syntaxhighlight>), which, I too believe, would not be entirely appropriate in this setting.

Also, I don't understand your statement about {{...}} being a block tag, but my point was that the two braces themselves are barely visible in a text, therefore ugly. Moreover, as you've also been so kind to show, it might become a chore to even parse the syntax.

Admin-Kaf · 2014-07-05T21:18:49Z

{{ }} is a block (as in block vs inline) tag because you will never have code and normal text on the same line. The code will take place in a new line in a block (with numerated lines and stuff) and then the text will continue on the line after.

My point was that every tinyboard/vichan markups are two times the same symbol for opening and ending which is cool because it's very quick to type and kinda logical sometimes (… is commented in the config file and can be added for underline and I added myself --…-- for ). So I tried to find the same for the code. {} are the characters that make me think the most of programming stuff. It could have been [[…]] or //…// or \…\ or anything really. And $$…$$ for latex because it's basically what's used for formulas in latex.

I know it's disturbing for people from 4chan or forums but it's already the way it's made and when you know it, it's very convenient. (And it's also very funny to see newfags trying to make strong text.)

mgrabovsky · 2014-07-06T09:55:14Z

Oh, I understand. It still doesn't change the fact that it's hard to see in text and might be hard to parse.

But the way, just a correction: LaTeX encourages $...$ and \[...\] for math mode; $...$ and $$...$$ are deprecated.

ctrlcctrlv · 2014-07-06T14:55:52Z

why not

```lang
 ex
```

like github?

mgrabovsky · 2014-07-06T16:45:26Z

Yes, that's what I was suggesting by “Markdown's triple-backquote”.

Admin-Kaf · 2014-07-06T21:43:07Z

Or a double one: …

Also since the opening tag is the same than the ending how can it differentiate:

code

code bis

and this? :
code with
shit
inside

@mgrabovsky:
Close enough, it's two times the same character it's suits perfectly for latex.

Admin-Kaf · 2014-07-06T21:46:48Z

Ok github answered itself. (it seems that a double on is enough for github to be taken as a code tag)
We should however take into account the `` being alone on a line because it can be used for SQL for exemple.

ctrlcctrlv · 2014-07-07T20:28:12Z

If any of you wants to submit a PR that would be good. I would support GeSHi and Github style markup by default.

This is not high on my todo list right now.

DirectorCleese · 2014-10-16T12:07:39Z

Easiest solution: Have a page that lists what the markup is. I don't care if it is in Cantonese. I am finding your github more easily than a markup list, you might be doing it wrong.

On that note, while harder to remember for new users, I agree short character sets are best since they are less verbose.

czaks · 2015-03-31T02:35:33Z

The problem with that is that we still don't have a proper parser, just a bunch for regular expressions. I have a workaround in mind.

For example, let's suppose that [b]...[/b] means bold text and [code=Lisp]...[/code] means code markup. Not that i'm suggesting those, let's take it for granted.

How about such a code:

[code=Lisp][b]LOL DONGS[/b][/code]

What would our current “parser” do? It would probably encode it as:

<code><font color='red'>&lt;b&gt;LOL DONGS&lt;/b&gt;</font></code>

...if markup was run first, or:

<code><font color='red'><b>LOL DONGS</b></font></code>

...if the regexp things were run first.

@ctrlcctrlv how do you solve it in infinity?

czaks · 2015-04-12T03:06:20Z

Ok, one of the recent commits introduced one of the syntaxes: triple backquote, like github and [code], to be manually enabled by board admin. We still don't have a code highlighting solution, but it should be easy.

ctrlcctrlv closed this as completed Nov 5, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support syntax highlighting #67

Support syntax highlighting #67

mgrabovsky commented Jun 27, 2014

infamousbutterly commented Jun 27, 2014

Admin-Kaf commented Jun 29, 2014

mgrabovsky commented Jun 30, 2014

czaks commented Jun 30, 2014

czaks commented Jun 30, 2014

ctrlcctrlv commented Jun 30, 2014

ghost commented Jun 30, 2014

Admin-Kaf commented Jul 5, 2014

mgrabovsky commented Jul 5, 2014

Admin-Kaf commented Jul 5, 2014

mgrabovsky commented Jul 6, 2014

ctrlcctrlv commented Jul 6, 2014

mgrabovsky commented Jul 6, 2014

Admin-Kaf commented Jul 6, 2014

Admin-Kaf commented Jul 6, 2014

ctrlcctrlv commented Jul 7, 2014

DirectorCleese commented Oct 16, 2014

czaks commented Mar 31, 2015

czaks commented Apr 12, 2015

Support syntax highlighting #67

Support syntax highlighting #67

Comments

mgrabovsky commented Jun 27, 2014

infamousbutterly commented Jun 27, 2014

Admin-Kaf commented Jun 29, 2014

mgrabovsky commented Jun 30, 2014

czaks commented Jun 30, 2014

czaks commented Jun 30, 2014

ctrlcctrlv commented Jun 30, 2014

ghost commented Jun 30, 2014

Admin-Kaf commented Jul 5, 2014

mgrabovsky commented Jul 5, 2014

Admin-Kaf commented Jul 5, 2014

mgrabovsky commented Jul 6, 2014

ctrlcctrlv commented Jul 6, 2014

mgrabovsky commented Jul 6, 2014

Admin-Kaf commented Jul 6, 2014

Admin-Kaf commented Jul 6, 2014

ctrlcctrlv commented Jul 7, 2014

DirectorCleese commented Oct 16, 2014

czaks commented Mar 31, 2015

czaks commented Apr 12, 2015