Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Text formatting #853

Open
Exagone313 opened this issue Apr 4, 2017 · 70 comments
Assignees
Labels

Comments

@Exagone313
Copy link

@Exagone313 Exagone313 commented Apr 4, 2017

Hi,

I worked a bit to support text formatting for ** to <strong> and * to <em>. You can find a demo here.

I'm wondering if such feature would be acceptable? I would submit a PR if yes. Here is my current diff, that works only for messages sent on the current instance (this code isn't ready for prod, I'd need some help for that):

diff --git a/app/lib/formatter.rb b/app/lib/formatter.rb
index da7ad202..9cf07400 100644
--- a/app/lib/formatter.rb
+++ b/app/lib/formatter.rb
@@ -14,6 +14,7 @@ class Formatter
 
     html = status.text
     html = encode(html)
+    html = strong_em_html(html)
     html = simple_format(html, {}, sanitize: false)
     html = html.gsub(/\n/, '')
     html = link_urls(html)
@@ -36,6 +37,7 @@ class Formatter
     return reformat(account.note) unless account.local?
 
     html = encode(account.note)
+    html = strong_em_html(html)
     html = link_urls(html)
     html = link_accounts(html)
     html = link_hashtags(html)
@@ -98,4 +100,8 @@ class Formatter
   def mention_html(match, account)
     "#{match.split('@').first}<a href=\"#{TagManager.instance.url_for(account)}\" class=\"h-card u-url p-nickname mention\">@<span>#{account.username}</span></a>"
   end
+
+  def strong_em_html(html)
+    html.gsub(/(^| )[*][*]([^*]+)[*][*]( |$)/, '\1<strong>\2</strong>\3').gsub(/(^| )[*]([^*]+)[*]( |$)/, '\1<em>\2</em>\3')
+  end
 end

  • I searched or browsed the repo’s other issues to ensure this is not a duplicate.
@sjml

This comment has been minimized.

Copy link

@sjml sjml commented Apr 4, 2017

Why not do a more complete support of Markdown? Ultimately this is something that could live at the UI level instead of the server so different clients (including the web) could adapt as needed.

@krainboltgreene

This comment has been minimized.

Copy link
Member

@krainboltgreene krainboltgreene commented Apr 4, 2017

Agreed, lets put this at the UI level at least and then perhaps handle markdown.

@valentin2105

This comment has been minimized.

Copy link
Contributor

@valentin2105 valentin2105 commented Apr 6, 2017

It would be awesome to have code support too. (without syntax for now).

@diomed

This comment has been minimized.

Copy link
Contributor

@diomed diomed commented Apr 6, 2017

send a PR. can't hurt.
there's also some other people interested in this, including me:

#761

@Exagone313

This comment has been minimized.

Copy link
Author

@Exagone313 Exagone313 commented Apr 6, 2017

I don't agree with any different format for links. It's not handy on phone and one could try phishing.

@sjml

This comment has been minimized.

Copy link

@sjml sjml commented Apr 6, 2017

That's a good point. Would it be confusing, though, to have Markdown minus the linking syntax?

@krainboltgreene

This comment has been minimized.

Copy link
Member

@krainboltgreene krainboltgreene commented Apr 6, 2017

@sjml

This comment has been minimized.

Copy link

@sjml sjml commented Apr 6, 2017

Right; I think @Exagone313 was implying, though, that the link syntax could be used for phishing and thus we may want to avoid it.

Lots of other sites use Markdown, though, without this being an urgent issue. Maybe not worth the hassle at this phase.

@krainboltgreene

This comment has been minimized.

Copy link
Member

@krainboltgreene krainboltgreene commented Apr 9, 2017

Lets use this as the central point for any sort of formatting of messages on the UI.

Duplicates/Similar: #1305 #761 #1031

@krainboltgreene

This comment has been minimized.

Copy link
Member

@krainboltgreene krainboltgreene commented Apr 9, 2017

@Gargron We get a lot of requests for this and there are a lot of ways to solve this issue, so this is going to need your approval if mastodon core is going to support this natively.

@Spunkie

This comment has been minimized.

Copy link

@Spunkie Spunkie commented Apr 9, 2017

This was previous posted in #761, moving it here.

I just want to 2nd @diomed & @valentin2105 here, inline code,

code blocks,

and quote blocks would be of daily use to me.

From the conversations I've had with maintainers of other large open source projects(like chocolatey) the piecemeal approach to implementing markdown will only lead to headaches down the line though. Either support markdown in its entirety or not at all.

If you are going to implement markdown it would be wise to offload the work to a library that is already CommonMark compliant. Most large services seem to be moving towards it commonmark, even github's own spec is based off it now.

Also some have mentioned in the various threads that supporting markdown will lead to visual clutter but from my experience using gitter.im I've seen that a splash of markdown can increase readability of small format content 100 fold. I've been made a markdown believer if you will. Ya sure there will be some people that abuse it or go overboard but that is why we have the mute/block function.

@zacanger

This comment has been minimized.

Copy link
Contributor

@zacanger zacanger commented Apr 10, 2017

If anyone's interested, I have a small (purely client-side) renderer working here. It's a very small subset of Markdown, because of comments made on this issue and the other related ones. Justification for the decisions I made in it are in comments. I'm planning to try it out as a Chromium extension sometime in the next few days and see how it goes.

@nykh

This comment has been minimized.

Copy link

@nykh nykh commented Apr 10, 2017

By the way I have not fully understand @Exagone313 's suggestion that link syntax can be used for phishing. The site I mentioned (Plurk) has link for so long and have not had a problem with that at all. Just render the link so that it is distinctive enough. People should have enough savviness to hover and peek the URL before clicking. But I kind of understand the caution.

@zacanger 's implementation looks quite good. Although I feel eating up the \n (for markdown compliance, obviously) is not very user-friendly in this context...

@zacanger

This comment has been minimized.

Copy link
Contributor

@zacanger zacanger commented Apr 11, 2017

If you're talking about the last line (.replace(/\n\s*\n/g, '\n')), consecutive newlines are already collapsed in toots (not sure where in the code that happens, but I tested it out here). If you're talking about something else, I misunderstood :$.

@zacanger

This comment has been minimized.

Copy link
Contributor

@zacanger zacanger commented Apr 13, 2017

Something that I hadn't thought about that might be important: I have no idea how GNU Social and other apps consume posts. RSS? Would this break things (or make things look really funky) for them? Does it matter (would small amounts of Markdown syntax showing up in feeds be a big deal to anyone)?

@krainboltgreene

This comment has been minimized.

Copy link
Member

@krainboltgreene krainboltgreene commented Apr 13, 2017

It's the OStatus protocol.

Here's my bid on this issue:

  1. Markdown components, ignore []() syntax (easy)
  2. Plain text pushes (easy)
  3. Syntax highlighting for code (hard)
@zacanger

This comment has been minimized.

Copy link
Contributor

@zacanger zacanger commented Apr 15, 2017

Sorry, I meant the specific format, not the protocol. I looked it up, seems to be Atom-ish underneath?

I have a POC here using a tiny renderer I put together that has a shorthand option (* _ ~ instead of ** __ ~~) and doesn't allow links and images. ol, ul, and blockquote work in the renderer but are broken in Mastodon because of inserted brs, but I can work on that some more if it seems like a good direction to be going in. Thoughts? I'm especially interested in how this might affect GS/other apps ( @rainyday would you be interested in reviewing if this makes it to a PR? ).

@rainyday

This comment has been minimized.

Copy link
Contributor

@rainyday rainyday commented Apr 16, 2017

I'm just some rando who has submitted a few PR's but in my opinion some limited subset of markdown, like the features dealing strong em ul ol and blockquote wouldn't be problematic on their own since the markdown syntax is pretty much identical to how those features are represented in plaintext normally. However, to my knowledge existing ostatus applications only parse br, a and span (and p in the case of mastodon which is the cause of current formatting issues between mastodon and other servers). So formatting would probably have to either be done client side in the frontend and every other client application or by adding an additional mastadon-specific element to the atom document (like it currently does with mastodon:scope) that contains a rendered version of the post content which would increase the size of activity streams.

Both solutions have their downsides so it kinda depends on the goals of the mastadon devs which, if any, of the solutions they'ed want to implement.

@denysvitali

This comment has been minimized.

Copy link

@denysvitali denysvitali commented Apr 19, 2017

For anyone wondering: I'm working on this issue
screenshot from 2017-04-20 01-00-37

@Gargron

This comment has been minimized.

Copy link
Member

@Gargron Gargron commented Apr 19, 2017

@denysvitali Are you using anything like Kramdown or RedCarpet? I don't know if those implementations allow to selectively disable Markdown features, but using an established library would be preferable

@denysvitali

This comment has been minimized.

Copy link

@denysvitali denysvitali commented Apr 20, 2017

@Gargron Actually I'm not using any library, just regex that IMHO are smaller and faster. But I agree on your point about the stability. I'm testing this method atm, we'll see how it goes

@zacanger

This comment has been minimized.

Copy link
Contributor

@zacanger zacanger commented Apr 20, 2017

Is it preferred that this happens on the server instead of client?

@denysvitali

This comment has been minimized.

Copy link

@denysvitali denysvitali commented Apr 20, 2017

@Gargron We're testing it right now at dev.mastodonti.co, if you want you can join and test it. (Registrations are open for a few minutes)

Edit:
You can also see the results here

@ThisIsMissEm

This comment has been minimized.

Copy link
Contributor

@ThisIsMissEm ThisIsMissEm commented Apr 15, 2018

Honestly, I fundamentally disagree with this proposal for few core reasons:

  1. It adds a lot of visual clutter to the UI
  2. It increases the attack surface when posting a toot (phishing links is just one thing, but there’s also frequently security issues in markdown parsers/renderers)
  3. it’s yet more for non-technical users to have to learn, which adds friction to new user onboarding. The user writes a toot with some of the special formatting syntax and then they post & their toot comes out weird & the complain going “<instance/Mastodon> is broken!! It won’t let me post how I want to”

If this is implemented, it should be:

  • able to be disabled by instance admin
  • able to be disabled by users

If enabled, it should also give a user a preview of how the toot will look, as there’s no edit toot afterwards.

I fully understand that technical powerusers (like that which is a decent percentage of the voices in the development community) do understand how to use markdown, you also have to see how most other people don’t: have you ever tried teaching a non-technical user markdown? It’s pretty hard, and confuses them.

@denysvitali

This comment has been minimized.

Copy link

@denysvitali denysvitali commented Apr 15, 2018

@ThisIsMissEm

phishing links

This "feature" is already disabled in most of the Markdown parsers


The user writes a toot with some of the special formatting syntax and then they post & their toot comes out weird & the complain going “<instance/Mastodon> is broken!! It won’t let me post how I want to”

I call that a user fault. I don't think Mastodon purpose is to share special formatted text: if that's the case, then we need code fences.


If this is implemented, it should be:

  • able to be disabled by instance admin
  • able to be disabled by users

Agreed, a simple [MD] button may be sufficient when tooting.


have you ever tried teaching a non-technical user markdown? It’s pretty hard, and confuses them.

Seriously? There is nothing difficult to understand behind **bold** means bold and _italic_ means italic. Markdown is currently supported on IM clients like Telegram and Whatsapp - meaning that their userbase is already familiar w/ the syntax, plus Markdown has been around since 2004.

I find MD to be more user-friendly than BBCODE (or any other formatter-syntax I know of) and it falls back nicely - as already stated in this thread.

@Gargron

This comment has been minimized.

Copy link
Member

@Gargron Gargron commented Apr 15, 2018

Yeah, I'm not in favour of this either. Which I now find I never mentioned in this thread, oops. I feel like it's scope creep. Mastodon is not macroblogging, it doesn't need rich formatting. There are single cases where formatting would be nice, such as sharing code snippets or memeing with bold italic but they're not crucial imo.

@mftrhu

This comment has been minimized.

Copy link

@mftrhu mftrhu commented Apr 15, 2018

My two cents: an [MD] button would be a good compromise, but a lot of people already know Markdown from IM & comments systems. Like for IM a full implementation of Markdown is not required, as 500 characters are not enough to need, say, tables or even headings.

Strong, italics, inline code and autolinks (<...>) would be plenty while not giving much of an attack surface if at all. Admittedly, I don't know if an implementation this stripped down - both Ruby & JS for the preview - exists, which means possibly introducing bugs while writing it.

@ghost

This comment has been minimized.

Copy link

@ghost ghost commented Apr 15, 2018

Mastodon is not macroblogging, it doesn't need rich formatting.

What is and what isn't needed for micro-blogging is unknown. I guess you mean it's not done on Twitter, but we are not Twitter and if we want to proof we aren't a Twitter clone (that we are not) it's a good thing to offer some features Twitter doesn't have.

But I agree Mastodon shouldn't have very rich formatting, but a simple sub-set of markdown. Like bold, italic, links, code, quotes and lists.

Edit: And I agree that a MD button and a preview is not needed, cause not much can go wrong. Although a preview would be nice to have, with or without markdown.

@zacanger

This comment has been minimized.

Copy link
Contributor

@zacanger zacanger commented Apr 15, 2018

I disagree with the concerns about phishing and education. More than one option that only includes a subset of markdown has been proposed, and as other folks have mentioned, several other apps already use markdown or markdown-like formatting (also including Slack and Discord). Whether it's scope creep or not... ¯\_(ツ)_/¯

@timmc

This comment has been minimized.

Copy link

@timmc timmc commented Apr 15, 2018

Is it correct to say that OStatus and ActivityPub support a subset of HTML, and that we're currently only using it for links? Or are URLs linkified from plaintext on the client/receiving server side?

If the former, I don't see a reason that we have to force Markdown on anyone. The default can be plaintext posting, and users can opt in to Markdown posting, or any other kind of markup, even filtered HTML—whatever the sending server and client want to support. It's not going to confuse people to see italics and code blocks in their timeline, and they don't have to be surprised by asterisks and backticks doing weird things to their text if they're never encountered Markdown.

(This is a separate question from whether formatting is desirable overall, or which subset should be allowed if it is desired...)

@trwnh

This comment has been minimized.

Copy link
Contributor

@trwnh trwnh commented Apr 18, 2018

Generally agreeing with the sentiments that Mastodon should be for plaintext microblogs. If you want rich formatting like bold/italics/code/etc, then it definitely feels out of scope here -- seems more appropriate for something like Friendica or Hubzilla maybe, or at least apps that are more oriented toward longform posts. Bringing up examples of Whatsapp and Telegram supporting Markdown doesn't really mean that it's appropriate there, either -- sure, it's another feature, but are you really going to be formatting your texts to your friends? Plaintext should work fine there, and it should work fine here too.

I guess the analogy to use is the difference between an ActivityStreams Vocabulary "Note" vs. "Article" -- semantically, toots are short and simple (Notes), and not rich or longform (Articles).

@Exagone313

This comment has been minimized.

Copy link
Author

@Exagone313 Exagone313 commented Apr 18, 2018

Guys, it's been a year I have submitted this feature request, and you're still divided. Just for that, I'd rather approve it to be canceled.
I was thinking of basic text formatting, em/strong/code, with a simple syntax like markdown. I understand that one might want to disable this feature for a post, if escaping is not wanted.
I don't agree with wider text formatting for the reasons given against all text formatting above (pick the right support).

It would be at least nice to have ActivitySub to support some additional HTML tags, but with a fallback on standard Mastodon.

@voronoipotato

This comment has been minimized.

Copy link

@voronoipotato voronoipotato commented Apr 18, 2018

You can already use Markdown on your instance via https://github.com/showdownjs/showdown . Some instances already are. Someone should do a pull request including markdown support via javascript, and then if it gets rejected that's the end of that. If someone doesn't do it by the end of the week I guess I will start on it. Right now I'm working on adding subscriptions to peertube so that's why I pushed it out to the end of the week because the only time I'm available is after work.

Support requires:

  • Markdown
  • Write / Preview tabs
  • Formatting buttons for headers
  • Formatting button for bold
  • Formatting button for italics
  • Formatting button for quotes
  • Formatting button for code blocks
  • Formatting button for strikethrough
  • Formatting button for links (to prevent the promulgation of tinyurls which are the same thing but worse)
  • Formatting button for bulleted lists
  • Formatting button for numbered lists
  • Formatting button for task list
  • Links should show the full url on hover

Bonus requirements:

  • Shortcut for bold (ctrl-b)
  • Shortcut for italics (ctrl-i)
  • Shortcut for links (ctrl-k)
  • Simple way to disable markdown for your instance
@Gargron

This comment has been minimized.

Copy link
Member

@Gargron Gargron commented Apr 20, 2018

It would be at least nice to have ActivitySub to support some additional HTML tags, but with a fallback on standard Mastodon.

ActivityPub supports any HTML tags in content, though Mastodon sanitizes this input on reception. That is to say, other ActivityPub servers or modified Mastodon instances can use whatever formatting they like, and adjust their sanitizers; vanilla Mastodon users will not be affected by strong/h3/em/code tags because it will look like plaintext.

You can already use Markdown on your instance via https://github.com/showdownjs/showdown . Some instances already are.

Mmmmm if you submit HTML into the API, it will rightfully be HTML-encoded, so no, you cannot do that, and if some instances do this they either do it differently or are taking risks.

Links should show the full url on hover

They already do that though?


Anyway, rich formatting can have a place in the fediverse (it's not even for me to say what does and doesn't, it's merely an observation on my part). Macroblogging and microblogging can be compatible. For example, Mastodon uses the Note type for toots. These are expected to be short notes. A blog article would be an Article instead. What Mastodon does with Article objects, it takes the title and the URL, and shows it as a toot with title and link to the post, rather than displaying the full article content in the toot.

@joyeusenoelle joyeusenoelle referenced this issue May 10, 2018
1 of 1 task complete
@wiktor-k

This comment has been minimized.

Copy link
Contributor

@wiktor-k wiktor-k commented May 11, 2018

Just for reference ActivityPub has special field (source) for markup/source text used by the author:

{
  "@context": ["https://www.w3.org/ns/activitystreams",
               {"@language": "en"}],
  "type": "Note",
  "id": "http://postparty.example/p/2415",
  "content": "<p>I <em>really</em> like strawberries!</p>",
  "source": {
    "content": "I *really* like strawberries!",
    "mediaType": "text/markdown"}
}

The entire section is worth reading: https://www.w3.org/TR/activitypub/#source-property

abcang added a commit to pixiv/mastodon that referenced this issue Aug 23, 2018
Add secure option to additional cookie (tootsuite#8069)
@trwnh trwnh referenced this issue Aug 29, 2018
1 of 1 task complete
@kizu

This comment has been minimized.

Copy link

@kizu kizu commented Sep 3, 2018

I'd want a subset of markdown on mastodon. While I'd agree that it doesn't need to have all its features, there are things that can be useful even in microblogging: bold, italic, inline code blocks, lists, quotes.

One of the things why they're good (and I didn't see it mentioned there) — they're good for accessibility, as screen readers would read those with proper semantics.

And the mentioned above subset have perfect fallbacks to plaintext, so those who don't like them could have the plaintext as an option without losing anything.

Implementation-wise, I'd say that it should be stored as is, as the implementation of a subset shouldn't be really hard.

Mastodon already makes paragraphs out of toots with line breaks, and making proper semantic lists, blockquotes and other lighter formatting would be very nice and accessible as well.

@Gargron Gargron added suggestion and removed enhancement labels Oct 20, 2018
@wiktor-k

This comment has been minimized.

Copy link
Contributor

@wiktor-k wiktor-k commented Dec 19, 2018

Another thing worth linking here: XMPP clients like Conversations use a very limited subset of Markdown called Message Styling that cover most of use cases (at least for me): code blocks and code spans, bold and italics.

Gargron added a commit that referenced this issue May 19, 2019
Fix #853
Gargron added a commit that referenced this issue May 21, 2019
Fix #853
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.