Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Already on GitHub? Sign in to your account

What should uploaded content look like on IRC channels? #258

Closed
Kegsay opened this Issue Oct 26, 2016 · 29 comments

Comments

Projects
None yet
7 participants
Contributor

Kegsay commented Oct 26, 2016

When a Matrix user uploads an image/file/etc to a room, the IRC bridge needs to convert that into text to send into the IRC channel. What should this text look like?

Related: When a Matrix user sends a block of text into a room, the IRC bridge "pastebins" the content by uploading it as a .txt file, which then is sent into the IRC channel. What should this text look like?

There seems to be 2 camps of people who want just the raw link vs people who want a wordier "so-and-so uploaded a thing", and I am using this issue to build consensus.

See:

Collaborator

Mikaela commented Oct 26, 2016

I prefer there being text too as random URL without context can look shady and leave unclicked.

jansol commented Oct 26, 2016

No IRC user I know sends anything other than the URL when pastebinning stuff. And for convenience reasons the entire message should fit on a single line in a standard 80-column terminal.

Of course here the pastebinning is not always intended so some explanation might be warranted. How "smart" do we want to be? Pastebinning triple-backtick-quoted text rarely needs an explanation, so triple backticks could just be replaced with URLs in most cases.

random URL without context can look shady

Well, it's a URL on your HS...

Collaborator

Mikaela commented Oct 26, 2016

Generally pastebins are easy to recognise as pastebins based on the URL (doesn't apply to https://matrix.org/...) and while I know the URL belongs to my homeserver, does random IRC user know what is homeserver or even Matrix other than movie trilogy?

jansol commented Oct 26, 2016

I hope your random IRC user at least knows where they are telling their client to connect. Especially if they are the paranoid sort... It shouldn't be difficult to recognize the URL based on that.

Another problem with adding unnecessary text is that it'll mess up non-english channels.

I am personally fine with just a link, but maybe there's a third option? I know a lot of channels have bots that lookup links that are pasted and provide some information about them. In one channel I'm on in Rizon, the channel bot will speak and provide some context about links posted. e.g. for a Youtube video:

ThatGeoGuy+> https://youtu.be/some-youtube-link
Ultrabot&> [URI 1191 by "ThatGeoGuy"] Title: My Awesome Video - Youtube
ThatGeoGuy+> https://thatgeoguy.ca/some-image.png
Ultrabot&> [URI 1192 by "ThatGeoGuy"] PNG Image file. Size: 9.39 Kio.

Personally I think this sort of feature is best left to bots on the IRC side, since it can be customized to do what the users in a channel want, and depending on how you program the bot you can have it post different things depending on the MIME type associated with the URI or not. Note that this gets annoying if you have two separate services / bots that provide this behaviour, which is why I think URLs should be left alone, and handled by bots IRC side, since there will be some variance from channel to channel whether or not this feature is useful or desired. Imagine what that would look like with the above example:

ThatGeoGuy[m]+> [PNG Image (9.39 Kio) by "ThatGeoGuy[m]"] https://matrix.org/some-hash-of-file.png
Ultrabot&> [URI 1193 by "ThatGeoGuy[m]"] PNG Image file. Size 9.39 Kio.

There's a lot of information there, and most of it is redundant. In my experience most IRC users don't want to see all of that, which is why IGNORE is often used on everything from joins, parts, etc so that there's a higher signal to noise ratio in the channel. Of course if you could /ignore matrixbot then sure, that'd be great, but because the bridge posts everything through a ghost user you don't really want people using IGNORE on you.

Just my 2 cents, I'm happy to clarify anything or provide more context if necessary.

Collaborator

Mikaela commented Oct 27, 2016

Most of IRC bots I have encountered will just tell Title: whatever (at domain.tld) (Supybot & deriatives) whcih would need the bridge to tell what the title is so those bots can do anything. And that isn't possible with other formats than HTML?

jansol commented Oct 27, 2016

In this case, since the bridge is actually the one who does the pastebinning AND puppets the bot, it knows a lot more about the link than 3rd party bots would. So no problem with the lack of metadata in the link itself? Also, 3rd party bots would also be bridged back to matrix, further clobbering the discussion and confusing the potato out of matrix users who have no clue what link the bot is referring to, since the matrix side doesn't see it.

Also, 3rd party bots would also be bridged back to matrix, further clobbering the discussion and confusing the potato out of matrix users who have no clue what link the bot is referring to, since the matrix side doesn't see it.

I actually didn't even consider this, but this is a good point to bring up. Maybe the point is to ask that bot-writers ignore URIs to Matrix, but I doubt that would get much traction (e.g. why should I have to re-write my bot because some user from Matrix joined my channel?). That said, if you upload a file and a URI bot describes a "link" that isn't seen Matrix-side, it should be pretty obvious what the bot is describing. The only confusion I can see here is if you accidentally trigger uploading to a pastebin-like service, in which case you'll wonder why the bot is describing your message.

TJuberg commented Nov 1, 2016

Pastebinning should at minimum be configurable so it can be disabled on a per room basis.
Personally I don't believe any text output should automatically converted to pastebin.

Contributor

Kegsay commented Nov 2, 2016

Pastebinning should at minimum be configurable so it can be disabled on a per room basis.

Pastebinning support isn't for convenience, it's to protect the bridge from getting k-lined due to flooding channels over a prolonged period. See the relevant issue for background on this. We do now use IPv6 so the impact would just be the client's connection rather than the entire bridge, which helps, but given how people can naively trigger this by putting code blocks into rooms, I'm not willing to make this configurable by users.

Back on topic, it looks like IRCCloud just put the URL for images and the like, without any sugar-coating with "so-and-so uploaded..":

<daviddias> dignifiedquire: but keep the contribution
<dignifiedquire> ?
<dignifiedquire> what contribution?
<daviddias> https://usercontent.irccloud-cdn.com/file/zCDQ5H6Z/
<daviddias> seems like you did it already :) 

This may set some precedent if people are already familiar with how IRCCloud does things.

Contributor

Kegsay commented Nov 9, 2016

@Mikaela can you help me out here? I seem to recall that we had to change the IRC bridge from sending NOTICE to sending PRIVMSG because IRC users were complaining about it. Can you please remind me why IRC users dislike NOTICE for uploads? Something about IRC clients being dumb about where the text goes or something? I forget.

jansol commented Nov 9, 2016

IIRC it was because apparently some clients think it's a grand idea to ring the system bell every time a NOTICE is received.

Collaborator

Mikaela commented Nov 9, 2016

I hope you don't mind me pasting logs.

[16:22:41] <Yaniel> Mikaela: are we in agreement on #258 now?
[16:22:41] -Github[m]- https://github.com/matrix-org/matrix-appservice-irc/issues/258 : What should uploaded content look like on IRC channels?
[16:24:30] <Riotela> Yaniel: are you jansol? I agree that Matrix has more metadata than the bot and 3rd party bots bridging into Matrix clobbering and confusing.
[16:25:06] <Yaniel> but would the bridge running a separate bot to inform the IRC side of link metadata solve the problem?
[16:25:57] <Yaniel> and yeah
[16:26:26] <Riotela> Oh, I guess that could work, but what if the IRC channel again doesn't want a ghost user only saying something when Matrix users say something? And I understood that Matrix users aren't all on same server, so how does it behave in case of netsplit? Tell wrong side of the split the metadata it has?
[16:27:50] <Yaniel> oh, right
[16:28:31] <Yaniel> well, netsplits are rather unfortunate and problematic
[16:29:00] <Yaniel> and if the channel doesn't want the bot to hang around it could probably just send /notices without joining?
[16:29:45] <Riotela> It cannot sends /notice unless the channel is mode -n while freenode defaults to +cnt and so does ZNC and possibly other bigger IRC clients/bouncers
[16:31:50] <Yaniel> hmm actually, what about having the bridge send the notice as whoever did the upload
[16:32:10] <Yaniel> then people can ignore notices from said person if they want, while still seeing their normal msgs
[16:32:25] <Yaniel> OTOH they'd have to ignore notices from every matrix user separately
[16:36:33] <Riotela> I would support that, but it will make Matrix users target of harassment by users of stupid clients which developers haven't read the IRC RFC and AUDIBLE BEEP on every NOTICE they receive no matter from whom
[16:37:35] <kegan[m]> so ftr in the past we used to send file uploads as notices and basically got harassed due to ^ so we ended up making it actual messages
[16:38:04] <kegan[m]> so notices wouldn't really be acceptable - we tried it before
[16:38:26] <Yaniel> sad
[16:39:55] <kegan[m]> tell me about it -_-

4.4.2 Notice

      Command: NOTICE
   Parameters: <nickname> <text>

   The NOTICE message is used similarly to PRIVMSG.  The difference
   between NOTICE and PRIVMSG is that automatic replies must never be
   sent in response to a NOTICE message.  This rule applies to servers
   too - they must not send any error reply back to the client on
   receipt of a notice.  The object of this rule is to avoid loops
   between a client automatically sending something in response to
   something it received.  This is typically used by automatons (clients
   with either an AI or other interactive program controlling their
   actions) which are always seen to be replying lest they end up in a
   loop with another automaton.

   See PRIVMSG for more details on replies and examples.
Contributor

Kegsay commented Nov 9, 2016 edited

Right, that was it, thanks. So that rules out /me NOTICE style responses. Still leaves the question of what should the message text look like though. Choices seems to be:

  • Plain URL https://matrix.org/content/download/whatever
  • Plain URL with metadata https://matrix.org/content/download/whatever (floop.jpg,1457KB)
  • Wordy with URL Uploaded an image: https://matrix.org/content/download/whatever
  • Wordy with URL with metadata Uploaded an image: https://matrix.org/content/download/whatever (floop.jpg,1457KB)

And we need to decide for:

  • Image uploads
  • Video uploads
  • File uploads (e.g. .doc files)
  • Pastebin messages (long text)
Collaborator

Mikaela commented Nov 9, 2016

I would go with Plain URL with metadata as the metadata can make the URL less weird and threatening even if someone didn't trust the metadata. It wouldn't take that much space and work for channels that don't have those bots.

@Kegsay: Hm, not sure it does rule out "/me style responses" - is there a particular reason that ACTION is unacceptable here? In particular, most phrasings I've seen are, in fact, actions.

Contributor

Kegsay commented Nov 9, 2016

@eternaleye My bad, wrong words.

Contributor

Kegsay commented Nov 24, 2016

So are people generally happy with:

/me uploaded an image|file: $url

What about pastebinning long messages?

msackman commented Nov 26, 2016 edited

If it's plain text content, please just pass it straight through to IRC. IRC users do not expect any fancy formatting or highlighting and will be annoyed by having to copy and paste out a URL. People who choose to use IRC are perfectly happy to read non-syntax-highlighted code if it's pasted in. If it's some sort of richer media, then sure, do something behind a URL and come up with something sensible to add to the IRC channel (I would very much be in favour of whatever is pasted including the MIME type of the content behind the URL).

Just read #258 (comment) so that rather rules out the above sadly. In which case, please make it clear to the sender that a multiline message will end up behind a URL, which will likely annoy IRC users.

eternaleye commented Nov 26, 2016 edited

@msackman: Not viable. Matrix supports multi-line messages, which will get users disconnected for flooding on IRC. There needs to be some manner of resolving the impedance mismatch, and pastebinning long messages is what has been chosen.

EDIT: Ah, missed your edit.

@eternaleye Having read more about this issue, I understand that. My thought then is that when someone is sending a message, if it's known it's going to an IRC channel and it's known that the message is going to have to be replaced by a URL, then the user sending the message should be warned about this.

eternaleye commented Nov 27, 2016 edited

@msackman: That's also nonviable. Consider the case of a bridged room, with a Matrix user connected via federation. That is:

User <-> HS1 <-> HS2 <-> Bridge <-> IRC

The user's message may have been sent, accepted, and federated a significant amount of time before (though 30s suffices for explanation).

The bridge cannot prevent the user from submitting it to HS1, nor HS1 federating it to HS2. At that point, it has four choices:

  1. Bridge the message as-is. Risk flooding. Die.
  2. Suppress the message. Lose data. Incur anger.
  3. Temp-suppress the message, somehow(?) prompt the user, who may no longer be online, to approve the message being pastebinned, or allow it to be fully suppressed. May arbitrarily delay the message, until the user notices.
  4. Just pastebin it.

Of the four, 1 and 2 are completely nonviable, 3 is nonviable in realistic circumstances around federation (and is a huge pain for users), and 4 is the current approach.

In addition, clients may very well not be aware that a room is bridged - any client side approach is thus also nonviable, as well as significantly increasing the load on client implementors (something Matrix works quite hard to minimize)

@eternaleye Thank you for the explanation. From a technical pov, I can appreciate the issue. I just hope you do take into account the fact that IRC users are stakeholders in this issue too. Currently I'm in the situation where members of some of my channels are using matrix.org because it allows them to get through company firewalls, so the fact they can be there at all is great. However, when sharing code, the fact that it uses a non-standard pastebin, does not do any syntax highlighting (and so offers no advantage to IRC users), and so just hides the plain content behind a URL is frustrating. Especially as the matrix users don't realise what's happening, so at least half the room is quite confused.

@msackman: Thank you for the additional information on the issues - one thing that might improve the situation is that there are ways in which the pastebin itself could be improved to handle the problems you describe.

For one, Matrix messages are rich-text, and the pastebin for long messages taking advantage of that (and syntax-highlighting code elements) is something that may very well be worth doing. I'll submit a separate ticket for that.

As for it being a nonstandard pastebin, there's a tradeoff there - specifically, it uses the file-upload API of the Matrix homeserver itself, rather than any dedicated pastebin service. Relying on a pastebin service would be an additional dependency (and a new API to support), while extending file-upload to have a pastebin UI would be a significant effort (and bloat an API into a UI).

On the topic of hiding it behind a plain URL... well, that's what the discussion here is about: How to better structure the actual message sent to IRC, so the behavior is clear to users.

Contributor

Kegsay commented Nov 28, 2016

@eternaleye: I agree with everything you've said. What do you think we should do for pastebinning long messages? Do you think /me style emotes for uploads is sensible? I'd like to resolve this sooner rather than later, rather than let this issue fester.

Contributor

Kegsay commented Jan 5, 2017

Well no one has said anything to the contrary, so /me style it is.

Contributor

Kegsay commented Mar 23, 2017

Totally happened too.

@Kegsay Kegsay closed this Mar 23, 2017

Collaborator

Mikaela commented Mar 24, 2017

I found old todo about this which I forgot to comment earlier.

The current format is: * Michaela uploaded an image: screenshot.9.jpg (71KB) - https://matrix.org/_matrix/media/v1/download/disroot.org/mtewvcPGqPsWmYCQmZKEhdQW

And while this isn't an issue with the IRC bridge, being /me TeleIRC converts it to

Would it be possible to enclose the link to <> which might also be clearer for IRC users who have been confused about Matrix uploaded content.

  • This would make the format * Michaela: uploaded an image: screenshot.9.jpg (71KB) - <https://matrix.org/_matrix/media/v1/download/disroot.org/mtewvcPGqPsWmYCQmZKEhdQW>.
    • But could the dash be dropped in this case resulting to * Michaela: uploaded an image: screenshot.9.jpg (71KB) <https://matrix.org/_matrix/media/v1/download/disroot.org/mtewvcPGqPsWmYCQmZKEhdQW>?
Contributor

Kegsay commented Mar 24, 2017

It's a shame that the * messes it all up 😞

As an aside, please open a new issue for this and xref this issue, thanks! (This makes it easier to see what actions need to be done without having to wade through historical comments).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment