-
-
Notifications
You must be signed in to change notification settings - Fork 7.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
composebox_typeahead: Avoid generating broken links. #30071
base: main
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have made a preliminary implementation and highlighted the reasons behind some technical choices.
Once we decide the correct logic to detect faulty topic names, I'll refactor the functions and put them in a sensible place.
web/src/composebox_typeahead.js
Outdated
@@ -995,8 +997,29 @@ export function content_typeahead_selected(item, query, input_element, event) { | |||
// Stream + topic mention typeahead; close the stream+topic mention syntax | |||
// with the topic and the final **. Note that token.length can be 0 | |||
// if we are completing from `**streamname>`. | |||
function will_produce_broken_link(topic) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Currently, I have considered only two cases which produced buggy links:
- When there are two or more backticks
- When the topic name begins or ends with asterisks.
There can still be topic names which cause broken links. I had earlier tried to parse the syntax text using the frontend markdown parser and see if the link in the parsed html points to the correct stream and topic. But double-backtick topic names gave different urls through frontend and backend markdown processors (in fact, the frontend parser correctly parsed the double-backtick topic name, while the backend processor generated an incorrect url).
This means that checking if the frontend markdown processor parses the syntax correctly gives no guarantee about the correctness of the link in the final message. So I thought the best option would be to check for the individual reported cases which produce broken urls.
I infer from CZO discussions that we will eventually move to a different markdown processor, and this will no longer be an issue then.
@timabbott Do you want to take a look and provide some feedback here? |
07cab3f
to
dbcdfa0
Compare
web/src/composebox_typeahead.ts
Outdated
beginning.slice(0, syntax_start_index) + | ||
url_syntax(get_stream_name(beginning), item.topic) + | ||
" "; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you extract this logic to a function and/or add unit tests for it? It's complicated enough that it's very much worth having a test for.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have extracted the logic and added some unit tests.
dbcdfa0
to
0a026be
Compare
0a026be
to
ff0ba14
Compare
@timabbott Addressed the suggestion. Please take a look. |
ff0ba14
to
7455824
Compare
); | ||
assert.equal( | ||
ct.stream_topic_link_syntax("#**stream>t", "*asterisk"), | ||
"[#stream>*asterisk](#narrow/stream/-1-unknown/topic/*asterisk) ", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why do all of these have -1
as the stream ID in the URLs? The first couple don't, which makes me think you've got some sort of weird bug in either one test case or the other.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
stream ids are only required when we generate the markdown link syntax (since we need to generate the url ourselves).
That is why the first two didn't need them.
web/src/composebox_typeahead.ts
Outdated
@@ -1180,6 +1185,31 @@ export function content_typeahead_selected( | |||
return beginning + rest; | |||
} | |||
|
|||
function will_produce_broken_stream_topic_link(topic_name: string): boolean { | |||
return /(\*+)|(.*`.*`.*)/.test(topic_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The links for topic names with double backticks are now generated correctly, but the double backticks cause the enclosed part of the topic name to render as an inline code block.
I don't know what to do in this case since backslash escaping is disabled.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think there's no great answer for that at present. This is #18202.
As long as the link works correctly, though, that's a much less severe issue than #19873 is.
A possible workaround is to use HTML character references, as suggested here: #18202 (comment)
I'd definitely leave that as a separate later commit, though.
a3c5613
to
8c243b1
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @kuv2707! This is an old issue people have repeatedly run into, and I'm happy to see progress being made on it.
web/src/composebox_typeahead.ts
Outdated
@@ -1180,6 +1185,31 @@ export function content_typeahead_selected( | |||
return beginning + rest; | |||
} | |||
|
|||
function will_produce_broken_stream_topic_link(topic_name: string): boolean { | |||
return /(\*+)|(.*`.*`.*)/.test(topic_name); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think there's no great answer for that at present. This is #18202.
As long as the link works correctly, though, that's a much less severe issue than #19873 is.
A possible workaround is to use HTML character references, as suggested here: #18202 (comment)
I'd definitely leave that as a separate later commit, though.
7704b04
to
d15f3fd
Compare
@alya Does this PR have to go through the buddy and mentor review stages? (Since it's already reviewed by maintainers once) |
I think it's OK to skip those at this stage. Has all the feedback above been addressed? |
d15f3fd
to
cbc52bb
Compare
@alya I have addresed all the feedback. Changes:
I have updated the PR description with the screenshots. I've initiated a discussion about some decisions regarding corner cases here |
web/src/composebox_typeahead.ts
Outdated
const end = syntax.lastIndexOf(">"); | ||
return syntax.slice(start + 3, end); | ||
} | ||
export function replace_markdown_characters(text: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You should have a blank line between functions.
Also, usually it's better to add new functions above the one function that uses them.
I also think a better name would be something like html_escape_markdown_syntax_characters
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done
web/src/composebox_typeahead.ts
Outdated
) { | ||
return url_syntax(stream_name, topic_name); | ||
} | ||
return `#**${stream_name}>${topic_name}** `; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is there a trailing space here? Seems like it should be added by the caller if desired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I noticed in the module that a trailing space was always put after inserting any sort of syntax, so I included it here itself.
Moved the spaces to the caller.
web/src/composebox_typeahead.ts
Outdated
return invalid_stream_topic_regex.test(word); | ||
} | ||
|
||
function get_stream_name(syntax: string): string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function name is too generic; something like get_stream_name_from_topic_link_syntax
would be more appropriate, though I've not checked exactly how it is used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This name matches what it does.
Changed.
The #**stream>topic** syntax generates broken links for topics containing two backticks or ending with *, because of architectural flaws in the backend markdown processor. So we avoid generating the syntax for such topics and instead generate the normal link syntax in markdown. Fixes zulip#19873
cbc52bb
to
a436ce8
Compare
@timabbott I have addressed the feedback. |
The
#**stream>topic**
syntax generates broken links for topics containing two backticks or starting/ending with *, because of architectural flaws in the backend markdown processor. So we avoid generating the syntax for such topics and instead generate the normal link syntax in markdown.Fixes #19873
CZO discussion
Screenshots and Screen Captures
Self-review checklist
(variable names, code reuse, readability, etc.).
Communicate decisions, questions, and potential concerns.
Individual commits are ready for review (see commit discipline).
Completed manual review and testing of the following: