New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 9731: implement $(DDOC_PARAGRAPH) #5315
Conversation
#4338 is the previous take on this. |
The I prefer the use of the closing tag, it makes parsing easier (a fair chunk of code in my dom.d could be removed if all p's were closed consistently..) and it is required in xhtml but you're right that there's no need for it in html.... but still, the other PR kinda looks better to me. I gotta look closer. |
PS I actually came close to writing this myself today. It'd be really nice to have, even if imperfect. |
I think the other PR you referenced has better behavior and left some comments there... |
Although closing some tags may be "optional" in some contexts, it makes sensibly validating the HTML much more difficult. I've spent some effort fixing this recently (see my HTML fixes PRs), please don't undo my work. |
|
Thanks for reviewing. As I feared this is getting sidetracked into a discussion about |
BTW could please anyone double check the code? I'm not sure whether I've left some corner cases uncovered. FWIW: to the best of my knowledge this PR mimics what TeX does: the processor replaces runs of whitespace containing two or more newlines into one |
} | ||
// We got to a non-whitespace, time to insert the paragraph | ||
// break. | ||
static __gshared immutable ps = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there any need for all those qualifiers in new D code? You're just referring to a string literal, use ordinary immutable ps = "it";
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok
@adamdruppe: thanks, read your comments on #4338. I think #5315 is the right approach (just insert a paragraph separator). With #4338, I'm unclear on what happens e.g. if there's a macro call that has two newlines inside of it. Consider:
I wonder how this whole nesting could be properly wrapped in paragraphs. With the terminator there's no such issue - insert a paragraph separator wherever there's two or more newlines. |
We really should understand what HTML is when writing code to generate it. I'll grant that ddoc is going to be a leaky abstraction at some level, but let's try not to be gratuitously wrong just because we're too lazy to think through the semantics.
That's just totally wrong. What we're trying to do is to achieve a more perfect mapping to native, semantic paragraphs. |
Instead of a macro that is inserted in place of blank lines, how about a macro that wraps paragraphs (and is defined as Edit: Oh, that's what #4338 does. I think that's the better direction to pursue then. |
This problem applies to this PR as well, because we'll just be starting a paragraph in the middle of Some options:
Option 3 seems most practical... however, then it has the issue that given a list, the list items which contain 1 paragraph will have no So... #5315 (this PR) does have the advantage in consistency and ease of use, however we can't use it with |
@CyberShadow it's totally fine to have a paragraph break inside a macro expansion. @adamdruppe I think you're making this quite a bit more than it really is. |
Why do you think so? What if
|
I'm not sure just what the other PR's code would do, but given:
The output I would expect is (assuming the macros are defined to eponymous tags):
You'll notice that My preferred rule is: naked text that begins a section or immediately follows "Naked text" is defined as any text that is not in a code example section, not in a macro definition, and not entirely contained inside a macro itself (if naked text starts with This "naked text" might be determined before any macro expansion is done; perhaps the insertion of paragraph macros is done in a pre-processing stage on the raw text before other transformations and is run once and only once in the transformation process. I'm not sure that's necessary but it might be. A paragraph can never be empty or consist only of whitespace. Finally, let us not forget that For example:
I'm actually not 100% convinced of the |
@adamdruppe What about the case I mentioned, a list where one item has one paragraph and another item has two paragraphs? |
@CyberShadow : by "totally fine" I mean "it won't lead to unmatched parens". It will of course look like paragraphs do because that's what the user asked for! @adamdruppe: I understand what you're saying but the amount of analysis (parens on lines etc) required by that semantics is not matched by what you get from it. Again: this is what we need - collect 2 or more newlines into one paragraph separator. It's the simple, meaningful, motivated solution. I'll hop on IRC to address any questions. |
I agree. Unfortunately, HTML does not have a paragraph separator. We can invent one, but we'll be throwing semantics out the window. This might still be the best option though.
I'm not home (in the US) right now so can't come to IRC. |
I just realized too that I forgot to address the issue of macros with multiple arguments. Ugh. |
@CyberShadow: yes, that kind of DIV is what I had in mind. I'm not worried about Far as I can tell the formatting that one would add to the So after researching the matter I figured it's not worth trying to do some complicated processing that would support the use of |
On Tue, Dec 22, 2015 at 09:13:38AM -0800, Vladimir Panteleev wrote:
I don't see the post even scrolling up... But
would that cover your concern? |
@adamdruppe you need to show how to wrap general paragraphs that contain unbalanced parens. |
Thing is that it has to be, otherwise the spacing between list items will be inconsistent. |
OK, then I'm on board with this PR. However, the implementation seems to need more work. I can't judge the code but looking at the documentation tester's diffs, it looks like it's eating whitespace where it shouldn't. If I understand correctly, by itself this patch should not affect documentation output. |
Let's get an example:
Putting a block in the middle of a paragraph doesn't make sense anyway... it would by nature split the paragraph into two! But there's two solutions: the paragraph would probably end with the Or we could define the whole macro to be inside the one paragraph which means it would span those BTW
is technically broken html.... but it is good enough to parse... |
} | ||
if (c == '\n') | ||
{ | ||
iLineStart = scout + 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is probably what is breaking the autotester, since iLineStart is also used for the DDOC_BLANKLINE test up on line 2150: https://github.com/andralex/dmd/blob/DDOC_PARAGRAPH_SEPARATOR/src/doc.d#L2150
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
actually, playing with it, it is the buf.insert call that is hosing it. It advances i up beyond the blank lines.
yo i forked your fork and did a simpler implementation that is more correct in my little tests. God willing, it will pass the autotester. https://github.com/D-Programming-Language/dmd/pull/5316/files You should try it on your test cases too to make sure the paragraph separators actually go where you want them to. This is what I tried:
The output was the same before and after my patch. Kewl, the blank lines were all still there (that's what was missing in your version, the buf.insert skipped over them so they wouldn't be inserted with the old blankline macro). Now add some macro action:
Here's the code:
The I'm not gonna say this is correct across all possible cases, I opened my PR just to give you some code to try that hopefully saves you some time. I don't want to take ownership of this issue. |
@adamdruppe I think your PR is better. But I have trouble figuring how it doesn't emit one paragraph separator every two blank lines. Say you have 100 blank lines, it seems to me every 2 lines it'll emit one separator then reset the |
On Tue, Dec 22, 2015 at 05:39:31PM -0800, Andrei Alexandrescu wrote:
Since the output is at the default case, it won't write out a Though I think it might need to be done in some of the other cases |
And merging my PR automatically updated this one! I'd love to see PRs against PRs become a common thing. |
d571623
to
450e065
Compare
450e065
to
f4968e9
Compare
Alrighty, this looks like the ticket. |
thx @adamdruppe !! |
The auto tester seems to be complaining about whitespace again.... though I think the pull request I did didn't delete your old code. Lines 2175 to 2201 can be removed. Lines 2427 to 2438 are the new way that do it better. |
You can also look at the doc tester diffs, e.g. currently this patch is messing with code blocks' indentation. |
Yeah, it is doing that because lines 2175-2201 are still there. |
Well I did things a different way, which is simpler because it doesn't use nested loops. |
@@ -2301,100 +2330,103 @@ extern (C++) void highlightText(Scope* sc, Dsymbols* a, OutBuffer* buf, size_t o | |||
/* A line beginning with --- delimits a code section. | |||
* inCode tells us if it is start or end of a code section. | |||
*/ | |||
if (leadingBlank) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I found fit to remove this Ridiculously Long If Statement. 90 lines!
09d7a40
to
33bcb40
Compare
one more empty line to fix... brb... |
I got this to work but it's just too much for too little. I decided instead to go with #5344 and dlang/dlang.org#1186, which simplify things for everyone. |
This feature inserts
$(DDOC_PARAGRAPH_SEPARATOR)
whenever encountering two newlines or more in the source text.After some research I figured that this is the right behavior instead of wrapping each paragraph in a DDOC_PARAGRAPH macro. That approach does not work well or at all whenever nesting of macros is present, with two newlines in between. In contrast, a paragraph break is always insertable.
One interesting thing is the
<P>
tag is somewhat special in HTML, due to it being a "mistake" of sorts (it should be a self-closing tag, and it seems to not be due to historical accident). Anyhow HTML does not require</p>
so we should be fine with paragraph separators as introduced by this feature.Alternatively, instead of
<p>
we should insert a<div class="DDOC_PARAGRAPH_SEPARATOR></div>
and let the css choose. For now the macro is defined to nothing and is backwards compatible.