Skip to content

Parser: Introduce <p>-as-text-block and soup block#1133

Closed
dmsnell wants to merge 1 commit intomasterfrom
parser/pea-gazpacho
Closed

Parser: Introduce <p>-as-text-block and soup block#1133
dmsnell wants to merge 1 commit intomasterfrom
parser/pea-gazpacho

Conversation

@dmsnell
Copy link
Member

@dmsnell dmsnell commented Jun 11, 2017

Basic text blocks don't need much indication of what they are. In this
patch we augment the parser to treat a top-level <p> tag as a text
block (if it has a matching closing </p>).

Further, anything that isn't one of these or a well-formed block is
going into a new freeform block.

Finally, to support this work I have allowed for the parser to ignore
interblock characters since they shouldn't indicate separate blocks.

@dmsnell dmsnell added the [Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f label Jun 11, 2017
@dmsnell dmsnell force-pushed the parser/pea-gazpacho branch 3 times, most recently from 00771b1 to 813245b Compare June 11, 2017 14:45
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually no they weren't included. I'm not sure what \s actually gave us. I have updated with the proper definition of "whitespace" from JS though in 8fde722. we may want to throttle back on this too since it includes things like Unicode whitespace characters as well

post-content.js Outdated
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's an interesting addition, because I'm not sure if we should accept a text block with a code block in it. That's a job for validation I guess. cc @nylen.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(Not saying no <code>, but no <pre><code>.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well I don't see how we can prevent it if we have something like TinyMCE in the picture of any kind of "do whatever you want" block. personally I'm wondering if it'a just a bad idea to break out of the block boundaries at all.

@ellatrix
Copy link
Member

Works great!

@youknowriad
Copy link
Contributor

I'm a bit concerned about introducing this special case in the grammar itself, I'd much rather opt for describing what legacy content a block can handle in the block API itself. See #590

Alos a good question is whether we should validate the content of the p or not to check for only text and formatting markup.

@dmsnell dmsnell force-pushed the parser/pea-gazpacho branch from 76d5a0c to 4eef37c Compare June 12, 2017 16:26
@dmsnell
Copy link
Member Author

dmsnell commented Jun 12, 2017

I messed up a rebase without realizing it. If you have the old version of this branch could you force push it? Thanks! never mind, git reflog to the rescue

@youknowriad I also have my reservations about what it implies when we strip out the block structure and make implicit assumptions about the other content. This PR was built on request though to explore the idea.

@dmsnell dmsnell force-pushed the parser/pea-gazpacho branch 3 times, most recently from f405727 to 60253da Compare June 12, 2017 17:32
@nylen
Copy link
Member

nylen commented Jun 12, 2017

I'm a bit concerned about introducing this special case in the grammar itself, I'd much rather opt for describing what legacy content a block can handle in the block API itself. See #590

+1, while the grammar should parse and return a full tree of HTML nodes, deciding what to do with them should not be the responsibilty of the parsing layer

Also a good question is whether we should validate the content of the p or not to check for only text and formatting markup.

+1, when building software, you should validate your inputs... though this also should not be the responsibility of the parsing layer.

@dmsnell dmsnell force-pushed the parser/pea-gazpacho branch 7 times, most recently from 751d7f7 to a267276 Compare June 16, 2017 13:36
Basic text blocks don't need much indication of what they are. In this
patch we augment the parser to treat a top-level <p> tag as a text
block (if it has a matching closing </p>).

Further, anything that isn't one of these or a well-formed block is
going into a new freeform block.

Finally, to support this work I have allowed for the parser to ignore
interblock characters since they shouldn't indicate separate blocks.
@dmsnell
Copy link
Member Author

dmsnell commented Jul 4, 2017

Closing because several of us don't think this is a good idea.

@dmsnell dmsnell closed this Jul 4, 2017
@dmsnell dmsnell deleted the parser/pea-gazpacho branch July 4, 2017 15:47
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

[Feature] Parsing Related to efforts to improving the parsing of a string of data and converting it into a different f

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants