mdast-util-split: start of work #315

artragis · 2019-02-21T12:45:47Z

Context

this PR aims to add a new mdast-util project to prepare some work on zmarkdown that will try to import a flat markdown text to a fully hierarchized tutorial/article.

For that I need an util to traverse all headings and extract "part" from what follow those headings.

What is done

the basic documentation
the canonical case (presented in doc & in tests)

What remains before merging

test of all configuration (for example I'm not sure my first commit properly handles the conclusion parameter)
test edge cases : what if a sub header is inside a quote? At the beginning I was using unist-util-find but as it finds everything in depth I changed the logic to a custom function that I haven't fully tested yet
test edge cases : what if there is no header? don't forget to mention mdast-util-normalize
implement the "sub header parsing" create a more splitted document (example with
part > subpart > chapter)

Why doing such a PR?

I'd like to have some comment about the functionality
Is it a good way to perform what I want?
Is my documentation good enough?
Have you ideas of edge cases?
Have you comment on my code style?
Hound please help me improve code quality.

packages/mdast-util-split/src/index.js

packages/mdast-util-split/__tests__/index.js

vhf · 2019-02-21T18:45:59Z

Nice! I'll do a few rounds of reviews, the first one will only be obvious things, later on I might PR against your branch to fix trickier things. [edit]cf. artragis#7 [/edit]

I'm not sure about the naming of this package. mdast-util-split sounds very generic while this package seems to solve a specific thing.

vhf

Please add tests for other depths and test a file with no heading.

vhf · 2019-02-21T18:46:23Z

packages/mdast-util-split/README.md

+# mdast-util-split [![Build Status][build-badge]][build-status] [![Coverage Status][coverage-badge]][coverage-status]
+
+
+**rebber** is a LaTeX stringifier for [remark][]


Suggested change

**rebber** is a LaTeX stringifier for [remark][]

vhf · 2019-02-21T18:46:59Z

packages/mdast-util-split/src/index.js

+module.exports = split
+import visit from 'unist-util-visit'
+
+function split (tree, {splitDepth = 1,


splitDepth is not used.

not yet implemented

Now it's implemented, but I'm not happy at all with my namings, I always mix "trees" and "children" and it's quite complex when splitDepth is greater than 1.

It's confusing I agree. I think this.subTrees actually refers to new root trees, not subtrees?

Try drawing a simple schema of the output for a simple md doc at various splitDepth, it should make it easier to see what is a root tree, what is a subtree (i.e. a subpart of mdast root tree)?

the introduction # chapter 1 a paragraph > a quote to *ensure this is parsed* ## section 1.1 content 1.1 ## subsection 1.1.1 content 1.1.1 ## section 1.2 content 1.2 # chapter 2 ## subsection 2.1.1 content 2.1.1 # the conclusion paragraph

For instance what is expected here with splitDepth 1 and splitDepth 2?

My guess for splitDepth 1:

{ introduction: {type: root, children: [/* from doc root to `chapter 1` (not included) */]}, trees: [ {type: root, children: [/* from `chapter 1` to `chapter 2` (not included) */]}, {type: root, children: [/* from `chapter 2` to `conclusion` (not included) */]}, ], conclusion: {type: root, children: [/* from `chapter 2` (not included) to document end */]}, }

For this one, it is what I want.

In fact, not, the conclusion must be from "conclusion" (not included)

I guess this is what you mean:

{ introduction: {type: root, children: [/* from doc root to `chapter 1` (not included) */]}, trees: [ {type: root, children: [/* from `chapter 1` to `chapter 2` (not included) */]}, {type: root, children: [/* from `chapter 2` to `conclusion` (not included) */]}, ], conclusion: {type: root, children: [/* from `conclusion` (not included) to document end */]}, }

What about splitDepth 2 for this document?

this is what I mean. For splitDepth 2 here it what i would like :

{ introduction: {type: root, children: [/* from doc root to `chapter 1` (not included) */]}, title: { type: root, children: [/*the heading node*/], trees: [ {type: root, title: {type: 'root', children:[/*chapter 1 heading*/]}, trees: [/** trees for section1.1=>section 1.2], conclusion: {type: root, children:[/* content of section 1.2, but heading discarded*/]}, {type: root, children: [/* from `chapter 2` to `conclusion` (not included) */]}, ], conclusion: {type: root, children: [/* from `conclusion` (not included) to document end */]}, }

Why does chapter 1 have a conclusion but no introduction? Could you please explain how all of this works in the README?

packages/mdast-util-split/src/index.js

vhf · 2019-02-21T18:49:13Z

packages/mdast-util-split/src/index.js

+    })
+  }
+  extractIntroductions () {
+    const self = this


No need for this trick

How can I do it then? if I use this in the lambda, this is undifined
.

No, this will be the object, not undefined.

packages/mdast-util-split/src/index.js

vhf · 2019-02-22T18:10:14Z

packages/mdast-util-split/README.md

+A boolean driving the way we want to process introduction :
+- if `true`, all MDAST elements between the split header and the first sub header will be extracted and put in the `introduction` property of the result
+property of the returned object
+- if `false` they will just be added as MDAST elements of the `children` property.


I don't understand this line, could you please clarify?

if the property set to false, we do not extract the introduction from the tree. In though I had changed like for the conclusion in if false, the part tree will not be changed

vhf · 2019-02-22T18:10:29Z

packages/mdast-util-split/README.md

+property of the returned object
+- if `false` they will just be added as MDAST elements of the `children` property.
+
+#### `options.conclusionAsProperty`


Suggested change

#### `options.conclusionAsProperty`

#### `options.conclusionAsProperty = false`

vhf · 2019-02-22T18:10:41Z

packages/mdast-util-split/README.md

+
+A boolean driving the way we want to process the last sub header if there are more than one.
+- if `true`, the last sub header is removed from tree and all following elements are stored in the `conclusion` property of the result
+- if `false` (default), nothing is changed.


Suggested change

- if `false` (default), nothing is changed.

- if `false`, nothing is changed.

vhf · 2019-02-22T18:14:43Z

packages/mdast-util-split/src/index.js

+    this.depth = depth
+    this._introduction = newRootTree()
+  }
+  set introduction (nodes) {


Using a getter/setter here makes it harder to understand IMO but I might be failing to see the benefits it brings.

_introduction is already initialized in the constructor, and later on (for instance l.120) when we assign to introduction it's hard to tell whether it enter the setter or not. I would refactor to not use these.

I think I wanted to use a "simple" get/set at the beginning, failed and instead of reworking the whole code kept the get/set. if you want to remove this, you clearly can, it's an antipattern now.

Good, go ahead then

vhf · 2019-02-22T18:14:59Z

packages/mdast-util-split/src/index.js

+  }
+  extractConclusions () {
+    this.subTrees.forEach(splittedPart => {
+      const firstHeading = find(splittedPart.children, {


splitPart, not splitted

vhf · 2019-02-22T18:15:41Z

packages/mdast-util-split/src/index.js

+      splittedPart.conclusion = newRootTree(rootContent)
+    })
+  }
+  extractIntroductions () {


In which cases do we have several introductions?

when all chapters have introductions.

Do we still extract it when only one chapter has one?

we extract it for every chapter that has one introduction.

artragis · 2019-03-04T11:59:38Z

Thanks to your first review and PR I could have a better view on my work and I was wondering if I was not trying to create a too complex thing.

An idea that showed up was to say :

just extract one level with two parametters :

which depth is considered as "root" depth for titles, default to 1
do we extract the last paragraph as a conclusion?

the result would just be :

{
    introduction: {type: 'root', children: [everything before first heading]},
    chapters: [all trees],
    conclusion: /* nothing if extractConclusion is false, else the last tree of chapters*/
}

thanks to that we can use all the existing utils on subtrees if we want a deeper splitting or a specific treatment.

vhf · 2019-03-04T19:22:41Z

Nice! I agree, the initial code was quite complicated and the purpose unclear, it's good we now see things more clearly.

{
    introduction: {type: 'root', children: [everything before first heading]},
    chapters: [all trees],
    conclusion: /* nothing if extractConclusion is false, else the last tree of chapters*/
}

If I understand correctly, when extractConclusion is true conclusion contains chapters.pop(). If that's correct I wouldn't bother, it's not worth it. If anyone wants the last chapter it's easy enough to get it, no need for an extra config option.

Other than that I like the new idea. Mapping the splitter to its result (with depth+1) is good enough when a deeper split is required.

artragis · 2019-03-04T20:46:33Z

If I understand correctly, when extractConclusion is true conclusion contains chapters.pop(). If that's correct I wouldn't bother, it's not worth it. If anyone wants the last chapter it's easy enough to get it, no need for an extra config option.

in fact I would say newTree(chapters.pop()) but you are probably right, it's perhaps too much.

Mapping the splitter to its result (with depth+1) is good enough when a deeper split is required.

perhaps you were rephrasing my message but I do not get it here. What do you mean by "mapping the splitter to its result"?

vhf · 2019-03-05T16:48:13Z

const splitAtDepth = (depth) => (tree) => split(tree, { depth })

const level1 = splitAtDepth(1)(doc)
const level2 = level1.chapters.map(splitAtDepth(2))

packages/mdast-util-split/src/index.js

vhf · 2019-04-05T14:55:32Z

I'm not sure about the naming of this package. mdast-util-split sounds very generic while this package seems to solve a specific thing.

What about mdast-util-split-by-heading?

artragis · 2019-04-06T12:42:22Z

About module name : I do not know what is better and having a look at unified documentation I saw you're part of the kind-of-board of unified/remark ecosystem so I think you can put the name you want without negociating it will be the best thing for remark ecosystem.

houndci-bot reviewed Feb 21, 2019

View reviewed changes

artragis force-pushed the add_split_utils branch from 961dfef to 6ea2466 Compare February 21, 2019 13:00

vhf reviewed Feb 21, 2019

View reviewed changes

vhf reviewed Feb 22, 2019

View reviewed changes

fdambrine and others added 6 commits March 6, 2019 12:52

mdast-util-split: start of work

e181d53

Add tests, implement in depth split

9f0c78c

mdast-util-split: code review

85eea3b

mdast-util-split: refactor root tree generation

1b04e12

mdast-util-split: readme first review

2d76dba

remark-util-split: simplify the goal

1368878

artragis force-pushed the add_split_utils branch from beb8f63 to 1368878 Compare March 6, 2019 11:53

houndci-bot reviewed Mar 6, 2019

View reviewed changes

packages/mdast-util-split/src/index.js Outdated Show resolved Hide resolved

mdast-util-split: fix depth-other-than-one bug

33c487c

artragis force-pushed the add_split_utils branch from 11e7593 to 33c487c Compare March 6, 2019 12:09

vhf and others added 3 commits March 18, 2019 12:59

mdast-util-split: remove extract conclusion tests

b269798

mdast-util-split: refactor to get simpler code

99faf10

mdast-util-split: update documentation

469214a

vhf mentioned this pull request Apr 14, 2019

mdast-util-split-by-heading #322

Merged

vhf closed this Apr 14, 2019

artragis deleted the add_split_utils branch December 19, 2020 11:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

mdast-util-split: start of work #315

mdast-util-split: start of work #315

artragis commented Feb 21, 2019 •

edited

vhf commented Feb 21, 2019 •

edited

vhf left a comment

vhf Feb 21, 2019

vhf Feb 21, 2019

artragis Feb 21, 2019

artragis Feb 22, 2019

vhf Feb 22, 2019

vhf Feb 22, 2019 •

edited

artragis Feb 22, 2019

artragis Feb 22, 2019

vhf Feb 22, 2019

artragis Feb 22, 2019

vhf Feb 22, 2019

vhf Feb 21, 2019

artragis Feb 22, 2019 •

edited

vhf Feb 22, 2019

vhf Feb 22, 2019

artragis Feb 22, 2019

vhf Feb 22, 2019

vhf Feb 22, 2019

vhf Feb 22, 2019

artragis Feb 22, 2019

vhf Feb 22, 2019

vhf Feb 22, 2019

vhf Feb 22, 2019

artragis Feb 22, 2019

vhf Feb 22, 2019

artragis Feb 22, 2019

artragis commented Mar 4, 2019

vhf commented Mar 4, 2019 •

edited

artragis commented Mar 4, 2019

vhf commented Mar 5, 2019

vhf commented Apr 5, 2019

artragis commented Apr 6, 2019

		# mdast-util-split [![Build Status][build-badge]][build-status] [![Coverage Status][coverage-badge]][coverage-status]


		rebber is a LaTeX stringifier for [remark][]

	#### `options.conclusionAsProperty`
	#### `options.conclusionAsProperty = false`

	- if `false` (default), nothing is changed.
	- if `false`, nothing is changed.

mdast-util-split: start of work #315

mdast-util-split: start of work #315

Conversation

artragis commented Feb 21, 2019 • edited

Context

What is done

What remains before merging

Why doing such a PR?

vhf commented Feb 21, 2019 • edited

vhf left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

vhf Feb 22, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

artragis Feb 22, 2019 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

artragis commented Mar 4, 2019

vhf commented Mar 4, 2019 • edited

artragis commented Mar 4, 2019

vhf commented Mar 5, 2019

vhf commented Apr 5, 2019

artragis commented Apr 6, 2019

artragis commented Feb 21, 2019 •

edited

vhf commented Feb 21, 2019 •

edited

vhf Feb 22, 2019 •

edited

artragis Feb 22, 2019 •

edited

vhf commented Mar 4, 2019 •

edited